基于多模态特征的无监督领域自适应多级对抗语义分割网络

王泽宇; 布树辉; 黄伟; 郑远攀; 吴庆岗; 常化文; 张旭

doi:10.11959/j.issn.1000-436x.2022212

您当前的位置：

首页 >

文章列表页 >

基于多模态特征的无监督领域自适应多级对抗语义分割网络

学术论文 | 更新时间：2024-06-05

- 基于多模态特征的无监督领域自适应多级对抗语义分割网络
- Unsupervised domain adaptation multi-level adversarial network for semantic segmentation based on multi-modal features
- 通信学报 2022年43卷第12期页码：157-171
- 作者机构：
  
  1. 郑州轻工业大学计算机与通信工程学院，河南郑州 450000
  2. 西北工业大学航空学院，陕西西安 710072
- 作者简介：
  
  [ "王泽宇（1989- ），男，河南郑州人，博士，郑州轻工业大学讲师，主要研究方向为计算机视觉、图像处理、深度学习等" ]
  [ "布树辉（1978- ），男，河南洛阳人，博士，西北工业大学教授、博士生导师，主要研究方向为计算机视觉、图像处理、机器学习等" ]
  [ "黄伟（1982- ），男，河南郑州人，博士，郑州轻工业大学副教授、硕士生导师，主要研究方向为遥感图像处理、深度学习等" ]
  [ "郑远攀（1983- ），男，河南郑州人，博士，郑州轻工业大学副教授、硕士生导师，主要研究方向为图像处理、智慧应急等" ]
  [ "吴庆岗（1984- ），男，河南濮阳人，博士，郑州轻工业大学副教授、硕士生导师，主要研究方向为计算机视觉、遥感图像处理、深度学习等" ]
  [ "常化文（1980- ），男，河南郑州人，博士，郑州轻工业大学讲师、硕士生导师，主要研究方向为图像质量评价、计算机视觉等" ]
  [ "张旭（1979– ），女，河南南阳人，郑州轻工业大学讲师，主要研究方向为图像处理、模型检测等" ]
- 基金信息：
  
  河南省科技攻关基金资助项目(222102210021);河南省高等学校重点科研项目计划基金资助项目(21A520049);河南省高等学校重点科研项目计划基金资助项目(23A520004)
- DOI：10.11959/j.issn.1000-436x.2022212
  中图分类号： TP391
- 网络出版日期：2022-12，
  
  纸质出版日期：2022-12-25
- 稿件说明：
移动端阅览
王泽宇, 布树辉, 黄伟, 等. 基于多模态特征的无监督领域自适应多级对抗语义分割网络[J]. 通信学报, 2022,43(12):157-171.

Zeyu WANG, Shuhui BU, Wei HUANG, et al. Unsupervised domain adaptation multi-level adversarial network for semantic segmentation based on multi-modal features[J]. Journal on communications, 2022, 43(12): 157-171.
王泽宇, 布树辉, 黄伟, 等. 基于多模态特征的无监督领域自适应多级对抗语义分割网络[J]. 通信学报, 2022,43(12):157-171. DOI： 10.11959/j.issn.1000-436x.2022212.

Zeyu WANG, Shuhui BU, Wei HUANG, et al. Unsupervised domain adaptation multi-level adversarial network for semantic segmentation based on multi-modal features[J]. Journal on communications, 2022, 43(12): 157-171. DOI： 10.11959/j.issn.1000-436x.2022212.

摘要

为了解决领域自适应中存在领域间视觉、空间以及语义特征分布差异的问题，提出了基于多模态特征的无监督领域自适应多级对抗语义分割网络。首先，设计3层结构的注意力融合语义分割网络来分别从源域和目标域学习上述三类特征。然后，在单级对抗学习中引入联合分布置信度和语义置信度的自监督学习方法，从而在领域间所学特征的分布距离最小化过程中实现更多目标域像素的分布对齐。最后，通过基于多模态特征的多级对抗学习方法对3路对抗分支与3个自适应子网进行联合优化，从而能够有效学习各子网所提取特征的域间不变表示。实验结果表明，与当前先进方法相比，所提网络在GTA5到Cityscapes、SYNTHIA到Cityscapes和SUN-RGBD到NYUD-v2的数据集上分别取得最优的平均交并比62.2%、66.9%和59.7%。

Abstract

In order to solve the problem of the distribution differences of visual

spatial

and semantic features between domains in domain adaptation

an unsupervised domain adaptation multi-level adversarial network for semantic segmentation based on multi-modal features was proposed.Firstly

an attentive fusion semantic segmentation network with three-layer structure was designed to learn the above three types of features from the source domain and target domain

respectively.Secondly

a self-supervised learning method jointing distribution confidence and semantic confidence was introduced into the single-level adversarial learning

so as to achieve the distribution alignment of more target domain pixels in the process of minimizing the distribution distance of the learnt features between domains.Finally

three adversarial branches and three adaptive sub-networks were jointly optimized by the multi-level adversarial learning method based on multi-modal features

which could effectively learn the invariant representation between domains for the features extracted from each sub-network.The experimental results show that compared with existing state-of-the-art methods

on the datasets of GTA5 to Cityscapes

SYNTHIA to Cityscapes

and SUN-RGBD to NYUD-v2 the proposed network achieves the best mean intersection over union of 62.2%

66.9%

and 59.7%

respectively.

关键词

Keywords

references

徐英姿 , 刘原 , 时梦然 , 等 . 语义在通信中的应用综述 [J ] . 电信科学 , 2022 , 38 ( Z1 ): 43 - 59 .

XU Y Z , LIU Y , SHI M R , et al . A survey of semantic applications in communications [J ] . Telecommunications Science , 2022 , 38 ( Z1 ): 43 - 59 .

AGIA C , JATAVALLABHULA K M , KHODEIR M , et al . Taskography:evaluating robot task planning over large 3D scene graphs [C ] // Proceedings of Conference on Robot Learning . Cambridge:JMLR , 2022 : 46 - 58 .

CAESAR H , BANKITI V , LANG A H , et al . nuScenes:a multimodal dataset for autonomous driving [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2020 : 11621 - 11631 .

YU C , LIU Z X , LIU X J , et al . DS-SLAM:a semantic visual SLAM towards dynamic environments [C ] // Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway:IEEE Press , 2018 : 1168 - 1174 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 770 - 778 .

LI Z , GAN Y , LIANG X , et al . LSTM-CF:unifying context modeling and fusion with LSTMS for RGB-D scene labeling [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2016 : 541 - 557 .

YUAN Y H , CHEN X L , WANG J D . Object-contextual representations for semantic segmentation [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2020 : 173 - 190 .

RADFORD A , METZ L , CHINTALA S . Unsupervised representation learning with deep convolutional generative adversarial networks [J ] . arXiv Preprint,arXiv:1511.06434 , 2015 .

HOFFMAN J , WANG D , YU F , et al . FCNs in the wild:Pixel-level adversarial and constraint-based adaptation [J ] . arXiv Preprint,arXiv:1612.02649 , 2016 .

TSAI Y H , HUNG W C , SCHULTER S , et al . Learning to adapt structured output space for semantic segmentation [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 7472 - 7481 .

ZHU J Y , PARK T , ISOLA P , et al . Unpaired image-to-image translation using cycle-consistent adversarial networks [C ] // Proceedings of the IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2017 : 2223 - 2232 .

LI Z Y , TOGO R , OGAWA T , et al . Learning intra-domain style-invariant representation for unsupervised domain adaptation of semantic segmentation [J ] . Pattern Recognition , 2022 , 132 ( 12 ): 108911 .

LI Y S , YUAN L , VASCONCELOS N . Bidirectional learning for domain adaptation of semantic segmentation [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2019 : 6936 - 6945 .

YANG J , AN W , WANG S , et al . Label-driven reconstruction for domain adaptation in semantic segmentation [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2020 : 480 - 498 .

CHENG Y , WEI F , BAO J , et al . Dual path learning for domain adaptation of semantic segmentation [C ] // Proceedings of the IEEE/CVF International Conference on Computer Vision . Piscataway:IEEE Press , 2021 : 9082 - 9091 .

LEE S , HYUN J , SEONG H , et al . Unsupervised domain adaptation for semantic segmentation by content transfer [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto:AAAI Press , 2021 : 8306 - 8315 .

SHIN I , WOO S , PAN F , et al . Two-phase pseudo label densification for self-training based domain adaptation [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2020 : 532 - 548 .

PAN F , SHIN I , RAMEAU F , et al . Unsupervised intra-domain adaptation for semantic segmentation through self-supervision [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2020 : 3764 - 3773 .

PENG C L , MA J Y . Domain adaptive semantic segmentation via entropy-ranking and uncertain learning-based self-training [J ] . IEEE/CAA Journal of Automatica Sinica , 2022 , 9 ( 8 ): 1524 - 1527 .

YANG J , AN W , YAN C , et al . Context-aware domain adaptation in semantic segmentation [C ] // Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision . Piscataway:IEEE Press , 2021 : 514 - 524 .

HUANG J , LU S , GUAN D , et al . Contextual-relation consistent domain adaptation for semantic segmentation [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2020 : 705 - 722 .

RICHTER S R , VINEET V , ROTH S , et al . Playing for data:ground truth from computer games [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2016 : 102 - 118 .

ROS G , SELLART L , MATERZYNSKA J , et al . The SYNTHIA dataset:a large collection of synthetic images for semantic segmentation of urban scenes [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 3234 - 3243 .

CORDTS M , OMRAN M , RAMOS S , et al . The cityscapes dataset for semantic urban scene understanding [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 3213 - 3223 .

SONG S R , LICHTENBERG S P , XIAO J X . SUN RGB-D:a RGB-D scene understanding benchmark suite [C ] // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2015 : 567 - 576 .

SILBERMAN N , HOIEM D , KOHLI P , et al . Indoor segmentation and support inference from RGBD images [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2012 : 746 - 760 .

PASZKE A , GROSS S , MASSA F , et al . PyTorch:an imperative style,high-performance deep learning library [J ] . Advances in Neural Information Processing Systems , 2019 , 32 ( 12 ): 8024 - 8035 .

GUO X Q , YANG C , LI B P , et al . MetaCorrection:domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation [C ] // Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2021 : 3926 - 3935 .

ARASLANOV N , ROTH S . Self-supervised augmentation consistency for adapting semantic segmentation [C ] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2021 : 15384 - 15394 .

JIANG Z K , LI Y X , YANG C Y , et al . Prototypical contrast adaptation for domain adaptive semantic segmentation [C ] // Proceedings of European Conference on Computer Vision . Berlin:Springer , 2022 : 36 - 54 .

ZHANG F , KOLTUN V , TORR P , et al . Unsupervised contrastive domain adaptation for semantic segmentation [J ] . arXiv Preprint,arXiv:2204.08399 , 2022 .

YANG J Y , LI C Y , AN W Z , et al . Exploring robustness of unsupervised domain adaptation in semantic segmentation [C ] // Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway:IEEE Press , 2021 : 9174 - 9183 .

浏览量

208

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

面向WSN异常节点检测的融合重构机制与对比学习方法

融合多尺度深度卷积的轻量级Transformer交通场景语义分割算法

基于正样本对比与掩蔽重建的自监督语音表示学习

YOLOv3-A：基于注意力机制的交通标志检测网络

基于上下文注意力CNN的三维点云语义分割