基于上下文注意力CNN的三维点云语义分割

杨军; 党吉圣

doi:10.11959/j.issn.1000-436x.2020128

您当前的位置：

首页 >

文章列表页 >

基于上下文注意力CNN的三维点云语义分割

学术通信 | 更新时间：2024-06-05

- 基于上下文注意力CNN的三维点云语义分割
- Semantic segmentation of 3D point cloud based on contextual attention CNN
- 通信学报 2020年41卷第7期页码：195-203
- 作者机构：
  
  兰州交通大学电子与信息工程学院，甘肃兰州 730070
- 作者简介：
  
  [ "杨军（1973- ），男，回族，宁夏吴忠人，博士，兰州交通大学教授、博士生导师，主要研究方向为计算机图形学、数字图像处理等" ]
  [ "党吉圣（1991- ），男，甘肃武威人，兰州交通大学硕士生，主要研究方向为机器视觉、模式识别" ]
- 基金信息：
  
  国家自然科学基金资助项目(61862039)
- DOI：10.11959/j.issn.1000-436x.2020128
  中图分类号： TP391
- 网络出版日期：2020-07，
  
  纸质出版日期：2020-07-25
- 稿件说明：
移动端阅览
杨军, 党吉圣. 基于上下文注意力CNN的三维点云语义分割[J]. 通信学报, 2020,41(7):195-203.

Jun YANG, Jisheng DANG. Semantic segmentation of 3D point cloud based on contextual attention CNN[J]. Journal on communications, 2020, 41(7): 195-203.
杨军, 党吉圣. 基于上下文注意力CNN的三维点云语义分割[J]. 通信学报, 2020,41(7):195-203. DOI： 10.11959/j.issn.1000-436x.2020128.

Jun YANG, Jisheng DANG. Semantic segmentation of 3D point cloud based on contextual attention CNN[J]. Journal on communications, 2020, 41(7): 195-203. DOI： 10.11959/j.issn.1000-436x.2020128.

摘要

针对三维点云语义分割中缺乏结合点云的上下文细粒度信息导致的欠分割问题，提出一种基于上下文注意力卷积神经网络的三维点云语义分割算法。首先，通过注意力编码机制挖掘点云的局部区域内细粒度特征；然后，通过上下文循环神经网络编码机制捕捉多尺度局部区域之间的上下文特征，且与细粒度局部特征相互补偿；最后，采用多头部机制增强网络的泛化能力。实验结果表明，所提算法在ShapeNet Parts、S3DIS和vKITTI标准数据集上的平均交并比分别为85.4%、56.7%和38.1%，分割性能良好，且具有较好的泛化能力。

Abstract

Aiming at the under-segmentation of 3D point cloud semantic segmentation caused by the lack of contextual fine-grained information of the point cloud

an algorithm based on contextual attention CNN was proposed for 3D point cloud semantic segmentation.Firstly

the fine-grained features in local area of the point cloud were mined through the attention coding mechanism.Secondly

the contextual features between multi-scale local areas were captured by the contextual recurrent neural network coding mechanism and compensated with the fine-grained local features.Finally

the multi-head mechanism was used to enhance the generalization ability of the network.Experiments show that the mIoU of the proposed algorithm on the three standard datasets of ShapeNet Parts

S3DIS and vKITTI are 85.4%

56.7% and 38.1% respectively

which has good segmentation performance and good generalization ability.

关键词

Keywords

references

KRIZHEVSKY A , SUTSKEVER I , HINTON G . ImageNet classification with deep convolutional neural networks [C ] // Advances in Neural Information Processing Systems . Piscataway:IEEE Press , 2012 : 1097 - 1105 .

YI L , SU H , GUO X , et al . SyncSpecCNN:synchronized spectral CNN for 3D shape segmentation [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 2282 - 2290 .

ZHANG J , ZHENG J , WU C , et al . Variational mesh decomposition [J ] . ACM Transactions on Graphics (TOG) , 2012 , 31 ( 21 ): 1 - 14 .

XIAO D , LIN H , XIAN C , et al . CAD mesh model segmentation by clustering [J ] . Computers ＆ Graphics , 2011 , 35 ( 3 ): 685 - 691 .

KU J , MOZIFIAN M , LEE J , et al . Joint 3D proposal generation and object detection from view aggregation [C ] // 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems . Piscataway:IEEE Press , 2018 : 1 - 8 .

YANG B , LUO W , URTASUN R . PIXOR:real-time 3D object detection from point clouds [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 7652 - 7660 .

KALOGERAKIS E , AVERKIOU M , MAJI S , et al . 3D shape segmentation with projective convolutional networks [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 3779 - 3788 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [J ] . arXiv Preprint,arXiv:1409.1556 , 2014

HUANG H , KALOGERAKIS E , CHAUDHURI S , et al . Learning local shape descriptors from part correspondences with multi view convolutional networks [J ] . ACM Transactions on Graphics (TOG) , 2018 , 37 ( 1 ): 1 - 14 .

CHEN X , MA H , WAN J , et al . Multi-view 3D object detection network for autonomous driving [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 1907 - 1915 .

QI C , LIU W , WU C , et al . Frustum PointNets for 3D object detection from RGB-D data [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 918 - 927 .

WU Z , SONG S , KHOSLA A , et al . 3D ShapeNets:a deep representation for volumetric shapes [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2015 : 1912 - 1920 .

KLOKOV R , LEMPITSKY V . Escape from cells:deep Kd-networks for the recognition of 3D point cloud models [C ] // IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2017 : 863 - 872 .

TATARCHENKO M , DOSOVITSKIY A , BROX T . Octree generating networks:efficient convolutional architectures for high-resolution 3D outputs [C ] // IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2017 : 2088 - 2096 .

QI C , SU H , MO K , et al . PointNet:deep learning on point sets for 3D classification and segmentation [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 652 - 660 .

QI C , YI L , SU H , et al . PointNet++:deep hierarchical feature learning on point sets in a metric space [C ] // Advances in Neural Information Processing Systems . Piscataway:IEEE Press , 2017 : 5099 - 5108 .

WANG Y , SUN Y , LIU Z , et al . Dynamic graph CNN for learning on point clouds [J ] . ACM Transactions on Graphics , 2019 , 38 ( 5 ): 146 - 160 .

CHEN C , FRAGONARA L , TSOURDOS A . GAPNet:graph attention based point neural network for exploiting local feature of point cloud [J ] . arXiv Preprint,arXiv:1905.08705 , 2019

LIU X , HAN Z , LIU Y , et al . Point2sequence:learning the shape representation of 3D point clouds with an attention-based sequence to sequence network [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto:AAAI Press , 2019 , 33 : 8778 - 8785 .

WANG W , YU R , HUANG Q , et al . SGPN:similarity group proposal network for 3D point cloud instance segmentation [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 2569 - 2578 .

JIANG M , WU Y , ZHAO T , et al . PointSIFT:a SIFT-like network module for 3D point cloud semantic segmentation [J ] . arXiv Preprint,arXiv:1807.00652 , 2018

YE X , LI J , HUANGU H , et al . 3D recurrent neural networks with context fusion for point cloud semantic segmentation [C ] // Proceedings of the European Conference on Computer Vision (ECCV) . Piscataway:IEEE Press , 2018 : 403 - 417 .

LANDRIEU L , SIMONOVSKY M . Large-scale point cloud semantic segmentation with superpoint graphs [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 4558 - 4567 .

LIN M , CHEN Q , YAN S . Network in network [J ] . arXiv Preprint,arXiv:1312.4400 , 2013

YI L , GUIBAS L , KIM V , et al . A scalable active framework for region annotation in 3D shape collections [J ] . ACM Transactions on Graphics , 2016 , 35 ( 6 ): 1 - 12 .

ARMENI I , SENER O , ZAMIR A , et al . 3D semantic parsing of large-scale indoor spaces [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 1534 - 1543 .

GAIDON A , WANG Q , CABON Y , et al . Virtual worlds as proxy for multi-object tracking analysis [C ] // IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 4340 - 4349 .

ENGELMANN F , KONTOGIANNI T , HERMANS A , et al . Exploring spatial context for 3D semantic segmentation of point clouds [C ] // IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2017 : 716 - 724 .

浏览量

1160

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于卷积神经网络的交通场景语义分割方法研究

融合多尺度深度卷积的轻量级Transformer交通场景语义分割算法

基于深度学习的光学遥感图像目标检测研究进展

基于图像深度学习的无线电信号识别

基于混合maxout单元的卷积神经网络性能优化