基于卷积神经网络的交通场景语义分割方法研究

李琳辉; 钱波; 连静; 郑伟娜; 周雅夫

doi:10.11959/j.issn.1000-436x.2018053

您当前的位置：

首页 >

文章列表页 >

基于卷积神经网络的交通场景语义分割方法研究

学术论文 | 更新时间：2024-06-05

- 基于卷积神经网络的交通场景语义分割方法研究
- Study on traffic scene semantic segmentation method based on convolutional neural network
- 通信学报 2018年39卷第4期页码：123-130
- 作者机构：
  
  大连理工大学工业装备结构分析国家重点实验室运载工程与力学学部汽车工程学院，辽宁大连 116024
- 作者简介：
  
  [ "李琳辉（1981-），男，河南辉县人，博士，大连理工大学副教授，主要研究方向为汽车安全辅助驾驶、智能车辆及基于视觉传感器的车载环境感知等。" ]
  [ "钱波（1991-），男，江苏宿迁人，大连理工大学硕士生，主要研究方向为图像语义分割、立体视觉。" ]
  [ "连静（1980-），女，吉林公主岭人，博士，大连理工大学副教授，主要研究方向为智能电动汽车、新能源汽车动力总成及整车控制等。" ]
  [ "郑伟娜（1994-），女，山东日照人，大连理工大学硕士生，主要研究方向为交通场景语义分割、图像理解。" ]
  [ "周雅夫（1962-），男，辽宁铁岭人，大连理工大学教授，主要研究方向为新能源汽车智能化技术、车载信息采集与远程监控技术、电动汽车整车匹配设计与控制技术、电动汽车驱动电机及其控制技术等。" ]
- 基金信息：
  
  国家自然科学基金资助项目(51775082);国家自然科学基金资助项目(61473057);国家自然科学基金资助项目(61203171)
- DOI：10.11959/j.issn.1000-436x.2018053
  中图分类号： U495，TP391.4
- 网络首发：2018-04，
  
  纸质出版：2018-04-25
- 稿件说明：
移动端阅览
李琳辉, 钱波, 连静, 等. 基于卷积神经网络的交通场景语义分割方法研究[J]. 通信学报, 2018,39(4):123-130.

Linhui LI, Bo QIAN, Jing LIAN, et al. Study on traffic scene semantic segmentation method based on convolutional neural network[J]. Journal on Communications, 2018, 39(4): 123-130.
李琳辉, 钱波, 连静, 等. 基于卷积神经网络的交通场景语义分割方法研究[J]. 通信学报, 2018,39(4):123-130. DOI： 10.11959/j.issn.1000-436x.2018053.

Linhui LI, Bo QIAN, Jing LIAN, et al. Study on traffic scene semantic segmentation method based on convolutional neural network[J]. Journal on Communications, 2018, 39(4): 123-130. DOI： 10.11959/j.issn.1000-436x.2018053.

摘要

为提高交通场景的语义分割精度，提出一种基于 RGB-D 图像和卷积神经网络的分割方法。首先，基于半全局立体匹配算法获取视差图D，并将其与RGB图像融合成四通道RGB-D图像，以建立样本库；其次，对于2种不同结构的卷积神经网络，分别采用2种不同的学习率调整策略对网络进行训练；最后，对训练得到的网络进行测试及对比分析。实验结果表明，基于RGB-D图像的交通场景语义分割算法得到的分割精度高于基于RGB图像的分割算法。

Abstract

In order to improve the semantic segmentation accuracy of traffic scene

a segmentation method was proposed based on RGB-D image and convolutional neural network.Firstly

on the basis of semi-global stereo matching algorithm

the disparity map was obtained

and the sample library was established by fusing the disparity map D and RGB image into the four-channel RGB-D image.Then

with two different structures

the networks were trained by using two different learning rate adjustment strategy respectively.Finally

the traffic scene semantic segmentation test was carried out with RGB-D image as the input

and the results were compared with the segmentation method based on RGB image.The experimental results show that the proposed traffic scene segmentation algorithm based on RGB-D image can achieve higher semantic segmentation accuracy than that based on RGB image.

关键词

Keywords

references

ANBALAGAN T , GOWRISHANKAR C , SHANMUGAM A . SVM based road surface detection to improve performance of ABS [J ] . Journal of Theoretical ＆ Applied Information Technology , 2013 , 51 ( 2 ): 234 - 239 .

LECUN Y , BENGIO Y , HINTON G . Deep learning [J ] . Nature , 2015 , 521 ( 7553 ): 436 - 444 .

高常鑫 , 桑农 . 基于深度学习的高分辨率遥感影像目标检测 [J ] . 测绘通报 , 2014 ( S1 ): 108 - 111 .

GAO C X , SANG N . Deep learning for object detection in remote sensing image [J ] . Bulletin of Surveying and Mapping , 2014 ( S1 ): 108 - 111 .

高凯珺 , 孙韶媛 , 姚广顺 , 等 . 基于深度学习的无人车夜视图像语义分割 [J ] . 应用光学 , 2017 , 38 ( 3 ): 421 - 428 .

GAO K J , SUN S Y , YAO G S , et al . Semantic segmentation of night vision images for unmanned vehicles based on deep learning [J ] . Journal of Applied Optics , 2017 , 38 ( 3 ): 421 - 428 .

刘丹 , 刘学军 , 王美珍 . 一种多尺度 CNN 的图像语义分割算法 [J ] . 遥感信息 , 2017 , 32 ( 1 ): 57 - 64 .

LIU D , LIU X J , WANG M Z . Semantic segmentation with multi-scale convolutional neural network [J ] . Remote Sensing Information , 2017 , 32 ( 1 ): 57 - 64 .

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [J ] . Advances in Neural Information Processing Systems , 2012 , 25 ( 2 ): 1 - 9 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [J ] . ArXiv Preprint,ArXiv:1409.1556 , 2014 .

SZEGEDY C , LIU W , JIA Y , et al . Going deeper with convolutions [C ] // IEEE Conference on Computer Vision and Pattern Recognition . 2014 : 1 - 9 .

DENG J , DONG W , SOCHER R , et al . ImageNet:a large-scale hierarchical image database [C ] // IEEE Computer Vision and Pattern Recognition . 2009 : 248 - 255 .

LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation [C ] // IEEE Computer Vision and Pattern Recognition . 2015 : 3431 - 3440 .

BADRINARAYANAN V , HANDA A , CIPOLLA R . SegNet:a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling [J ] . ArXiv Preprint,ArXiv:1505.07293 , 2015 .

BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet:a deep convolutional encoder-decoder architecture for image segmentation [J ] . IEEE Transactions on Pattern Analysis ＆ Machine Intelligence , 2017 , PP ( 99 ): 1 .

NOH H , HONG S , HAN B . Learning deconvolution network for semantic segmentation [C ] // IEEE International Conference on Computer Vision . 2015 : 1520 - 1528 .

SILBERMAN N , HOIEM D , KOHLI P , et al . Indoor segmentation and support inference from RGBD images [C ] // European Conference on Computer Vision . 2012 : 746 - 760 .

GUPTA S , GIRSHICK R , ARBELÁEZ P , et al . Learning rich features from RGB-D images for object detection and segmentation [C ] // European Conference on Computer Vision . 2014 : 345 - 360 .

SHAO T , XU W , ZHOU K , et al . An interactive approach to semantic modeling of indoor scenes with an RGBD camera [J ] . ACM Transactions on Graphics , 2012 , 31 ( 6 ): 439 - 445 .

FILLIAT D , BATTESTI E , BAZEILLE S , et al . RGBD object recognition and visual texture classification for indoor semantic mapping [C ] // 2012 IEEE International Conference on Technologies for Practical Robot Applications . 2012 : 127 - 132 .

GEIGER A , LENZ P , URTASUN R . Are we ready for autonomous driving? The KITTI vision benchmark suite [C ] // IEEE Conference on Computer Vision and Pattern Recognition . 2012 : 3354 - 3361 .

LI L , HUANG H , QIAN B , et al . Vehicle detection method based on mean shift clustering [J ] . Journal of Intelligent ＆ Fuzzy Systems , 2016 , 31 ( 3 ): 1355 - 1363 .

MIN D , CHOI S , LU J , et al . Fast global image smoothing based on weighted least squares [J ] . IEEE Transactions on Image Processing a Publication of the IEEE Signal Processing Society , 2014 , 23 ( 12 ): 5638 - 5653 .

RUSSELL B C , TORRALBA A , MURPHY K P , et al . LabelMe:a database and web-based tool for image annotation [J ] . International Journal of Computer Vision , 2008 , 77 ( 1-3 ): 157 - 173 .

GOULD S , FULTON R , KOLLER D . Decomposing a scene into geometric and semantically consistent regions [C ] // IEEE International Conference on Computer Vision . 2009 : 1 - 8 .

EIGEN D , FERGUS R . Predicting depth,surface normals and semantic labels with a common multi-scale convolutional architecture [C ] // IEEE International Conference on Computer Vision . 2015 : 2650 - 2658 .

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

JIA Y , SHELHAMER E , DONAHUE J , et al . Caffe:convolutional architecture for fast feature embedding [C ] // The 22nd ACM International Conference on Multimedia . 2014 : 675 - 678 .

浏览量

2583

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

融合多尺度深度卷积的轻量级Transformer交通场景语义分割算法

基于上下文注意力CNN的三维点云语义分割

基于深度学习的光学遥感图像目标检测研究进展

基于图像深度学习的无线电信号识别

基于混合maxout单元的卷积神经网络性能优化