交通运输系统工程与信息 ›› 2023, Vol. 23 ›› Issue (1): 236-244.DOI: 10.16097/j.cnki.1009-6744.2023.01.025

• 系统工程理论与方法 • 上一篇    下一篇

双注意力引导的跨层优化交通场景语义分割

谢新林*1,2,罗臣彦1,2,续欣莹3,谢刚1,2,3   

  1. 1.太原科技大学,电子信息工程学院,太原 030024;2.先进控制与装备智能化山西省重点实验室,太原 030024; 3.太原理工大学,电气与动力工程学院,太原 030024
  • 收稿日期:2022-10-10 修回日期:2022-11-16 接受日期:2022-12-05 出版日期:2023-02-25 发布日期:2023-02-16
  • 作者简介:谢新林(1990- ),男,山西运城人,讲师,博士。
  • 基金资助:
    国家自然科学基金(62006169);山西省重点研发计划 (202102020101005);太原科技大学博士科研启动基金(20192047)

Dual Attention Guided Cross-layer Optimized Traffic Scene Semantic Segmentation

XIE Xin-lin*1,2, LUO Chen-yan1,2, XU Xin-ying3, XIE Gang1,2,3   

  1. 1. School of Electronic and Information Engineering, Taiyuan University of Science and Technology, Taiyuan 030024, China; 2. Shanxi Key Laboratory of Advanced Control and Equipment Intelligence, Taiyuan 030024, China; 3. College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
  • Received:2022-10-10 Revised:2022-11-16 Accepted:2022-12-05 Online:2023-02-25 Published:2023-02-16
  • Supported by:
    National Natural Science Foundation of China (62006169);Key Research and Development Program of Shanxi Province (202102020101005);Taiyuan University of Science and Technology Scientific Research Initial Funding (20192047)

摘要: 针对交通场景目标分割边缘不平滑以及小目标难以准确分割等问题,本文提出一种双注意力引导的跨层优化交通场景语义分割算法。首先,构建多分支特征提取编码网络,并利用串行非比例式空洞卷积实现空间上下文信息提取,进而改善小目标信息的丢失;其次,构建基于空间对齐的跨层特征融合解码网络,实现语义信息和细节信息的融合,增强不同尺度目标的表达能力;最后,提出通道和空间注意力机制,建模全局通道相关性和长距离位置相关性,增强网络对关键特征的学习能力。交通场景数据集Cityscapes和CamVid上的实验结果表明,所提特征提取编码网络、跨层特征融合解码网络以及注意力机制模块是有效的;所提语义分割算法获得了77.79%和78.66%的平均交并比,能够平滑目标分割边缘,尤其对细长条形目标具有鲁棒性。

关键词: 智能交通, 语义分割, 深度学习, 注意力机制, 小目标, 交通场景

Abstract: This paper proposes a dual attention guided cross-layer optimized traffic scene semantic segmentation to solve the problems that edge of object segmentation is not smooth and small objects are difficult to be accurately segmentedin the traffic scene. A coding network is established with multi-branch feature extraction. The assignment of dilation rate of atrous convolution is non-proportional and can extract spatial context information whichavoids the loss of small object information. Then, a cross-layer feature fusion decoding network based on spatial alignment is constructed to fuse semantic information and detail information, thereby enhancing the expressive ability of objects of different scales. At last, the channel and spatial attention mechanisms are proposed to model global channel correlations and long- distance location correlations, which enhances the ability to learn key features of the network. The experimental results on the traffic scene datasets Cityscapes and CamVid show that the proposed feature extraction encoding network, cross-layer feature fusion decoding network, and attention mechanism are effective.The proposed semantic segmentation algorithm achieves the mean intersection over union ratio of 77.79% and 78.66% , and can smooth the edge of object segmentation, especially for the long and thin objects.

Key words: intelligent transportation, semantic segmentation, deep learning, attention mechanism, small object, traffic scene

中图分类号: