Journal of Transportation Systems Engineering and Information Technology ›› 2023, Vol. 23 ›› Issue (2): 233-241.DOI: 10.16097/j.cnki.1009-6744.2023.02.025

Previous Articles     Next Articles

Semantic Segmentation of Railway Scene Based on Reticulated Multi-scale and Bidirectional Channel Attention

LU Tonga, YU Zu-juna,b, GUO Bao-qing*a,c, RUAN Taoa,b   

  1. a. School of Mechanical and Electronic Control Engineering; b. Frontier Science Center for Intelligent High-speed Railway System; c. Collaborative Innovation Center of Railway Traffic Safety, Beijing Jiaotong University, Beijing 100044, China
  • Received:2022-12-29 Revised:2023-02-15 Accepted:2023-02-22 Online:2023-04-25 Published:2023-04-19
  • Supported by:
    Fundamental Research Funds for the Central Universities of Ministry of Education of China (2022JBXT005);National Natural Science Foundation of China (52072026)

网状多尺度与双向通道注意力的铁路场景语义分割

路通a,余祖俊a,b,郭保青*a,c,阮涛a,b   

  1. 北京交通大学,a. 机械与电子控制工程学院;b. 智慧高铁系统前沿科学中心;c. 轨道交通安全协同创新中心,北京 100044
  • 作者简介:路通(1995- ),男,吉林辽源人,博士生
  • 基金资助:
    中央高校基本科研业务费专项资金(2022JBXT005);国家自然科学基金(52072026)

Abstract: Semantic segmentation is the basis of intelligent perception. This paper proposes a semantic segmentation network for railway scenes based on reticulated multi-scale fusion and bidirectional channel attention to address the difficulties that railway scene categories are complex and effective features are challenging to extract. To enhance the discriminant ability of the model for various railway facilities, a reticulated multi-scale fusion module is proposed. The module is embedded in the backbone network to obtain the parallel connection of features of different scales and carries out reticulated information interaction in the fusion layer to realize the feature fusion of different branches. By aggregating inputs from other branches, the output of the model can simultaneously retain multi-resolution features. In order to improve the extraction performance of effective features in complex railway scenes, a bidirectional channel attention module is proposed. After the up-down sampling operation of the backbone network, the forward channel attention module makes the output feature map weighted by the input features of different scales, so as to adaptively improve the expression of effective features. The inverted channel attention module is inserted before the final output of the model, to generate effective high-level semantic information while retaining the underlying spatial information. Experimental results on the RailSem19 railway dataset show that the proposed method can significantly improve the segmentation performance of easily confused categories as well as railway facility categories such as track area, catenary mast, train and protective fence, and mIoU reached 65.12%, which is a certain improvement compared with other methods.

Key words: intelligent transportation, railway semantic segmentation, deep learning, complex railway scene, reticulated multi-scale fusion, bidirectional channel attention

摘要: 语义分割是智能感知的基础。本文针对复杂铁路场景类别易混淆及有效特征难提取的问题,提出一种基于网状多尺度融合与双向通道注意力的铁路场景语义分割网络。为增强模型对各类铁路设施的判别能力,提出网状多尺度融合模块。该模块嵌入主干网络,获取不同尺度的特征并行连接,并在融合层进行网状信息交互,实现不同分支的特征融合。通过汇聚其他分支输 入,模型输出可同时保留多分辨率特征。为提升复杂铁路场景下有效特征的提取性能,提出双向通道注意力模块。正向通道注意力模块位于主干网络上下采样操作后,使输出特征图由不同尺度的输入特征加权重组,从而自适应地提升有效特征的表达;反向通道注意力模块插入模型最终输出之前,保留底层空间信息的同时,生成有效高阶语义信息。在RailSem19铁路数据集上的实验结果表明,本文所述方法对于易混淆类别,以及轨道区域、接触网立柱、列车及防护栅栏等铁路设施类别的分割性能均有显著提升,mIoU达到65.12%,相比于其他方法有一定提升。

关键词: 智能交通, 铁路语义分割, 深度学习, 复杂铁路场景, 网状多尺度融合, 双向通道注意力

CLC Number: