交通运输系统工程与信息 ›› 2026, Vol. 26 ›› Issue (2): 137-147.DOI: 10.16097/j.cnki.1009-6744.2026.02.013

• 智能交通系统与信息技术 • 上一篇    下一篇

融合双注意力机制的快速路协同深度强化学习方法

孙健*a,b ,纪裕伟a ,于珂伟a ,李子豪a ,赵昱霖a   

  1. 长安大学,a.未来交通学院;b.运输工程学院,西安710064
  • 收稿日期:2025-11-24 修回日期:2026-01-05 接受日期:2026-01-12 出版日期:2026-04-25 发布日期:2026-04-20
  • 作者简介:孙健(1977—),男,安徽芜湖人,教授,博士。
  • 基金资助:
    国家自然科学基金(52172319);国家社会科学基金 (22XJY030)。

Collaborative Deep Reinforcement Learning Method for Expressways Integrating Dual Attention Mechanism

SUN Jian*a,b, JI Yuweia, YU Keweia, LI Zihaoa, ZHAO Yulina   

  1. a. School of Future Transportation; b. School of Transportation Engineering, Chang'an University, Xi'an 710064, China
  • Received:2025-11-24 Revised:2026-01-05 Accepted:2026-01-12 Online:2026-04-25 Published:2026-04-20
  • Supported by:
    National Natural Science Foundation of China (52172319);National Social Science Foundation of China (22XJY030)。

摘要: 城市快速路匝道合流区是交通瓶颈频发区域,而网联自动驾驶车辆与人工驾驶车辆构成的混合交通流为交通控制带来新挑战。本文旨在提出一种兼顾匝道排队溢出影响的混合交通流可变限速控制方法,以缓解拥堵并增强交通流稳定性。研究将合流区协同优化控制问题抽象为马尔可夫决策过程,提出双深度Q网络和双注意力机制(Convolutional Block Attention Module, CBAM)的集成方法DDQN-CBAM:构建含匝道排队长度和合流区密度等多维度核心参数与网格化特征的扩展状态空间,引入CBAM双注意力机制强化关键特征提取,设计融合通行效率和排队控制等多目标的奖励函数,并结合优先经验回放和渐进式车流输入等策略优化训练。以徐州北三环快速路为实例,在仿真平台(SUMO)进行仿真验证。实验结果表明,该方法较传统控制策略,总行程时间减少26.49%,总行驶距离提升35.95%,交通流量标准差降低超22.5%,小时控制频率与速度调整率分别稳定在10次和0.14左右。本文提出的方法兼具工程适用性与鲁棒性,为城市快速路匝道合流区交通控制提供可靠支撑。

关键词: 智能交通, 可变限速控制, 深度强化学习, 城市快速路, 合流区协同影响, 注意力机制

Abstract: On urban expressways, ramp merging areas are prone to become traffic bottlenecks. The mixed traffic flow composed of connected and automated vehicles (CAVs) and human-driven vehicles (HDVs) brings new challenges to traffic control. This study proposes a variable speed limit (VSL) control method for mixed traffic flow that considers ramp queuing spillover impacts, so as to alleviate congestion and enhance traffic flow stability. The merge area collaborative optimization control problem is formulated as a Markov decision process (MDP), and an integrated method DDQN-CBAM is proposed by combining the double deep Q-network (DDQN) and convolutional block attention module (CBAM). Specifically, an extended state space is constructed, including multi dimensional core parameters such as ramp queue length and merging area density, as well as grid-based features. The CBAM dual attention mechanism is introduced to strengthen the extraction of key features. A reward function integrating multi-objectives such as traffic efficiency and queue control is designed, and the training process is optimized by combining strategies such as prioritized experience replay and progressive traffic input. Taking the North Third Ring Road Expressway of Xuzhou, China as a case study, validation is completed on the Simulation of Urban Mobility (SUMO) platform. Experimental results show that compared with traditional control strategies, the proposed method reduces the total travel time by 26.49%, increases the total travel distance by 35.95%, decreases the standard deviation of traffic flow by more than 22.5%, and stabilizes the hourly control frequency and speed adjustment rate at approximately 10 times and 0.14, respectively. This method possesses both engineering applicability and robustness, and provides reliable support for traffic control of ramp merging areas on urban expressways.

Key words: intelligent transportation, variable speed limit control, deep reinforcement learning, urban expressway, merge area collaborative impacts, attention mechanism

中图分类号: