交通运输系统工程与信息 ›› 2025, Vol. 25 ›› Issue (4): 63-72.DOI: 10.16097/j.cnki.1009-6744.2025.04.007

• 智能交通系统与信息技术 • 上一篇    下一篇

基于深度强化学习的城市交通信号分层协同控制方法

代亮* ,杜鹏飞,黄自彬,杨朋博   

  1. 长安大学,电子与控制工程学院,西安710064
  • 收稿日期:2025-02-28 修回日期:2025-03-28 接受日期:2025-04-09 出版日期:2025-08-25 发布日期:2025-08-25
  • 作者简介:代亮(1981—),男,陕西安康人,教授,博士。
  • 基金资助:
    陕西省交通运输厅交通科研项目 (24-15R);长安大学中央高校基本科研业务费专项资金(300102323201)。

Hierarchical Collaborative Control of Urban Traffic Signals Based on Deep Reinforcement Learning

DAI Liang*, DU Pengfei, HUANG Zibin,YANG Pengbo   

  1. School of Electronics and Control Engineering, Chang'an University, Xi'an 710064, China
  • Received:2025-02-28 Revised:2025-03-28 Accepted:2025-04-09 Online:2025-08-25 Published:2025-08-25
  • Supported by:
    Transportation Research Project of Shaanxi Provincial Department of Transport (24-15R);The Fundamental Research Funds for the Central Universities, CHD (300102323201)。

摘要: 强化学习具有强大的自适应性和学习能力,能够根据环境变化和反馈信号不断调整策略和行为,实现持续优化,为城市交通信号控制提供新的技术手段。针对现有强化学习方法在交通信号协同控制中存在的智能体协作效率低下与控制区域划分机制缺失问题,本文提出一种交通信号分层协同控制架构,通过构建交叉口智能体,进行状态空间与回报函数的关联性协同设计,并建立基于拥堵扩散的交通控制子区划分模型,实现动态划分交通控制子区。最后,构建子区智能体协调子区内部交叉口智能体,交叉口智能体根据子区智能体提供的全局性建议以及所在交叉口情况完成信号控制方案的优化,实现区域交通信号分层协同控制。仿真结果表明,与现有定时控制与强化学习方法相比,本文方法平均行程时间分别降低56.78%和29.23%。相比MPLight(Max Pressure Light)方法,平均速度提升7.21%,平均行程时间与停车次数分别减少22.62%和3.98%。此外,通过对比在不同规模以及拓扑结构路网的性能表现,验证本文方法在同质交叉口路网中具有一定可移植性。

关键词: 智能交通, 交通信号控制, 深度强化学习, 多智能体, 子区划分

Abstract: Reinforcement learning has strong adaptability and learning ability, which can continuously adjust strategies and behaviors based on changes in the environment and feedback signals, thereby achieving continuous optimization and providing new technological means for urban traffic signal control. In response to the low efficiency of intelligent agent collaboration and the lack of control area partitioning mechanism in existing reinforcement learning methods for traffic signal collaborative control, this paper proposes a traffic signal hierarchical collaborative control architecture. By constructing intersection intelligent agents, this paper performs the correlation and collaborative design of state space and reward function. A traffic control sub zone partitioning model based on congestion diffusion is established to dynamically partition traffic control sub zones. The regional coordinated traffic signal control framework is established by deep reinforcement learning, and the information interaction mechanism is proposed with multi-level agents. The dynamic division method of traffic control sub-area is developed based on the congestion diffusion relationship of urban intersections. The results show that, compared with the existing timing control and reinforcement learning methods, the average travel time of the proposed method is reduced by 56.78% and 29.23%, respectively. In addition, the proposed method has certain portability in the road network of homogeneous intersections.

Key words: intelligent transportation, traffic signal control, deep reinforcement learning, multi-agent, sub-area division

中图分类号: