交通运输系统工程与信息 ›› 2020, Vol. 20 ›› Issue (2): 76-82.

• 智能交通系统与信息技术 • 上一篇    下一篇

考虑博弈的多智能体强化学习分布式信号控制

曲昭伟,潘昭天,陈永恒*,李海涛,王鑫   

  1. 吉林大学交通学院,长春 130022
  • 收稿日期:2019-12-10 修回日期:2020-02-11 出版日期:2020-04-25 发布日期:2020-04-30
  • 作者简介:曲昭伟(1962-),男,吉林大安人,教授,博士.
  • 基金资助:

    国家自然科学基金/National Natural Science Foundation of China(51705196).

Distributed Signal Control of Multi-agent Reinforcement Learning Based on Game

QU Zhao-wei, PAN Zhao-tian, CHEN Yong-heng, LI Hai-tao,WANG Xin   

  1. College of Transportation, Jilin University, Changchun 130022, China
  • Received:2019-12-10 Revised:2020-02-11 Online:2020-04-25 Published:2020-04-30

摘要:

交通需求的不均衡和波动会增加分布式信号控制优化的难度. 由于现有独立动作的多智能体强化学习(IA-MARL)仅基于自身的历史经验做出决策,基于IA-MARL的分布式信号控制难以及时缓解交通需求不均衡和波动的影响. 本文融入博弈论的混合策略纳什均衡概念,改进IA-MARL的决策过程,提出考虑博弈的多智能体强化学习(G-MARL)框架. 在采用带有泊松到达率的道路网络流量不均衡输入的格子网络中,分别对基于IA-MARL 和GMARL 的分布式控制方法进行数值模拟,获取单位行程时间和单位车均延误曲线. 结果显示,与IA-MARL相比,G-MARL在单位行程时间和单位车均延误方面分别改善59.94%和81.45%. 证明G-MARL适用于不饱和且交通需求不均衡和波动的分布式信号控制.

关键词: 智能交通, 分布式交通信号控制, 多智能体强化学习, 不均衡需求下的城市道路网络, 博弈论, 数值模拟

Abstract:

The difficulty of distributed signal control is increasing due to the unbalance and fluctuation of traffic demand. Since the decision-making of existing independent action multi-agent reinforcement learning (IA-MARL) is based on its own historical experience, the distributed signal control based on IA-MARL is difficult to timely alleviate the impact of unbalanced and fluctuating traffic demand. In this paper, the framework of multi- agent reinforcement learning based on the game (G-MARL) was proposed by improving the decision- making of IAMARL with integrating the mixed strategy Nash- equilibrium, which is a concept in game theory. In the grid network with the Poisson arrival rate, the distributed control methods based on IA-MARL and G-MARL were simulated to obtain the unit travel time and the unit vehicle delay curves. The results show that, the unit travel time and the unit vehicle average delay obtained by G-MARL are reduced by 59.94% and 81.45% compared with IAMARL respectively. It is proved that G-MARL is suitable for distributed signal control when there are unbalances and fluctuations in traffic demand with the unsaturated state.

Key words: intelligent transportation, distributed traffic signal control, multi-agent reinforcement learning, urban road network under unbalanced demand, game theory, numerical simulation

中图分类号: