交通运输系统工程与信息 ›› 2025, Vol. 25 ›› Issue (2): 180-189.DOI: 10.16097/j.cnki.1009-6744.2025.02.017

• 系统工程理论与方法 • 上一篇    下一篇

基于可解释集成学习的异常驾驶行为风险识别方法

邓院昌*,蒋昀轩,陶胜芹   

  1. 中山大学,智能工程学院,广东省智能交通系统重点实验室,广东深圳518107
  • 收稿日期:2024-11-14 修回日期:2025-02-12 接受日期:2025-02-24 出版日期:2025-04-25 发布日期:2025-04-20
  • 作者简介:邓院昌(1972—),男,江西临川人,副教授,博士。
  • 基金资助:
    国家自然科学基金(U1611461)。

Risk Identification Method for Abnormal Driving Behavior Based on Interpretable Ensemble Learning Model

DENG Yuanchang*,JIANG Yunxuan,TAO Shengqin   

  1. Guangdong Provincial Key Laboratory of Intelligent Transportation System, School of Intelligent Systems Engineering, Shenzhen Campus of SunYat-Sen University, Shenzhen 518107, Guangdong, China
  • Received:2024-11-14 Revised:2025-02-12 Accepted:2025-02-24 Online:2025-04-25 Published:2025-04-20
  • Supported by:
    National Natural Science Foundation of China (U1611461)。

摘要: 为实现对异常驾驶行为风险的准确识别,并克服现有模型可解释性差的局限,本文通过自然驾驶试验获取车辆运动数据,针对超速、急变速、急转弯、跟车过近、危险变道这5种异常驾驶行为,采用阈值法量化其风险系数,结合CRITIC(CriteriaImportance Through Inter-criteria Correlation)权重法及分位值法划分异常驾驶行为风险等级。构建基于Stacking的集成学习识别模型,由不同学习器组合训练结果,选择综合性能最佳的学习器GBDT(Gradient Boosting Decision Tree)、AdaBoost、XGBoost作为初级学习器组合,采用Logistic回归作为次级学习器。在此基础上,利用SHAP(ShapleyAdditiveexPlanation)算法分析了特征变量对最优Stacking模型识别结果的影响。研究表明,最优Stacking模型识别准确率达92.68%,对于异常驾驶行为风险实现较高精度的识别。车速和换道碰撞时间是对模型识别结果影响较大的特征,车速超过95 km·h-1和换道碰撞时间小于2.8s均会增加行为风险,且车速超过150km·h-1时风险等级更高。本文为异常驾驶行为风险的识别与解释提供了一个可行的框架,有望为提升交通安全水平提供技术支持。

关键词: 交通工程, 风险识别, Stacking集成学习, 异常驾驶行为, 自然驾驶试验, SHAP算法

Abstract: In order to accurately identify the risk of abnormal driving behavior and overcome the limitations of poor interpretability of existing models, this study collected vehicle motion data through natural driving test. Five types of abnormal driving behaviors were investigated: speeding, rapid shifting, sharp turning, short distance following, and dangerous lane changing. The risk coefficients of these behaviors were quantified using a threshold method, and the risk levels were classified using the CRITIC (Criteria Importance Through Inter-criteria Correlation) weight method and quantile method. A Stacking based ensemble learning identification model was constructed to identify the abnormal driving behaviors. The model combined training results from different learners. GBDT (Gradient Boosting Decision Tree), AdaBoost, and XGBoost that have the best comprehensive performance, selected as the primary learner combination, and logistic regression was used as the secondary learner. The SHAP (Shapley Additive exPlanation) algorithm was then used to analyze the influence of feature variables on the recognition results of the optimal Stacking model. Results indicate that the optimal Stacking model has an identification accuracy of 92.68%, achieving high precision in identifying abnormal driving behavior risks. Vehicle speed and time-to-collision of lane changing were identified as key features significantly impacting model recognition. Specifically, vehicle speed exceeding 95 km·h-1 and time-to-collision of lane changing less than 2.8 s both increase behavioral risk, and the risk level is higher when the vehicle speed exceeds 150 km·h-1. This study provides a feasible framework for identifying and interpreting the risks of abnormal driving behavior, which is expected to provide technical support for improving traffic safety levels.

Key words: traffic engineering, risk identification, Stacking ensemble learning, abnormal driving behavior, natural driving test, SHAPalgorithm

中图分类号: