交通运输系统工程与信息 ›› 2022, Vol. 22 ›› Issue (1): 195-208.DOI: 10.16097/j.cnki.1009-6744.2022.01.021

• 系统工程理论与方法 • 上一篇    下一篇

基于梯度关联规则的老年行人交通事故风险识别

袁振洲a, b,郭曼泽a, b,彭泳鑫a, b,杨洋* a, b   

  1. 北京交通大学,a. 综合交通运输大数据应用技术交通运输行业重点实验室;b. 交通运输学院,北京100044
  • 收稿日期:2021-07-23 修回日期:2021-09-08 接受日期:2021-10-11 出版日期:2022-02-25 发布日期:2022-02-23
  • 作者简介:袁振洲(1966- ),男,吉林舒兰人,教授,博士。
  • 基金资助:
    中国博士后科学基金

Risk Recognition of Older Pedestrian Traffic Crashes Based on XGB-Apriori Algorithm

YUAN Zhen-zhoua, b, GUO Man-zea, b, PENG Yong-xina, b, YANG Yang* a, b   

  1. a. Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehesive Transport; b. School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China
  • Received:2021-07-23 Revised:2021-09-08 Accepted:2021-10-11 Online:2022-02-25 Published:2022-02-23
  • Supported by:
    China Postdoctoral Science Foundation(2021M700333)。

摘要: 为构建老年行人交通事故严重程度风险关联因素识别方法体系,本文应用极限梯度提升关联规则挖掘算法(Extreme Gradient Boost-Apriori,XGB-Apriori)识别城市道路老年行人交通事故风险因子。运用机器学习优化关联规则算法结构,通过机器学习库 scikit-learn 中 XGBoost (Extreme Gradient Boost)算法与SFM(Select From Model)特征选择类功能实现变量特征值的选择。进而,对Apriori算法设置有序定向约束,得到适用于交通事故致因分析的数据挖掘技术。通过逐层迭代识别关联项,选取频繁项集,总结高置信度、高提升度的关联规则。关联因素模型评估结果表明,本文采用的SFM功能准确度可达78.31%,关联规则XGB-Apriori算法较传统算法精度提升了91%。挖掘结果显示,驾驶员与行人的自身特征、车辆特征、碰撞状态以及道路特征均对老年行人交通事故的严重程度具有重要影响。其中,男性驾驶员造成的行人死亡事故频次较高,女性驾驶员造成的受伤事故频次较高;大型、重型车辆(SUV、卡车、施工车)发生死亡事故频次相对小轿车更高;位于匝道等道路线型弯曲的坡道中,老年行人发生致死交通事故的频次相对线型缓和路段更高。本文对老年行人交通事故耦合因素全面识别并针对性提出风险防控精准预判方法,为有效保护道路弱势群体提供必要的理论支持。

关键词: 交通工程, 事故风险识别, Apriori算法, 老年行人, XGBoost算法, 行人交通事故, 机器 学习

Abstract: In order to provide a recognition system for the safety features associated with older pedestrian traffic crashes severity levels. This paper applies Extreme Gradient Boost-Apriori (XGB-Apriori) to recognize the features of older pedestrian crashes in the road network. First, the methodology optimizes the weight of those crashes associated features by implementing the Select From Models(SFM) function from the XGBoost algorithm in scikit-learn. This process trains an XGBoost model to select important features by a pre-set threshold, then calculates a relative feature score (F- Score) for those selected features and set up a directional constrain to achieve an applicable data mining program for the causality analysis of traffic crashes which proposes a multi-dimensional interaction Apriori algorithm. This algorithm in this study recognizes the associated features, selects the highly frequent features, outputs association roles with relatively high confidence and lift. Moreover, this study evaluates the proposed SFM function and XGBApriori algorithm, the accuracy of the SFM function is 78.31% and the XGB-Apriori can increase 91% accuracy of the traditional algorithm, which indicates that the proposed algorithm and system can accurately predict the correlations among the causes and features leading to the severity of traffic accidents of older pedestrians. This study provides insights on the influence of the demographic features of driver and pedestrian, the features of vehicle and road structure on the severity of older pedestrian crashes; among them (1) more fatal crashes happen with male drivers compared with female drivers; (2) SUV, pickup truck and utility vehicles involved in more fatal pedestrian crashes than passenger cars; (3) and when the older pedestrian crash happens on the grade curve road, it is more likely to be a fatal crash. This paper proposes an accurate prognostic method for the comprehensive identification of the coupling factors of older pedestrian crashes and the implementation of targeted risk prevention and control, providing the necessary theoretical support for the effective protection of vulnerable groups on the road.

Key words: traffic engineering, crash risk recognition, Apriori algorithm, older pedestrian, XGBoost algorithm; pedestrian crash, machine learning

中图分类号: