交通运输系统工程与信息 ›› 2023, Vol. 23 ›› Issue (6): 227-238.DOI: 10.16097/j.cnki.1009-6744.2023.06.023

• 系统工程理论与方法 • 上一篇    下一篇

轨道交通站点聚类及其对客流预测的影响分析

户佐安*1a, 1b, 1c,邓锦程1a,杨江浩1a,赵妍2   

  1. 1. 西南交通大学,a. 交通运输与物流学院,b. 综合交通大数据应用技术国家工程实验室,c. 综合交通运输智能化 国家地方联合工程实验室,成都 611756;2. 成都信息工程大学,资源环境学院,成都 610225
  • 收稿日期:2023-08-02 修回日期:2023-09-27 接受日期:2023-10-07 出版日期:2023-12-25 发布日期:2023-12-23
  • 作者简介:户佐安(1979- ),男,湖北黄梅人,副教授,博士。
  • 基金资助:
    国家自然科学基金(61104175);四川省科技计划项目(2021YJ0067)

Rail Transit Station Clustering and Its Impact on Passenger Flow Forecasting

HU Zuo-an*1a, 1b, 1c,DENG Jin-cheng1a,YANG Jiang-hao1a,ZHAO Yan2   

  1. 1a. School of Transportation and Logistics, 1b. National Engineering Laboratory of Integrated Transportation Big Data Application Technology, 1c. National United Engineering Laboratory of Integrated and Intelligent Transportation, Southwest Jiaotong University, Chengdu 611756, China; 2. College of Resources and Environment, Chengdu University of Information Technology, Chengdu 610225, China
  • Received:2023-08-02 Revised:2023-09-27 Accepted:2023-10-07 Online:2023-12-25 Published:2023-12-23
  • Supported by:
    National Natural Science Foundation of China (61104175); Sichuan Science and Technology Program (2021YJ0067)

摘要: 城市轨道交通站点受多层面因素交互作用而反映出异质性,为实现站点精细化分类,本文统筹考虑地铁刷卡数据、兴趣点数据和地铁网络数据,提取客流、土地利用和网络性质等特征,其中,客流层面考虑工作日、周末和节假日等不同日期类型下客流状态,土地利用层面考虑站点辐射区用地强度和均衡性,网络层面考虑节点自身特性和影响能力。构建基于主成分分析与K-means++算法的聚类模型,综合聚类评价指标确定簇数,辨析不同类型站点多维度特性,结合站区土地利用和站点网络特征探讨对出行活动的影响,并设计簇内联合预测和整体联合预测策略,采用3种多元时序预测方法探究站点聚类对预测性能的影响。研究结果表明:考虑全部客流特征时,划分为10簇,考虑工作日进站客流特征时,划分为5簇,充分挖掘客流时变特征能够获得更加精细化的聚类结果;各簇站点客流分布特征与其土地利用及网络特征间存在一定的反馈关系;相比于整体联合预测,通过聚类联合相关性强的站点进行预测,以间接捕获空间相关性的方式能有效提升预测性能,各模型均方根误差平均降低9.04%,平均绝对误差平均降低4.94%。研究结果为站点精细化管理和站区设施建设规划提供依据。

关键词: 城市交通, 站点聚类, 机器学习, 轨道交通站点, 多源数据, 客流预测

Abstract: Urban rail transit stations have heterogeneity due to the interaction of multi-level factors. This paper classifies the rail transit stations using the passenger card data, interest point data and subway network data, and extracts passenger flow, land use and network properties. The passenger flow property considers the different status under weekdays, weekends and holidays. The land use property considers the land use intensity and balance of the radiation area of the station. The network property considers the characteristics of nodes and influence capabilities. The paper develops the clustering model of the K-means++ algorithm and uses comprehensive clustering evaluation indicators to determine the number of clusters, distinguish and analyze the multi- dimensional characteristics of different types of stations. It combines the land use of the station area and the characteristics of the station network to analyze the impact on travel activities, and designs the joint prediction within the cluster and the overall joint. Three multivariate time series forecasting methods are used to evaluate the impact of site clustering on forecasting performance. The results show that the rail transit station can be divided into 10 clusters when consider the characteristics of all passenger flow. The rail transit stations can be divided into 5 clusters when consider the characteristics of inbound passenger flow on weekdays. Fully mining the time-varying characteristics of passenger flow can obtain more refined clustering results. There is a certain feedback relationship between passenger flow distribution characteristics and its land use and network characteristics. Compared to the overall joint prediction, the proposed method captures spatial correlation through clustering and joint prediction of sites with strong correlation and can effectively improve the prediction performance. The average reduction in root mean square error was 9.04%, and the average reduction in mean absolute error was 4.94%. The study results provide a basis for the refined management of the station and the planning of station facility construction.

Key words: urban traffic, station clustering, machine learning, rail transit station, multi- source data, passenger flow forecast

中图分类号: