交通运输系统工程与信息 ›› 2025, Vol. 25 ›› Issue (6): 118-128.DOI: 10.16097/j.cnki.1009-6744.2025.06.011

• 系统工程理论与方法 • 上一篇    下一篇

考虑数据和模型不确定性的城轨线网客流短时预测方法

牟亮1 ,康彧瑞1 ,闫梓续1 ,朱广宇*2   

  1. 1. 北京交通大学,交通运输学院,北京100044;2.山西大学,自动化与软件学院,太原030031
  • 收稿日期:2025-06-20 修回日期:2025-09-06 接受日期:2025-09-09 出版日期:2025-12-25 发布日期:2025-12-24
  • 作者简介:牟亮(1996—),男,山东青岛人,博士生。
  • 基金资助:
    国家自然科学基金(62433005,62272036)。

Method for Short-term Passenger Flow Prediction in Urban Rail Transit Networks Considering Data and Model Uncertainty

MU Liang1, KANG Yurui1,YAN Zixu1, ZHU Guangyu*2   

  1. 1. School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China; 2. School of Automation and Software, Shanxi University, Taiyuan 030031, China
  • Received:2025-06-20 Revised:2025-09-06 Accepted:2025-09-09 Online:2025-12-25 Published:2025-12-24
  • Supported by:
    National Natural Science Foundation of China (62433005,62272036)。

摘要: 为获得预测时刻线网客流量的概率信息,本文提出一种考虑数据与模型不确定性的城轨线网客流短时概率预测模型(PD-STGCN)。该模型构建时空不确定性预测模块(STUPM)和概率量化模块(PQM)的协同框架。在STUPM中,针对客流数据和预测模型中存在的不确定性,利用高斯负对数似然损失(GNLL)和蒙特卡洛Dropout技术(MCDropout),开发融合两种不确定性量化结果的新损失函数。在PQM中,根据预测结果利用随机正态采样获得离散的样本集,基于高斯核密度估计(KDE)方法输出连续的概率预测结果。以某大城市城轨交通客流数据为例,选择工作日和非工作日两种场景进行验证。结果表明:相较于基准预测模型,PD-STGCN的预测区间覆盖概率(PICP)和连续排序概率评分(CRPS)指标分别提升8.01%和20.77%,可以更好地覆盖真实客流值且预测精度更高。通过消融实验验证不确定性是影响模型性能最显著的因素,考虑双重不确定性比仅考虑单种不确定性模型在PICP和CRPS上分别提升了1.91%和4.02%以上。

关键词: 城市交通, 客流预测, 概率预测, 城轨站点客流, 多源数据, 不确定性量化

Abstract: This paper proposes a short-term probabilistic model for forecasting passenger flow in urban rail transit networks, termed PD-STGCN, which considers the uncertainties of both data and model. It aims to obtain the probabilistic information of network-wide passenger volume at future time steps. The model constructs a collaborative framework consisting of the Spatio Temporal Uncertainty Prediction Module (STUPM) and Probabilistic Quantification Module (PQM). In the STUPM, addressing uncertainties inherent in both the passenger flow data and forecasting model, a novel loss function is developed by integrating Gaussian Negative Log-Likelihood (GNLL) and Monte Carlo Dropout (MC Dropout) techniques to quantify these dual uncertainties. Within the PQM, the discrete sample sets are obtained through the random normal sampling based on prediction results, and continuous probabilistic forecasting outputs are generated using the Gaussian Kernel Density Estimation (KDE). Using the data of passenger flow from a urban rail transit system in a large city as a case study, the model is validated under both weekday and non-weekday scenarios. The results demonstrate that, compared to the baseline forecasting models, PD-STGCN improves the Prediction Interval Coverage Probability (PICP) and Continuous Ranked Probability Score (CRPS) by 8.01% and 20.77%, respectively, which provides a better coverage of actual passenger flow values and has higher forecasting accuracy. Ablation experiments confirm that uncertainty is the most significant factor affecting model performance. The model considering dual uncertainties leads to the improvements of at least 1.91% in PICP and 4.02% in CRPS over that considering only a single type of uncertainty.

Key words: urban traffic, passenger flow prediction, probability prediction, urban rail transit station passenger flow, multi-source data, uncertainty quantification

中图分类号: