交通运输系统工程与信息 ›› 2020, Vol. 20 ›› Issue (6): 99-105.

• 综合交通运输体系论坛 • 上一篇    下一篇

基于票务数据的团体旅客出行目的推断

钱剑培1,邵春福*1 ,李军1, 2,蔡楠3,黄士琛1   

  1. 1. 北京交通大学 综合交通运输大数据应用技术交通运输行业重点实验室,北京 100044; 2. 中国交通通信信息中心 交通运输信息化标准研究所,北京 100011; 3. 南通市规划设计院有限公司,江苏 南通 226004
  • 收稿日期:2020-07-07 修回日期:2020-08-07 出版日期:2020-12-25 发布日期:2020-12-25
  • 作者简介:钱剑培(1990-),男,江苏南通人,博士生.
  • 基金资助:

    国家自然科学基金创新研究群体科学基金/ Science Fund for Creative Research Groups of the National Natural Science Foundation of China (71621001).

Trip Purpose Inference of Group Passengers Based on Ticket Sales Data

QIAN Jian-pei1, SHAO Chun-fu1, LI Jun1, 2, CAI Nan3, HUANG Shi-chen1   

  1. 1. Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China; 2. Institute of Transportation Information Standardization, China Transport Telecommunications & Information Center, Beijing 100011, China; 3. Nantong Urban Planning & Design Institute Co., Ltd, Nantong 226004, Jiangsu, China
  • Received:2020-07-07 Revised:2020-08-07 Online:2020-12-25 Published:2020-12-25

摘要:

为推进城际交通大数据的应用,需要补全出行目的信息,将团体旅客出行目的决策与文本主题生成类比,开发基于无监督学习框架的出行目的推断方法.提出嵌入出发时间生成模块的主题模型,以及团体旅客重建和语义化特征设计方法,并通过吉布斯采样估计参数. 基于调查数据的模型对比研究发现,模型对一般私务辨识性能提升7.7%;基于票务数据的案例研究发现,模型对出发时间预测精度达到90.9%,间接验证了模型的可靠性.主题标注表明,模型不仅推断出4种与典型模式相符的出行目的,还辨识出既有认识外的非常规模式.对道路客运分析表明,出行目的构成呈现地区差异,高铁开通对不同出行目的出行量的负向影响程度不一.

关键词: 交通工程, 出行目的推断, 主题模型, 面板回归模型, 道路客运, 票务数据

Abstract:

To obtain trip purpose missing in big data derived from intercity transportation for deeper application, by drawing an analogy between the decision- making of trip purpose in group passengers and the generation of topics in texts, this study develops an approach for trip purpose inference under the unsupervised learning framework. First, a modified topic model embedded with the generation process of start time was proposed. Second, methods for reconstructing group passengers and designing semantic features were presented. Finally, the parameters were estimated using Gibbs sampling. Model comparison based on the survey data manifests that the performance of identifying personal affairs is raised by 7.7 percent using the proposed model; a case study based on the ticket sales data demonstrates that the precision of predicting start time is 90%, providing an indirect proof of its reliability. The topic annotation reveals that not only trip purpose corresponding to four typical patterns are inferred, but also anomalies beyond existing knowledge are recognized. In regard to the road passenger transport, trip purpose configuration shows a regional disparity, and whether high speed rail (HSR) has reached has diverse negative effects on the ridership of different trip purposes.

Key words: traffic engineering, trip purpose inference, topic model, panel regression model, road passenger transport, ticket sales data

中图分类号: