交通运输系统工程与信息 ›› 2026, Vol. 26 ›› Issue (3): 302-314.DOI: 10.16097/j.cnki.1009-6744.2026.03.027

• 系统工程理论与方法 • 上一篇    下一篇

基于地理自编码与跨域迁移的公交出行需求分层聚类方法

田君豪1 ,邢璐1 ,廖世豪1 ,桂瑰*1 ,蒋小晴2   

  1. 1. 中南大学,自动化学院,长沙410083;2.湖南中车智行科技有限公司,长沙410017
  • 收稿日期:2026-01-22 修回日期:2026-03-25 接受日期:2026-04-17 出版日期:2026-06-25 发布日期:2026-06-23
  • 作者简介:田君豪(1997— ),男,江西上饶人,博士生。
  • 基金资助:
    国家自然科学基金(62473381);湖南省科技创新计划 (2025JK2068)。

A Hierarchical Clustering Method for Public Transit Demand Based on Geographic Autoencoder and Cross-Domain Transfer

TIAN Junhao1, XING Lu1, LIAO Shihao1, GUI Gui*1, JIANG Xiaoqing2   

  1. 1. School of Automation, Central South University, Changsha 410083, China; 2. Hunan CRRC Intelligent Transport Technology Co Ltd, Changsha 410017, China
  • Received:2026-01-22 Revised:2026-03-25 Accepted:2026-04-17 Online:2026-06-25 Published:2026-06-23
  • Supported by:
    National Natural Science Foundation of China(62473381);Hunan Provincial Science and Technology Innovation Plan (2025JK2068)。

摘要: 为提升城市公交出行需求聚类的精确性,突破传统聚类方法在非凸形态捕捉、噪声适应性和参数依赖性等方面的局限,本文提出一种支持跨域应用的公交出行需求聚类通用框架,并据此设计一种基于地理自编码器的分层聚类算法(GeoAE-HC)。首先,引入正余弦位置编码,设计地理自注意力机制捕捉公交出行需求数据间的特征相似性与地理邻近性;然后,设计融合密度聚类与均值聚类的分层聚类方法,并应用在自编码器提取的潜在特征空间中,实现对公交出行需求分布的准确聚类;最后,为提升模型在不同城市的跨域泛化能力,设计结合域对抗训练和冻结编码器微调解码器的迁移学习策略,剥离城市特异性噪声。仿真结果表明,在成都数据集上,当聚类中心数为375时,GeoAE-HC算法的各项评价指标均优于对比算法,其中,轮廓系数较深度嵌入聚类(DeepEmbedded Clustering,DEC)、DK-means与k-means分别提升11.82%、23.14%和45.01%,Calinski-Harabasz(CH)指数较 DEC、DK-means与k-means分别提升6.14%、5.62%和14.44%。在跨域迁移至北京数据集后,GeoAE-HC算法性能表现也优于对比算法,验证了该算法的有效性和泛化能力。

关键词: 智能交通, 深度聚类, 自编码器, 公交出行需求, 迁移学习

Abstract: To enhance the accuracy of urban public transit travel demand clustering and overcome the limitations of traditional clustering methods in capturing non-convex patterns, handling noise, and reducing parameter dependency, this paper proposes a general framework for public transit travel demand clustering supporting cross-domain applications and develops a Geographic AutoEncoder-based Hierarchical Clustering (GeoAE-HC) algorithm. The sine-cosine positional encoding is introduced and a geographical self-attention mechanism is designed to capture both feature similarity and geographical proximity among transit travel demand data. A hierarchical clustering method integrating density-based clustering and mean-based clustering is developed and applied in the latent feature space extracted by the autoencoder, enabling accurate clustering of travel demand distribution. To improve the cross-domain generalization ability of the model across different cities, a transfer learning strategy combining domain adversarial training with a frozen encoder and fine-tuned decoder is designed to strip away city-specific noise. Simulation results show that in the Chengdu dataset with 375 cluster centers, all evaluation metrics of the GeoAE-HC algorithm are superior to those of the comparison algorithms. Specifically, the Silhouette Coefficient is improved by 11.82%, 23.14%, and 45.01% compared to Deep Embedded Clustering(DEC), DK-means, and k-means, respectively. The Calinski-Harabasz (CH) Index is improved by 6.14%, 5.62%, and 14.44%, respectively. After cross-domain transfer to the Beijing dataset, the performance of GeoAE-HC also outperforms those of comparative algorithms, validating its effectiveness and generalization capability.

Key words: intelligent transportation, deep clustering, autoencoder, public transit demand, transfer learning

中图分类号: