交通运输系统工程与信息 ›› 2021, Vol. 21 ›› Issue (6): 298-309.

• “碳达峰、碳中和”下的交通运输业发展 • 上一篇    下一篇

基于深度特征聚类的高排放移动污染源自动识别

许镇义* 1,王仁军1,2,张聪3,王瑞宾1,2,夏秀山4   

  1. 1. 合肥综合性国家科学中心人工智能研究院,合肥 230088;2. 安徽大学,计算机科学与技术学院,合肥 230601; 3. 合肥市生态环境局,合肥 230601;4.中国科学技术大学,先进技术研究院,合肥 230088
  • 收稿日期:2021-07-09 修回日期:2021-08-31 接受日期:2021-09-08 出版日期:2021-12-25 发布日期:2021-12-24
  • 作者简介:许镇义(1993- ),男,江苏宿迁人,副研究员。
  • 基金资助:
    国家自然科学基金

Automatic Identification of High-emitting Vehicle Based on Deep Feature Clustering

XU Zhen-yi* 1 , WANG Ren-jun1, 2 , ZHANG Cong3 , WANG Rui-bin1, 2 , XIA Xiu-shan4   

  1. 1. Hefei Comprehensive National Science Center Artificial Intelligence Research Institute, Hefei 230088, China; 2. School of Computer Science and Technology, Anhui University, Hefei 230601, China; 3. Hefei Municipal Environmental Protection Bureau, Hefei 230601, China; 4. Institute of Advanced Technology, University of Science and Technology of China, Hefei 230088, China
  • Received:2021-07-09 Revised:2021-08-31 Accepted:2021-09-08 Online:2021-12-25 Published:2021-12-24
  • Supported by:
    National Natural Science Foundation of China(62103124)

摘要: 传统的高排放移动源识别方式是将采集的尾气数据与预先设定的排放阈值进行比较判 定,但是,排放阈值的设定很大程度上取决于人为标准,并且忽视了外部因素对尾气排放的影响, 无法真正反映移动源排放水平。针对此问题,本文结合机器学习算法,提出一种基于深度特征聚 类的高排放移动源识别方法。首先,利用随机森林算法筛选出不同污染物(CO、HC、NO)排放的 主要影响特征;其次,对多维影响特征进行聚类分析,获取高排放类别标签;最后,训练得到基于 深度森林的移动污染源分类模型,自动识别高排放目标源。通过对比实验,在合肥市机动车污染 遥测数据集上验证了所提方法的有效性。

关键词: 信息技术, 高排放识别, 特征聚类, 深度森林

Abstract: The traditional approach of identifying high emission mobile sources is to compare the collected tailpipe data with pre-defined emission thresholds. However, the setting of emission thresholds depends mainly on human standards, and this method ignores the influence of external factors on tailpipe emissions, which cannot exactly reflect the emission level of mobile sources. To address this problem, this paper combines different machine learning algorithms and proposes a method for identifying high emission mobile sources based on deep feature clustering. The random forest algorithm is first used to filter out the main impact features of different pollutant (CO/HC/NO) emissions. Then, the multidimensional impact features are clustered to obtain the high emission category labels. A deep forest-based mobile source classification model is trained to automatically identify the high emission target sources. The experiment results on the telemetry dataset of mobile source pollution in Hefei verify the effectiveness of this method.

Key words: information technology, high-emitter identification, feature clustering, deep forest

中图分类号: