交通运输系统工程与信息 ›› 2025, Vol. 25 ›› Issue (4): 104-115.DOI: 10.16097/j.cnki.1009-6744.2025.04.011

• 智能交通系统与信息技术 • 上一篇    下一篇

基于无锚旋转框的航拍图像车辆全向检测方法

王维锋* ,黄建鑫,王晓全,吴昕韩,卞子馨   

  1. 河海大学,土木与交通学院,南京210098
  • 收稿日期:2025-03-07 修回日期:2025-05-05 接受日期:2025-06-03 出版日期:2025-08-25 发布日期:2025-08-25
  • 作者简介:王维锋(1979—),男,湖北京山人,教授,博士。
  • 基金资助:
    中央高校基本科研业务费专项基金 (B240201168);江苏省交通运输科技项目 (2024Y19)。

Multi-directional Vehicle Detection in Aerial Images Based on Anchor-free Oriented Bounding Box

WANG Weifeng*, HUANG Jianxin, WANG Xiaoquan, WU Xinhan, BIAN Zixin   

  1. College of Civil and Transportation Engineering, Hohai University, Nanjing 210098, China
  • Received:2025-03-07 Revised:2025-05-05 Accepted:2025-06-03 Online:2025-08-25 Published:2025-08-25
  • Supported by:
    Fundamental Research Funds for the Central Universities of Ministry of Education of China (B240201168);Transportation Science and Technology Projects of Jiangsu Province (2024Y19)。

摘要: 交通场景的航拍图像具有背景复杂,车辆长宽比分布不均,以及车辆航向角动态多变等特点,导致车辆检测任务中易出现漏检或误检问题。为此,本文通过改进YOLOv8-OBB(You Only Look Once version 8-Oriented Bounding Box)网络,提出一种针对航拍图像的车辆全向检测方法。首先,在网络的颈部引入可选择性大核注意力机制(Large Selective Kernel Attention Mechanism, LSKAM),增强对不同长宽比车辆的特征提取能力;其次,为提升对背景与目标的区分能力,在头部的路径聚合网络(Path Aggregation Network, PANet)中增加维度为10×10的深层特征提取模块;最后,在网络的颈部加入VoV-GSCSP(VoVNetGSConv Cross Stage Partial)轻量化模块兼顾检测精度与速度。在大规模数据集Drone-Vehicle上的实验结果表明,相较于Oriented-R-CNN(Oriented-Regions with Convolutional Neural Networks)、R-YOLOv3-tiny、YOLOv6OBB、YOLOv8-OBB和YOLOv12-OBB等典型检测方法,本文方法具有更优的检测精度和更低的计算复杂度,针对“Car” “Bus”类别的检测精度超过95%,且针对所有类别车辆的平均检测精度为73.7%,计算复杂度为26.9 GFLOPs(Giga Floating-Point Operations per Second);同时,通过无人机实地采集数据进行验证,表明本文方法可有效减少漏检与误检问题,满足航拍视角下的车辆全向检测任务要求。

关键词: 智能交通, 车辆检测, YOLOv8-OBB, 航拍图像, 注意力机制

Abstract: Aerial images of traffic scenarios are characterized by complex backgrounds, uneven distribution of vehicle aspect ratios, and dynamic variations in vehicle heading angles, which often lead to missed or false vehicle detection. This paper proposes an improved YOLOv8-OBB (You Only Look Once version 8-Oriented Bounding Box) network tailored for detecting vehicles with different heading angles in aerial images. First, a Large Selective Kernel Attention Mechanism (LSKAM) was integrated into the network's neck to enhance feature extraction capabilities for vehicles with varying aspect ratios. To improve the distinction between backgrounds and targets, a deep feature extraction module with a dimension of 10×10 was added to the Path Aggregation Network (PANet) in the head. Then, a VoV-GSCSP (VoVNet GSConv Cross Stage Partial) based lightweight module was embedded into the neck of the network to balance detection accuracy and speed. Experimental results on the large-scale Drone Vehicle dataset show that the proposed method outperforms typical detection methods such as Oriented-R-CNN(Oriented-Regions with Convolutional Neural Networks), R-YOLOv3-tiny, YOLOv6-OBB, YOLOv8-OBB and YOLOv12-OBB in terms of detection accuracy and computational complexity. Specifically, the detection accuracy for "Car" and "Bus" categories exceeds 95%, with a mean average precision (mAP) of 73.7% and a computational complexity of 26.9 GFLOPs (Giga Floating-Point Operations per Second) for all types of vehicles selected in the experiment. Additionally, verification using data collected in the field by drones indicates that the proposed method can effectively reduce missed and false detection, thereby fulfilling the requirements for vehicle detection tasks from an aerial perspective.

Key words: intelligent transportation, vehicle detection, YOLOv8-OBB, aerial image, attention mechanism

中图分类号: