交通运输系统工程与信息 ›› 2025, Vol. 25 ›› Issue (6): 87-100.DOI: 10.16097/j.cnki.1009-6744.2025.06.008

• 系统工程理论与方法 • 上一篇    下一篇

面向高层建筑应急救援的无人机螺旋搜索轨迹控制方法

陈德启1 ,张自设1 ,张文会*1 ,闫学东2 ,蒋贤才1   

  1. 1. 东北林业大学,土木与交通学院,哈尔滨150040;2.西南交通大学,交通运输与物流学院,成都611756
  • 收稿日期:2025-06-30 修回日期:2025-08-21 接受日期:2025-09-02 出版日期:2025-12-25 发布日期:2025-12-24
  • 作者简介:陈德启(1990—),男,黑龙江哈尔滨人,讲师。
  • 基金资助:
    黑龙江省哲学社会科学研究规划项目 (23GLCo22);国家自然科学基金 (52272311)。

Trajectory Control Method for UAV Spiral Search Oriented to High-rise Building Emergency Rescue

CHEN Deqi1, ZHANG Zishe1, ZHANG Wenhui*1,YAN Xuedong2, JIANG Xiancai1   

  1. 1. School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China; 2. School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China
  • Received:2025-06-30 Revised:2025-08-21 Accepted:2025-09-02 Online:2025-12-25 Published:2025-12-24
  • Supported by:
    Heilongjiang Province Philosophy and Social Science Research Planning Project (23GLCo22);National Natural Science Foundation of China (52272311)。

摘要: 在灾后黄金救援时期,可使用无人机率先抵达受损高层建筑进行螺旋上升式全覆盖扫描感知灾情。然而,由于受灾现场复杂的动态环境,无人机在抵近立体扫描时,容易出现轨迹跟踪精度低和碰撞风险高等问题。为此,本文提出融合优先经验回放的软演员-评论家(PER-SAC)控制模型,并基于六自由度(6DOF)非线性动力学模型搭建高保真仿真平台。模型通过优先学习高时序差分误差(TD-error)的关键经验,提升复杂任务中的学习效率与控制策略的鲁棒性。仿真对比实验表明,所提PER-SAC策略的收敛速度和最终性能均优于软演员-评论家(SAC)和近端策略优化(PPO)算法。在静态轨迹跟踪任务中,PER-SAC的任务成功率达99.0%,平均轨迹误差较SAC降低了66.3%;在动态避障任务中,其任务成功率高达97.0%,且规避动作更平滑高效,模型控制的鲁棒性得到充分验证。通过融合优先经验回放机制显著提升无人机在未知动态环境下的自主飞行性能。所构建的PER-SAC模型即可以兼顾飞行控制精度、飞行品质与安全性,也可直接应用于灾后对高层受损建筑物的自主螺旋式扫描,通过稳定的飞行姿态获取高清影像,从而辅助救援团队快速感知灾情,提升应急搜救效率。

关键词: 航空运输, 动态避障, 深度强化学习, 无人机, 应急救援

Abstract: During the golden rescue period following a disaster, unmanned aerial vehicles (UAVs) can be delivered first to reach the damaged buildings for spiral full-coverage scanning and search for survivors. However, due to the complex and dynamic environment at disaster sites, UAVs often encounter challenges such as low trajectory tracking accuracy and high collision risks during close-range three-dimensional (3D) scanning. To address these issues, this paper proposes a Prioritized Experience Replay Soft Actor-Critic (PER-SAC) control model and establishes a high-fidelity simulation platform based on a 6-degree-of-freedom (6 DOF) nonlinear dynamic model. By prioritizing the learning of key experiences with high Temporal-Difference error (TD-error), the model enhances learning efficiency and policy robustness in complex tasks. Comparative simulation experiments demonstrate that the proposed PER-SAC strategy outperforms both Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms in terms of convergence speed and final performance. In static trajectory tracking tasks, PER-SAC achieves a success rate of 99.0%, with an average trajectory error reduced by 66.3% compared to SAC. For dynamic obstacle avoidance tasks, the success rate is 97.0%, exhibiting smoother and more efficient evasion maneuvers, thereby fully validating its robustness. The incorporation of prioritized experience replay significantly improves UAVs' autonomous flight performance in unknown dynamic environments. The proposed PER-SAC strategy represents an advanced control method that effectively balances control precision, flight quality, and safety. It can be directly applied to autonomous spiral scanning of high-rise damaged buildings post-disaster, enabling stable flight attitudes to capture high-definition imagery. This capability assists rescue teams in rapidly locating trapped individuals, thereby enhancing emergency search and rescue efficiency

Key words: air transportation, dynamic obstacle avoidance, deep reinforcement learning, unmanned aerial vehicle(UAV), emergency rescue

中图分类号: