面向高层建筑应急救援的无人机螺旋搜索轨迹控制方法

doi:10.16097/j.cnki.1009-6744.2025.06.008

交通运输系统工程与信息 ›› 2025, Vol. 25 ›› Issue (6): 87-100.DOI: 10.16097/j.cnki.1009-6744.2025.06.008

面向高层建筑应急救援的无人机螺旋搜索轨迹控制方法

陈德启¹，张自设¹，张文会^*1 ，闫学东²，蒋贤才¹

1. 东北林业大学，土木与交通学院，哈尔滨150040；2.西南交通大学，交通运输与物流学院，成都611756

收稿日期:2025-06-30 修回日期:2025-08-21 接受日期:2025-09-02 出版日期:2025-12-25 发布日期:2025-12-24
作者简介:陈德启(1990—)，男，黑龙江哈尔滨人，讲师。
基金资助:
黑龙江省哲学社会科学研究规划项目 (23GLCo22)；国家自然科学基金 (52272311)。

Trajectory Control Method for UAV Spiral Search Oriented to High-rise Building Emergency Rescue

CHEN Deqi¹, ZHANG Zishe¹, ZHANG Wenhui^*1,YAN Xuedong², JIANG Xiancai¹

1. School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China; 2. School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China

Received:2025-06-30 Revised:2025-08-21 Accepted:2025-09-02 Online:2025-12-25 Published:2025-12-24
Supported by:
Heilongjiang Province Philosophy and Social Science Research Planning Project (23GLCo22)；National Natural Science Foundation of China (52272311)。

摘要/Abstract

摘要： 在灾后黄金救援时期，可使用无人机率先抵达受损高层建筑进行螺旋上升式全覆盖扫描感知灾情。然而，由于受灾现场复杂的动态环境，无人机在抵近立体扫描时，容易出现轨迹跟踪精度低和碰撞风险高等问题。为此，本文提出融合优先经验回放的软演员-评论家(PER-SAC)控制模型，并基于六自由度(6DOF)非线性动力学模型搭建高保真仿真平台。模型通过优先学习高时序差分误差(TD-error)的关键经验，提升复杂任务中的学习效率与控制策略的鲁棒性。仿真对比实验表明，所提PER-SAC策略的收敛速度和最终性能均优于软演员-评论家(SAC)和近端策略优化(PPO)算法。在静态轨迹跟踪任务中，PER-SAC的任务成功率达99.0%，平均轨迹误差较SAC降低了66.3%；在动态避障任务中，其任务成功率高达97.0%，且规避动作更平滑高效，模型控制的鲁棒性得到充分验证。通过融合优先经验回放机制显著提升无人机在未知动态环境下的自主飞行性能。所构建的PER-SAC模型即可以兼顾飞行控制精度、飞行品质与安全性，也可直接应用于灾后对高层受损建筑物的自主螺旋式扫描，通过稳定的飞行姿态获取高清影像，从而辅助救援团队快速感知灾情，提升应急搜救效率。

关键词: 航空运输, 动态避障, 深度强化学习, 无人机, 应急救援

Abstract: During the golden rescue period following a disaster, unmanned aerial vehicles (UAVs) can be delivered first to reach the damaged buildings for spiral full-coverage scanning and search for survivors. However, due to the complex and dynamic environment at disaster sites, UAVs often encounter challenges such as low trajectory tracking accuracy and high collision risks during close-range three-dimensional (3D) scanning. To address these issues, this paper proposes a Prioritized Experience Replay Soft Actor-Critic (PER-SAC) control model and establishes a high-fidelity simulation platform based on a 6-degree-of-freedom (6 DOF) nonlinear dynamic model. By prioritizing the learning of key experiences with high Temporal-Difference error (TD-error), the model enhances learning efficiency and policy robustness in complex tasks. Comparative simulation experiments demonstrate that the proposed PER-SAC strategy outperforms both Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms in terms of convergence speed and final performance. In static trajectory tracking tasks, PER-SAC achieves a success rate of 99.0%, with an average trajectory error reduced by 66.3% compared to SAC. For dynamic obstacle avoidance tasks, the success rate is 97.0%, exhibiting smoother and more efficient evasion maneuvers, thereby fully validating its robustness. The incorporation of prioritized experience replay significantly improves UAVs' autonomous flight performance in unknown dynamic environments. The proposed PER-SAC strategy represents an advanced control method that effectively balances control precision, flight quality, and safety. It can be directly applied to autonomous spiral scanning of high-rise damaged buildings post-disaster, enabling stable flight attitudes to capture high-definition imagery. This capability assists rescue teams in rapidly locating trapped individuals, thereby enhancing emergency search and rescue efficiency

Key words: air transportation, dynamic obstacle avoidance, deep reinforcement learning, unmanned aerial vehicle(UAV), emergency rescue

中图分类号:

陈德启, 张自设, 张文会, 闫学东, 蒋贤才. 面向高层建筑应急救援的无人机螺旋搜索轨迹控制方法[J]. 交通运输系统工程与信息, 2025, 25(6): 87-100.

CHEN Deqi, ZHANG Zishe, ZHANG Wenhui, YAN Xuedong, JIANG Xiancai. Trajectory Control Method for UAV Spiral Search Oriented to High-rise Building Emergency Rescue[J]. Journal of Transportation Systems Engineering and Information Technology, 2025, 25(6): 87-100.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: http://www.tseit.org.cn/CN/10.16097/j.cnki.1009-6744.2025.06.008

http://www.tseit.org.cn/CN/Y2025/V25/I6/87

参考文献

[1]胡大伟,张世鹏,刘慧甜,等.应急响应初期“卡车-无人机”联合配送路径问题[J].长安大学学报(自然科学版), 2024, 44(1): 105-119. [HU D W, ZHANG, S P, LIU H T, et al. Routing problem of truck-drones joint distribution in the initial stage of emergency response[J]. Journal of Chang'an University (Natural Science Edition), 2024, 44(1): 105-119.]

[2]康柳江,李浩,孙会君,等.复杂山区工程建设物资运输无人机巡航模型构建与实证研究[J].交通运输系统工程与信息,2023, 23(3): 290-299. [KANG L J, LI H, SUN H J, et al. UAV cruising for material transportation under engineering construction in complex mountainous areas: Modeling and case study[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(3): 290-299.]

[3] 许云鹏,谢雅琪,于然,等.感-通-物多目标融合应急无人机路径规划方法[J]. 通信学报,2024, 45(4): 1-12. [XU Y P, XIE Y Q, YU R, et al. Integrated perception communication-logistics multi-objective oriented path planning for emergency UAVs[J]. Journal on Communications, 2024, 45(4): 1-12.]

[4]张洪海,李翰,刘皞,等.城市区域物流无人机路径规划[J]. 交通运输系统工程与信息,2020, 20(6): 22-29. [ZHANG H H, LI H, LIU H, et al. Path planning for logistics unmanned aerial vehicle in urban area[J]. Journal of Transportation Systems Engineering and Information Technology, 2020, 20(6): 22-29.]

[5]胡小兵,杨常澍,周隽.复杂城市环境下无人机路网模型研究[J]. 交通运输系统工程与信息, 2023, 23(4): 251-261. [HU X B, YANG C S, ZHOU J. Route network modeling for unmanned aerial vehicle in complex urban environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(4): 251-261.]

[6]梁吉,王立松,黄昱洲,等.基于深度强化学习的四旋翼无人机自主控制方法[J].计算机科学,2023,50(S2): 13-19. [LIANG J, WANG L S, HUANG Y Z, et al. Autonomous control algorithm for quadrotor based on deep reinforcement learning[J]. Computer Science, 2023, 50(S2): 13-19.]

[7]YU X, FAN Y, XUS, et al. A self-adaptive SAC-PID control approach based on reinforcement learning for mobile robots[J]. International Journal of Robust and Nonlinear Control, 2022, 32(18): 9625-9643.

[8]WILLIAMS G, WAGENER N, GOLDFAIN B, et al. Information theoretic MPC for model-based reinforcement learning[C]//2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2017: 1714-1721.

[9] 王硕, 李洋, 赵蕴龙, 等. 无人机航迹规划算法综述[J/OL]. 哈尔滨工程大学学报, (2025-06-16) [2025- 07- 07]. https://link.cnki.net/urlid/23.1390. U.20250616.1544.003. [WANG S, LI Y, ZHAO Y L, et al. A review of UAV trajectory planning algorithms[J/ OL]. Journal of Harbin Engineering University, (2025 06-16) [2025-07-07]. http://kns.cnki.net/kcms/detail/ 23.1390.U.20250616.1544.003.html.]

[10] 吕超, 李慕宸,欧家骏.基于分层深度强化学习的无人机混合路径规划[J].北京航空航天大学学报,2025,51 (10): 3451-3459. [LV C, LI M C, OU J J. UAV hybrid path planning based on hierarchical deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2025, 51(10): 3451-3459.]

[11] 杜江涛,于家明,齐辉.无人机集群不完全信息路径规划方法[J]. 哈尔滨工程大学学报,2024,45(11): 2210 2217. [DU J T, YU J M, QI H. Incomplete information path planning method for an UAV cluster[J]. Journal of Harbin Engineering University, 2024, 45(11): 2210 2217.]

[12] 滕菲, 王迎春,姚永辉,等.基于深度强化学习的无人机动态避障规划[J/OL]. 北京航空航天大学学报, (2025-05-23)[2025-07-07]. https://doi.org/10.13700/j. bh.1001-5965.2025.0084. [TENG F, WANG Y C, YAO Y H, et al. Dynamic obstacle avoidance planning for UAV based on deep reinforcement learning[J/OL]. Journal of Beijing University of Aeronautics and Astronautics, (2025-05-23) [2025-07-07]. https://doi. org/10.13700/j.bh.1001-5965.2025.0084.]

[13] FENG Z, HUANG M, WU D, et al. Multi-agent reinforcement learning with policy clipping and average evaluation for UAV-assisted communication Markov game[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 14281-14293.

[14] AL-HILO A, SAMIR M, ASSI C, et al. UAV-assisted content delivery in intelligent transportation systems joint trajectory planning and cache management[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(8): 5155-5167.

[15] XI M, DAI H, HE J, et al. A lightweight reinforcement learning-based real-time path-planning method for unmanned aerial vehicles[J]. IEEE Internet of Things Journal, 2024, 11(12): 21061-21071.

[16] HAARNOJA T, ZHOU A, HARTIKANEN K, et al. Soft actor-critic algorithms and applications[J]. arXiv Preprint arXiv, 2018: 1812.05905.

[17] SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay[J]. arXiv Preprint arXiv, 2015: 1511.05952.

[18] GOK M. Dynamic path planning via Dueling Double Deep Q-Network (D3QN) with prioritized experience replay[J]. Applied Soft Computing, 2024, 158: 111503.

[19] ZHOU Y, YANG J, GUO Z, et al. An indoor blind area oriented autonomous robotic path planning approach using deep reinforcement learning[J]. Expert Systems with Applications, 2024, 254: 124277.

[20] HASSANI H, NIKAN S, SHAMI A. Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay[J]. Engineering Applications of Artificial Intelligence, 2024, 137: 109147.

[21] BEARD R W, MCLAIN T W. Small unmanned aircraft: Theory and practice[M]. Princeton: Princeton university press, 2012.

[22] SUTTON R S, BARTO A G. Reinforcement learning: An introduction[M]. Cambridge: MIT Press, 1998.

[23] BERROCAL E, SIERRA B, HERRERO H. Evaluating PyBullet and Isaac Sim in the Scope of Robotics and Reinforcement Learning[J]. 2024 7th Iberian Robotics Conference (ROBOT), 2024, DOI: 10.1109/ robot61475.2024.10797383.

[24] RAFFIN A, HILL A, GLEAVE A, et al. Stable baselines3: Reliable reinforcement learning implementations[J]. Journal of Machine Learning Research, 2021, 22(268): 1-8.

面向高层建筑应急救援的无人机螺旋搜索轨迹控制方法

Trajectory Control Method for UAV Spiral Search Oriented to High-rise Building Emergency Rescue

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	刘丹, 林杉杉, 郑宇婷. 中国上市航空公司产能效率及影响因素研究[J]. 交通运输系统工程与信息, 2025, 25(6): 13-22.
[2]	伍景琼, 奠然, 字太升, 李云起. 无人机配送研究：关于技术、效益及应用的系统综述[J]. 交通运输系统工程与信息, 2025, 25(6): 34-49.
[3]	程明, 黄泓鸣. 天气对低空航线网络抗毁性影响分析[J]. 交通运输系统工程与信息, 2025, 25(6): 101-108.
[4]	孟令航, 张林海, 陈敏. 新闻事件驱动的洋区不明飞行活动架次预测[J]. 交通运输系统工程与信息, 2025, 25(6): 360-372.
[5]	刘钊瑄, 周哲旭, 方晶. 考虑风险异质传播的航空多阶段运行风险演化分析[J]. 交通运输系统工程与信息, 2025, 25(5): 302-311.
[6]	曾培彬, 巫威眺, 柯炜鹏, 王发. 考虑航空地服人员满意度的混合技能派工优化模型[J]. 交通运输系统工程与信息, 2025, 25(5): 312-319.
[7]	田利军, 陈学功, 王琦, 徐馨喆, 王雅婷, 屈茜茜. 中国民航可持续航空燃油减排潜力及多路径协同研究[J]. 交通运输系统工程与信息, 2025, 25(5): 365-374.
[8]	代亮, 杜鹏飞, 黄自彬, 杨朋博. 基于深度强化学习的城市交通信号分层协同控制方法[J]. 交通运输系统工程与信息, 2025, 25(4): 63-72.
[9]	郑展骥, 冯昌奎, 赵杨洋, 凃强, 张河山, 徐进. 无人机航拍视角下密集场景非机动车小目标检测方法[J]. 交通运输系统工程与信息, 2025, 25(4): 147-161.
[10]	朱琴跃, 李纪元, 李泓羿, 钱舒杨, 赵亚辉. 考虑负载不确定性的城轨列车目标速度曲线实时鲁棒优化[J]. 交通运输系统工程与信息, 2025, 25(4): 254-264.
[11]	张荣, 张茜, 史文萱, 靖晴. 基于呼吸参数的模拟管制工作负荷预测模型研究[J]. 交通运输系统工程与信息, 2025, 25(4): 317-325.
[12]	杜婧涵, 李佳祥, 程擎, 朱新平, 尹嘉男, 张魏宁. 融合经济性与鲁棒性的多目标停机位动态分配方法[J]. 交通运输系统工程与信息, 2025, 25(4): 326-336.
[13]	马剑, 何欢, 王巧, 廖伟屹, 陈俊, 施冬冬, 邱云飞. 窄体客机慢速优先登机策略建模与分析[J]. 交通运输系统工程与信息, 2025, 25(3): 308-320.
[14]	陈振坤, 陈可嘉. 自主取消航班下进离场时隙二次分配的双目标优化[J]. 交通运输系统工程与信息, 2025, 25(3): 321-334.
[15]	胡荣, 张雨橦, 丁嘉豪, 王怡人, 张军峰. 航班时刻配置公平性研究现状与展望[J]. 交通运输系统工程与信息, 2025, 25(2): 1-15.