Reinforcement Learning Based UAV Swarm Fission-Fusion Approach with Integrated Validation of Perception and Control

Oct 1, 2024·
Xiaorong Zhang
,
Wenrui Ding
,
Qingyi Liu
Dacheng Qi
Dacheng Qi
,
Zhilan Zhang
,
Shutong Wang
,
Yufeng Wang
· 5 min read
Framework of our IPCVSA System
Type
Publication
2024 IEEE International Conference on Unmanned Systems

Best Paper Award

This paper received the Best Paper Award.

Best Paper Award
Best Paper Award

Abstract

Unmanned Aerial Vehicle (UAV) cluster swarm motion is a complex area of research, mainly because UAV cluster system composition contains multiple system components such as perception and control. However, the split swarm motion of UAV swarms in response to unknown multiple dynamic disturbances has received relatively little attention compared to static flight behavior. In this paper, we propose an integrated verification method of swarm control and perception control for UAV swarms in response to multiple unknown dynamic disturbances through reinforcement learning algorithms, which effectively solves the problem of integrating the method and perception control of swarms in response to multiple unknown dynamic disturbances. First, we develop a self-organized swarm control framework for UAV clusters, which realizes the multi-cluster swarm motion of UAV swarms. Second, we propose a reinforcement learning-based sub-cluster adversarial algorithm, aiming at dynamic confrontation with minimal resource consumption against multiple unknown disturbances. Finally, we introduce an integrated perception and control validation System based on airsim (IPCVSA) that realizes the integrated verification of UAV clusters based on real environments. Simulation experiments show that the UAV swarm can successfully perform self-organized sub-swarm motion when working in an environment with multiple unknown disturbances, effectively protecting the main swarm from multiple dynamic disturbances.

Introduction

The study of UAV swarm systems is a key focus in unmanned control technology, demonstrating that the collective capabilities of a swarm surpass the sum of individual units. In nature, biological swarms enhance survival through collective behavior, such as bats optimizing foraging via sub-swarm movements, fish using fission-fusion to defend against predators, and birds reducing wind resistance through formation flying. Similarly, UAV swarms leverage swarm dynamics to enhance mission performance, such as improving the ability to handle dynamic disturbances through fission-fusion movements or increasing the efficiency of roundup missions through coordinated planning.

Most existing research focuses on individual UAV swarms. However, complex real-world tasks often exceed the capacity of a single swarm, requiring multiple swarms to perform specialized tasks, such as surveillance, data collection, or communication relay. The fission-fusion movement in UAV swarms improves adaptability to complex missions, mirroring behaviors observed in nature. This adaptability is crucial in areas like disaster response, environmental monitoring, and large-scale agriculture, where dynamic adjustments to swarm size can significantly enhance mission performance.

However, real-world UAV swarms operate in dynamic and unpredictable environments with various disturbances, such as unknown flying objects and mission changes. These factors can interact in complex ways, making the results of individual system tests insufficient for predicting real-world swarm performance. Additionally, real-world swarm flight experiments are costly and challenging. While reinforcement learning-based fission-fusion methods have been proposed, they require extensive training.

To address these challenges, this paper focuses on the methodology for multi-UAV swarm merging in dynamic environments and proposes an integrated perception-control validation system.

The contributions of this study are as follows:

  • A reinforcement learning-based method for UAV swarm merging in multi-dynamic interference environments is proposed, enabling antagonistic swarm movements against unknown dynamic interference while minimizing energy consumption.

  • An integrated perception and control validation system is developed using AirSim, which validates UAV swarm electromagnetic environment perception, visual perception, mission planning, and swarm control based on real-world data.

  • The feasibility and effectiveness of the proposed method are demonstrated through extensive numerical simulations, supported by various evaluation metrics.

Simulation System

In this section, I will present my work on the design and development of the simulation system.

User Interface Overview

The interface of our validation system (APDCIVS) is shown in the figure below. The left side features the visualization flight interface, displaying visual effects rendered directly by the engine. The right side showcases the output of the verification system, divided into three sections: flight control, visual verification, and electromagnetic signal verification. The flight control section presents the aircraft’s flight trajectory and real-time attitude, demonstrating the effectiveness of the control algorithms. The middle section, dedicated to visual verification, exhibits the detection results of computer vision algorithms applied to images captured by drones. The bottom section focuses on electromagnetic signal verification, displaying the raw IQ signals and the approximate estimation of enemy position using Direction of Arrival Techniques.

User Interface
User Interface

Simulation Steps

In the context of UAV swarm simulation, we delineate a scenario wherein a formation of quadrotor drones is deployed for reconnaissance missions over strategically sensitive maritime regions, aiming to validate the system’s efficacy. This simulation posits the presence of three adversarial fixed-wing drones patrolling these areas. The operational objective of the quadrotor formation is to infiltrate this sensitive maritime zone to collect intelligence. The specific steps for the simulation are illustrated in the figure below.

Simulation Steps
Simulation Steps

Experiments

Flight Experiments

During the reconnaissance mission, it is presumed that each enemy fixed-wing drone possesses the capability to autonomously track and follow the nearest quadrotor drone within the friendly formation. To counter this threat and accomplish the mission successfully, the quadrotor formation employs a tactical maneuver by dividing into sub-groups. These sub-groups are strategically designed to engage and divert the attention of the enemy drones through coordinated swarming behaviors. The simulation process is illustrated in detail in the figure below.

Visual Depiction of Critical Simulation Stages
Visual Depiction of Critical Simulation Stages

Vision Experiments

When the enemy aircraft approaches proximity to our drone cluster, it becomes essential to utilize the onboard camera to pinpoint its location accurately. Concurrently, during close-ground reconnaissance, the camera also serves to identify ground targets, thereby enhancing the precision of image capture. To ascertain the efficacy of our system, we implement the Segment Anything Model for semantic segmentation on the images obtained from the cameras. Semantic segmentation involves a dense prediction task where each pixel in an image is labeled, capturing almost all the complex details present. The figure below displays the outcomes of semantic segmentation in various established scenarios, with the majority of vital objects being distinctly marked.

Semantic segmentation results
Semantic segmentation results