Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Recognizing the motion of Micro Aerial Vehicles (MAVs) is crucial for
enabling cooperative perception and control in autonomous aerial swarms. Yet,
vision-based recognition models relying only on RGB data often fail to capture
the complex spatial temporal characteristics of MAV motion, which limits their
ability to distinguish different actions. To overcome this problem, this paper
presents MAVR-Net, a multi-view learning-based MAV action recognition
framework. Unlike traditional single-view methods, the proposed approach
combines three complementary types of data, including raw RGB frames, optical
flow, and segmentation masks, to improve the robustness and accuracy of MAV
motion recognition. Specifically, ResNet-based encoders are used to extract
discriminative features from each view, and a multi-scale feature pyramid is
adopted to preserve the spatiotemporal details of MAV motion patterns. To
enhance the interaction between different views, a cross-view attention module
is introduced to model the dependencies among various modalities and feature
scales. In addition, a multi-view alignment loss is designed to ensure semantic
consistency and strengthen cross-view feature representations. Experimental
results on benchmark MAV action datasets show that our method clearly
outperforms existing approaches, achieving 97.8\%, 96.5\%, and 92.8\% accuracy
on the Short MAV, Medium MAV, and Long MAV datasets, respectively.
Authors (2)
Nengbo Zhang
Hann Woei Ho
Submitted
October 17, 2025
Key Contributions
MAVR-Net is a multi-view learning framework for MAV action recognition that combines RGB frames, optical flow, and segmentation masks. It uses cross-view attention to enhance interaction between views, improving robustness and accuracy for complex MAV motions.
Business Value
Enhances the capabilities of autonomous aerial swarms for tasks like coordinated surveillance, formation flying, and complex maneuvers, leading to more sophisticated drone applications.