Leveraging visual sensing technologies for the detection and tracking of vehicles represents a critical application domain for unmanned aerial vehicles (UAVs), notably in challenging operational contexts. This study focuses on enhancing UAV functionalities in intricate environments through the development of a specialized dataset, derived from battlefield scenarios, to facilitate advanced research on vehicle detection and multi-target tracking under complex conditions. A comprehensive collection of vehicular movement videos spanning diverse scenarios was amassed and manually annotated, culminating in the creation of the " Cross-Scenario Vehicle Detection" (CSVD) dataset. This dataset encompasses a wide array of environmental settings, featuring urban landscapes, plains, and forests, across the four seasons, resulting in a total of 13,025 meticulously annotated images. Utilizing several state-of-the-art deep learning models, we established robust benchmarks for object detection. Additionally, an extensive evaluation and performance validation were conducted using cutting-edge multi-object tracking algorithms on the CSVD dataset, incorporating diverse assessment metrics. The conducted experiments demonstrate the dataset's robust applicability and versatility, endorsing its effectiveness for the development and evaluation of UAV-based vehicle detection and multi-target tracking systems in complex settings.
In the task of multi-person tracking, we have observed that ID switches in applications can lead to confusion in tracking targets, significantly reducing the efficiency of target tracking in actual work and the user experience. Therefore, this paper proposes a new method for multi-person tracking, building upon the BoT-SORT, with a focus on identity switching. The proposed framework integrates spatial location constraint and occlusion awareness into the tracking process. By combining spatial location and occlusion for track initialization, and methods for ID switch verification and rectification, we have enhanced the utilization of low-confidence detections, effectively reduced ID switches in multi-person tracking. The effectiveness of the proposed method has been verified through ablation experiments and validated on the MOT20 benchmark, The HOTA increased by 0.3 to 63.6, the MOTA increased by 0.3 to 78.1, and the IDF1 increased by 0.9 to 78.4. Particularly, the number of ID switches decreased from 1313 before the improvement to 1007, a reduction of 23.3%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.