Paper
11 September 2024 HT-DETR: accelerating convergence for detection transformer with hybrid task allocation
Shuheng Zhao, Zhong Qu, Zhenjun Wu
Author Affiliations +
Proceedings Volume 13253, Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024); 132531C (2024) https://doi.org/10.1117/12.3041273
Event: 4th International Conference on Signal Image Processing and Communication, 2024, Xi'an, China
Abstract
The DETR(Detection Transformer), based on the Transformer architecture, has shown great potential as an end-to-end object detection application in this field. It is worth noting that the DETR model uses the Hungarian algorithm for object prediction. However, due to the instability of the Hungarian matching algorithm itself, the model faces inconsistency issues in target optimization during early training stages, resulting in slower convergence speed. To address this challenge, this paper proposes an innovative hybrid task assignment algorithm. This algorithm increases the number of positive sample queries using one-to-many matching, allowing queries to predict multiple aspects of a single target and improving matching stability. Additionally, targets are grouped based on their true box sizes and corresponding grouping is applied to queries as well. Each query group is responsible for matching targets of specific sizes, significantly enhancing matching stability and accelerating model convergence process. Experimental results on COCO dataset demonstrate the effectiveness of this approach, showing outstanding performance in a single-scale DETR model with ResNet-50 backbone network achieving an average precision (AP) of 38.8% within 12 training epochs. Compared to baseline models with similar settings, it achieves a 3.2% AP improvement.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Shuheng Zhao, Zhong Qu, and Zhenjun Wu "HT-DETR: accelerating convergence for detection transformer with hybrid task allocation", Proc. SPIE 13253, Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531C (11 September 2024); https://doi.org/10.1117/12.3041273
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Education and training

Detection and tracking algorithms

Transformers

Data modeling

Performance modeling

Ablation

Back to Top