Most trackers based on the correlation operation mainly carry out a straightforward fusion of template features and search region features. The correlation operation, however, is a local linear matching process that might cause the loss of semantic information, leading to local optimums. We propose a tracker method that takes advantage of both the attention mechanism and feature fusion. This approach uses an attention-based feature fusion network to integrate the characteristics of the template area and search area, which effectively avoids the semantic information loss and local optimum induced by the correlation operation. Meanwhile, by adopting a modified ResNet50 network for extracting multilayer target features to fully utilize both shallow and deep features, our method takes full advantage of the template information to get better tracking results. Extensive experiments on four large-scale tracking benchmark datasets—VOT2019, OTB100, GOT-10k, and LaSOT—show that our tracker obtains more accurate tracking results and has a better performance. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Detection and tracking algorithms
Feature fusion
Semantics
Feature extraction
Education and training
Optical tracking
Head