Paper
5 October 2021 Scene text detection with improved receptive field and adaptive feature fusion
Liangjun Wang, Weijie Gu, Yuhang Ji
Author Affiliations +
Proceedings Volume 11911, 2nd International Conference on Computer Vision, Image, and Deep Learning; 119110V (2021) https://doi.org/10.1117/12.2604527
Event: 2nd International Conference on Computer Vision, Image and Deep Learning, 2021, Liuzhou, China
Abstract
Regression-based text detection methods are currently the research focuses due to their simple network structure and fast inference speed. However, most of them suffer from limited receptive field of convolutional neural network and simplistic feature-fusing in feature pyramid. As a consequence, the previous algorithms still have many shortcomings, such as difficulty in accurately detecting long texts and the inconsistency across different feature scales. To address these two problems, we first incorporate a densely connected atrous convolutional module into the feature extraction network, accordingly the receptive field is enlarged, and in turn the extraction of high-level semantic information is strengthened. Secondly, we weight and re-fuse the features from different levels of the feature pyramid, which can filter conflicting information at various levels to maintain the scale invariance of features. Extensive experiments have been made on ICDAR2015 and MSRA-TD500 datasets, and the experimental results have proved the effectiveness of the method.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Liangjun Wang, Weijie Gu, and Yuhang Ji "Scene text detection with improved receptive field and adaptive feature fusion", Proc. SPIE 11911, 2nd International Conference on Computer Vision, Image, and Deep Learning, 119110V (5 October 2021); https://doi.org/10.1117/12.2604527
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Feature extraction

Convolution

Algorithm development

Visualization

Network architectures

Image resolution

Sensor performance

RELATED CONTENT


Back to Top