The development of point cloud-based object detection in the field of autonomous driving has been rapid. However, it is undeniable that the issue of detecting small objects with high precision remains an urgent challenge. To address this issue, we introduce a single-stage 3D detection network, termed self-attention voting-single stage detection (SAV-SSD). It directly extracts feature information from the raw point cloud data and introduces an innovative self-attention voting mechanism to generate center points through weighted voting based on feature correlations. Compared with the feature prediction, we make an additional prediction of the center point, which can better control the position and size of the bounding boxes to improve the accuracy and stability of the predictions. To capture more features of small objects, cross multi-scale feature fusion is designed to establish connections between deep and shallow features. Experimental results demonstrate that SAV-SSD significantly improves the accuracy of pedestrian and cyclist detection while maintaining real-time performance. On the KITTI dataset, SAV-SSD outperforms many state-of-the-art 3D object detection methods. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Object detection
Point clouds
Feature extraction
Feature fusion
Detection and tracking algorithms
Ablation
Optical engineering