KEYWORDS: Action recognition, Feature extraction, Data modeling, Convolution, Video, Matrices, RGB color model, Solar thermal energy, Education and training, Visualization
Human action recognition is a research hotspot in the field of computer vision. Focusing on the problem of similar action recognition, we propose an improved two-stream adaptive graph convolutional network for skeleton-based action recognition, which incorporating a multiscale temporal convolutional network and a spatiotemporal excitation network. Using the multiscale temporal convolutional network, the temporal information can be effectively extracted by dilated convolution at different scales so as to broaden the width of the temporal network and extract more temporal features with slight difference between categories at the same time. By utilizing the spatiotemporal excitation network, the input features can be obtained through channel pooling to form single-channel features for two-dimensional convolution, by which important spatiotemporal information can be excited and the role of local nodes in similar actions can be effectively enhanced. Extensive tests and ablation studies on the three large-scale datasets, NTU-RGB+D60, NTU-RGB+D120, and Kinetics-Skeleton, were conducted. Our model outperforms the baseline by 7.2% and the state-of-the-art model by 4% in the similar action recognition on NTU-RGB+D60 dataset on average, which demonstrates the superiority of our model.
To improve the facial expression recognition accuracy in resource-constrained and real-time application equipment such as mobile and embedded devices, a lightweight method for facial expression recognition is proposed based on attention mechanism and key regions fusion. To reduce the computation complexity, a lightweight convolutional neural network, mini_Xception, is used as the basic expression recognition model for expression classification. The attention mechanism is introduced to enhance the learning of the important features of the whole face. Then a parameter is introduced to locate the key regions and construct key region models. Finally, to realize the complementarity of models and learn more comprehensive features, the whole facial expression recognition model is fused with the key region models. The proposed method can capture and utilize the important facial expression information in related regions displayed through class activation mapping visualization. The experimental results on JAFFE, CK+ datasets, and a real scene dataset verify the effectiveness of the proposed method.
With the spread of the epidemic in the world, wearing masks has become the most simple and effective way to block the COVID-19. For the lack of data and model design to fit the epidemic scene, we propose an integrated masked face recognition system with three cascaded convolutional neural networks. Firstly, a SSD model is used to detect masked face to eliminate the interference of irrelevant background. Then, we use an Hourglass network to regress the key points of the occluded face and crop the aligned eye-brow area without mask. Finally, we finetune a pretrained FaceNet to fully adapt to the data of eye-brow regions. Experiments on numbers of laboratory and wild images proved that our method can recognize the subjects with mask effectively.
Skeleton-based action recognition is a significant direction of human action recognition, because the skeleton contains important information for recognizing action. The spatial–temporal graph convolutional networks (ST-GCN) automatically learn both the temporal and spatial features from the skeleton data and achieve remarkable performance for skeleton-based action recognition. However, ST-GCN just learns local information on a certain neighborhood but does not capture the correlation information between all joints (i.e., global information). Therefore, we need to introduce global information into the ST-GCN. We propose a model of dynamic skeletons called attention module-based-ST-GCN, which solves these problems by adding attention module. The attention module can capture some global information, which brings stronger expressive power and generalization capability. Experimental results on two large-scale datasets, Kinetics and NTU-RGB+D, demonstrate that our model achieves significant improvements over previous representative methods.
Face detection from images has been an essential task in the field of computer vision, which is the premise of face recognition. Concerning the problem of low-face detection accuracy for multiscale face images in unconstrained settings, we propose single-stage headless face detector with IRNN, multilayer and soft-NMS face detection method based on single-stage headless face detector in unconstrained settings. First, considering the importance of the context for small-scale face detection, recurrent neural network initialized by the unit matrix module is joined to fully learn the contextual information. Second, in order to further improve the accuracy for multiscale face detection, multilayer fusion strategy is proposed, which learns the facial texture features from the lower layer in more detail. Finally, aiming at the face occlusion problem in unconstrained settings, soft nonmaximum suppression is used to join the predicted boxes together from different scales together to form final detection results. The results of experiments show that IMS-SSH significantly improves the performance of multiscale face detection in unconstrained settings, especially for small-scale face detection, and state-of-the-art performance is achieved on the unconstrained WIDER Face dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.