Convolutional neural network (CNN)-based facial emotion recognition (FER) lacks structural information, which affects the accuracy of FER. To automatically extract structural features and enrich the representation of expression features, we propose a dual-branch network based on CNN and graph CNN (GCNN) for FER. Specifically, one branch uses CNN to obtain the global features of expressions; in the parallel branch, we adopt a sparse coding strategy and propose a landmark-guided GCNN to extract the facial structure information under different expressions, enhancing the feature representation of key regions of facial expressions. The features of the two branches are fused through the channel attention mechanism to increase the semantic strength of the features to obtain more comprehensive and accurate expression features for expression classification. The entire deep neural network is trained end-to-end, and the extracted expression features are more discriminative, thereby improving the classification performance. Our experimental results prove the effectiveness of the proposed method on three publicly available datasets such as CK + (99.24%), Fer2013 (73.26%), and Raf-db (87.42%). The proposed algorithm for extracting image structure information can be flexibly implanted into any model of image classification, image segmentation, etc., where the input image has structural information. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
![Lens.org Logo](/images/Lens.org/lens-logo.png)
CITATIONS
Cited by 2 scholarly publications.
Facial recognition systems
Convolutional neural networks
Feature extraction
Data modeling
Image segmentation
Convolution
Detection and tracking algorithms