Generalisation and robustness investigation for facial and speech emotion recognition using bio-inspired spiking neural networks
MetadataShow full item record
Altmetrics Handle Statistics
Altmetrics DOI Statistics
Emotion recognition through facial expression and non verbal speech represent an important area in affective computing. They have been extensive studied, from classical feature extraction techniques to more recent deep learning approaches. However most of these approaches face two major challenges: (1) robustness – in the face of degradation such as noise, can a model still make correct predictions?, and (2) cross-dataset generalisation – when a model is trained on one dataset, can it be used to make inference on another dataset?. To directly address these challenges, we first propose the application of a Spiking Neural Network (SNN) in predicting emotional states based on facial expression and speech data, then investigate and compare their accuracy when facing data degradation or unseen new input. We evaluate our approach on third-party, publicly available datasets and compare to the state-of-the-art techniques. Our approach demonstrates robustness to noise, where it achieves an accuracy of 56.2% for facial expression recognition (FER) compared to 22.64% and 14.10% for CNN and SVM respectively when input images are degraded with the noise intensity of 0.5, and the highest accuracy of 74.3% for speech emotion recognition (SER) compared to 21.95% of CNN and 14.75% for SVM when audio white noise is applied. For generalisation, our approach achieves consistently high accu- racy of 89% for FER and 70% for SER in cross-dataset evaluation and suggests that it can learn more effective feature representations, which lead to good generalisa- tion of facial features and vocal characteristics across subjects.
Mansouri Benssassi , E & Ye , J 2021 , ' Generalisation and robustness investigation for facial and speech emotion recognition using bio-inspired spiking neural networks ' , Soft Computing , vol. First Online . https://doi.org/10.1007/s00500-020-05501-7
Copyright © The Author(s) 2021. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.