Investigating multisensory integration in emotion recognition through bio-inspired computational models
MetadataShow full item record
Altmetrics Handle Statistics
Altmetrics DOI Statistics
Emotion understanding represents a core aspect of human communication. Our social behaviours are closely linked to expressing our emotions and understanding others emotional and mental states through social signals. The majority of the existing work proceeds by extracting meaningful features from each modality and applying fusion techniques either at a feature level or decision level. However, these techniques are incapable of translating the constant talk and feedback between different modalities. Such constant talk is particularly important in continuous emotion recognition, where one modality can predict, enhance and complement the other. This paper proposes three multisensory integration models, based on different pathways of multisensory integration in the brain; that is, integration by convergence, early cross-modal enhancement, and integration through neural synchrony. The proposed models are designed and implemented using third-generation neural networks, Spiking Neural Networks (SNN). The models are evaluated using widely adopted, third-party datasets and compared to state-of-the-art multimodal fusion techniques, such as early, late and deep learning fusion. Evaluation results show that the three proposed models have achieved comparable results to the state-of-the-art supervised learning techniques. More importantly, this paper demonstrates plausible ways to translate constant talk between modalities during the training phase, which also brings advantages in generalisation and robustness to noise.
Mansouri Benssassi , E & Ye , J 2021 , ' Investigating multisensory integration in emotion recognition through bio-inspired computational models ' , IEEE Transactions on Affective Computing , vol. Early Access . https://doi.org/10.1109/TAFFC.2021.3106254
IEEE Transactions on Affective Computing
Copyright © 2021 IEEE. This work has been made available online in accordance with publisher policies or with permission. Permission for further reuse of this content should be sought from the publisher or the rights holder. This is the author created accepted manuscript following peer review and may differ slightly from the final published version. The final published version of this work is available at https://doi.org/10.1109/TAFFC.2021.3106254.
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.