Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild

Dae Ha Kim, Min Kyu Lee, Dong Yoon Choi, Byung Cheol Song

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

43 Scopus citations

Abstract

Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-Task learning and semi-supervised learning using the spatio-Temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Original languageEnglish
Title of host publicationICMI 2017 - Proceedings of the 19th ACM International Conference on Multimodal Interaction
EditorsEdward Lank, Eve Hoggan, Sriram Subramanian, Alessandro Vinciarelli, Stephen A. Brewster
PublisherAssociation for Computing Machinery, Inc
Pages529-535
Number of pages7
ISBN (Electronic)9781450355438
DOIs
StatePublished - 3 Nov 2017
Event19th ACM International Conference on Multimodal Interaction, ICMI 2017 - Glasgow, United Kingdom
Duration: 13 Nov 201717 Nov 2017

Publication series

NameICMI 2017 - Proceedings of the 19th ACM International Conference on Multimodal Interaction
Volume2017-January

Conference

Conference19th ACM International Conference on Multimodal Interaction, ICMI 2017
Country/TerritoryUnited Kingdom
CityGlasgow
Period13/11/1717/11/17

Bibliographical note

Publisher Copyright:
© 2017 ACM.

Keywords

  • EmotiW 2017 challenge
  • Emotion recognition
  • Multi modal signal
  • Multi-Task learning
  • Semi-supervised learning

Fingerprint

Dive into the research topics of 'Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild'. Together they form a unique fingerprint.

Cite this