Robust user context analysis for multimodal interfaces

Prasenjit Dey, Muthuselvam Selvaraj, Bowon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Multimodal Interfaces that enable natural means of interaction using multiple modalities such as touch, hand gestures, speech, and facial expressions represent a paradigm shift in human-computer interfaces. Their aim is to allow rich and intuitive multimodal interaction similar to human-to-human communication and interaction. From the multimodal system's perspective, apart from the various input modalities themselves, user context information such as states of attention and activity, and identities of interacting users can help greatly in improving the interaction experience. For example, when sensors such as cameras (webcams, depth sensors etc.) and microphones are always on and continuously capturing signals in their environment, user context information is very useful to distinguish genuine system-directed activity from ambient speech and gesture activity in the surroundings, and distinguish the "active user" from among a set of users. Information about user identity may be used to personalize the system's interface and behavior - e.g. the look of the GUI, modality recognition profiles, and information layout - to suit the specific user. In this paper, we present a set of algorithms and an architecture that performs audiovisual analysis of user context using sensors such as cameras and microphone arrays, and integrates components for lip activity and audio direction detection (speech activity), face detection and tracking (attention), and face recognition (identity). The proposed architecture allows the component data flows to be managed and fused with low latency, low memory footprint, and low CPU load, since such a system is typically required to run continuously in the background and report events of attention, activity, and identity, in real-time, to consuming applications.

Original languageEnglish
Title of host publicationICMI'11 - Proceedings of the 2011 ACM International Conference on Multimodal Interaction
Pages81-88
Number of pages8
DOIs
StatePublished - 2011
Externally publishedYes
Event2011 ACM International Conference on Multimodal Interaction, ICMI'11 - Alicante, Spain
Duration: 14 Nov 201118 Nov 2011

Publication series

NameICMI'11 - Proceedings of the 2011 ACM International Conference on Multimodal Interaction

Conference

Conference2011 ACM International Conference on Multimodal Interaction, ICMI'11
Country/TerritorySpain
CityAlicante
Period14/11/1118/11/11

Keywords

  • human-computer-interaction.
  • multimodal systems
  • speech
  • user context

Fingerprint

Dive into the research topics of 'Robust user context analysis for multimodal interfaces'. Together they form a unique fingerprint.

Cite this