A multisensory perspective on human auditory communication

Katharina von Kriegstein
Symposium Talk
Last modified: 2008-05-13

Abstract


Human face-to-face communication is essentially audio-visual. Typically, people talk to us face-to-face, providing concurrent auditory and visual input. Understanding someone is easier when there is visual input, because visual cues like mouth and tongue movements provide complementary information about speech content. I will present data which show that for auditory-only speech the human brain exploits previously encoded audio-visual correlations to optimize communication. The data are derived from behavioural and functional magnetic resonance imaging experiments in prosopagnosics (i.e. people with a face recognition deficit) and controls. The results show, that in the absence of visual input the brain optimizes both auditory-only speech and speaker recognition by harvesting speaker-specific predictions and constraints from distinct visual face-processing areas. These findings challenge current uni-sensory models of speech processing. They suggest that optimization of auditory speech processing is based on speaker-specific audio-visual internal models, which are used to simulate a talking face.

Conference System by Open Conference Systems & MohSho Interactive Multimedia