Audio-visual speech perception and attention

Jyrki Tuomainen, Raeya Abbas, Michael Coleman
Poster
Time: 2009-06-29  11:00 AM – 12:30 PM
Last modified: 2009-06-04

Abstract


An increasing number of reports suggest that audio-visual (AV) speech integration is not completely automatic but instead depends on attentional resources (e.g., Alsius et al. 2005; Tiippana et al. 2003). Two aspects of attention seem to be crucial: the perceptual set (Tuomainen et al., 2005) and perceptual capacity (resources allocated to processing relevant stimuli and ignoring distracters, e.g., Lavie & Tsal, 1994).

We investigated the latter issue by measuring susceptibility to the McGurk effect (e.g., an incongruent audio-visual pairing of auditory /aba/ with visual /aga/) while the participants (N=19) simultaneously performed a demanding primary visual task under conditions of high or low perceptual load with the instruction to ignore the articulating face. According to the ‘perceptual load hypothesis’ (Lavie & Tsal 1994) distracter stimuli (i.e., in the current experiment: the articulating mouth) are only processed if the detection of target stimuli in the primary visual task does not exceed attentional capacity limits. If this hypothesis is accurate then, especially under high perceptual load, we should observe an increase in the number of “auditory responses� on the “McGurk trials� (indicating reduced integration of auditory and visual speech signals) suggesting that AV speech integration is not a fully pre-attentive process.

The results showed a significant main effect of the attention manipulation (p<0.001); the low and high load conditions differed significantly from the baseline condition. However, we failed to find an increase in the attentional effect as a function of increasing perceptual load (Figure 1, see supplementary file).

The increase of task difficulty (perceptual load) as a function of the number of different nontargets was indicated by a significant reduction of the number of correct responses from low load condition (77%) to high load condition (58%) (p<0.001). The accuracy in the high load condition was not significantly different from chance. The accuracy was also significantly lower on incongruent trials than congruent trials (p=0.026).

The predictions of the perceptual load hypothesis were not completely supported by our results as increase in load did not further reduce integration. We speculate that under very high perceptual load subjects cannot constantly focus on the primary task and may be distracted by the mouth movements which automatically capture the attention.

Conference System by Open Conference Systems & MohSho Interactive Multimedia