YOUNG AND OLD PROCESS AUDIOVISUAL SPEECH MORE EFFICIENTLY THAN AUDITORY SPEECH: AN ERP STUDY OF AUDIOVISUAL SPEECH IN NOISE
Axel H Winneke, Natalie A Phillips
Poster
Time: 2009-06-29 11:00 AM – 12:30 PM
Last modified: 2009-06-04
Abstract
Background: In a sample of young adults (YA) and older adults (OA) we employed event-related brain potentials (ERPs) to examine audiovisual (AV) speech in background babble noise. There is ample evidence that even healthy OA with clinically normal hearing thresholds manifest deficits in speech perception particularly in noisy environments. We were interested in the extent to which visual speech cues can offset those perceptual deficits and how whether the brain processes underlying AV speech integration differ in OAs and YAs. According to the inverse effectiveness hypothesis, older adults, relative to young adults, should gain more from multisensory cues due to their age-related sensory decline.
Method: ERPs were recorded while participants categorized 80 spoken words as either natural (e.g., tree) or artificial objects (e.g., bike) via button press responses. Young adults (N=12; mean age = 24.3) and older adults (N=9; mean age = 68.6) had clinically normal visual contrast sensitivity and hearing thresholds and were cognitively healthy. Participants were presented in a random order with single spoken words in unimodal auditory-alone (A) and visual-alone (V) trials (i.e. lip-reading) and in a bimodal (AV) modality. We adjusted the signal/noise ratio in the A-only modality to equate the groups on perceptual load, to be able to measure the gain obtained by adding visual speech cues (i.e., visual enhancement effect).
Results & Discussion: Compared to unimodal trials, responses to AV trials were the fastest and most accurate for both age groups (p< .001). In young and older adults this AV benefit was accompanied by a reduced amplitude of the auditory N1 ERP component at central sites compared to A-alone trials (p=.03) and to the summed response of unimodal trials (A+V) (p<.01). Furthermore, the N1 peaked significantly earlier (22ms) during AV trials (p<.001). This indicates that the addition of visual speech cues enabled more efficient speech processing because speech perception is more accurate and faster yet fewer neural resources are recruited. Interestingly, we did not see a main effect of Age nor an Age by Condition interaction when comparing responses to A-only and AV trials. In other words OA are as proficient in integrating auditory and visual speech cues as YA, and hence benefit just as much from audiovisual speech when the signal/noise is equated for A-only performance. Noteworthy is the fact that OA performed less accurate than YA in the V-only (lipreading) task (p=.006). However, when both modalities were combined in the AV condition OA performed as well as YA. This means that, with respect to the V-only condition, OA benefitted more from AV speech than YA. These results are consistent with the inverse effectiveness hypothesis which states that the less efficient unisensory information processing is the more is to be gained from combining the unisensory signals.
Method: ERPs were recorded while participants categorized 80 spoken words as either natural (e.g., tree) or artificial objects (e.g., bike) via button press responses. Young adults (N=12; mean age = 24.3) and older adults (N=9; mean age = 68.6) had clinically normal visual contrast sensitivity and hearing thresholds and were cognitively healthy. Participants were presented in a random order with single spoken words in unimodal auditory-alone (A) and visual-alone (V) trials (i.e. lip-reading) and in a bimodal (AV) modality. We adjusted the signal/noise ratio in the A-only modality to equate the groups on perceptual load, to be able to measure the gain obtained by adding visual speech cues (i.e., visual enhancement effect).
Results & Discussion: Compared to unimodal trials, responses to AV trials were the fastest and most accurate for both age groups (p< .001). In young and older adults this AV benefit was accompanied by a reduced amplitude of the auditory N1 ERP component at central sites compared to A-alone trials (p=.03) and to the summed response of unimodal trials (A+V) (p<.01). Furthermore, the N1 peaked significantly earlier (22ms) during AV trials (p<.001). This indicates that the addition of visual speech cues enabled more efficient speech processing because speech perception is more accurate and faster yet fewer neural resources are recruited. Interestingly, we did not see a main effect of Age nor an Age by Condition interaction when comparing responses to A-only and AV trials. In other words OA are as proficient in integrating auditory and visual speech cues as YA, and hence benefit just as much from audiovisual speech when the signal/noise is equated for A-only performance. Noteworthy is the fact that OA performed less accurate than YA in the V-only (lipreading) task (p=.006). However, when both modalities were combined in the AV condition OA performed as well as YA. This means that, with respect to the V-only condition, OA benefitted more from AV speech than YA. These results are consistent with the inverse effectiveness hypothesis which states that the less efficient unisensory information processing is the more is to be gained from combining the unisensory signals.