A Visual or Tactile Signal Can Make the Auditory System More Efficient but Not Less Noisy
Ewen A. Chao, Bosco S. Tjan, Lynne E. Bernstein
Poster
Time: 2009-07-01 09:00 AM – 10:30 AM
Last modified: 2009-06-04
Abstract
Multisensory stimuli improve auditory speech detection thresholds, particularly, under noisy background conditions. In a previous study [1], an auditory stimulus “ba� was detected at lower signal-to-noise ratios (SNR), when it was presented simultaneously with a video of the talker’s face, or with one of several other non-speech visual stimuli, including, a static square, a dynamic square, and a dynamic circle. We hypothesized that this effect could be due to a change in the sensory noise under multisensory conditions, or a change in perceptual efficiency. We implemented an external noise paradigm [2] in order to model two separate, independent sources of potential multisensory threshold improvement, a sensory reduction in intrinsic noise and/or an increase in sampling efficiency. Intrinsic noise is the inherent noise in the sensory system and is theoretically stimulus-invariant. Efficiency is a measure of stimulus information utilization. Efficiency is considered to be a measure of the tuning of a perceptual system to the specific task-relevant stimulus attributes, such as the stimulus onset time or spatial properties.
In Experiment 1, participants were tested on detection of the spoken “ba� from [1] with the external noise paradigm [2] at four fixed noise levels (i.e., 20, 40 and 60dB SPL white noise, and no-noise). An adaptive staircase method was used in which the acoustic signal level was varied and the noise level was fixed to obtain the 79.4% correct detection thresholds for a 2IFC task. Four conditions were tested, audio only (AO), audio with a vibrotactile pulse-train stimulus (AT), audio with a rectangular visual stimulus (AVR), and audio with visual speech (AVS). The statistically reliable relationship among speech detection thresholds across the noise levels was AO > (AT ≈ AVR) > AVS. The AVS efficiencies were higher than the AO efficiencies. The AT and AVR efficiencies were higher than the AO efficiencies, but with individual variation in the relative order of effectiveness of the two stimuli. Intrinsic noise was near zero. Thus, for normal-hearing participants, threshold improvement with multisensory stimuli is mainly due to an improvement in efficiency, with no significant effect on intrinsic sensory noise across conditions.
When the masking noise is much higher than the intrinsic sensory noise, d’ = sqrt (efficiency * SignalEnergy / NoiseEnergy), which allows a direct estimate of sampling efficiency. In Experiment 2, the method of constants was used to obtain d’, and two new multimodal conditions were introduced, AVR + tactile (AVRT) and AVS + tactile (AVST). A preliminary experiment fixed the signal level at 55dB SPL and varied the noise level adaptively to determine a range of SNRs suitable for constant stimuli presentation. Detection was then tested at -13,-14, and -15dB SNR. AT, AVRT, AVS, and AVST all resulted in higher d’ than did the AO condition, and this elevation in d’ was indistinguishable amongst conditions. This result replicates our previous finding that a concurrent non-informative stimulus without speech-related qualities can be as effective as a natural speech stimulus in its threshold enhancement effect. Overall, the results support the view that auditory speech detection in noise is enhanced whenever a visual and/or tactile stimulus affords information to improve sampling of the auditory stimulus, and that internal sensory noise is unaffected by those visual and/or tactile stimuli.
1. Bernstein, L.E., E.T. Auer, Jr., and S. Takayanagi, Auditory speech detection in noise enhanced by lipreading. Speech Communication, 2004. 44(1-4): p. 5-18.
2. Legge, G.E., D. Kersten, and A.E. Burgess, Contrast discrimination in noise. Journal of the Optical Society of America A, 1987. 4(2): p. 391-404.
In Experiment 1, participants were tested on detection of the spoken “ba� from [1] with the external noise paradigm [2] at four fixed noise levels (i.e., 20, 40 and 60dB SPL white noise, and no-noise). An adaptive staircase method was used in which the acoustic signal level was varied and the noise level was fixed to obtain the 79.4% correct detection thresholds for a 2IFC task. Four conditions were tested, audio only (AO), audio with a vibrotactile pulse-train stimulus (AT), audio with a rectangular visual stimulus (AVR), and audio with visual speech (AVS). The statistically reliable relationship among speech detection thresholds across the noise levels was AO > (AT ≈ AVR) > AVS. The AVS efficiencies were higher than the AO efficiencies. The AT and AVR efficiencies were higher than the AO efficiencies, but with individual variation in the relative order of effectiveness of the two stimuli. Intrinsic noise was near zero. Thus, for normal-hearing participants, threshold improvement with multisensory stimuli is mainly due to an improvement in efficiency, with no significant effect on intrinsic sensory noise across conditions.
When the masking noise is much higher than the intrinsic sensory noise, d’ = sqrt (efficiency * SignalEnergy / NoiseEnergy), which allows a direct estimate of sampling efficiency. In Experiment 2, the method of constants was used to obtain d’, and two new multimodal conditions were introduced, AVR + tactile (AVRT) and AVS + tactile (AVST). A preliminary experiment fixed the signal level at 55dB SPL and varied the noise level adaptively to determine a range of SNRs suitable for constant stimuli presentation. Detection was then tested at -13,-14, and -15dB SNR. AT, AVRT, AVS, and AVST all resulted in higher d’ than did the AO condition, and this elevation in d’ was indistinguishable amongst conditions. This result replicates our previous finding that a concurrent non-informative stimulus without speech-related qualities can be as effective as a natural speech stimulus in its threshold enhancement effect. Overall, the results support the view that auditory speech detection in noise is enhanced whenever a visual and/or tactile stimulus affords information to improve sampling of the auditory stimulus, and that internal sensory noise is unaffected by those visual and/or tactile stimuli.
1. Bernstein, L.E., E.T. Auer, Jr., and S. Takayanagi, Auditory speech detection in noise enhanced by lipreading. Speech Communication, 2004. 44(1-4): p. 5-18.
2. Legge, G.E., D. Kersten, and A.E. Burgess, Contrast discrimination in noise. Journal of the Optical Society of America A, 1987. 4(2): p. 391-404.