Integration of speech and gesture; an ERP study
Boukje Habets, Sotaro Kita, Asli Özyürek, Peter Hagoort
Talk
Last modified: 2008-05-13
Abstract
During face-to-face communication, one does not only hear speech but one also sees a speaker’s hand movements and body language. In order to comprehend, listeners therefore have to integrate spoken language information with several forms of visual information, including hand gestures accompanying speech. ERPs have been successfully used to demonstrate that integration between gesture and speech takes place.
The present ERP study investigated the optimal timing for this integration process. Videos of a person gesturing were combined with speech segments that had different timing onsets (0 ms (simultaneous with stroke onset), 160 ms or 360 ms after stroke onset) and that matched or mismatched the gesture meaning. We analyzed the negative ERP deflection around 400 ms (the N400), an indicator of semantic integration.
When matching speech-gesture pairs with different asynchrony were compared, the N400 increased as speech-gesture asynchrony became bigger, an indicator for growing integration difficulties as the delay increased.
For non-matching pairs the N400 was largest in the no-delay condition, decreasing with increase in asynchrony (not significant at 360 ms), implying a stronger disruption of integration when the asynchrony is smallest.
Our results imply that speech and gesture are integrated best within a narrow time interval of 160 ms.
The present ERP study investigated the optimal timing for this integration process. Videos of a person gesturing were combined with speech segments that had different timing onsets (0 ms (simultaneous with stroke onset), 160 ms or 360 ms after stroke onset) and that matched or mismatched the gesture meaning. We analyzed the negative ERP deflection around 400 ms (the N400), an indicator of semantic integration.
When matching speech-gesture pairs with different asynchrony were compared, the N400 increased as speech-gesture asynchrony became bigger, an indicator for growing integration difficulties as the delay increased.
For non-matching pairs the N400 was largest in the no-delay condition, decreasing with increase in asynchrony (not significant at 360 ms), implying a stronger disruption of integration when the asynchrony is smallest.
Our results imply that speech and gesture are integrated best within a narrow time interval of 160 ms.