Crossmodal Attention &
Multisensory Integration

Friday 1st & Saturday 2nd October,
Wolfson Hall, Somerville College, Oxford

Funded by the Experimental Psychology
Society & Unilever Research

Organized by Charles Spence (Oxford)
& Philip Quinlan (York)


Talk Abstracts

Cue-driven shifts of anticipatory intermodal

attention: High-density electrical mapping


John J. Foxe (

Department of Neuroscience, Albert Einstein College of Medicine,

Kennedy Centre 915, 1300 Morris Park Avenue, Bronx, N.Y. 10461, USA


In this study, we used high-density mapping of the human event-related potential (ERP) to examine the brain activity associated with directing attention to one of two possible modalities. Attention was directed to a particular modality by the use of visually presented symbolic word-cues that instructed subjects on a trial-by-trial basis, which modality was to be attended. By assessing the spatiotemporal pattern of activation in the approximately 1 second period between the cue-instruction and a subsequent compound auditory-visual imperative stimulus, we could assess the brain regions that were involved in setting up and maintaining intermodal selective attention prior to the actual selective processing of those stimuli. We revealed a system of areas that included frontal, parietal and sensory association areas that were involved in this deployment of intermodal attention. Frontal and parietal areas showed modality specific increases in activation during the early part of the anticipatory period (onsetting at ~230 ms), likely representing the activation of fronto-parietal attentional deployment systems. In the later period preceding the arrival of the "to-be-attended" stimulus, sustained differential activity was seen over fronto-central regions and parieto-occipital regions, suggesting the maintenance of modality specific "biased" attentional states which would allow for subsequent selective processing. These late sensory specific biasing effects were accompanied by sustained activations over frontal cortex that also showed modality specific activation patterns, suggesting that maintenance of the biased state involves top-down inputs from frontal executive areas. Robust attention effects were also seen during the stimulus processing phase of the imperative stimulus.





Neural mechanisms of intermodal attention and

multisensory integration in macaque monkeys

Charles E. Schroeder (

Program in Cognitive Neuroscience and Schizophrenia, Nathan Kline Institute for

Psychiatric Research,140 Old Orangeburg Rd., Orangeburg, New York 10962, USA


Two sets of experiments will be described. In both, ERP, current source density (CSD) and multiunit activity (MUA) profiles were sampled with linear array multielectrodes from cortical and subcortical brain regions in awake monkeys. These profiles delineate the physiology and circuitry of cortical processing, as well as the neural origins of scalp ERPs. Monkeys were studied while: 1) alternating between a foveal color/intensity discrimination and a binaural frequency discrimination across trial blocks or 2) passively receiving combinations of visual, auditory and somatosensory stimuli. Three aspects of our findings will be discussed. First, in our paradigm, non-spatial, intermodal attention entails feedback-mediated enhancement of the net amplitude and duration of stimulus-evoked excitation. Second, attentional modulation of visual and auditory cortical processing contributes to ERP "selection negativities," thus positing specific circuits and processes as bases for effects noted in human ERPs. Third, regions of auditory association cortex adjacent to A1 receive visual and somatosensory inputs, contrasting in both timing and laminar profile with multisensory inputs to the nearby classic multisensory area (superior temporal polysensory) region. Thus, the substrates for auditory-visual and auditory-somatosensory integration are present at early auditory processing stages.



Selective attention within and between sensory modalities

Lawrence E. Marks ( & Gail Martino

John B. Pierce Laboratory, 290 Congress Avenue, New Haven, CT 06519, USA

Beginning in the 1960s, studies by Garner and others have revealed clear limitations on selective attention to components of multidimensional stimuli. Using methods of speeded classification, Garner and others have shown, for example, that subjects' ability to classify visual stimuli with respect to saturation is impeded (response time is increased) if lightness varies randomly and irrelevantly over trials. The failure to filter out irrelevant variation in an unattended stimulus dimension, sometimes called 'Garner interference', is often attributed to early holistic processing of the stimuli. In addition to interference, speeded classification often reveals congruence effects, whereby, for example, subjects respond more quickly to less saturated stimuli when lightness is low and to more saturated stimuli when lightness is high than to 'mismatching' combinations of saturation and lightness. Patterns of interference and congruence are not limited to multidimensionally varying unimodal stimuli; They also appear when subjects try to attend selectively to one modality in a multimodal (auditory-visual, auditory-tactile) stimulus. With unimodal stimuli, interference and congruence may reflect the operation of "early" perceptual processes. With multimodal stimuli, the interactions may in part reflect relatively 'late' processes whereby information from different modalities is coded into an abstract, perhaps even semantic, format.





The attentional blink across stimulus modalities


Karen Arnell (

Department of Psychology, North Dakota State University, USA


When two masked targets (T1 and T2), both requiring attention, are presented within half a second of each other, report of the second target is poor — a phenomenon known as the Attentional Blink (AB). Early theories of the AB emphasised its uniquely visual nature. However, I will present several studies demonstrating that the AB can be found when both targets are auditory, and across modalities when one target is visual and the other target is auditory. Crossmodal AB has been observed when; 1) auditory targets are spoken letters, 2) auditory targets are pure tones, 3) a speeded response is required to an unmasked auditory target, and 4) the experiment allows no preparatory task-set switching. The existence of crossmodal AB supports theories of the AB emphasising central processing limitations on stimulus consolidation.



Dual task interference on visual marking: Modality-dependent

and modality-independent components of the marking state


Glyn W. Humphreys ( (1) & Derrick Watson (2)

(1) School of Psychology, University of Birmingham, UK

(2) Department of Psychology, University of Warwick, UK

Efficient selection of new visual objects depends not only on a passive prioritization of new items but also on top-down inhibition of old objects: a process we have termed visual marking (Watson & Humphreys, 1997). Marking can be demonstrated in visual search tasks in which potentially harmful 'old' distractors can be shown to have little effect on search through large sets of old items, when the old and new items are temporally separated. However there is a systematic decrease in the ability to ignore the old items if participants perform a secondary task while the old items are present. In this paper we varied the timing of the secondary task with regards to the old items, and the modality of presentation. When the second task started along with the onset of the old items, performance was disrupted equally by visual and auditory secondary tasks. When the second task started after the old items had been in the field for some period, there was only disruption from a visual secondary task. The results suggest that there may be both modality-dependent and modality-independent components of marking. The ability to establish the marking state appears to involve central processes utilised in processing auditory as well as visual signals. The ability to maintain the marking state relies on processes specific to vision.


Crossmodal perceptual grouping: Evidence from the ventriloquist effect


Paul Bertelson (

Laboratoire de Psychologie experimentale, Universite Libre de Bruxelles,

CP 191, B-1050 Bruxelles, BELGIUM

The term ventriloquism refers to various manifestations of crossmodal spatial interaction between auditory and visual inputs that are observed in sensory conflict situations. These involve on-line manifestations like perceptual fusion in spite of spatial discrepancy and immediate crossmodal bias, by which localization of data in one modality are attracted toward conflicting data in the other modality, and also off-line aftereffects. Existing data concerning the variables on which the occurrence of the phenomena depend are reviewed. A question which has been neglected in the literature consists of separating genuine perceptual contributions to the phenomena from voluntary postperceptual adjustments. Using a new paradigm based on psychophysical staircases, we have shown that the visual bias of auditory location occurs even when the subject is not aware of the auditory-visual discrepancy, hence cannot reduce to voluntary corrections. Other experiments have demonstrated that ventriloquism cannot be explained by deliberate shifts of spatial attention. It is concluded that ventriloquism reflects a phenomenon of automatic crossmodal pairing, i.e. formation of a crossmodal perceptual unit which takes place at a pre-conscious processing stage and thus must be clearly distinguished from conscious perceptual fusion.



Sound enhances visual perception: Crossmodal effects

of auditory organization on visual perception


Jean Vroomen (

KUB / FSW Tilburg, University P.O. Box 9015 35000, LE Tilburg, THE NETHERLANDS

Four experiments demonstrated crossmodal influences from the auditory on the visual modality at an early level of perceptual organization. Participants had to detect a visual target in a rapidly changing sequence of visual distractors. An abrupt tone that segregated from a tone sequence (stream segregation; Bregman, 1990) improved detection of the visual target (Expt. 1). Synchronization between the tone and the visual target proved to be critical (Expt. 2). Perceptual organization of the tone sequence was of crucial importance: Detection of the visual target was better when the tone segregated from the tone sequence than when the same tone was part of a melody (Expt. 3). This finding was robust under various display times of the visual target (Expt. 4). These results are compatible with the notion that the perceptual system makes coherent decisions about the input across modalities. Stream segregation in the auditory modality can therefore have an effect on segregation in the visual modality.



Cognitive and non-cognitive influences on "intersensory bias"


Robert B. Welch (

NASA-Ames Research Center, Moffett Field, CA 94035, USA


When two sensory modalities conflict with one another as, for example, when viewing the hand through a prism, each modality modifies the other to a greater or lesser extent. This so-called "intersensory bias" is modulated by a variety of factors. The major "non-cognitive" controlling variable is the number (and perhaps weighting) of "amodal properties" shared by the two sensory modalities. Proposed "cognitive" variables include: (1) the degree to which the two sensory sources are related through experience (e.g., the sight of a speaker's moving mouth and the sound of his voice), (2) instructional or situational modifications of the observer's assumption that the two sensory inputs are emanating from a single event (the "unity assumption"), and (3) the distribution of the observer's attention between the two modalities. However, before it can be concluded that these or any other cognitive variables actually influence intersensory bias, precautions must be taken to minimize the possibility that the perceptual measures of this bias are being corrupted by response factors, such as the observer's knowledge about the objective situation. Bertelson and Aschersleben (1998) have argued recently that this problem can be solved by the "procedure of psychophysical staircases," a stratagem that, although quite promising, has certain limitations.



Integration of redundant and no-redundant auditory-visual information

during object recognition in humans: Behavioral and electrophysiological data


Marie-Helene Giard ( Alexandra Fort

Mental Processes and Brain Activation, INSERM U280,

151, Cours Albert Thomas 69424 Lyon Cedex 03, FRANCE


A fundamental question for cognitive psychologists and neuroscientists is how the sensory systems "integrate" separate (e.g. auditory and visual) features of a multimodal object to form a unitary percept. In paradigms where subject have to detect stimuli defined either by unimodal features alone or by the combination of the unimodal features, behavioral results generally show a facilitation effect (shorter RT and better performance) for multimodal processing ("redundant target effect"). Recent ERP recordings during a (redundant) multimodal object recognition task have revealed the existence of several neural patterns of auditory-visual interactions at early stages of sensory analysis (<200 ms), in both sensory-specific and non-specific cortices. In addition these interactions were found to partly depend on the perceptually more dominant sensory modality of the subject: the predominant effects were evident in the sensory areas of the non-dominant modality. (Giard & Peronnet, J. Cog. Neurosci., 1999). When the sensory information included in the bimodal stimulus is not redundant for the task (i.e. both unimodal cues are necessary for object identification), no facilitation effect is observed. Yet ERP results show that auditory-visual informations still interact at early analysis stages in sensory-specific and non-specific cortices, with smaller amplitudes but still the same effect of modality-dominance of the subject. We will examine these different patterns of results and discuss their implication for multisensory integration.




Gemma Calvert (

FMRIB Centre, University of Oxford, OX3 9DU, UK

Combining information from the different sensory channels greatly improves the detection, localisation and classification of objects and events in the environment. Such crossmodal synthesis is highly dependent on the congruity of the intersensory inputs. Matching sensory inputs (such as the sight and sound of a cat) constructively combine to refine classification, whilst mismatched inputs (e.g. the sight of a cat and sound of a dog) destructively interfere, hampering classification. Little is known about the neural basis of this crossmodal synthesis in man. FMRI is now providing insights into both the method by which this crossmodal binding is achieved in humans and the neurophysiological mechanisms underlying the resulting perceptual gains. 


Crossmodal links in exogenous spatial attention revealed by

the orthogonal temporal order judgment task

Juan Lupiáñez ( (1), Roland Baddeley (2), & Charles Spence (3)

(1) Departamento de Psicologia Experimental, Universidad de Granada, 18071 - Granada, SPAIN

(2) Department of Psychology, Sussex University, UK

(3) Department of Psychology, University of Oxford, UK

In this study, we investigated the effects of exogenous visual and auditory cues on the perception of visual, tactile and auditory stimuli. Participants were presented with pairs of stimuli (one visual and the other tactile), one to either side of fixation, and were required to make unspeeded temporal order judgments (TOJs) about which modality was perceived first (Experiment 1) or last (Experiment 2). A spatially-uninformative auditory cue was presented unpredictably on the left or right shortly before (150 ms interval) the onset of the visuotactile stimuli. Sensory processing was faster on the side of the auditory cue than on the uncued side. This exogenous cuing effect was no longer present when the interval was increased to 700 ms, supporting the view that TOJs are unaffected by IOR. In a final experiment, visual cues were presented prior to pairs of auditory stimuli differing in frequency, and participants judged which frequency was perceived first/last. The sound presented on the cued side was perceived more rapidly than the sound on the uncued side, revealing that visual cues can lead to exogenous shifts of auditory spatial attention, at least under certain conditions.

Top    Back to 1999 Front Page




Crossmodal interactions in exogenous spatial

attention: Behavioral and ERP measures


John J. McDonald ( & Lawrence M. Ward

University of British Columbia, Canada

It is well known that sensory events of one modality can influence judgments of sensory events in other modalities. For example, people respond more quickly to a target appearing at the location of a previous cue than to a target appearing at another location, even when the two stimuli are from different modalities. Such crossmodal interactions suggest that involuntary spatial attention mechanisms are not entirely modality-specific. The present study recorded event-related brain potentials (ERPs) to examine the timing and neuroanatomical basis of involuntary, crossmodal spatial attention effects. We found that orienting spatial attention to an irrelevant sound modulates the ERP to a subsequent visual target over modality-specific, extrastriate visual cortex, but only after the initial stages of sensory processing were completed. Specifically, ERPs measured over the occipital scalp were negatively displaced when an auditory cue and visual target appeared at the same location compared to when they appeared at different locations. This occipital negative displacement (Nd) began approximately 200 ms after the appearance of the target and continued for 200 ms. These crossmodal ERP effects are consistent with the proposal that involuntary spatial attention orienting to auditory and visual stimuli involves shared, or at least linked, brain mechanisms.



Event-related fMRI of visuo-tactile interactions during spatial covert orienting

Emiliano Maclauso ( (1), C.D. Frith (1), & J. Driver (2)

(1) Wellcome Department of Cognitive Neurology, Institute of Neurology, London, UK

(2) Institute of Cognitive Neuroscience, Department of Psychology, University College London, UK

Visuo-tactile links have been demonstrated using single cell recording in monkeys and behavioural procedures in humans. Using event-related fMRI we investigated the possibility that spatially congruent visuo-tactile stimulation may modulate brain responses to peripheral visual stimuli. Six right-handed volunteers were tested using a 2x2 factorial design. One factor was the side of the visual stimulation (left or right). The second factor was the presence of tactile stimulation on the right index finger (present or absent). The right visual stimulation was delivered near to the right index finger. The task was to quickly respond to every visual stimulus. The first step of the analysis was to identify brain areas showing a main effect of side of the visual stimulation. These were found in the contralateral extrastriate cortex. Within these areas, we then tested for an interaction between side and presence of the tactile stimulation. The left lingual gyrus showed significant modulation. The effect of side of the visual stimulation was increased in the context of the contralateral tactile stimulation. Maximal response was observed when both visual and tactile events occurred simultaneously in the right hemifield. We conclude that spatial congruence of crossmodal stimulations may influence activity in brain areas usually considered unimodal.





ERP correlates of crossmodal links between vision, audition, and touch

Martin Eimer (

Department of Experimental Psychology, University of Cambridge,

Downing Street, Cambridge CB2 3EB, UK

Recent findings from ERP studies suggesting the existence of crossmodal links in spatial attention. When attention is directed to a specific location within one modality, ERPs recorded to stimuli in another irrelevant modality indicate the selective processing of stimuli presented at attended locations. These results were found in a series of new ERP studies investigating crossmodal links in endogenous spatial attention between audition, vision and touch. They provide evidence for the existence of crossmodal links that affect primarily early stages of information processing. However, in contrast to vision and audition, touch may be 'special' in the sense that it can be selectively decoupled from attentional processes within another modality when tactile information can be entirely ignored. In addition, ERP evidence for crossmodal links between touch and vision in exogenous (involuntary) spatial attention will be presented.




How the brain synthesizes information from

different senses to produce adaptive behavior


Barry E. Stein (

Department of Neurobiology & Anatomy, Wake Forest University School of Medicine,

Wake Forest University, Winston-Salem, North Carolina, 27157, USA

That sensory cues in one modality affect perception in another has been known for some time, and there are many examples of "intersensory" influences within the broad phenomenon of cross-modal integration. The ability of the CNS to integrate cues from different sensory channels is particularly evident in the facilitated detection, and reaction to combinations of concordant cues from different modalities, and in the dramatic perceptual anomalies that can occur when these cues are discordant. A substrate for multisensory integration is provided by the many CNS neurons (e.g., in the superior colliculus) which receive convergent input from multiple sensory modalities. Similarities in the principles by which these neurons integrate multisensory information in different species point to a remarkable conservation in the integrative features of the CNS during vertebrate evolution. In general, profound enhancement or depression in neural activity can be induced in the same neuron, depending on the spatial and temporal relationships among the stimuli presented to it. The specific response product obtained in any given multisensory neuron is predictable based on the features of its various receptive fields. Perhaps most striking, however, is the parallel which has been demonstrated between the properties of multisensory integration at the level of the single neuron in the superior colliculus and at the level of overt attentive and orientation behavior.





Jon Driver (

Institute of Cognitive Neuroscience, University College London UK




Bistability and multisensory integration in human postural control

John J. Jeka (, Kelvin S Oie (1),

Tim Kiemel (2) & Gregor Schöner (3)

(1) Program In Neuroscience & Cognitive Sciences & Departments of

Kinesiology/Psychology, University of Maryland, College Park, MD 20742, USA

(2) Department of Biology, University of Maryland, College Park, MD, USA

(3) Laboratoire de Neurosciences Cognitives, CNRS, Marseilles, FRANCE

The control of human upright stance is inherently multisensory, requiring fusion of inputs from the visual, vestibular and somatosensory systems to estimate center of mass position in space. In our multisensory experimental paradigm, subjects stand in front of a large screen visual display while lightly touching a contact surface with their right index fingertip. Applied forces to the contact forces are sensory, not mechanically supportive (< 1 Newton). Previous studies showed that when the visual or contact surface oscillated, body sway entrained to the periodic sensory input. However, a linear second-order dynamical model was unable to account for the fusion of sensory inputs in an additive fashion.

In the present experiment we systematically manipulate the relative strength of both visual and somatosensory input into the postural control system to examine the nonlinear fusion process. Anti-phase, oscillatory, visual and somatosensory stimuli were presented at a constant frequency (0.2 Hz). The amplitudes of the stimuli were inversely scaled relative to each other in 0.2 cm increments over a range of 0.0-2.0 cm. With increasing visual motion amplitude, postural sway switched from being approximately in-phase with the somatosensory stimulus to being approximately in-phase with the visual stimulus. Switches occurred in the opposite direction with decreasing visual motion amplitude. In some subjects hysteresis was observed, with bistability at intermediate stimulus amplitudes. These switches can be viewed as bifurcations of a dynamical system and may be related to a decision-making process in which inconsistent sensory information is discounted. Supported by NIH grant R29 NS35070-01A2 (John J Jeka, PI).




Integration of visual and proprioceptive information about hand position


Robert van Beers (

Institute of Cognitive Neuroscience, University College London, London, WC1N 3AR, UK


In order to plan and execute goal-directed arm movements, the CNS needs to have information about the position of the hand. Two sensory systems provide direct information on the hand's position: vision and proprioception. We studied how the CNS integrates information from these two modalities by performing two psychophysical experiments. In the first experiment we determined the precision of the information from each of these sources in isolation. For both modalities, the precision proved to be spatially non-uniform in the horizontal plane. Proprioception was found to provide more precise information in the radial direction, whereas visual information is more precise in the azimuthal direction. The second experiment addressed the question how the CNS combines the information from the two sources when they are available simultaneously. The results show not only that the information from both sources is used simultaneously, but also that they contribute with different weights in different directions. These weights proved to be related to the direction-dependent precision found in the first experiment, in such a way that the CNS makes a very efficient use of the available information.


Near ideal multisensory integration in a simple reaching task


Roland Baddeley ( & Helen Ingram

Laboratory of Experimental Psychology, University of Sussex, Falmer, UK


Human arm movements are not as reliable as those of robots. Sending the same signal to our muscles after 3 hours of rock climbing will result in a different movement to that after a day at the computer terminal. In day to day life, people naturally adapt the mapping between spatial and motor coordinates. The distorting effects of glasses, muscle tiredness, increased strength from training, the effects of environment (such as water or clothing), the effect of a heavy load or injury; all change the mapping between the motor command and the resultant movement. Therefore the mapping between motor commands and the resultant arm movements needs to be recalibrated constantly using a combination of visual and proprioceptive information. Using a simple reaching task in the presence of noise, analysed using system identification methods, we show that 1) the visual and proprioceptive information is very efficiently integrated (with a Fisher efficiency of above 80%), 2) that in most cases visual feedback dominates proprioceptive feedback, and 3) that the nature of the algorithm subjects use adapts to the nature of the task in a way consistent with an ideal observer. This last characteristic rules out a model based on a Kalman filter with fixed parameters (the model usually used in engineering for integration of the output from multiple sensors). We present an alternative Bayesian model that provides a good account of the observed subjects' behaviour.


Sensory integration as revealed by proprioceptive matching in a patient

with unilateral somatosensory impairment following central deafferentation

Roger Newport, Stephen R Jackson (, Masud Husain, & John V. Hindle

Human Movement Laboratory, Psychology, University of Wales, Bangor, Gwynedd LL57 2DG, UK

During reaching movements, sensory signals must be transformed into appropriate motor command signals. For movements directed to visually defined targets, this will involve translating visual information signalling the spatial position of the target, into a motor plan which specifies the sequence of postural changes required to bring the hand to the target. Understanding the nature of these visuomotor transformations, particularly how different kinds of sensory cues may be combined to produce the motor plan, remains a critical, but largely unresolved issue in motor neuroscience. Recent anatomical and neuropsychological evidence suggest that the frame of reference used to guide reaching movements may vary according to whether movements are directed to visually-defined or proprioceptively-defined target locations. In the current study we report a series of experiments which investigate how the accuracy of reaching movements varies, in a patient (CT) recovering from unilateral somatosensory impairment, including tactile extinction, when executing reaches toward visually-defined or proprioceptively-defined targets. A key feature of the reaching task that we used was that the patient was required to reach, using her non-impaired limb, above a table surface to target positions defined proprioceptively by passively placing the patient's impaired hand in position the table, and thus out of view. Our findings demonstrate the following: (a) when the target location of a reach is defined proprioceptively by an hand which has been passively positioned by the experimenter, access to visual cues significantly increases end-point accuracy of the reach, even though such cues cannot possibly signal the position of the target. (b) our patient shows a limb-specific loss of accuracy such that she only exhibits large spatial inaccuracies when reaching to proprioceptively-defined targets using her non-impaired hand (targets are defined by her impaired hand).(c) these limb-specific errors are substantially reduced when visual cues are made available, even though such cues cannot signal the spatial position of the target hand.

References 1. Rushworth MFS, Nixon PD, and Passingham RE. (1997). Parietal cortex and movement I. Movement selection and reaching. Exp. Brain Research, 117, 292-310. 2. Jackson SR, Newport R, Husain M, Harvey M, & Hindle JV. (in press). Reaching movements may reveal the distorted topography of spatial representations after neglect. Neuropsychologia.




Multisensory integration in retinotopic coordinates

Alex Pouget (, J.C. Ducom, J. Torri, & D. Bavelier

Brain and Cognitive Science Department, 402 Meliora Hall,

University of Rochester, Rochester, NY 14627 USA

Human subjects tend to overshoot when pointing at the remembered location of a visual target in the absence of visual feedback from their hand and while maintaining fixation away from the target. The amplitude of this overshoot follows a sigmoidal function of the retinotopic position of the target, suggesting that the position of visual targets is remembered in retinotopic coordinates. Using retinotopic coordinates for spatial memory can be problematic since retinotopic position changes with each saccade. One solution consists of updating the retinotopic location of the remembered target after each saccade. This predicts that if a saccade is intervened between the presentation of the target and the initiation of the hand movement, the overshoot would reflect the retinal position of the target after the eye movement, not before. This is indeed what has been observed by Enright, (Vis. Res., 35, 1995) and Henriques et al. (J. Neurosci., 18, 1998). Using a similar experimental paradigm, we have found that retinotopic coordinates appear to be used for auditory, proprioceptive and imaginary targets as well. These results are surprising since one can compute a pointing motor command from any of these modalities without ever using retinotopic coordinates. It is clear, however, that remapping all sensory modalities into a common frame of reference greatly simplifies the problem of multisensory integration and memory.




Visual, tactile, and vestibular information convergence in the monkey parietal cortex


Jean-René Duhamel ( & Sophie Denève

Institut des Sciences Cognitives, CNRS UPR 9075, 67 boulevard Pinel, 69675 Bron, FRANCE


The parietal lobe contains multiple specialized areas which are involved in different aspects of sensorimotor integration, such as the establishment of reference frames for coding spatial information across sensory modalities, the visual guidance of voluntary movement, and in attentional and decisional aspects of stimulus and response selection. In one of its subdivisions, the ventral intraparietal area (VIP), single unit recording experiments conducted in non-human primates have shown that visual activity is modulated by eye position signals, as observed in many other extrastriate visual areas, but in contrast to most of these areas, the individual visual receptive fields of a population of VIP neurons were organized along a continuum, from eye to head coordinates. That is, some neurons encode the azimuth and/or elevation of a visual stimulus independent of the direction in which the eyes are looking, thus representing spatial locations explicitly in at least a head-centered frame of reference Although the prevalent sensory modality is visual, area VIP neurons often respond to visual, tactile and vestibular stimulation, with congruent response properties across the three sensory modalities. Neural network modeling suggests that such multisensory convergence maps serve as integrators of sensory input in different coordinate frames, and enable the coding of the location of a stimulus detected by one sensory system using another sensory system as reference.




The nature of responses to auditory and visual stimuli

in macaque posterior parietal cortex


Alexander Grunewald (

Division of Biology, California Institute of Technology,

Mail Code 216-76, Pasadena, CA 91125, USA

The lateral intraparietal area (LIP) is involved in sensory-motor processing. Historically it has been considered responsive to visual, but not auditory stimuli. Recent reports, however, indicate that LIP neurons respond to auditory stimuli in an auditory-saccade task. How do auditory and visual responses in LIP depend on training on, and the performance of auditory saccades? In the first experiment, LIP fixation responses were recorded at two different times: before and after auditory-saccade training. Before training no neurons responded to auditory stimuli, and half the neurons responded to visual stimuli. After training, 13% of neurons responded to auditory stimuli, while the responsiveness to visual stimuli was unchanged. In the second experiment, recordings were made while monkeys performed either a fixation or a saccade task to auditory and visual stimuli. Blocks with fixation and saccade trials were alternated, and activity in both tasks was recorded from the same neurons. Responses to auditory stimuli in LIP were stronger in saccade trials than in fixation trials. Visual responses were unaffected. These results suggest that responses to auditory and visual stimuli differ strongly in LIP. Auditory responses are dependent on the saccadic significance of auditory stimuli, whereas visual responses appear to be "native" to LIP.



Multisensory integration in space perception


Hans-Otto Karnath (

Department of Neurology, University of Tuebingen, Hoppe-Seyler-Strasse 3

D-72076 Tuebingen, GERMANY

Damage of the human parietal cortex leads to disturbances of spatial perception and of motor behaviour. Within the parietal lobe, lesions induce quite different, characteristic deficits. Patients with inferior (predominantly right) parietal lobe lesions fail to explore the contralesional part of space by eye or limb movements. The observations suggest that a spatial reference frame for exploratory behaviour is disturbed in patients with neglect. Data from these patients' visual search argue that their failure to explore the contralesional side is due to a disturbed input transformation from different modalities leading to a deviation of egocentric space representation. Multimodal neural representation of space in the inferior parietal lobule seems to serve as a matrix for spatial exploration and for orienting in space.





Neuropsychological evidence of an integrated visuotactile

representation of peripersonal space in humans


Current interpretations of extinction suggest that the disorder is due to an unbalanced competition between ipsilesional and contralesional representations of space. The question I will address is whether the competition between left and right representations of space in one sensory modality (i.e., touch) can be modulated by the activation of an intact spatial representation in a different modality (i.e., vision), functionally linked to the damaged representation . It will be shown that, in patients with a right-hemisphere lesion and reliable tactile extinction, a visual stimulus presented near the ipsilesional hand (or face) inhibited or interfered with the processing of a tactile stimulus delivered on the contralesional hand (or face) (cross visual-tactile extinction) to the same extent as did an ipsilesional tactile stimulation (unimodal extinction). In contrast, weak modulatory effects of vision on touch perception were observed when a visual stimulus was presented far from the patient's hand (or face). It will also be discussed whether such cross-modal links between touch and vision in the peripersonal space could be mediated by proprioceptive signals specifying the current position of the body part, or whether they directly reflect an interaction between two sensory modalities, i.e. vision and touch. The results suggest the existence of an integrated system that controls both visual and tactile inputs within peripersonal space and they show that this system is functionally separated from the one that controls visual information in the extrapersonal space. These findings will be explained by referring to the activity of bimodal neurons in premotor and parietal cortex of macaque, which have tactile receptive fields on the hand (or face), and corresponding visual receptive fields in the space immediately adjacent to the tactile fields.




Tactile perception in neurological patients:

Facilitation and competition from visual stimuli

Chris Rorden (

Institute of Cognitive Neuroscience, University of London, London, WC1N 3AR, UK

Medical Research Council, Cognition and Brain Sciences Unit, Cambridge, UK

Two neuropsychological studies are presented which investigate the influence of visual information on tactile perception. The first investigation examined a patient who reported only being able to feel touch when he could see the region being touched. In an experimental setting, we found that he had difficulty detecting a tap accompanied by a salient (but not predictive) light located directly above his concealed hand. However, his performance was dramatically improved if the light was attached to a prosthetic arm situated in line with the patient's arm. This finding demonstrates that crossmodal sensory facilitation does not only depend upon simple spatial proximity. Rather, a simultaneous visual event improves perception of touch specifically when it is attributed to the perceiver's stimulated limb. The second study investigated the effect of visual events on the perception of touch in a patient suffering from extinction. We found that both visual and tactile events occurring near the right hand extinguished the perception of taps to the (contralesional) left hand. However, when the visual stimuli and gaze were shifted toward the impaired side (so that the 'right' visual stimulus was at the location of the left hand, but in still in the right field retinotopically) the 'right' visual event no longer caused extinction of stimuli on the left hand. This study suggests that crossmodal extinction occurs after the remapping of perceptual information.




Crossmodal interactions in visual-somatosensory saccade generation

Robin Walker ( & Richard Amlôt

Department of Psychology, Royal Holloway University of London, Egham, Surrey, TW20 OEX, UK

Behavioural studies of saccades have provided some evidence of multisensory visual-auditory interaction effects (e.g. Frens & Van Opstal, 1998). Such interactions are consistent with neurophysiological evidence that saccade related neurons in the deep layers of the superior colliculus (SC) respond more vigorously when presented multimodally than when presented unimodally (Meredith & Stein, 1986). Similar neural facilitation effects have been illustrated also with somatosensory stimuli and although the characteristics of somatosensory saccades have been investigated unimodally (Groh & Sparks, 1996) there have been no multimodal investigations of somatosensory integration.

The present study investigated multimodal visual-somatosensory interaction effects. Saccades were made to either somatosensory or visual targets and on some trials a distractor from the non-target modality was presented also. Coincident or non-coincident somatosensory distractors did not influence the latency of saccades made to visual targets. However, coincident visual distractors did reduce the latency of saccades to tactile targets and non-coincident visual distractors increased somatosensory saccade latency. These results may be interpreted in terms of multisensory convergence in the colliculus and dominance for visual stimuli in saccade generation.




The development of temporally based intersensory integration in human infants


David J. Lewkowicz (

NYS Institute for Basic Research in Developmental Disabilities,

1050 Forest Hill Rd., Staten Island, NY 10314, USA


Most perceptual events are multimodally represented and temporally distributed. The temporal organization of such events provides a ready-made basis for the integration of the concurrent information in multiple sensory modalities. Whether the integration is based on the perception of amodal invariants or on an active process of integration, it is still not clear how intersensory temporal perceptual mechanisms develop in early infancy. To answer this question, I will first review and synthesize current knowledge about the development of intersensory temporal perception in human infants. I will then present a theoretical model whose principal aim is to account for the developmental emergence of temporally based intersensory integration skills. The model is based on the principles of epigenetic systems theory and proposes that responsiveness to four basic features of multimodal temporal experience, namely, temporal synchrony, duration, temporal rate, and rhythm emerges in a sequential, hierarchical fashion. It postulates that (a) initial developmental limitations make intersensory synchrony the first and foundational basis for the integration of intersensory temporal relations and, (b) that the emergence of responsiveness to the other, increasingly more complex, temporal relations occurs in a hierarchical, sequential fashion by building on the previously acquired intersensory temporal processing skills.




Merging neural representations of visual and auditory space during development

Andrew J. King (

University Laboratory of Physiology, Parks Road, Oxford OX1 3PT, UK

The superior colliculus in the midbrain contains topographically aligned visual, auditory and somatosensory maps. The registration of these maps facilitates the integration of multisensory signals at the neuronal level and potentially allows any of the sensory cues associated with a target in space to evoke goal-directed orienting behaviours. Because spatial information in these sensory systems is initially encoded using different coordinates, establishing and maintaining map registration in the face of both growth-related changes in the relative geometry and independent movements of the peripheral sense organs is not straightforward. Studies in birds and mammals have shown that sensory experience plays a crucial role in the process of merging the visual and auditory representations during development. In particular, visual signals, which may arise from the superficial layers of the superior colliculus, are used to calibrate the auditory spatial tuning of neurons in the deeper layers, ensuring that the spatial representations of these modalities share the same coordinates. Although the dominant role of vision in stimulus localization within a multisensory environment is also apparent at a behavioural level, recent studies indicate that loss of visual function can lead to compensatory improvements in spatial hearing.



Spatial representation and crossmodal compensation in blind humans

Brigitte Röder ( (1), Frank Rösler (1),

Wolfgang Teder-Sälejärvi (2), Steven A. Hillyärd (2), & Helen J. Neville (3)

1 - Philipps-University Marburg, Germany

2 - University of California, San Diego, USA

3 - University of Oregon, Eugene, USA

Different sensory systems provide both complementary and redundant information. The spatial location is an excellent medium to link inputs of different modalities together. It has been hypothesized that, although most sensory systems provide some spatial information, the visual sense instructs the remaining senses in setting up spatial representations. Here we report that congenitally blind humans had the same reaction time functions as sighted controls in spatial imagery tasks. Furthermore, the blind showed a task related parietal activation in a mental rotation task as sighted participants did, suggesting that the same cortical representations and processes were activated. Moreover, using an auditory spatial attention task, we found that blind people localize sounds as well or even better than sighted controls. Concurrently recorded event-related potential measures indicated a shaper tuned early spatial selection mechanism in the blind for sound sources which they localized more precisely than sighted participants. These findings demonstrate that spatial concepts develop in the absence of visual input, that is, there is no visual guidance required to build up spatial representations. Furthermore, auditory spatial maps seem to be refined in the absence of visual input indicating crossmodal compensation in the blind.





The developmental appearance of multisensory integration in the superior

colliculus (SC) is dependent on the maturation of cortical influences

Mark T. Wallace ( & B. E. Stein

Department of Neurobiology and Anatomy, Wake Forest University

School of Medicine, Winston-Salem, NC 27157, USA

The SC receives convergent visual, auditory and somatosensory inputs, and synthesizes these multisensory inputs in order to guide behavior. At the neuronal level, this synthesis can be manifested as a marked enhancement in responses to crossmodal stimulus combinations. In the adult, this enhancement has been shown to be dependent on inputs from the cortex surrounding the anterior ectosylvian sulcus (AES). In the current study we examined the development of this crossmodal capability and its dependence on cortical input by recording from the SC of cats at different postnatal ages and reversibly deactivating the AES. Three postnatal stages appear to characterize the multisensory development of the SC. In the first (0-12 dpn), multisensory neurons are absent. In the second (12-28 dpn), multisensory neurons appear but are unable to synthesize crossmodal cues, and these neurons fail to receive influences from the AES. In the final stage, beginning at 28 dpn and extending into the third postnatal month, multisensory neurons exhibiting adult-like integration appear in concert with AES corticotectal influences. Thus, the final stage of multisensory SC development is one in which a functional corticotectal link is established between the AES and the SC. The maturation of this influence corresponds closely with the peak period of cortical plasticity, suggesting that the genesis of these corticotectal influences, and hence the onset of SC multisensory integration, occurs only after cortex is capable of exerting experience-dependent control over these SC processes.





What, when and where of crossmodal effects in the perception of affect

Beatrice de Gelder ( (1, 2), Jean Vroomen (1), & Gilles Pourtois (1, 2)

(1) Psychonomics Laboratory, University of Tilburg, Tilburg, THE NETHERLANDS

(2) Laboratory of Neurophysiology, Universisty of Louvain, Brussels, BELGIUM

Speech communication is based on perceiving the spoken sounds as well as facial gestures. Affect communication represents a very similar case of bimodal input. As a first step to explore the integration of affective information provided simultaneously in the auditory and the visual modality we used a bimodal perception situation modeled after studies of audio-visual speech perception. The evidence suggests that tone of voice biases the subjects’ judgement of facial expression (de Gelder and Vroomen, in press). Further behavioral studies showed that the crossmodal bias effect is not under attentional control. Moreover, in order to obtain a bias effect from the face on the voice it is not necessary that viewers be conscious of the facial expression. A crossmodal bias effect was observed in a visual agnosia patient after loss of overt recognition of facial expressions due to occipito-temporal damage (patient AD) and in a patient with damage to striate cortex who shows blindsight for facial expressions (patient GY) (de Gelder et al., in press; de Gelder et al., submitted). Two major questions concerning these crossmodal effects are the time course and the neuroanatomical implementation. To address the ‘when’ question we recorded electric brain responses to emotional multimodal stimulations (i.e., angry voice associated with angry or sad face) and we showed early perceptual processing (at 178 ms after stimulus onset) of the emotional content of the multimodal stimulus, reflected by a negative electric brain reponse compatible with the parameters of the MMN (de Gelder et al., 1999). The ‘where’ question was approached in an event-related fMRI study.


1. de Gelder, B., and Vroomen, J. (In press). The perception of emotions by ear and by eye. Cognition and Emotion. 2. de Gelder, B., Pourtois, G., Vroomen, J., and Bachoud-Levi, A.-C. (In press). Covert processing of faces in prosopagnosia is restricted to facial expressions: evidence from crossmodal bias. Brain and Cognition. 3. de Gelder, B., Vroomen, J., Pourtois, G., and Weiskrantz, L. (Submitted). Non-conscious recognition of affect in the absence of striate cortex. Neuroreport. 4. de Gelder, B., Böcker, K.B.E., Tuomainen, J., Hensen, M., & Vroomen, J. (1999). The combined perception of emotion from voice and face: Early interaction revealed by human electric brain responses. Neuroscience Letters, 260, 133-136.



Last-minute cancellations

Intra-modal and crossmodal spatial attention to auditory and visual stimuli


Wolfgang A. Teder-Sälejärvi (

Department of Neurosciences, University of California, San Diego, USA

This study investigated crossmodal interactions in spatial attention by means of recording event-related brain potentials (ERPs). Noise bursts and light flashes were presented in random order to both left and right field locations separated by 60º in free-field. One group of subjects was instructed to attend selectively to the noise bursts (attend-auditory group), and a second group attended only to the flashes (attend-visual group). On different runs attention was directed to either the right or left field stimuli of the designated modality. In the attend-auditory group, noise bursts at the attended location elicited a broad, biphasic negativity (Nd) beginning at 70 ms. The crossmodal spatial attention effect on the auditory ERPs in the attend-visual group was very similar in morphology, albeit of smaller amplitude. In the attend-visual group, flashes at the attended location elicited enhanced early (100-200 ms) and late (200-350 ms) ERP components relative to unattended-location flashes. The crossmodal effect in the attend-auditory group included small but significant enhancements of early components of the visual ERPs. It was concluded that spatial attention has a multimodal organization such that the processing of stimuli at attended locations is facilitated at an early, sensory level, even for stimuli of an unattended modality.



The neural processes underlying synesthesia: Data and theory

Peter G. Grossenbacher (

Section on Clinical and Experimental Neuropsychology, Laboratory of Brain

and Cognition, National Institute of Mental Health, 15 North Drive MSC 2668,

Building 15-K Rm 105A, Bethesda, MD 20892-2668, USA

Synesthesia is the conscious experience of sensory attributes that are not experienced by most people when receiving comparable stimulation. For example, in colored-hearing synesthesia, the perception of sound induces a phenomenal experience of color. There have been several recent findings that shed considerable light on the nature of synesthesia and its neurocognitive underpinnings. Patterns of familial aggregation indicate a clear genetic component to synesthesia, suggesting a neurobiological basis. A PET study measuring regional cerebral blood flow found activation of visual cortical areas during auditory stimulation in colored-hearing synesthetes but not in non-synesthetes, indicating cortical involvement. Brain electrical activity related to synesthesia-inducing sensory stimulation shows differences between synesthetes and non-synesthetes by 200 ms after stimulus onset. Multiple studies using Stroop-type experiments have found that synesthesia-induced color phenomena automatically interfere with naming the colors of chromatic stimuli. I explain all these findings with a theory that suggests that synesthesia results from disinhibition of feedbackward neural connections that project from multisensory cortical areas to the unisensory cortical areas which mediate the synesthetic phenomena. This theory only posits anatomical connections that have already been shown to exist in primates, and it accounts for drug-induced as well as developmental types of synesthesia.





Crossmodal Attention & Multisensory Integration

Poster Session 6.15-7.30 pm Friday 1st October,

Main Dining Hall, Somerville College, Oxford



Poster Abstracts

(Note that posters should be in place by lunchtime on the 1st October)





(1) Exogenous tactile-visual attention within hemifields:

Increased spatial specificity when arms are visible


Steffan Kennett ( (1, 2) & Jon Driver (2).

(1) Birkbeck College, University of London, UK

(2) Institute of Cognitive Neuroscience, University College London, UK

Spatially nonpredictive tactile events can lead to a cued-side advantage in subsequent visual judgements. Here we investigate whether this advantage is hemifield wide or more specifically located close to the cued hand within a hemifield. Visual targets were presented in one of four lateral positions and hand position was manipulated between two postures (hands next to "inner" targets vs. "outer" targets). In experiments with small inner/outer separations, within-hemifield effects were found only when hand position was visible: In particular, outer visual targets were best cued by ipsilateral outer tactile events. In a further experiment inner and outer locations were very widely separated. Now, even with hand position unseen, outer visual targets were better cued by outer tactile events. This indicates that proprioception alone can provide crude spatial information for tactile-visual exogenous cueing, while greater precision is obtained when arm position is also seen. Cued visual regions surrounding the stimulated hand may mediate the within-hemifield cueing effects. Analogous findings have been reported from single-unit recording, which has revealed cells with tactile receptive fields on the hand and visual receptive fields that are locked to the current hand position.






(2) The effects of background and sound source location

on auditory facilitation of visual target acquisition


Melanie C. Doyle (* & Robert J. Snowden

School of Psychology, Cardiff University, Cardiff, CF1 3YG, UK

*MCD now at Royal Holloway, University of London

Irrelevant sound facilitates visual search for targets located within 15 degrees of fixation (e.g., Perrott, Saberi, Brown & Strybel, 1990). We compared the effects of central and target congruent sounds on search for targets in empty and cluttered displays. Target and distractor stimuli were red or green, horizontal or vertical rectangles: the target could be differentiated from the distractors due to its relatively late onset. Sound was presented at fixation or at the target location and the onset of the sound was simultaneous with target appearance. Stimulus duration was limited to 150 msec to preclude overt orienting. Observers were required to detect the target, or to discriminate its orientation or conjunction of colour and orientation. Auditory facilitation was influenced by task demands, however, sound source location had no effect upon the magnitude of facilitation observed. Background clutter had little effect upon the pattern of results which suggests that 'pop-out' had occurred on the basis of temporal factors alone. Sound can facilitate covert orienting to visual targets in spatially limited displays (e.g., Doyle & Snowden, 1998, Perception, 27 supp: 134), however, in the present experiment target salience may have modified auditory facilitation.


Perrott, D. R., Saberi, K., Brown, K., & Strybel, T. Z. (1990). Auditory psychomotor coordination and visual search performance. Perception & Psychophysics, 48, 214-226.




(3) A comparison of visual and auditory

attention in the horizontal plane and in depth

Jason S. Chan (, Agavni Petrosyan, & David R. Perrott.

Psychoacoustics Lab., Psychology Department, California State University, Los Angeles

5151 State University Drive, Los Angeles, Ca. 90032, USA

Previous studies examining visual and auditory attention used cue and target stimuli that were either contralateral or ipsilateral to the subject. They did not examine the effects of the various degrees between the contralateral and ipsilateral sides. In this study, we examined visual and auditory attention using a visual cue-target, an auditory cue-target, a visual cue and auditory target, and an auditory cue and visual target. In the horizontal plane each auditory and visual stimuli were separated by 4.5 degrees. We also examined the same conditions in depth. The cue was presented from one second, there was an ISI of 100ms, and then the target was presented either in the same location or in a different location. Results show a latency in reaction time in the visual cue-target condition when they both appeared in the same location. There were no significant findings in the auditory conditions. In depth, there were significant latencies in reaction time when both stimuli were presented in the same field. However reaction times were not significant between fields when the cue was presented at the nearest or farthest field. Between crossmodal conditions, where the cue was a light and the target was a sound or vice versa, there were some significant results.




(4) Can the ventriloquism illusion facilitate audiovisual speech perception?


Stuart Leech (

Department of Experimental Psychology, University of Sussex, FalmerBN1 9QG

Five experiments attempted to replicate Driver (1996). Driver found that an audiovisual speech stream was better identified when the ventriloquism illusion caused the target auditory speech to be mislocated to a position spatially displaced from distractor auditory speech. Experiment 1 replicated this finding, but the size of the effect was smaller than Driver’s. Experiment 2 failed to replicate the effect using smaller audiovisual spatial separations. Experiment 3 failed to replicate the effect using a constant spatial position for the target and distractor auditory streams. In Experiment 4 subjects again performed the identification task, but were also asked to judge which of 9 possible spatial locations the target auditory speech was coming from. This experiment failed to replicate Driver_s effect, despite the fact that a significant ventriloquism illusion was present, as measured by subjective report. Experiment 5 investigated how identification was affected by audiovisual spatial separation in the absence of distractor speech, finding that, in this case, spatial displacement of the auditory and visual speech stimuli had no significant effect.


(5) The integration of auditory and visual attention in speech recognition

Fussell, C. and Culling, J. F. (

School of Psychology, Cardiff University, P.O.Box 901, Cardiff, CF10 3YG, UK

Speech intelligibility was measured in conditions of physical separation between target and masking speech and of perceived separation produced by a ventriloquism effect (Driver, 1996). Speech reception thresholds (SRTs) were measured against interfering speech using an adaptive procedure in four lip-reading conditions. The video monitor, bearing the lip movements for the target speech, was placed in location A. Loudspeakers were located at A, and at B, 45 degrees to the right. Target and interfering speech came either from the same loudspeaker (conditions AA and BB) or from different loudspeakers (AB and BA). The listeners attended to the monitor and repeated the target sentences. SRTs were 10 dB lower in condition AB than in the other conditions. No other differences were significant. The results support the importance of spatial correspondence between auditory and visual information (AB < BA). However, they did not replicate Driver's ventriloquism effect (BB = AA). In the presence of visual lip-reading information, the spatial separation of target and interfering voice had no independent effect (BA = AA).


(6) The perceptual lateralisation of non-speech audiovisual stimuli


Nigel J. Holt (

Auditory Lab., Department of Psychology, Early Gate, University of Reading, Reading, UK


A correspondence between visually and auditorily presented simple non-speech sounds was demonstrated using a lateralization paradigm. Mean lateralization judgements of auditory, visual and audio-visual stimuli with spatio-temporally corresponding components did not differ significantly although standard deviations in lateralizations of audio-visual stimuli with spatio-temporally corresponding components were significantly smaller than those of judgments of auditory or visual stimuli. The assessment of lateralizations of audio-visual stimuli with spatially non-corresponding components followed a spatial threshold task. A dominance of the visual modality was shown (cf. Radeau and Bertelson 1977), but mean standard deviations in lateralizations increased as a function of audio-visual spatial mismatch. Audio-visual temporal mismatch difference limen measurements suggested a 50ms visual processing lag (cf. Poppel 1988). Lateralizations of audio-visual stimuli with asynchronous modal components showed an increase in standard deviation with asynchrony. Mean lateralizations of audio-visual stimuli with spatio-temporally non-corresponding components showed an influence of the position of the auditory component. The relative influence of the auditory component is discussed in terms of stimulus unpredictability as a result of the simultaneous variation of audio-visual spatial and temporal correspondence, resulting in a weakened unity assumption.



(7) Interactions between the perception of visual and auditory movement

Jane E. Aspell (, Bramwell, D.I., Green, G.G.R, & Hurlbert,A.C.

Department of Physiological Sciences, Medical School, Framlington Place,

Newcastle Upon Tyne, NE2 4HH, UK

Purpose: To investigate interactions between the perception of visual and auditory movement.

Methods: Observers viewed random dot kinematograms on a CRT display while listening to sound movement simulated by interaural amplitude modulation (IAM) of a 500 Hz tone over headphones. Each trial began with a "noise" phase, during which the dots moved at random while the sound remained stationary, followed by the "coherent" phase, during which a percentage of randomly selected dots moved together leftwards or rightwards, while the sound moved in the same or opposite direction. Observers maintained visual fixation on a central marker throughout. In the visual task, observers ignored the sound and reported the direction of coherent dot movement. In the auditory task, observers reported sound movement direction. The degree of IAM and the percentage coherence were both varied independently in all combinations from sub- to supra-threshold levels.

Results: The presence of concurrent visual motion enhanced performance on the auditory task, whereas conflicting visual motion degraded performance. Both effects occurred at all levels of sound movement and increased with increasing levels of visual motion coherence. For 2/3 observers, sound movement had analogous effects on the visual task. The effects cannot be explained by probability summation of two independent mechanisms (visual and auditory) signalling motion direction.



(8) Audiovisual integration affects perception of apparent motion and momentum


Victor Berrio ( (1), Juan Lupiáñez (1), Charles Spence (2), & Francisco Martos (1)

(1) Universidad de Granada, SPAIN

(2) University of Oxford, UK

In this study we investigated whether an auditory event affects the representational momentum of two visual approaching objects. Two coloured discs were presented on the top of a computer screen on each trial (one to the left and the other to the right). The two discs (red and blue) started moving to the centre at a constant speed. Before arriving to the centre the discs disappeared behind a black occluder, continued their motion and reappeared below the occluder still moving at the same speed. On an unpredictable half of the trials the discs reappeared on the same side as they had approached the occluder (bounce condition) while on the remainder of trials the discs reappeared on the opposite side (cross condition). Participants were to localise one of the discs after reappearing from behind the occluder, by pressing the corresponding key as fast as possible. On half of the trials a meaningless sound was presented when the two discs were supposedly behind the occluder. Participants were faster localising the disc in the crossing condition, but this momemtum effect was eliminated by the presence of the sound. In the first experiment we examined the effect of the size of the approaching angle on the effect of the sound to eliminate the momentum effect. In the second experiment we manipulated the speed of the two discs. Momentum increased with speed but was maximally reduced by the sound at medium speed and when the two discs approached diagonally with a big angle. In a third experiment we introduced a change in the colour of the occluder instead of the sound. This stimulus change had no effect in the momentum, thus indicating that the effect observed in previous experiments is specific of audio-visual integration which might be interpreted as a collision between the two discs.


(9) Temporal window for audiovisual interaction

revealed with an ambiguous motion display

Katsumi Watanabe ( & Shinsuke Shimojo (

Division of Biology, California Institute of Technology, Pasadena, CA 91125, USA

Identical visual targets moving across each other with equal, constant speed can be perceived either to bounce off or to stream through each other at the coincidence point. Sound presented at the coincidence enhances the bouncing percept (Sekuler et al., 1997). We explored this bounce-inducing effect to investigate the temporal tuning of audio-visual interactions and the effect of auditory context/saliency in motion-event perception. A single sound or a train of sounds was presented at various timings while observers looked at the streaming/bouncing ambiguous motion. A single sound presented in the range between 250-ms before and 150-ms after the visual coincidence enhanced the bouncing percept. However, this bounce-inducing effect was attenuated when another identical sound preceded the coincidence- synchronized sound by about 250 ms. Additionally, whereas a single sound embedded in a sequence of identical sounds failed to show the bounce-inducing effect, this attenuation effect disappeared (i.e., the bounce-inducing effect recovered) if the intensity or frequency of the coincidence-synchronized sound was different from the repetitive sounds so that it did odd-man-out from auditory context. These results suggest that the audio-visual interaction for the bounce-inducing effect is broadly tuned in time and dependent on salience of the sound in auditory context.





(10) Auditory and visual blinks in attention do not span across modalities

Salvador Soto-Faraco ( (1) & Charles Spence (2)

(1) Dept. de Psicologia Basica, Universitat de Barcelona,

Campus de la Vall d'Hebron, Pg. de la Vall d'Hebron, 171, Barcelona, SPAIN

(2) Department of Experimental Psychology, Oxford University, UK

The attentional blink (AB) is a temporal processing deficit that impairs normal processing for the second of two targets requiring response in a rapid serial visual presentation (RSVP) stream of irrelevant items (e.g. Raymond et al., 1994). The AB phenomenon has been thoroughly demonstrated within vision. However, recent studies of AB using pure auditory or mixed audiovisual presentations have rendered divergent results that, in our view, might be attributed to various methodological confounds. First, visual and auditory stimuli were presented from different spatial locations (e.g., Potter et al. 1998; Jolicoeur, 1998). Second, in most of previous studies participants had to perform different tasks on each of the two targets (e.g., Jolicoeur, 1998). These circumstances might have introduced eventual confounds between true attentional blinks and reallocation of attention in space or task-switch costs (Potter et al., 1998), making the interpretation of results uncertain. We report a series of experiments in which the task is unspeeded identification of pairs of digits presented among two rapid streams of letter distractors (one visual and one auditory) presented from a common spatial location. When participants had to recall both auditory and visually presented digits, only unimodal but not crossmodal AB was observed. When participants had to attend to only one modality, and ignore the other, again the unimodal blink was observed in both audition and vision, with no trace of crossmodal blink. We conclude that the attentional blink exists for both auditory and visual sensory modalities, but when spatial displacement and task-switch are suppressed, the blink does not span from vision to audition or vice versa.



(11) The role of attention in auditory and visual interaction

Geoff Patching (

Department of Psychology, University of York, Heslington,York Y010 5DD, UK

Four experiments examined the speeded classification of bimodal stimuli in the Garner task under conditions of divided and focused attention. A primary aim was to investigate the role of attention in processing concurrently presented auditory and visual signals. Within condition interference (i.e., congruence effects) obtained consistently in tasks demanding either divided or focused attention. In contrast, between conditions interference (i.e., Garner interference) obtained only in tasks demanding divided attention. In all, these data suggest that different forms of attentional involvement underlie Garner interference and congruence effects. Implications for current theoretical frameworks of multidimensional interaction are discussed.




(12) Crossmodal interference of touch on vision:

Spatial coordinates and visual capture

Mark Walton ( & Charles Spence,

Department of Experimental Psychology, Oxford University, UK

The concept of visual dominance (see Klein, 1977; Posner, Nissen, & Klein, 1976) assumes that people's performance is controlled by visual information. We report evidence showing that tactile distractors can capture visual attention on an elevation discrimination task, and that this effect is spatially-modulated. Crossmodal interference (difference between congruent and incongruent trials) was significantly larger when visual and tactile stimuli appeared on the same side than on different sides (Experiment 1). A further condition with a crossed-hands posture (Experiment 2) revealed a partial remapping of tactile and visual space, with equal interference effects whether or not the visual and tactile stimuli appeared on the same or different sides. In the final study, the interaction between vision, touch, and proprioception in determining crossmodal interference was investigated by the addition of a pair of rubber hands placed in line with the real hands and holding the visual targets to induce tactile ventriloquism for the participants own (unseen) hands. This had the effect of reducing the interference when both stimuli appeared on the same side. The results are related to theories of visual dominance, and to recent neurophysiological and neuropsychological evidence about how we integrate information across different sensory modalities.



(13) Crossmodal links in endogenous spatial attention between audition and touch

Donna M. Lloyd ( (1), Charles Spence (2), & Francis P. McGlone (1)

(1) Unilever Research Portsunlight Laboratories, Quarry Road East, Bebington, Wirral, L63

(2) Department of Experimental Psychology, University of Oxford

A series of 3 experiments were designed to investigate crossmodal links in endogenous spatial attention between audition and touch. Experiment 1 demonstrated that when people were informed that targets were more likely on one side, elevation judgments were faster on that side although it was possible to ‘split’ auditory and tactile attention when targets in the two modalities were expected on opposite sides. Experiment 2 demonstrated that people could shift attention around in either audition or touch independently. Finally, a study with crossed hands suggests that audiotactile links in spatial attention apply to common external locations, rather than simply being determined by which hemisphere information initially projects to. These results will be contrasted with previous findings regarding audiovisual (Spence & Driver, 1996) and visuotactile (Spence, Pavani, & Driver, in press) links in attention and a possible neurophysiological substrate for the effects will be posited.


Spence, C., & Driver, J. (1996). Audiovisual links in endogenous covert spatial attention. Journal of Experimental Psychology: Human Perception and Performance, 22, 1005-1030.

Spence, C., Pavani, F., & Driver, J. (in press). Crossmodal links in spatial attention between vision and touch: Allocentric coding revealed by crossing the hands. Journal of Experimental Psychology: Human Perception & Performance.




(14) Crossmodal interference between audition and touch

Natasha Merat ( (1), Francis McGlone (1), & Charles Spence (2)

  1. Unilever Research Portsunlight Laboratories, UK
  2. Department of Experimental Psychology, Oxford University, UK

Recent studies have demonstrated extensive crossmodal links in spatial attention between vision and audition, and between vision and touch (see Driver & Spence, 1998; Trends in Cog Sci, 2, 254-262). However, to date there has been little research into audiotactile links in spatial attention. We report 3 experiments designed to investigate crossmodal links between audition and touch under conditions of focused attention. Participants made speeded discrimination responses concerning the elevation (up vs. down) of auditory targets presented to their left or right, while ignoring simultaneously presented irrelevant tactile distractors presented to the anterior aspect of the elbow joint or to the thenar eminence of either hand (with the elbows resting on the table and the hands directed upward). When the elevation of the tactile distractor was incompatible with that of the auditory target, performance was both slower and less accurate than when target and distractor were presented from the same elevation. This compatibility effect was greater when target and distractor were presented from the same side than when presented to opposite sides. A similar spatial modulation of crossmodal interference (compatibility effect) was observed when the tactile distractors were presented to the thumb or index finger of either hand; though varying the position of the hands within each hemifield had little effect on performance (Experiment 2). A final study demonstrated that the crossmodal compatibility effects are unaffected by crossing the hands (i.e., by placing the left hand in the right hemispace, and the right hand in the left hemispace), suggesting that these audiotactile links in attention are modulated by the initial hemispheric projections of the target and distractor, rather than their position in external space. Taken together, these results show extensive crossmodal links in focused attention between audition and touch.




(15) Crossmodal comparisons of approximate numerosity

Hilary Barth (, Kanwisher, N., & Spelke, E.,

Dept. of Brain and Cognitive Sciences, Bldg. NE20-443C, MIT, Cambridge, MA, USA

Studies of nonhuman animals, human infants, and adults provide evidence for the existence of an abstract sense of number. Yet visual judgments of numerosity are greatly affected by certain visual properties of the stimulus set, such as regularity or area. Similarly, auditory numerosity judgments are affected by changes in the location and frequency of stimuli. Are these non-numerical properties used as cues in the formation of an abstract representation of magnitude, or are they themselves the bases of supposed "numerosity" judgments? If the latter is true, then comparing magnitude representations across modes of presentation should cause large performance deficits. Previous research on animals and human infants suggests that there is intermodal transfer of numerosity under some conditions, but these results are equivocal. We assessed adults' ability to compare approximate numerosities from sets of events presented crossmodally. Preliminary results suggest that there is little or no cost of comparing numerosities of visual and auditory sets, nor is there a cost for comparisons across spatial and temporal presentations. Our findings support the existence of an abstract number sense that is not tied to the modality or format of the stimulus.




(16) fMRI investigation of brainstem and cortical regions

activated during an exogenous tactile attention task

F. McGlone ( (1, 3), S. Francis (2),

D. Lloyd (1), J. Newton (2) & R. Bowtell (2)

(1) Neuroscience Group, Unilever Research, Wirral, L63 3JW, UK

(2) MR Unit, Univ. of Nottingham, NG7 2RD, UK

(3) School of Psychology, Univ. of Wales, Bangor LL572DG, UK

Tactile attention can be oriented voluntarily to a stimulated body location, or reflexively captured by an unexpected tactile stimulus — the touch to the shoulder. These effects have been studied primarily with behavioural measures of RT, and we have recently established that the inhibition of return effect described by Posner & Cohen (1984) for a visual stimulus, when it is presented after a delay at the same spatial location as a previously unattended cue, is also present in the tactile domain (Lloyd et. al., 1999). In this study 6 subjects were imaged (3.0T EPI/BOLD) during a tactile exogenous orienting task in which uninformative, and non-predictive, vibratory cues were delivered bilaterally to the feet, followed by targets (SOA’s of 200 & 700 msec) at the cued or uncued locations. Stimuli were randomised and RT’s collected. Significant activation was observed in the superior colliculus, posterior parietal areas, M1, SMA, insular, cingulate, DLPFC & cerebellum (dentate). A fronto-parietal attentional system similar to that described for vision is described, and activation states will be discussed in relation to behavioural measures of inhibition.


(17) Integration of exogenous and endogenous information in the superior colliculus


Thomas P. Trappenberg (

RIKEN Brain Science Institute, JAPAN

There is only one location to which we can direct our gaze despite the fact that many potential targets may be present. Selecting the target for a saccade entails the integration of multiple sources of information. We classify such information sources into two broad, conceptually defined classes using the terms exogenous to refer to visual inputs, and endogenous to refer to voluntary inputs which are dependent on instruction. We discuss the role of the superior colliculus as a possible site of the integration of such information. We propose that the integration of such information, leading to the initiations of saccades, is achieved within the superior colliculus through a mechanism of dynamic competition. We outline a simple yet powerful model that is able to represent single cell data and does model a variety of well known behavioral effects. Special attention is given to the determination of the interaction structure within the superior colliculus. In addition we outline the mechanisms behind some behavioral effects of saccade initiations in gap/overlap, distractor, target probability and antisaccade paradigms. We further speculate that similar mechanism might account for some attentional effects through competitive interactions within LIP.




(18) Cognitive evoked potentials to pure vibrotactile stimuli:

Two tactile attention studies

Veena Natarajan ( & Francis McGlone

Unilever Research Port Sunlight, Quarry Road East, Bebington, Wirral, Merseyside, L63 3JW, UK

Pure vibrotactile stimuli have rarely been used in evoked potential studies because of high signal to noise ratio and practical complexities such as rise and fall times. Attention studies exploring event related potentials to these stimuli have been carried out in this laboratory. Initial experiments used a classic oddball paradigm with frequent and rare vibrotactile stimuli delivered to the thenar of the hands. A large positivity maximal over parietal electrodes was observed to the rare tactile stimuli, suggesting a tactile P300 potential. We have also explored tactile attention using the classical Posner paradigm previously employed in visual attention studies. The studies have used an exogenous attention paradigm with tactile cues and targets at a variety of stimulus onset asynchronies (SOA). 12 subjects were asked to respond to target stimuli on the thenars of the left and right hands while ignoring weaker tactile cues on the same locations. Amplitude differences in the potentials evoked after target stimuli were observed depending on the spatial location of the preceding irrelevant cue. The results appear to corroborate behavioral studies on inhibition of return to tactile targets preceded by tactile cues with long SOAs (Lloyd et al 1999). These data suggest evoked potentials can be specifically correlated with tactile attentional processes, as have been demonstrated for the modalities of vision and audition.


(19) Selective regions of activation to tone modulation and auditory attention

Hall Debbie, A (, Akeroyd MA (1), Haggard MP (1),

Summerfield AQ (1), Palmer AR (1), Elliott MR (2), & Bowtell RW (2),

1 - MRC Institute of Hearing Research, University Park Nottingham, NG7 2RD

2 - Magnetic Resonance Centre School of Physics and Astronomy, University of Nottingham

Active listening elicits a different sensory response than passive listening. In neuroimaging, this is generally observed as an increase in the magnitude of activation. Sensory activation differences may therefore be masked by the effect of attention. We report a study that measured activation induced by static and modulated tones, whilst controlling attention by using target detection and passive listening tasks. The factorial design enabled us to determine whether the stimulus-induced activation in auditory cortex was independent of the information-processing demands of the task. Listening conditions induced widespread activation in the temporal cortex, including Heschl's gyrus (HG), planum temporale, superior temporal gyrus (STG) and superior temporal sulcus. The primary auditory area is located on HG, while secondary auditory areas surround it. No additional auditory areas were recruited in the analysis of modulated tones compared to static tones. However, modulation increased the amplitude of the response in the STG, anterior to HG. Relative to passive listening, the active task increased the response in the STG, posterior to HG. The active task also recruited a network of regions in the frontal and parietal cortex and sub-cortical areas. These findings indicate that preferential responses to stimulus modulation and the attention-demanding, active listening task involve distinct, non-overlapping areas of the secondary auditory cortex. Thus, in this study, differences in sensory activation were not masked by effects of attention.




(20) Evidence for perceptual asymmetries in audition

Rhodri Cusack ( & Robert P. Carlyon

MRC-Cognition & Brain Sciences Unit, 15 Chaucer Road, Cambridge, CB1 3NT, UK

Many studies have investigated perceptual asymmetries in visual search tasks, but comparatively little work has been done in audition. Here, we present three experiments suggesting strong perceptual asymmetries in the auditory domain. We show that a target warble (frequency modulated tone) is much easier to detect amongst steady distractors than vice-versa. In one paradigm, all sounds were presented sequentially, and the target was close to threshold. In the second, directly analogous to visual search experiments, many sounds were randomly distributed in both frequency and time, and the target was highly supra-threshold. Little "peripheral" masking was expected in either case. To investigate whether this effect is specific to frequency modulation (perhaps through adaptation or interference in a "modulation filterbank"), we conducted a third experiment in which the target was distinguished by duration. Again, a strong perceptual asymmetry was found, with long targets being much easier to select from short distractors than vice-versa. The most parsimonious explanation for all three experiments is that it is much easier to selectively attend to a target when it either contains a feature not present in the distractors, or when its magnitude on a relevant dimension (e.g. duration) exceeds that of the distracting stimuli.




(21) Crossmodal semantic priming in normals and schizophrenic patients

AS David, SA Surguladze, SL Rossell, S Rabe-Hesketh, & R Campbell

Institute of Psychiatry, De Crespigny Park, London SE5 8AF, UK

Certain schizophrenic symptoms suggest an excessive influence of information from one modality (eg visual) on perception in another (e.g., auditory). 20 Schizophrenia patients and 26 age/sex matched normal controls performed a primed lexical decision task in ipsimodal (visual-visual) and crossmodal (auditory-visual) prime-target conditions. Half the prime-target word pairs (30 out of 60) were semantically related. Primes (visual or auditory) were presented for 250 ms, followed after an ISI of 150 ms by a target. Subjects responded with a button press if the target was a real word (but not if it was a non-word). Reaction times for target words preceded by a semantically related prime were significantly faster than those for unrelated targets. This was observed in both ipsimodal and crossmodal conditions, but was more marked in the crossmodal condition. Schizophrenics showed generally increased priming, especially crossmodal (i.e., there was a significant diagnosis x priming x modality interaction) which survived various data transformations. These results suggest that spread of semantic associations across modalities is remarkably unconstrained in general and in schizophrenia in particular which could be interpreted as a loss of "informational encapsulation".





(22) Influence of auditory stop-signals on visually guided saccadic eye movements

Hans Colonius ( & Jale Oezyurt,

Universität Oldenburg, Oldenburg, GERMANY

In a stop-signal task subjects are instructed to perform a reaction time (RT) task, but must withhold their response whenever a stop-signal is presented. Stop-signals occur unpredictably at different time points after presentation of the go-signal. Here, the effect of auditory stop-signals on saccades towards a visual target was investigated. RT data for two of the three subjects were found to be in general agreement with the horse-race model of Logan & Cowan (1984) which holds that RTs and stopping performance are exclusively determined by the relative finishing times of hypothetical go- and stop-processes of (stochastically) independent duration. However, small but significant changes in movement parameters such as hypometric saccadic amplitudes and reduced peak velocities in signal-respond trials (i.e., responses in the presence of a stop-signal) suggest a violation of the independence assumption of the model. The spatial position of the auditory stop-signal was varied via a virtual acoustic environment, but it did not show an effect on performance.


(23) The Attentional Blink and P300

Genevieve McArthur (

(1,2), Timothy Budd (1), & Pat Michie (1)

  1. Department of Psychology, University of Western Australia, Nedlands, 6907, WESTERN AUSTRALIA
  2. Department of Experimental Psychology, University of Oxford, South Parks Road,



Both the "attentional blink" (AB) and the P300 have been hypothesised to represent a short period of inhibition following attention to a brief visual stimulus. Three experiments were conducted to determine whether there is an association between these two phenomena. Results indicated that (1) the AB and P300 follow a similar time course at the individual and group level, (2) reducing task difficulty has similar effects on AB and P300 magnitude at the group level, and (3) there is no relationship between the magnitude of AB and P300 within observers. These findings suggest a moderate association between the two phenomena, which may mirror transient inhibition of cortical networks to facilitate processing of target events.