A fundamental question with regard to perceptual development is how multisensory information is processed in the brain during the early stages of development. Although a growing body of evidence has shown the early emergence of modality-specific functional differentiation of the cortical regions, the interplay between sensory inputs from different modalities in the developing brain is not well understood. To study the effects of auditory input during audio-visual processing in 3-month-old infants, we evaluated the spatiotemporal cortical hemodynamic responses of 50 infants while they perceived visual objects with or without accompanying sounds. The responses were measured using 94-channel near-infrared spectroscopy over the occipital, temporal, and frontal cortices. The effects of sound manipulation were pervasive throughout the diverse cortical regions and were specific to each cortical region. Visual stimuli co-occurring with sound induced the early-onset activation of the early auditory region, followed by activation of the other regions. Removal of the sound stimulus resulted in focal deactivation in the auditory regions and reduced activation in the early visual region, the association region of the temporal and parietal cortices, and the anterior prefrontal regions, suggesting multisensory interplay. In contrast, equivalent activations were observed in the lateral occipital and lateral prefrontal regions, regardless of sound manipulation. Our findings indicate that auditory input did not generally enhance overall activation in relation to visual perception, but rather induced specific changes in each cortical region. The present study implies that 3-month-old infants may perceive audio-visual multisensory inputs by using the global network of functionally differentiated cortical regions. Hum Brain Mapp, 2013. © 2011 Wiley Periodicals, Inc.