Recent studies have reported that preverbal infants are able to discriminate between numerosities of sets presented within a particular modality. There is still debate, however, over whether they are able to perform intermodal numerosity matching, i.e. to relate numerosities of sets presented with different sensory modalities. The present study investigated auditory–visual intermodal matching of small numerosities in infancy by using a violation-of-expectation paradigm. After being familiarized with events of a few objects impacting a surface successively, 6-month-old infants were alternatively presented with two and three tones while the movement of each object remained hidden behind an opaque screen. The screen was then removed to reveal either two or three objects. Results showed that the infants looked significantly longer at the numerically nonequivalent events (the three-tone/two-object and the two-tone/three-object events) than at the numerically equivalent events (the two-tone/two-object and the three-tone/three-object events) irrespective of the rate or duration of auditory tones presented. These findings suggest that infants are capable of performing intermodal matching of small numerosities and that they might possess abstract representations of numerosity beyond sensory modalities.