All languages employ certain phonetic contrasts when distinguishing words. Infant speech perception is rapidly attuned to these contrasts before many words are learned, thus phonetic attunement is thought to proceed independently of lexical and referential knowledge. Here, evidence to the contrary is provided. Ninety-eight 9-month-old English-learning infants were trained to perceive a non-native Cantonese tone contrast. Two object–tone audiovisual pairings were consistently presented, which highlighted the target contrast (Object A with Tone X; Object B with Tone Y). Tone discrimination was then assessed. Results showed improved tone discrimination if object–tone pairings were perceived as being referential word labels, although this effect was modulated by vocabulary size. Results suggest how lexical and referential knowledge could play a role in phonetic attunement.