The work reported here investigated whether the extent of McGurk effect differs according to the vowel context, and differs when cross-modal vowels are matched or mismatched in Japanese. Two audio-visual experiments were conducted to examine the process of audio-visual phonetic-feature extraction and integration. The first experiment was designed to compare the extent of the McGurk effect in Japanese in three different vowel contexts. The results indicated that the effect was largest in the /i/ context, moderate in the /a/ context, and almost nonexistent in the /u/ context. This suggests that the occurrence of McGurk effect depends on the characteristics of vowels and the visual cues from their articulation. The second experiment measured the McGurk effect in Japanese with cross-modal matched and mismatched vowels, and showed that, except with the /u/ sound, the effect was larger when the vowels were matched than when they were mismatched. These results showed, again, that the extent of McGurk effect depends on vowel context and that auditory information processing before phonetic judgment plays an important role in cross-modal feature integration.