Gestures are an important part of interpersonal communication, for example by illustrating physical properties of speech contents (e.g., “the ball is round”). The meaning of these so-called iconic gestures is strongly intertwined with speech. We investigated the neural correlates of the semantic integration for verbal and gestural information. Participants watched short videos of five speech and gesture conditions performed by an actor, including variation of language (familiar German vs. unfamiliar Russian), variation of gesture (iconic vs. unrelated), as well as isolated familiar language, while brain activation was measured using functional magnetic resonance imaging. For familiar speech with either of both gesture types contrasted to Russian speech-gesture pairs, activation increases were observed at the left temporo-occipital junction. Apart from this shared location, speech with iconic gestures exclusively engaged left occipital areas, whereas speech with unrelated gestures activated bilateral parietal and posterior temporal regions. Our results demonstrate that the processing of speech with speech-related versus speech-unrelated gestures occurs in two distinct but partly overlapping networks. The distinct processing streams (visual versus linguistic/spatial) are interpreted in terms of “auxiliary systems” allowing the integration of speech and gesture in the left temporo-occipital region. Hum Brain Mapp, 2009. © 2009 Wiley-Liss, Inc.