Five speakers of different age uttering one sentence were recorded audiovisually. Stimuli were created where auditory and visual information are coherent (from the same speaker) as well as incoherent (combinations of audio track from one speaker and video track from another speaker). The subjects task was to rate the age either by the whole speaking person, only by the voice while ignoring the face, or only by the face while ignoring the voice. Results reveal that subjects integrate both modalities if available in all three tasks. Additionally it could be shown that a) this effect is stronger if visual information should be ignored, b) in coherent stimuli the subjects rely more on the visual information, and c) the robustness of the visual modality exceeds that one of the auditory modality. Overall results give evidence for vision as the leading modality with respect to age perception in audiovisual speech.