In an attempt to understand and integrate various meanings of the stereotype concept, we conducted a longitudinal study in which we measured subjects' stereotypes of various target groups using multiple measurement techniques: trait ascription (Likert scales), group differentiation (diagnostic-ratio), and deviation from group consensus. The measures were compared with regard to (I) their sensitivity to variations over time and to expected differences between social groups, and (2) their associations with degree of group contact and liking. The data suggested that trait ratings were the best-performing measures, in that they were quite effective in capturing cross-sectional effects of group contact and liking and were reliable over time. The diagnostic ratio was less reliable and provided a weaker replication of these effects, and the deviation from consensus measure was most effective in establishing an important longitudinal effect — movement toward consensus with time. Suggestions for researchers concerning appropriate use of measures and conceptions of stereotyping are provided.