Detecting that two images are different is faster for highly dissimilar images than for highly similar images. Paradoxically, we showed that the reverse occurs when people are asked to describe how two images differ—that is, to state a difference between two images. Following structure-mapping theory, we propose that this disassociation arises from the multistage nature of the comparison process. Detecting that two images are different can be done in the initial (local-matching) stage, but only for pairs with low overlap; thus, “different” responses are faster for low-similarity than for high-similarity pairs. In contrast, identifying a specific difference generally requires a full structural alignment of the two images, and this alignment process is faster for high-similarity pairs. We described four experiments that demonstrate this dissociation and show that the results can be simulated using the Structure-Mapping Engine. These results pose a significant challenge for nonstructural accounts of similarity comparison and suggest that structural alignment processes play a significant role in visual comparison.