• Software Cartography;
  • software visualization;
  • program comprehension;
  • latent-semantic analysis;
  • Multidimensional Scaling;
  • software evolution;
  • spatial representations of code


Software visualizations can provide a concise overview of a complex software system. Unfortunately, as software has no physical shape, there is no ‘natural’ mapping of software to a two-dimensional space. As a consequence most visualizations tend to use a layout in which position and distance have no meaning, and consequently layout typically diverges from one visualization to another. We propose an approach to consistent layout for software visualization, called Software Cartography, in which the position of a software artifact reflects its vocabulary, and distance corresponds to similarity of vocabulary. We use Latent Semantic Indexing (LSI) to map software artifacts to a vector space, and then use Multidimensional Scaling (MDS) to map this vector space down to two dimensions. The resulting consistent layout allows us to develop a variety of thematic software maps that express very different aspects of software while making it easy to compare them. The approach is especially suitable for comparing views of evolving software, as the vocabulary of software artifacts tends to be stable over time. We present a prototype implementation of Software Cartography, and illustrate its use with practical examples from numerous open-source case studies. Copyright © 2010 John Wiley & Sons, Ltd.