While a variety of quantitative measurements of disparity have been proposed, the most common have been average pairwise character dissimilarity and the total variance (sum of univariate variance) (see Foote 1997). Ciampaglio et al. (2001) evaluated these and five additional measurements of disparity [total range, mean distance, number of unique pairwise character combinations, principal coordinate analysis (PCO) volume, and participation ratio] for their sensitivity to sample size, number of morphological characters, percentage of missing data and changes in morphospace occupation pattern (see also Foote 1991, 1993b, 1999; Wills et al. 1994; Villier and Eble 2004). The results show that there is no single best estimate of disparity; different measures capture different aspects of disparity. Thus, the appropriate measures will depend, at least in part, on the questions addressed and the samples available. The results of Ciampaglio et al. (2001) parallel those of Foote (1993b) in suggesting that average pairwise dissimilarity, which is relatively immune to different sample sizes, is a useful metric for differences between taxa. Unique pairwise character combinations can reveal the amount of space occupied, and a distance metric such as PCO volume or mean pairwise distance can reveal changes in character space occupation. In addition, while some studies have looked at single characters or character suites, the trend has been towards inclusion of a greater range of characters, both because estimates of morphological disparity may vary between characters, and because this allows partitioning of characters with different functional roles (Foote 1994; Wagner 1995; Eble 2000b; Ciampaglio 2004). Such a combination of metrics will provide a more informative analysis of changes in morphological disparity than a single measure alone.
Phylogenetic characters provide an alternative approach to analysing disparity, one that is particularly useful in the absence of homologous characters amenable to geometric morphometrics. Wagner (1997) used the average patristic dissimilarity per branch over branch distance as his metric of morphological separation and the median pairwise phenetic dissimilarity among all pairwise comparisons as his metric of disparity. However, Wagner (2000) showed that at least for the sorts of characters useful for phylogenetic analysis, the number of character state spaces is rapidly exhausted, so that further character change is likely to repeat previously achieved character states. In the case of binary characters, after infinite evolution character exhaustion would produce an average similarity of 0·5 (P. Wagner, pers. comm. 2006). With continuous character states, exhaustion is less of a problem. Thus, the potential morphospace as measured with continuous characters should be larger than that for discrete characters, even for the same clade.
Although character-based studies of disparity evaluate changes in form across taxa, Eble (2003) pointed out that developmental morphospaces can also be constructed which more closely illuminate patterns of developmental variation in form. Theoretical morphospaces, the subject of the next section, are one example of a developmental morphospace as they are inherently generative, although often not in a way that can translate directly into how developmental evolution occurs.
Morphospaces and theoretical morphology. Vertebrates with six appendages are a biological impossibility, evidently for developmental reasons. Arthropods clearly lack such inhibitions. (Ironically, either this means that angels are arthropods or it is a biological refutation of the possibility of angels!). But establishing such a ‘geography of possible worlds’ (MacLaurin 2003, p. 463) requires more consideration in other clades. Studies of disparity have largely been based on exemplars of known taxa, facilitating description of changes in the occupation of a morphospace through evolutionary radiations, across time or in response to mass extinctions. Although the distinction has been disputed, most prominently by Michael Foote, such empirical morphospaces are of limited utility in addressing what forms could have evolved but have not, and why.
Empirical morphospaces are defined on the basis of the forms included in the analysis; addition of new forms will change the morphospace, albeit sometimes very subtly. Only with theoretical morphospaces can we distinguish realized from potential forms and explore issues of constraint, as yet unexplored possibilities and impermissible forms [Lauder et al. 1995; McGhee 1999; although see Eble 2000b for an interesting discussion of how the distinction between theoretical and empirical morphospaces breaks down with raw (unordinated) morphospaces]. Foote argues that given a skeletal element, for example a long bone, it is possible to map out the range of possible forms and compare these with those that are known (Foote, pers. comm. 2006). I think the difference lies in that theoretical morphospaces require some growth model to generate the diversity of form, and this is what Foote is also doing, at least implicitly, in his example.
That such theoretical morphospaces are easiest to construct for organisms using accretionary or branching growth accounts for the predominance of studies of molluscs, particularly ammonoids. Raup (1966) first developed the concept of theoretical morphospace for logarithmically coiled shells and demonstrated that bivalves, gastropods, ammonoids and brachiopods are limited to particular portions of the morphospace. For bivalves and brachiopods the requirement that the two valves meet imposes an architectural constraint on the regions of morphospace that can be occupied, and in his 1967 paper on ammonoids Raup discussed in detail the reasons behind correlations between different shell coiling parameters. Schindel (1990) subsequently dissected the non-orthogonal nature of the Raupian morphospace despite how it was depicted (and frequently reproduced, e.g. Lauder et al. 1995). Thus, at least part of the pattern of morphospace occupation is an artefact of presentation. Thomas and Reif (1993; see also Thomas et al. 2000) developed a more abstract morphospace. Termed the ‘skeleton space’, this allows classification of skeletal features in terms of various aspects of design and growth. By the more abstract nature of the morphospace, it allows a broader range of skeletal types to be considered within the same context. Thus, Thomas et al. (2000) evaluated the occupation of this morphospace by the various organisms of the Middle Cambrian Burgess Shale, concluding that they occupied 80 per cent of the morphospace that has been exploited by all extinct and extant animals.
Patterns of occupation within theoretical morphospaces can reveal much about functional constraints. Viable helically coiled bryozoan colonies are limited to low to medium surface area for functional reasons (McGhee and McKinney 2000). Niklas's theoretical morphospace for plant growth captured plant structure by defining branching and rotation angles around an axis, generated a variety of morphologies, and allowed evaluation of their function in terms of light interception and other variables (Niklas 1986, 2004; see also discussion in McGhee 1999). Theoretical morphospaces have now been explored for a variety of other groups, including corals, a variety of bryozoans, echinoids, graptolites and some fish; McGhee (1999) provides a thorough introduction.
McGhee (1999) argued that it is only within the context of theoretical morphospaces that we can address the issue of forms that could have existed but which have never evolved. Although empirical morphospaces can be useful in this quest, McGhee is largely correct that theoretical morphospaces provide a more useful method. It is thus particularly unfortunate that some groups, particularly arthropods, have so far proved intractable for the construction of morphospaces. Application of fractal growth models (Prusinkiewicz and Lindenmayer 1990; Kaandorp 1994) may provide one avenue to extend theoretical morphospaces.
Theoretical morphospaces are far from a panacea, and for some questions about morphological innovations they can be a trap. MacLaurin (2003) noted that, as with any model, theoretical morphospaces capture only a component of form, and users must be particularly careful not to claim greater generality for the results than is justified by the model. An additional problem is the bounded nature of theoretical morphospaces, which inhibits (or eliminates) the study of significant innovations. These morphospaces may be particularly valuable to studying heterochronic changes, while downplaying heterotypic and other morphological changes (Eble 2000b). Finally, if the dimensionality of the morphospace itself changes during innovation, as may often be the case, then neither empirical nor theoretical morphospaces accurately capture the dynamics of the morphological changes.
Morphospace in a phylogenetic context. Morphospaces show the distribution of forms in space but provide no information about whether two closely aligned forms are phylogenetically related or share a locality as the result of convergence. Resolving this requires mapping the phylogeny of the group within the morphospace (Bookstein et al. 1985; Wagner 1995, 1997; David and Laurin 1996; Foote 1996a; Eble 2000b; Stone 2003). Relatively few studies have included a phylogenetic component. Doing so allows at least partial reconstruction of ancestors, permits testing hypotheses of evolutionary transformation and can aid in identifying evolutionary constraints.
When morphospaces are three-dimensional mathematical spaces, mapping phylogenies faces the difficulty of positioning non-terminal nodes. Because these internodes represent successive common ancestors, correct positioning of them is essential for testing hypotheses about the patterns of transformation. In character-based studies of disparity this is relatively easy using maximum-likelihood or similar methods. Stone (2003) proposed use of geometric algorithms for three-dimensional spaces, producing what he termed a ‘cladistic morphospace’. This technique assumes that evolutionary transitions will follow the straightest path; thus, the technique may be increasingly likely to produce invalid results as the taxa are exemplars of larger clades, or the sparser the recovered record. In their study of the radiation of iguanid lizards, Harmon et al. (2003) developed an alternative approach to the inclusion of phylogenetic information that obviates the need to map nodes into the morphospace, but which does not yet seem to have been applied in palaeontological studies (discussed further below).
Problems in assessing disparity. As with any technique there are a variety of potential difficulties in assessing changes in morphological disparity. Distinguishing functional explanations for patterns of morphospace occupation from historical and developmental constraints and contingency is not necessarily straightforward. Issues associated with which statistical techniques best sample within-group variance were discussed above, and Raup (1972, 1987) noted that features not captured by the analysis may influence the results. The issue of character exhaustion when discrete characters are used to analyse disparity was also noted above. When comparing the disparity patterns of two or more clades, differences in clade age can become important. As a clade ages we expect morphological disparity to expand; this requires analyses of disparity to control for clade age and to distinguish random walks through morphospace from patterns that need and require explanation.
Because patterns of morphospace occupation have been interpreted as evidence of particular evolutionary processes, they must be distinguished from random walks (Foote 1996b; Gavrilets 1999; Pie and Weitz 2005). Gavrilets's (1999) simple model of taxonomic diversity and disparity showed that deceleration of morphological disparity during a radiation was an expected consequence of the geometry of morphospace coupled with speciation and extinction. This suggests that changing early maximal disparity may not indicate time-inhomogeneous changes in evolutionary patterns. It is not obvious that this criticism applies to any individual case, however, as it requires a demonstration that the empirical distances between taxa have approached the upper limit (M. Foote, pers. comm. 2006). A different approach was advocated by Pie and Weitz (2005) who used branching random walks as a null model for morphospace occupation, with the goal of stripping out the random component. This approach is derived from the earlier work of Raup and Gould (1974; see also Bookstein 1987). Particularly significant was their observation that on simulated adaptive landscapes, clumping of forms into regions of morphospace occurs even if the landscape is flat and thus requires no special explanation. This irregular occupation of morphospace reflects the likelihood of coupled random walks.
Clade age can be a troubling issue, but has been relatively infrequently addressed by palaeobiologists. One normally expects that the older of two clades will have greater disparity, simply because it has had a longer interval over which to explore possible morphologies. However, simply comparing two sister clades may be insufficient if the crown groups under study are of different ages. For example, Collar et al. (2005) compared the increases in morphological diversity between two clades of centrarchid fish (sunfish and black basses). After controlling for different ages of the crown groups, the possibility of random walks and considering rates of change, they concluded that: ‘… comparisons of within-group morphological variance can be useful for examination of patterns of diversity at some point in time, [but] variance comparisons may confound two distinctly different causes of trait variance – time and the rate of evolution of the trait’ (p. 1790). Because rates of morphological change are independent of phylogeny and time, Collar et al. concluded that they are a more useful metric than disparity for comparing morphological diversity. This is unlikely to be true for some of the questions of interest to palaeobiologists and other macroevolutionists, but the concerns raised in these studies should be considered.
Differences in analytical methods and procedures including temporal scale and levels of taxonomic refinement and varying efficacy of corrections for clade age, random walks and other factors make comparisons between the results of different studies of disparity hazardous (some might say foolhardy). Villier and Eble (2004) used Eble's (2000a) data on Cretaceous spatangoid echinioids to evaluate how different sources of data (homologous landmarks, traditional morphometrics and discrete characters) and analytical methods influenced the analysis of disparity, and also investigated the role of time and using species vs. genera as the units of analysis. In this analysis, the pattern of disparity appears relatively robust to differences in data, analysis and temporal scale. In particular, Villier and Eble's results suggest that sampling may be less of an issue than suggested by the simulations of Ciampaglio et al. (2001).
It is perhaps worth noting that disparity maps in interesting ways to the morphologies defied by systematists. The first, and probably still the most influential, such study was the work of Tabachnick and Bookstein (1990) on variation in the Miocene foraminifera Globorotalia. They showed that the named species bore little relationship to the relatively continuous distribution of forms at one time, and even more interestingly, that the distribution of forms evolved to a more clustered distribution. By contrast, Courville and Crônier (2005) evaluated the disparity among presumed ‘species’ of the Jurassic ammonoid Kosmoceras and found a better relationship to previously described taxa.
Summary. This review suggests the following conclusions. First, that a variety of quantitative techniques for the analysis of disparity are now available for both geometric morphometric data and character-based studies. Use of several methods together seems wise, and average pairwise dissimilarity, which is less affected by sample sizes, is a useful metric for differences between taxa; unique pairwise character combinations can reveal the amount of space occupied; and distance metrics such as PCO volume or mean pairwise distance reveal changes in morphospace occupation. Second, use of discrete characters may create a problem with character exhaustion and could produce underestimates of disparity relative to studies using continuous characters. Third, the use of phylogenetic information in conjunction with disparity studies is the only way to recover step sizes between morphologies and transformation series, although studies without such a framework are useful for many questions. Finally, many of the initial studies of disparity did not consider issues of random walks from an initial condition, differences in clade age, or other confounding difficulties.