Emerging patterns in the comparative analysis of phylogenetic community structure


Steven M. Vamosi, Fax: (403) 289-9311; E-mail: smvamosi@ucalgary.ca


The analysis of the phylogenetic structure of communities can help reveal contemporary ecological interactions, as well as link community ecology with biogeography and the study of character evolution. The number of studies employing this broad approach has increased to the point where comparison of their results can now be used to highlight successes and deficiencies in the approach, and to detect emerging patterns in community organization. We review studies of the phylogenetic structure of communities of different major taxa and trophic levels, across different spatial and phylogenetic scales, and using different metrics and null models. Twenty-three of 39 studies (59%) find evidence for phylogenetic clustering in contemporary communities, but terrestrial and/or plant systems are heavily over-represented among published studies. Experimental investigations, although uncommon at present, hold promise for unravelling mechanisms underlying the phylogenetic community structure patterns observed in community surveys. We discuss the relationship between metrics of phylogenetic clustering and tree balance and explore the various emerging biases in taxonomy and pitfalls of scale. Finally, we look beyond one-dimensional metrics of phylogenetic structure towards multivariate descriptors that better capture the variety of ecological behaviours likely to be exhibited in communities of species with hundreds of millions of years of independent evolution.

‘Melding of concepts of community ecology and macroevolution provides the necessary foundation for exploring the causes of the distributions and abundances of species.’            –McPeek (2007)


Evolutionary ecology is undergoing a period of rapid and exciting development based partly on new computational tools. We now have the ability to integrate analyses of (i) contemporary interactions among co-existing members of ecological communities, (ii) community effects on trait evolution, and (iii) the effects of diversification and trait evolution on how community members interact. Although evolutionary biologists have long realized the difficulties posed by the co-existence of closely related species, they largely focused on pairs of species in the context of evolutionary responses to interspecific competition (e.g. Fjeldså 1983; Schluter & McPhail 1992; Pfennig & Murphy 2000; but see Vamosi 2003 for a consideration of the influence of species pool on divergence of focal species). Conversely, community ecologists explicitly incorporated the complex communities and food webs within which species exist, but for a long time largely ignored phylogenetic relatedness of interacting species (e.g. Paine 1966; Brown 1989; Leibold 1996) — with the notable exception of interest in species-per-genus ratios (Elton 1946; Moreau 1948; Williams 1964; Järvinen 1982). Since an early attempt to perform a phylogenetic analysis of community structure (Webb 2000), and a review of the theoretical and empirical roots of such methods (Webb et al. 2002), the use of molecular phylogenies to investigate patterns in community structure has gone from an incidental application to a burgeoning subdiscipline. At least 42 such articles were published in 2006 and 2007 alone [number obtained by searching the ISI Web of Science Science Citation Index Expanded database with search criteria TS = (‘phylogenetic community structure’) OR TS = (‘phylogenetic structure’ AND communit*) OR TS = (‘phylogenetic ecology’) OR TS = (phylogen* AND ‘community ecology’)]. This rapid accumulation of data makes it worthwhile to spotlight what has been achieved and to investigate whether general patterns are beginning to emerge. Our review is complementary to other recent discussions (e.g. Pennington et al. 2006; Johnson & Stinchcombe 2007; Emerson & Gillespie 2008) but differs from these in its focus on phylogenetic community structure as the central topic and in its inclusion of the first meta-analysis of results from published phylogenetic community ecology. We review and discuss: (i) the history and foundations of phylogenetic community structure analyses, (ii) overall trends in empirical studies conducted to date, and (iii) shortcomings and outstanding issues with this approach. We find that some general patterns in phylogenetic community structure are emerging, although a number of issues need increased attention.

Overview of phylogenetic community structure analysis

Because organisms interact via their phenotypes, and because phenotypes are not randomly distributed with respect to phylogeny, we should expect that the phylogenetic composition of a community is partially the product of species interactions. This idea was first explicitly acknowledged by Darwin (1859), and became the basis for studies of the taxonomic structure of communities (Elton 1946; Moreau 1948; Williams 1964), which reasoned that communities with fewer congeners than expected were exhibiting evidence of competition among morphologically and behaviourally similar, related species. While the analytical predictions of some of these studies were in error, as pointed out by Simberloff (1970), the underlying expectation that competitive exclusion should reduce the incidence of close taxonomic or phylogenetic relatives in communities remains fundamental to today's studies.

Studies of this kind rely on tests of species’ distribution against null models, and the assumptions of these null models came under sharp scrutiny in the 1970s and 1980s (Gotelli & Graves 1996). That scrutiny may partially explain a decline in interest in taxonomic community structure during those decades. However, that decline may also reflect dissatisfaction with taxonomies as a proxy for phylogenies, given the commonness of finding nonmonophyletic taxonomic groups and the frequent failure of classical taxonomic groupings to map onto ecologically relevant ones. For example, in plants it is unlikely that clades defined by floral morphological characters should contain species that also have consistent edaphic preferences. Rather, sub- or super-clades might make more coherent ecological units. Similarly, sets of organisms with the same taxonomic rank may vary greatly in stem- and crown-group ages, and thus the potential for ecological diversification: for example, the 25% quantiles of angiosperm family age have been estimated at 25, 55, 66, 86, and 159 million years (Davies et al. 2004). Quantifying phylogenetic structure, especially if branch length data can be incorporated, can avoid these limitations of taxonomy. Note that the arguments here are not about an increased likelihood of finding patterns with multi-node phylogenies per se, but rather against the likelihood of detecting meaningful patterns using traditional classes.

Studies of taxonomic community structure usually quantified hierarchical taxonomic diversity (e.g. species-per-genus ratios), and studies of phylogenetic community structure are analogous, but based instead on a phylogenetic hypothesis. The phylogeny required is constructed either for all the taxa in an appropriately defined regional species pool, or for the total list of all taxa in all sample units. Depending on the study, the phylogeny may be constructed de novo, or by assembly, grafting or subsetting of published phylogenies. This phylogeny provides the phylogenetic distance measures needed for calculation of structure metrics, and thus needs to include branch length information. Sometimes a lack of branch-length data is dealt with by simply setting all branch lengths equal, which makes calculating phylogenetic distance between two taxa a matter of simply counting nodes along the phylogenetic path between the two taxa; however, it is not clear how well node-counting works.

While a number of different metrics are available (see Box 1), they have a common aim in quantifying phylogenetic diversity, or ‘phylodiversity.’ A sample containing taxa that are (phylogenetically) distantly related has relatively high phylodiversity (or shows phylogenetic evenness), whereas one containing closely related taxa has relatively low phylodiversity (or shows phylogenetic clustering). Phylodiversity can also be measured absolutely, for example in millions of years of independent evolution, from total subtended branch length on a chronogram. Such an absolute calculation of phylodiversity can be made for any set of species, and so does not depend on debatable definitions of local ‘communities’ (species lists) or the regional species pools from which they are drawn.

Null models

While these metrics of phylogenetic structure can be informative in their own right, or be used in comparisons of communities in different situations (e.g. Kelly 1999; Webb et al. 2006), their most frequent use is in the detection of nonrandom community structure. Observed metrics of phylogenetic structure can be assessed against an appropriate null model to determine whether the taxa that occur in a sample have higher or lower than expected phylodiversity, and inferences made about the reasons for nonrandom phylodiversity. This approach is analogous to comparing the observed morphological similarity of taxa in a community to the expected similarity (e.g. Brown 1989), and may employ the methods of null-model analysis of presence–absence matrices (Gotelli 2000). The role of these null models is to randomize the community data so as to remove all (but only) effects of the mechanisms under study (Gotelli & Graves 1996), that is, removing any effect of species identity on composition, and therefore of species phylogenetic relationship. Most studies of phylogenetic community structure to date have generated randomized communities by drawing species at random from an appropriately defined species pool. Where plots, quadrats or captures are replicate samples of a larger community, and thus not independent, more appropriate random samples can be constructed using methods that incorporate the relative incidence of taxa across all samples (Gotelli 2000; see Box 2). Another approach to assessing nonrandomness in phylogenetic community structure is to correlate the co-occurrence probability for all pairs of taxa with their phylogenetic distance, testing the significance of the r2 with a standard Mantel test, a quantile regression (Slingsby & Verboom 2006), or some other more appropriate randomization of species composition (Cavender-Bares et al. 2004).

Table Box 2 . A checklist for analysis of phylogenetic community structure
We suggest that all investigators ask the following questions of their study. The nature of the issue (e.g. bias, interpretation, or randomization) is indicated where appropriate.
What is the taxonomic scope of the study?
Are all likely ‘interactors’ included in the pool? If not, will reliable interpretation of biotic causes of phylogenetic structure be possible? (Interpretation)
Are there any taxa that are very distantly related to the others and whose inclusion may create biases? (Bias; randomization)
Are all clades expected to be homogeneous with respect to the ecological processes under investigation? If not, are the intended measures of phylogenetic structure sensitive enough to detect contrasting structure in different parts of the tree? (Metrics)
What is the spatial scale of the species pool?
Is the pool (i) the combined taxon list for all samples, or (ii) an appropriate list of possible taxa from external sources?
What is the likely magnitude of dispersal limitation among samples? Do you expect the probability of a species occurring in a random sample (i.e. without biotic or substrate effects) to be a function of the incidence of the species in the observed samples? That is, should widespread and/or common taxa be more likely to occur in a randomized sample? (Randomization)
If the pool spans several biogeographical-scale regions, with some taxa having restricted distributions within the total sampled area, how should taxon lists for random samples be generated? (Randomization)
If the pool spans several biogeographical-scale regions, how can habitat-driven phylogenetic clustering be differentiated from local speciation? (Interpretation)
If abundances are being used, is there covariance in abundance and phylogenetic distance? (Bias; randomization)
What is the spatial scale of the sample unit?
A priori, what are the likely ecological or evolutionary mechanisms that may lead to nonrandom phylogenetic structure of the samples?
Is a sample homogenous with respect to the abiotic factors being considered? That is, is it a habitat? (Interpretation)
Are individuals in each samples actually capable of biotic interaction? (Interpretation)
What are the relevant traits that influence phylogenetic structure? The measurable traits may not be the ecologically relevant ones!
Could mutualisms be a possible cause of phenotypic attraction? (Interpretation)

The debate over community null models (how to define the community, how to randomize, how to interpret, etc.) continues, and practitioners of phylogenetic community structure analysis should be aware of a number of concerns (also see Box 2). Type I error rates (detection of spurious phylogenetic structure) can be inflated under several circumstances, including when (i) frequencies of species in a pool of samples are not even, and null samples are generated by drawing from the pool species list with replacement (Kembel & Hubbell 2006), and (ii) there are long branches to rare taxa (e.g. angiosperm trees vs. tree ferns; Kembel & Hubbell 2006). Type II error rates can be strongly influenced by the relative sizes of samples and source pools (Kraft et al. 2007; see Species pools and the power to detect patterns), and may be inflated when there is a phylogenetic signal in the abundance structure of the observed community, but a randomization test uses only presence and absences (Kembel & Hubbell 2006; Hardy 2008). Finally, dispersal limitation on both ecological and biogeographical scales may invalidate simple shuffling methods of null sample creation.


Nonrandom phylogenetic community structure can be caused by a variety of mechanisms. Because of their predominance in ecological thought, the roles of competitive exclusion and habitat filtering have received most attention, but other ecological forces such as mutualism or facilitation may also generate nonrandom patterns, and ‘competition,’ as ever, is a simple term that can hide much mechanistic complexity. The action of these mechanisms depends on the phenotypes of organisms, and thus patterns of evolved similarity (conservative or convergent) will influence the resulting phylogenetic community structure. In the simplest possible combination, phylogenetic clustering should result when conserved characters determine habitat filtering, because they influence tolerance of abiotic conditions, and phylogenetic evenness should result if there is either filtering on convergent characters or competitive exclusion of species with similarity in conserved characters (Webb et al. 2002; Cavender-Bares et al. 2004). (See Box 3 and Fig. 1 for an example of phylogenetic clustering and evenness.)

Figure 1.

An example of phylogenetic community structure for dytiscid (diving) beetle communities in Alberta lakes. (a) The phylogeny of a partial regional pool of predaceous diving beetles, with the mean adult body length of each species indicated in the right panel. (b) and (c), ‘even’ and ‘clustered’ six-species communities, from two small lakes sampled in 2005 in southern Alberta (S. Vamosi, unpublished data). Colour codes represent different subfamilies (e.g. Dytiscus circumcinctus is in the subfamily Dytiscinae), for which body size is strongly conserved. Community B contains members of all five subfamilies represented, whereas community C contains members of only three subfamilies.

These pattern predictions are, however, very simple and care should be taken not to apply them uncritically to all situations. For example, one could imagine circumstances in which competition might drive clustering rather than evenness: if a phylogenetically conservative trait determines whether species are good competitors vs. fugitives, then a clade of good competitors might be overrepresented in competitive communities, with clustering the result (J. C. Cahill, personal communication, 2008). Furthermore, in any community composed of members of an old, diverse lineage, we expect multiple origins of key ecological characters, likely with variable amounts of character conservatism within descendant clades. Ecological mechanisms such as filtering and competition might themselves apply differently across descendant clades, and thus we might often expect complex patterns of nested phylogenetic evenness and clustering (e.g. Cavender-Bares et al. 2006; Slingsby & Verboom 2006; Swenson et al. 2006). Such patterns might not be well characterized by simple metrics, and as our hypotheses about phylogenetic community structure become more refined, we will increasingly need to use multiple, or multidimensional metrics (see Phylogenetic community structure beyond clustering and evenness below). A visual inspection of subclades showing clustering or evenness (as offered by the ‘nodesig’ algorithm in Phylocom, Webb et al. 2008) is one useful starting point for interpretation of index-based results. We must also strive to quantify ecologically relevant phenotypes, and directly examine the likely history of character evolution. Many studies have assumed that trait and niche conservatism dominate in their study organisms, and, while this may be broadly justified — morphology-based taxonomy has been successful over the centuries because of this fact — the relationship is not universally strong (e.g. Cahill et al. 2008). The take-home message is clear: we need to measure traits to be sure.

Finally, researchers must be aware of issues of spatial scale during interpretation (Swenson et al. 2006; see Box 2). Are ecological or evolutionary/biogeographical mechanisms more likely to be producing nonrandom phylogenetic structure, and if the latter, are the null models being employed appropriate?


We found 24 papers that dealt explicitly with determining whether communities tended to display phylogenetic clustering or evenness (Table 1). Accounting for the fact that a number of studies investigated more than one community type, we have data entries for 39 analyses of phylogenetic community structure. Note that these numbers do not exactly align with the number quoted in the Introduction, because here we restricted our attention to original analyses of phylogenetic community structure that contained sufficient raw/summary data to contemplate conducting our own analyses. In most cases, the authors had calculated and reported patterns in community structure, but for one study we re-analysed reported data [Gillespie 2004, for which we calculated NRI (net relatedness index) values from the phylogeny and species occurrences reported in the paper, and used a Wilcoxon rank test to test whether NRI values were significantly different from zero]. We contacted a number of authors for various data, the two most common being mean number of species per plot and size of plot.

Table 1.  Summary of studies used for analyses of patterns in phylogenetic community structure
‘Taxon’Habitat typeTaxonomic breadthTaxonomic scale considered?Root node age (Ma)Trophic levelLocalityTemp/ trop?Total area (km2)Plot area (ha)Total number of speciesMean number of species per siteNumber of sitesClustering/ evennessReference
Yeast of columnar cactiTerrOrderNo50DecomposersNorth American desertsTemp518001325.7998E (A)Anderson et al. 2004
Yeast of Opuntia cactiTerrOrderNo50DecomposersNorth American desertsTemp518001315.8950noneAnderson et al. 2004
BacteriaFWDomainYesNot reportedDecomposersMichigan, USATempn/a0.0003Not reported1065CHorner-Devine & Bohannan 2006
Soil bacteriaTerrDomainNoNot reportedDecomposersVirginia + Delaware, USATemp5000< 10e–5Not reported44.25C (HD)Horner-Devine & Bohannan 2006
Ammonia-oxidizing bacteriaTerrDomainNoNot reportedDecomposersCosta RicaTrop1610e–8Not reported2514C, E, noneHorner-Devine & Bohannan 2006
Ammonia-oxidizing bacteriaSWDomainNoNot reportedDecomposersMaryland, USATemp6400< 10e–5Not reported20.85CHorner-Devine & Bohannan 2006
Denitrifying bacteriaSWDomainNonot reportedDecomposersMaryland, USATemp6400< 10e–5Not reported685C (HD)Horner-Devine & Bohannan 2006
acI lineage of ActinobacteriaFWOrderNon/aDecomposersWisconsin, USATemp1697905.711 [N]4.818CNewton et al. 2007
Various treesTerrClassNo147Primary producersWest Kalimantan, IndonesiaTrop1.50.1632461.128CWebb 2000
OaksTerrGenusNo22Primary producersFlorida, USATemp15800.1173.474ECavender-Bares et al. 2004
All seed plants + fernsTerrKingdomYes400Primary producersFlorida, USATemp110000.114119.955C (E on finer taxonomic scale)Cavender-Bares et al. 2006
All trees/shrubsTerrKingdomYes260Primary producersFlorida, USATemp1100096.521619.175C (E on finer taxonomic scale)Cavender-Bares et al. 2006
Various herbs, shrubs, and trees – fynbosTerrPhylumNo160Primary producersSouth AfricaTemp1920000.0115527.416CProcheşet al. 2006
Various herbs, shrubs, and trees – grasslandTerrPhylumNo160Primary producersSouth AfricaTemp1920000.0114825.816CProcheş et al. 2006
Various herbs, shrubs, and trees – karooTerrPhylumNo160Primary producersSouth AfricaTemp1920000.017213.316CProcheş et al. 2006
Various herbs, shrubs, and trees – thicketTerrPhylumNo160Primary producersSouth AfricaTemp1920000.0110624.716noneProcheş et al. 2006
Ceanothus shrubsTerrGenusYes65Primary producersCalifornia, USATemp41000010 (?)162.151EAckerly et al. 2006
Meadow plantsTerrPhylumNo160Primary producersTadham Moor, Somerset, UKTemp0.220.00014222.9844noneSilvertown et al. 2006
Meadow plantsTerrPhylumNo160Primary producersCricklade, Wiltshire, UKTemp0.440.00013322.8644noneSilvertown et al. 2006
Schoenoid sedgesTerrTribeYes31Primary producersCape Floristic Region, SATemp197700.005269.2921noneSlingsby & Verboom 2006
Tetraria sedgesTerrGenusYes< 31Primary producersCape Floristic Region, SATemp197700.005155.2921ESlingsby & Verboom 2006
Rain forest treesTerrKingdomNo360Primary producersBarro Colorado Island, PanamaTrop0.50.01, 0.04, 0.25, 131228.4, 63.5, 121.0, 186.75000, 1250, 200, 50C, E (varied by habitat)Kembel & Hubbell 2006
Rain forest treesTerrKingdomYes300Primary producersPanamaTrop14320.041270Not reported30CSwenson et al. 2006
Rain forest treesTerrKingdomYes300Primary producersCosta RicaTrop511000.042261Not reported30CSwenson et al. 2006
Rain forest treesTerrKingdomYes300Primary producersPuerto RicoTrop113.30.04281Not reported30CSwenson et al. 2006
Rain forest treesTerrClassNo147Primary producersWest Kalimantan, IndonesiaTrop1.50.00364464828EWebb et al. 2006
Canopy treesTerrPhylumNo144Primary producersEquatorial GuineaTrop6251273Not reported – n/a28CHardy & Senterre 2007
Coastal vegetation (high fire)TerrKingdomNo300Primary producersE Iberian Peninsula, SpainTemp4200590085695CVerdú & Pausas 2007
Montane Mediterranean vegetation (low fire)TerrKingdomNo300Primary producersE Iberian Peninsula, SpainTemp7020981571484E (marginal)Verdu & Pausas 2007
Nurse (and associated) speciesTerrKingdomNo300Primary producersZapotitlán Valley, MexicoTrop32000.41043712EValiente-Banuet & Verdú 2007
Cape flora (east of division)TerrPhylumNo195Primary producersCape Floristic Region, SATemp4500067500821157.5101EForest et al. 2007
Cape flora (west of division)TerrPhylumNo195Primary producersCape Floristic Region, SATemp4500067500820197.2100CForest et al. 2007
Monogenean parasites of roachFWGenusNon/aParasitesMorava River, Czech RepublicTemp2008.5e–792.4 [M]195C (on three spatial scales)Mouillot et al. 2005
Tetragnatha spiders (Arachnida: Tetragnathidae)TerrGenusNo5Small predatorsHawaiian IslandsTrop1632731500173.112CGillespie 2004
Diving beetles (Coleoptera: Dytiscidae)FWFamilyNo125Small predatorsAlberta, CanadaTemp661850351068.553CVamosi & Vamosi 2007
Various bony fishesFWInfraclassNo215Predator, algivores, insectivoresE BrazilTrop6600.03258.653nonePeres-Neto 2004
Wood warblersTerrFamilyNo6InsectivoresNorth AmericaTemp167780002513434.43501ELovette & Hochachka 2006
Various bony fishesFWInfraclassNo215PredatorsWisconsin, USATemp1697901585812.338CHelmus et al. 2007a
Sunfishes (Centrarchidae)FWFamilyNo29PredatorsWisconsin, USATemp16979066113.2890C, E (H)Helmus et al. 2007b
A, abundant species tended to be genetically distinct from one another; H, repulsion detected after environmental conditions considered; HD, clustering was the most frequent response, but evenness and none also observed; M, estimated from Table 5 in Šimkováet al. 2001; N, lineages (rather than species in the strict sense) made up of several polymerase chain reaction clones.

We had three specific goals in our analyses. First, we calculated the relative proportion of clustering vs. evenness in the studies reviewed. Second, with the additional data we collated (Table 1), we explored the relative representation of different taxonomic groups, habitat types, geographical scales, and trophic levels in these studies. Where appropriate, we call attention to biases in data. Finally, we examined associations between reported patterns (clustering, evenness, none) and sizes of local communities and regional pools.


Incidence of phylogenetic clustering

Overall, studies of contemporary communities conducted to date have documented all three possible patterns (clustering, evenness, and not significantly different from random), with each occurring in multiple studies (Table 1). Clustering, however, has been the most commonly observed pattern: 18 analyses found an overall pattern of clustering, with an additional five analyses finding clustering in some habitat types (Horner-Devine & Bohannan 2006; Kembel & Hubbell 2006), at broader taxonomic scales (Cavender-Bares et al. 2006), or before accounting for the effect of environmental variables on co-existence patterns (Helmus et al. 2007b). For an expanded consideration of one study that found clustering overall (Vamosi & Vamosi 2007), see Box 3. While the set of studies available to date is far from representative of the diversity of taxa and habitats on Earth (see next section), it is important to note at this point that studies that find clustering have included multiple taxonomic groups and multiple habitat types.

Taxonomic and habitat representation

With reference to Table 1, one pattern is especially obvious: the majority of analyses (24, or 62%) to date have been conducted on land plants. There are at least four reasons for this uneven taxonomic representation. First, many of the early researchers in this field, including the three authors of Phylocom (Webb et al. 2008), are land-plant community/evolutionary ecologists (e.g. Webb 2000; Ackerly et al. 2006; Kembel & Hubbell 2006). Second, plant communities beg for subtle pattern analysis because of the apparent lack of axes along which plants can partition resources. Third, the diversity of flowering plants has drawn the attention of many molecular systematists, so supertrees are available with largely well-resolved ordinal and family relationships (e.g. Stevens 2001 onwards; Davies et al. 2004; however, the lack of much within-family resolution limits the power of many plant-based analyses and means that many of the results in Table 1 largely reflect patterns at the ‘family-level’ and above). Finally, the demarcation of appropriate sampling scales and habitats is often more obvious, and the sampling of individuals easier, for plant communities than it is for mobile animal species.

The practicality of demarcating and studying well-defined ‘community’ samples leads to another conspicuous pattern in the studies we tabulated: of the 15 analyses not involving land plants, nine (60%, or 23% of the total) were for communities of discretely bounded habitats. Such communities include yeasts in cactus-tissue rots (Anderson et al. 2004), bacteria in experimental freshwater mesocosms (Horner-Devine & Bohannan 2006), Actinobacteria in lakes (Newton et al. 2007), monogenean parasites of river-dwelling roach (Mouillot et al. 2005), predaceous diving beetles in lakes (Vamosi & Vamosi 2007), and bony fish in rivers (Peres-Neto 2004) or lakes (Helmus et al. 2007a, b). Such discrete communities have, of course, a long history of exploitation as model systems, so the existence of this bias in phylogenetic community ecology studies is not a surprise.

Together, these biases create a dramatic and important restriction in study diversity: what we know about phylogenetic community structure applies to a small subset of Earth's ecosystems. For instance, there have been only two studies in marine systems (both of bacteria; Horner-Devine & Bohannan 2006), despite the fact that oceans cover 70% of the Earth's surface. Equally strikingly, among terrestrial studies a study of North American wood warblers (Lovette & Hochachka 2006) is the only one to date not focusing on microbes or plants. This situation is sure to improve simply through accumulation of more studies, but we urge ecologists working in understudied taxa and habitats to begin filling in some of the gaps.

Trophic level biases

As might be expected given the taxonomic biases detailed above, there are striking inequalities in the representation of different trophic levels in our compilation. Primary producers are best represented, with 24 analyses, followed by decomposers (N = 8), predators (including insectivores; N = 6), and parasites (N = 1). This pattern is in striking contrast to that reported by Schluter (2000, p. 144) in his review of studies of ecological character displacement. At the time of his synthesis, there were a total of 60 cases, with carnivores best represented (N = 35), followed by herbivores/granivores/omnivores (N = 14), primary producers (N = 5), and scavengers/detritivores/microbivores (N = 6). These dramatically different trophic level biases are surprising, given that the role of interspecific competition in mediating co-existence of closely related species is surely a common theme in studies of both character displacement and phylogenetic community structure.

The trophic-level bias is unfortunate for many reasons, not least because it prevents testing a particularly interesting hypothesis: the phylogenetic community structure version of the ‘green world’ hypothesis (Hairston et al. 1960). Discussing evolutionary divergence in response to species interactions, Schluter (2000, p. 156) suggested that, because of alternating control of populations based on their trophic level, one might expect character displacement at the highest trophic level in a community, with mixtures of character displacement and divergence via apparent competition at all lower levels. In the context of phylogenetic structure, we might then expect to find clustering in the species present at the highest trophic level, coupled with evenness in trophic traits but no pattern in antipredator traits. Conversely, at the next trophic level, we might expect to find evenness in antipredator traits and either no pattern or greatly reduced evenness in trophic traits. The relative amounts of evenness in trophic vs. antipredator traits at lower trophic levels will likely be context specific. For example, if character displacement in trophic traits is associated with habitat shifts (e.g. Schluter & McPhail 1992), and the different habitats contain different predators, this may also produce correlated evenness in antipredator traits (e.g. Vamosi & Schluter 2004). We hope for further investigations of such communities. Again, analyses of the phylogenetic relatedness of species will have to be complemented with an understanding of the phylogenetic distribution and function of traits. Such investigations would complement the recent growth of interest in the role of predators in driving divergence and diversification (e.g. Geffeny et al. 2005; Meyer & Kassen 2007), a topic long been overshadowed by a focus on interspecific competition for resources (Vamosi 2005).

Taxonomic and geographical scales

The execution of any ecological study involves decisions about appropriate delineation of the organisms and sites under study. These decisions will be of particular importance for phylogenetic community ecology, because in several ways taxonomic and geographical scales of the study may plausibly have major influences on both expectations and results. Taxonomic and geographical scale are likely to be critically important to at least three issues: the definition of local communities, how plot and study areas relate to the scales of individual movement, biogeographical range movement, and speciation, and how the relative sizes of local and species-pool biotas affects the statistical power of community structure studies.

The scales of ‘local communities’ and the ‘Darwin–Hutchinson zone’.  Ecologists have never settled on a single definition of ‘local community’, largely because no single definition would suit all purposes. The studies we review here have (explicitly or implicitly) offered their working definitions in the taxonomic breadth of the species studied and in the geographical scale covered by each replicate plot (Fig. 4). Taxonomic and geographical scale choices must be carefully examined, because interspecific competition is often offered as a mechanism that could produce phylogenetic evenness in community structure. Such a mechanism, however, presumes that communities are studied on a spatial scale allowing for direct interactions between individuals of different species, or at least allowing for such interactions over a few generations of local dispersal. It also presumes a taxonomic scale making competitive interactions likely — and despite many specific exceptions (e.g. Brown & Davidson 1977; Schluter 1986; Cramer et al. 2007), the strength of competition is typically expected to be strongest between members of closely related groups (e.g. Darwin 1859; Jenssen 1973; Barnes 2003). These considerations suggest particular interest in phylogenetic community structure measured for plots of relatively small size and for clades of fairly closely related species — that is, those falling in the lower left corner of Fig. 4. This might be called the ‘Darwin-Hutchinson zone1’ because studies falling in this zone are most directly relevant to the competitive structuring of local communities discussed famously by Darwin (1859) and Hutchinson (e.g. 1959).

Figure 4.

Taxonomic and geographical scales of contemporary studies in phylogenetic community structure. The x-axis refers to the geographical extent of individual study plots, and the y-axis the highest taxonomic level considered in the study. The ‘Darwin–Hutchinson zone’ marks an approximate region of the plane for which we might be particularly interested in community phylogenetic structure: plots small enough for individuals to interact, and species closely enough related that competition is a plausible expectation (see Taxonomic and geographical scales for further discussion).

Of course, defining the boundaries of the Darwin–Hutchinson zone is not easy, and appropriate definitions will obviously depend on the ecology and dispersal abilities of the organisms in question. Many plant ecologists argue that plant communities are likely to be intensively competitive across wide phylogenetic distances (e.g. Cramer et al. 2007; Cahill et al. 2008), because plants tend to compete for relatively few limiting resources (Tilman 1982); however, this proposition is open to debate (Elser et al. 2007). While there are relatively few direct tests, competition-relatedness relationships do exist for at least some plant groups (e.g. Cahill et al. 2008 found a weak but significant positive relationship within monocots but not for eudicots or in a combined data set). In the end, no single delineation of a Darwin–Hutchinson zone will satisfy all ecologists, but strictly for the purpose of discussion, we suggest an arbitrary general definition in Fig. 4: taxonomic scale of family or below, and plot area less than 5 ha. Only four data sets, from three studies, fall into the Darwin–Hutchinson zone by this definition: Mouillot et al.'s (2005) study of parasites on individual roach, Slingsby & Verboom's (2006) study of schoenoid sedges in South African fynbos, and Cavender-Bares et al.'s (2004, 2006) study of Florida oaks. Of these studies, one found phylogenetic clustering (Mouillot et al. 2005), one found phylogenetic evenness (Cavender-Bares et al. 2004), and the third found evenness at the generic scale but not the tribal scale (Slingsby & Verboom 2006). (A more generous delineation of the Darwin–Hutchinson zone for plants might include two additional studies at the class level; of these, one found clustering and one evenness; Webb 2000; Webb et al. 2006). This sample size is, unfortunately, much too small for a formal statistical test of the prediction that evenness (as a potential consequence of competition) should be more common inside the Darwin–Hutchinson zone. An alternative approach to such an analysis is to treat taxonomic breadth and plot area as continuous variables: for taxonomic breadth, we gave ‘species’ the value 1, ‘genus’ the value 2, and so on. Using this approach, we find that taxonomic breadth is a significant predictor of study outcome, with evenness more likely on finer taxonomic scales (Table 2a). Surprisingly, plot area is not a significant predictor of study outcome, despite the argument above for the importance of local interspecific interactions. However, we caution that there are few very small plots in the data set.

Table 2.  Logistic regressions of study outcome against (a) plot area and taxonomic scale (evenness vs. other results), and (b) regional species pool richness (clustering vs. other results). Analogous analyses treating outcome as a three-state variable (evenness, clustering, or not different from random) yielded similar results
Taxonomic scale 15.900.015
Plot area 10.670.41
Interaction 10.720.40
Pool richness 11.820.40

Another approach to assessing the effect of taxonomic scale on phylogenetic structuring is provided by four studies that have explicitly tested for evenness vs. clustering for taxonomically nested sets of species in the same plots. Three of these were conducted on plant communities (Cavender-Bares et al. 2006; Slingsby & Verboom 2006; Swenson et al. 2006) and revealed similar patterns: as taxonomic scale became finer, there was increasing evidence for evenness. For example, for three community types in Florida, USA, clustering was the dominant pattern observed when all plant species, all angiosperm species or all tree and shrub species were included (Cavender-Bares et al. 2006). However, analyses restricted to Quercus species found weak to significant evenness. Analyses restricted to Pinus or Ilex species did not find evenness, although this may reflect small regional species pools for these genera (N = 6 and 7). In contrast to the plant examples, analyses of bacterial communities in freshwater mesocosms consistently found clustering, whether all bacteria were considered or the three major groups (Alphaproteobacteria, Betaproteobacteria, and Cytophaga-Bacteroides-Flavobacteria) were analysed separately. This may simply mean that these major bacterial groups are still too broad for interspecific competition to be an important force in community organization, but it may also reflect recent in situ‘biogeographical’ diversification among samples.

Further studies that explicitly address the influence of taxonomic scale on the outcome of community-structure analyses are clearly needed, as are many more studies falling in the Darwin-Hutchinson zone. Unfortunately, analyses of narrowly defined clades tend to be of low power, because in many cases few members of the clade occur in each community. This will likely represent a significant obstacle to progress on this front (Heard & Cox 2007).

Geographical scale, dispersal limitation and speciation.  The fact that few members of a clade occur in any one community indicates that one important aspect missing from most community phylogenetic analyses is an explicit consideration of the role of speciation, local dispersal limitation (see Box 4 and Fig. 3), and biogeography. In studies with very large spatial scales, phylogenetic pattern in local communities can be strongly influenced by the biogeography of speciation and how it interacts with the movement or stability of geographical ranges. One the one hand, close relatives may be unlikely to co-occur if speciation is mostly allopatric and geographical ranges are relatively stable through time (Johnson & Stinchcombe 2007), leading to evenness in local communities. In contrast, frequent sympatric speciation with similarly stable range boundaries could drive a pattern of phylogenetic clustering. An examination of the ranges of sister taxa (e.g. Barraclough & Vogler 2000) could help identify whether apparent signatures of allopatric or sympatric speciation are present, although such enterprises have been controversial for a number of reasons (Losos & Glor 2003; Bolnick & Fitzpatrick 2007).

Figure 3.

The relationship between regional and local species richness in compiled studies of phylogenetic community ecology. Studies falling between the dashed lines have mean local species richness between 30% and 60% of the total regional species richness.

Because in many groups pervasive sympatric speciation appears unlikely (Bolnick & Fitzpatrick 2007), we suspect the possibility of parallel allopatric speciation driving evenness will interest more evolutionary ecologists. However, phylogenetic community analyses often incorporate greater taxonomic scale than just a single genus or family (see Table 1), and in order for allopatric speciation and dispersal limitation to produce the pattern of significant phylogenetic evenness, parallel allopatric speciation must occur in several clades. This condition may be met when a region features discrete geographical barriers such as a mountain range that affect many clades, but we do not know whether to expect this to be common. Future studies should consider controlling for this sort of effect by building range information into randomly constructed communities (i.e. a species can only be randomly re-assigned to a plot that is within its extant range limits). Unfortunately, such methods are not readily available or currently employed (see Box 4).

Large-scale range limits (under allopatric speciation) and local-scale dispersal limitation are roughly equivalent in their effects on phylogenetic community structure — both will give the appearance of evenness. Experimental evidence indicates that local dispersal limitation of individuals may be a mechanism underlying the infrequent observation of co-existence of closely related competitors, rather than competitive exclusion (e.g. Hurtt & Pacala 1995; Tofts & Silvertown 2002). Few studies have attempted to grapple with the issue of dispersal limitation as it affects phylogenetic structure, although Kembel & Hubbell (2006) offered some evidence for competition (and thus against the alternative that evenness was generated only by dispersal limitation) by examining the phylogenetic structure of plant communities at different points in succession. One tantalizing, albeit poorly replicated, observation was that older communities displayed more evenness than did younger communities — a result consistent with the idea of decreasing clustering over time, as competition differentially increases the probability of local exclusion for species with close relatives in the community. An alternative approach to the same issue might be available whenever individuals in a plot can be aged, or their longevity measured, as the competition hypothesis suggests increasing evenness in older cohorts of individuals, whereas the dispersal limitation alternative does not. We are aware of only one attempt to conduct such a test, with Webb et al. (2006) finding that nearest taxa index (NTI) decreased and NRI increased in saplings relative to seedlings in a forest in Borneo. So while it seems that, at least in some systems, seeds of closely related species do indeed arrive at the same local sites, the prevalence of the role of dispersal limitation remains largely unknown.

On even larger scales, the biogeography of speciation could also drive clustering in local communities: if a regional phylogeny includes a large geographical area with several distinct diversification locales, and biogeographical ranges are again relatively stable through time, then local communities may include species more closely related than expected by chance under random sampling from the regional phylogeny. A number of studies have reported such an effect, called ‘geographical structure’ by Pennington et al. (2006), although they have generally used metrics other than NRI/NTI (e.g. Gorman 1992; Cadle & Green 1993; Price et al. 2000; Stephens & Wiens 2004). Forest et al. (2007), for example, measured the number and identity of genera in 201 quarter-degree squares for the entire Cape region of South Africa. They were able to compute phylogenetic diversity (PD) for each quarter-degree square and compare it to that expected based solely on taxon richness. When all the quarter-degree squares were pooled, there appeared to be no large-scale pattern of either clustering (less PD than expected) or evenness (more PD than expected). However, further analyses revealed an east–west division in the distribution of PD that broadly corresponded to climate zone, with PD generally higher in the east (Forest et al. 2007, p. 757). In the western part (101 sites) of the Cape, there was evidence for a number of endemic radiations over the past 25 million years or so, resulting in phylogenetic clustering of local communities. Conversely, it was argued that the eastern part (100 sites) of the Cape has been subject to different evolutionary and palaeoclimatic processes, resulting in a prevailing pattern of phylogenetic evenness in local communities. Notably, this region appears to be influenced by another biodiversity hotspot with which is contiguous, and which is the source of occasional genera of unusual ecotypes (Forest et al. 2007).

To explicitly examine the effects of regional diversification on the phylogenetic structure of communities and biotas, one can deliberately analyse phylogenetic structure at a very large scale, with the species pool being (say) the continental flora and the ‘local’ sample being a regional flora (Webb et al. 2008; see also Pennington et al. 2006; Heard & Cox 2007). In such an analysis, intracontinental diversification would be observed as several small phylogenetic clusters of taxa in a regional scale sample (high NTI). On the other hand, if the taxa in regional-scale samples are evenly distributed on the pool phylogeny (low NTI), then it might indicate either extensive regional competitive exclusion or the persistent signal of allopatric speciation in all clades. (The latter, though, may be unlikely given repeated mixing of species ranges at continental scales.) Of course, such studies are really asking questions about the origins of diversification, and have moved away from questions about local-scale community structure. The implication for local-scale studies, though, is that the use of a regional pool that is too large compared to the local samples is likely to be a problem for studies that do want to explain local community organization, because phylogenetic clustering actually caused by in-situ diversification might be interpreted as habitat filtering. The vast majority of studies in our compilation, however, have total study ranges < 106 km2 (Table 1), so we suspect that such continental-scale biogeographical artefacts are not a common problem with the current phylogenetic community structure literature.

Overall, these ideas suggest that (i) a more explicit ex-ploration of geographical scale would be rewarding, and (ii) the creation of the species list for the regional phylogeny, including delineation of the overall spatial extent of the study, should not be arbitrary. For example, in tropical rain forests, direct resource competition among seedlings probably occurs on scales of less than 1 m, pathogen-associated density and phylodiversity dependence occur at larger scales (10 to 1000 m), drought-related habitat differentiation occurs in ridge/valley systems over scales of 50 to 500 m, and for some plant taxa all these scales are linked by vertebrate seed dispersal over scales of 10 m to 5 km (Webb & Peart 2000; Webb et al. 2006). At still larger scales, the extent at which plots may sample different regional diversifications will depend on factors such as range sizes, climatic and geographical restrictions on historical range movement, and again the dispersal biology of the clades in question.

Species pools and the power to detect patterns.  Another consequence of the way plot size and study area are defined is extensive variation in the ratio of local richness to the size of the species pool. A number of studies have called attention to this ratio (Swenson et al. 2006; Kraft et al. 2007; Valiente-Banuet & Verdú 2007), and it has been argued that studies with ‘intermediate’ values will have the greatest statistical power to detect phylogenetic community structure. Based on simulation results, Kraft et al. (2007) concluded that local communities that ranged from ~30–60% of the regional pool would likely afford the greatest power. The probability of type II errors increased both for very small communities and for communities nearly as species rich as the regional pool, because of increased sampling uncertainty in either case. In our survey, local communities with 30–60% of source-pool diversity were uncommon, with only 7 of 30 analyses for which we could obtain both local and regional species numbers falling in this range (Table 1, Fig. 4). Interestingly, the proportion of analyses that found no pattern overall was higher among studies within this range (43%) than among those outside this range (13%). This difference is not significant (Fisher exact test: P = 0.12), but there are very few published ‘no pattern’ cases (N = 6), perhaps due to a ‘file-drawer effect’. Furthermore, the incidence of finding significant patterns was very high for analyses in which the local species pool represented a relatively small fraction (i.e. < 30%) of the regional species pool, with only two (10%) analyses finding no pattern (yeast in Opuntia cacti: Anderson et al. 2004; thicket communities in South Africa: Procheşet al. 2006). The conclusions regarding variation in local community size by Kraft et al. (2007) were based on holding regional pool size constant, which may partially explain the discrepancy between their simulations and empirical findings. Perhaps unsurprisingly, analyses in which the local species pool represents a large fraction (i.e. > 60%) of the regional species pool are rare (points in the leftmost region of Fig. 3). All three entries in this category were temperate plant communities (coastal and montane Mediterranean vegetation: Verdú & Pausas 2007; meadow plants: Silvertown et al. 2006).

Kraft et al. (2007) also investigated the effects of regional pool size, at given community size, on power to detect patterns in phylogenetic community structure. Using simulations, they were able to create communities created either by filtering or by limiting similarity. Based on their results, Kraft et al. (2007) suggested that patterns created by habitat filtering will be easier to detect with larger pool sizes, whereas the opposite pattern is expected for those created by limiting similarity. Insofar as increased phylogenetic scope in a regional pool will increase the probability that distantly related clades will converge on trait values by chance, this could increase the likelihood of competitive exclusion of distantly related species (i.e. phylogeny-based clustering could result from trait-based evenness). Thus, they expected that phylogenetic clustering might be more common in communities with large regional pools, especially tropical forests. In our compilation, we found a very weak but nonsignificant trend towards such an effect (Table 2b).

Geography and niche ‘traits’: α vs. β niches

Discussions of the role of geography in community structuring are not complete without a consideration of α vs. β niches (Ackerly et al. 2006; Silvertown et al. 2006). Although a precise definition is elusive, ‘α niche’ has been used to denote niche axes that differ among co-occurring species within a habitat (e.g. feeding preference), whereas ‘β niche’ is used for those axes that separate taxa into spatially distinct habitats (e.g. conifer vs. deciduous forest for some bird groups). Some authors have suggested that the α niche shows significant evolutionary lability, whereas β niche tends to be evolutionarily conserved (e.g. Emerson & Gillespie 2008). However, Ackerly et al. (2006) found α niche traits to be conserved, likely contributing to the overall results of Ceanothus communities being comprised of species more distantly related than expected by chance. For Ceanothus, they considered the α niche to be the microhabitat, with relevant traits including those involved in fire response or water stress tolerance. They considered the β niche to be the ‘macrohabitat’ (specifically, edaphic conditions, forest type, precipitation and temperature) but noted (Ackerly et al. 2006, p. S51) that ‘ambiguity over the use of the word habitat is unfortunate, as there is a substantive difference between these models in their emphasis on large-scale habitat differences, implying allopatric populations, vs. microhabitat differentiation within local communities.’ Regardless of the lability of traits relevant to either niche class, there are some outstanding issues. For example, how can we assess whether a particular trait is related to a species’α or β niche a priori (e.g. before testing for evolutionary conservatism)? Although the β niche will typically vary over larger spatial scales, some traits will no doubt be in a gray area between composing part of a species α or β niche. Further examination of these factors is warranted, as direct examination of the clustering or evenness of the niche components as well as that of the species themselves should provide insight into the mechanisms determining phylogenetic community structure (see Sargent & Ackerly 2008). Tool for such analyses are newly available (e.g. the ‘picante’ package for R; Kembel et al. 2008) but, to our knowledge, have yet to be used.

Experimental evidence

Given the many confounding factors potentially affecting interpretation of phylogenetic community structure in natural systems, experimental approaches to examining the causes and consequences of nonrandom structure should be attractive. However, to date, such experiments are rare. Indeed, we are aware of only two relevant studies (Maherali & Klironomos 2007; Cahill et al. 2008).

Maherali & Klironomos (2007) asked whether local interactions could generate phylogenetic evenness in communities of arbuscular mycorrhizal fungi (AMF) associated with roots of the narrowleaf plantain (Plantago lanceolata L.). Spatial resource use for these fungi was a phylogenetically conserved trait, with three families occupying distinct regions of the plantain root. Maherali & Klironomos (2007) constructed eight experimental communities with varying numbers of species from the three families in order to produce three degrees of phylogenetic relatedness (PR). In three high PR replicates, all species were drawn from a single AMF family. Two intermediate PR replicates contained different combinations of species from two AMF families, and three low PR replicates contained species from all three AMF families. After 1 year, low PR replicates retained > 80% of the initial species pool on average, whereas high PR replicates retained < 40% of the initial species richness. These experiments, obviously, rule out dispersal limitation and implicate local interactions (possibly competition between species with high niche overlap) as responsible for generating phylogenetic community structure.

Most recently, Cahill et al. (2008) tested the proposition that competition is strongest between phylogenetically related plant species. This test used a meta-analysis of five experimental studies that measured the relative competitive ability of focal species grown together with various competitors. Although these studies had been conducted without reference to phylogenetic relatedness, Cahill et al. (2008) were able to assemble a relatively well-resolved phylogeny for the 142 species included in the five studies. With all studies pooled, they found only a weak but nonsignificant relationship between the strength of competition and relatedness. They also conducted separate analyses within eudicots and within monocots, and found a significant competition-relatedness relationship in the latter group but not the former. This difference may have arisen from a difference in sampling: among the monocots, a number of confamilial genera (e.g. Carex, Eleocharis, Scirpus) were represented by five or more species, whereas the average relatedness was lower and representation sparser among the eudicots. This approach is a promising one, and it is hoped that further such analyses will shed light on the relative predictive value of phylogenetic relatedness (vs. functional traits) in plants and other taxa.

While these contributions shed light on the extent to which phylogenetic relatedness predicts the ability of species to co-exist, they do not directly address the mechanisms underlying the types of patterns being reported in comparative studies. In addition to manipulating initial combinations of species based on relatedness, we advocate experiments that recreate conditions that have been predicted to lead to clustering vs. evenness. For example, given a group of species with short generation times and phylogenetic conservatism of traits, one could impose two (or more) environmental treatments. A ‘stressful’ treatment (e.g. high soil pH or frequent disturbances) would be predicted to lead to more phylogenetic clustering, while a ‘benign’ treatment (e.g. high resource levels and no disturbances) would be predicted to lead to more evenness. Such an experiment might be most powerful if initial replicate communities resembled the intermediate PR replicates of Maherali & Klironomos (2007), so that deviations from initial conditions were detectable in both directions.

Incorporating abundances in studies of clustering and evenness

The great majority of studies investigating phylogenetic clustering vs. evenness have focused on presence or absence of species in the studied communities. However, presence/absence data are likely to miss a great deal of ecologically interesting pattern. In part, this is because presence/absence data are very sensitive to the chance and perhaps temporary occurrence of a single individual in a habitat or competitive situation that is actually unsuitable. More importantly, the interspecific-competition logic underlying interest in community evenness is based on interactions among individuals that are cumulative in their effects. For instance, the presence of a single individual of competitor A may have negligible effects on the population dynamics of competitor B, even when a dense population of competitor A rapidly drives competitive exclusion. For this reason, the incorporation of species abundances into phylogenetic community structure seems an important goal.

To our knowledge, only three studies have examined phylogenetic clustering vs. evenness in the context of species abundances, perhaps because the dominant software package for such analyses (Phylocom, Webb et al. 2008) has only very recently included an option for abundance weighting. Anderson et al. (2004) asked whether yeast species that were jointly common in cactus rots (rather than simply co-occurring) were phylogenetically clustered or even, finding evenness in rots of columnar cacti, but neither clustering nor evenness in rots in Opuntia cacti. Hardy & Senterre (2007) derived a set of abundance-based phylogenetic diversity metrics based on Simpson diversity and allowing a local-vs.-regional decomposition analogous to population-genetic FST (see also Box 1). They applied their methods to tree diversity data from a rain forest in Equatorial Guinea, finding phylogenetic clustering that was only marginally significant for metrics based on presence/absence data (their ΠST) but highly significant for metrics incorporating abundances (their PST). Finally, Helmus et al. (2007a) offered a different method for incorporating abundances into measures of phylogenetic diversity, and applied their method to fish communities in Wisconsin, USA. Like Hardy & Senterre (2007), Helmus et al. (2007a) found that incorporating abundances changed the strength of measured phylogenetic clustering, but this time a pattern of clustering was stronger when based on presence/absence data [their PSV (phylogenetic species variability)] than when abundances were incorporated [their PSE (phylogenetic species evenness)]. Unfortunately, with just three studies and no two using a common analytical approach, we cannot draw any useful generalizations about the incorporation of abundance data — except that it surely ought to be carried out more often. Abundance data will be equally relevant to other kinds of phylogenetic diversity calculations. For instance, Lozupone et al. (2007) included abundances in calculations of phylogenetic β diversity. However, surprisingly few such efforts have been made. Of course, raw abundances are only one way of expressing the importance of species in local communities, and similar analyses using instead biomass, nutrient uptake rates, or other process-based measures of importance will likely be rewarding.

Phylogenetic community structure beyond clustering and evenness

Most of the preceding discussion deals with just one aspect of phylogenetic community structure: whether local communities show evenness or clustering compared to a random-sampling null model. This reflects an emphasis in the literature on this question, with comparatively less effort applied to other ways in which phylogenetic information might illuminate questions about community structure. Such an emphasis is not surprising, for two reasons. First, the evenness/clustering question is a natural refinement of questions that were central to community ecology long before the phylogenetic dimension could be feasibly added (see Overview of phylogenetic community structure analysis). Second, the progress of inquiry in a field is often dependent on the availability of easily used analytical tools — and available software such as Phylocom (Webb et al. 2008) makes analysis of evenness vs. clustering straightforward given appropriate data.

Despite this emphasis, the contribution of the phylogenetic perspective to community ecology has not been limited to evenness vs. clumping, and phylogenetic information can improve our answers to other questions about community structure. For example, Lozupone et al. (2007) incorporated phylogenetic data into calculations of bacterial β diversity, whereas Weiblen et al. (2006) used phylogenetic data to refine estimates of the effect of host-plant relatedness on similarity in herbivore communities. A number of studies have asked whether convergence in local community structure results from convergence in community assembly from a source pool of old and ecomorphologically stable lineages (e.g. Stephens & Wiens 2004; Kozak et al. 2005) or from recent convergent evolution of species to fill similar niches in each community (e.g. Losos et al. 1998; Winemiller 1991).

While the preceding examples involve cases where phylogenetic information enhances our ability to answer a pre-existing question, we suspect that phylogenetic perspectives can also lead to novel ways of thinking about communities. For example, Heard & Cox (2007) introduced the concept of local diversity skewness. Diversity skewness is a measure of unevenness in biodiversity among subclades of a larger clade; on global scales, there is a strong and widespread tendency for some subclades to be much more diverse than others (for instance, beetles vs. mayflies among insects, or passerines vs. penguins among birds; Heard 1992; Mooers & Heard 1997). It is easily quantified using metrics of phylogenetic tree shape, such as tree imbalance (Ic; Fig. 5). Heard & Cox (2007) asked whether diversity skewness might also be a property of local communities or regional biotas, and in parallel with the logic for clustering vs. evenness they compared diversity skewness for the phylogeny represented in a local community to that expected for communities randomly sampled from a regional source pool (Fig. 5). In a local community showing significantly high skewness, diversity is dominated to an unexpected degree by members of one or a few subclades of the larger clade being studied; and in a community showing significantly low skewness, subclades are surprisingly even in their local representation. Local signatures in diversity skewness could arise for either biogeographical or ecological reasons (Heard & Cox 2007). For example, if niche breadth is phylogenetically conserved, then in competitively structured communities species packing during community assembly could favour high local skewness. This effect would arise because lineages with narrow niches could be tightly packed in niche space and thus heavily represented in local communities, while lineages with broad niches would experience competitive exclusion and remain locally depauperate.

Figure 5.

Phylogenetic tree imbalance and local skewness. In this hypothetical example, a clade's global phylogeny (left) includes 10 species, of which 4–5 are represented in each of three local communities. Local diversity skewness is well measured by the imbalance (Heard 1992) of the local phylogenies; using the conventional metric Ic, communities 1 and 2 have high diversity skewness (Ic = 1), while community 3 has low diversity skewness (Ic = 0). Expected local skewness would be calculated by repeatedly sampling species from the global phylogeny and calculating local skewness for each sample.

Heard & Cox (2007) demonstrated their analysis with a study of African and South American primate assemblages. They found that continental primate faunas had higher diversity skewness than expected for random samples from the global primate phylogeny, but that local primate assemblages did not differ in skewness from the expectation for random samples from continental faunas. No other similar analysis has yet been completed.

Interestingly, and despite obvious connections between the competitive-exclusion rationales we have offered for local evenness and local diversity skewness, nothing is known of the covariation (if any) between these aspects of phylogenetic community structure — despite the fact that the two sets of metrics require essentially the same data to calculate. Now that software is freely available to conduct both sorts of analyses (Phylocom, Webb et al. 2008; SkewMatic, Heard & Cox 2007), we anticipate studies that examine both skewness and clustering and shed light on relationships between them. We do not expect that diversity skewness is the only novel phylogenetic property of communities worthy of study, and as phylogenetic perspectives become more commonplace among community ecologists, we look forward to new and unexpected insights.

Future directions and considerations

In addition to the outstanding issues discussed earlier — notably incorporating or accounting for geographical scale, species abundance and extending our concept of phylogenetic community structure beyond just evenness and clustering — we would like to highlight three other areas for consideration. The first is a simple data-reporting issue. In our attempt to analyse trends in the phylogenetic community structure literature, we found summary statistics for the following variables to be especially scarce: (estimated) age of most recent common ancestor, total area encompassed by study, (average) plot area, number of plots, total number of species, and mean number of species per plot. Future studies should present such data, perhaps in a summary table or appendix.

Curiously, no studies have simultaneously investigated phylogenetic structuring in members of two or more interacting trophic levels co-existing in the same localities. There remains the possibility that the imprint of trophic cascades, reciprocal co-evolution, or sympatric diversification via host shifts will be detectable in matching or contrasting patterns in linked trophic levels. Weiblen et al. (2006) took a notable step in this direction with a study of host use by herbivores with respect to host plant phylogeny in a rain forest in New Guinea. Herbivore species tended to exhibit phylogenetic clustering with regard to host plant use, such that closely related herbivore species fed on more closely related plants than predicted by chance. However, Weiblen et al. were not able to include phylogenetic structure of the herbivore community in their study, and as they readily admit (Weiblen et al. 2006, p. S70), their set of herbivores and host plants may not even be representative of forests at their study site, let alone plant–insect communities in general. Nevertheless, we hope this study will spur further investigations, perhaps in less species-rich settings (e.g. temperate forests) where more complete sampling might be feasible.

Finally, researchers will have to tackle head-on the joint influences of the mode of speciation and the ecological similarity of the resulting species on patterns in phylogenetic community structure. McPeek (2007), for example, noted that different models of speciation impact how ecologically different sister species will be (see also Mooers et al. 1999; Schluter 2000). Considering the interplay between regional diversification and local community processes is a relatively nascent field of study (McPeek 2007; Ricklefs 2007) worthy of further attention.


While we admit that the taxonomic focus of most studies of phylogenetic community structure is far from representative of the diversity of life on earth, the high frequency of discovery of phylogenetic clustering of local communities against larger species pools suggests at least a common role for filtering of species into local habitats based on conserved ecological characters. However, the complexities of defining communities, both taxonomically and spatially, and of abstracting out patterns, obscure our ability to draw any stronger conclusions. Nevertheless, we have attempted in this review to synthesize and expose key issues that must be addressed in further studies, and we anticipate that as the field develops, new methods, cautions and insights will reveal with increasing clarity the interplay of evolution and ecology.


  • 1

    The phrase ‘Darwin-Hutchinson zone’ is not our invention. It was introduced in this context by Michael Donoghue (unpublished), although the placement of its boundaries in our Fig. 4 is our responsibility.

  • Box 1 Measuring phylogenetic community structureComparison of metrics of phylogenetic community structure.

    MethodBase measureIntrasample metricIntersample metricSignificance testingAbundance data possibleSoftware
    1. Notes: (1) Generalized form since phylocom version 4.0; (2) Hardy & Santerre 2007; (3) Webb, Ackerly & Kembel 2008; (4) Webb et al. (2008); (5) PST only; (6) Contact Hardy; (7) Steers 2001; Cavender-Bares et al. 2004; (8) Kembel et al. 2008; (9) Helmus et al. 2007a, b; (10) Procheşet al. 2006; (11) depending on measure of co-occurrence used; (12) Cavender-Bares & Lehman 2007.

    Subtended branch lengthsPDPDN/ASample/phylogeny randomization (10)Noape (R); Phylocom
    Mean phylogenetic distanceMPDNRIcomdist (4)Sample/phylogeny randomizationNo (Yes; 1)Phylocom (3); picante (R; 8)
    Mean nearest taxon phylogenetic distanceMNNDNTIcomdistnnSample/phylogeny randomizationNo (Yes; 1)Phylocom (3); picante (R; 8)
    Phylogenetic Simpson diversity (2)D (= MPD)inline imagePST, ΠSTSample/phylogeny randomizationYes(6); Phylocom (5)
    Variability in neutral trait (9)PSV, PSR, PSEPSV, PSR, PSEPSV, PSR, PSESample randomizationYesMatlab code
    Phylogenetic distance vs. co-occurrence correlation (7)r2N/Ar2Sample randomization; Mantel testYes (11)ecophyl (12); ape and vegan (R)

    Single dimensional indicesBecause adding species into a sample alters the topology of the phylogenetic network joining them, most ‘raw’ metrics are correlated with the taxon richness of a sample. Various standardized metrics have thus been developed to enable comparisons of phylodiversity or phylogenetic structure among samples of different species richness; that is, ‘standardized effect size’ (Gotelli & Rohde 2002; Kembel & Hubbell 2006).PD: developed originally to quantify the phylogenetic uniqueness of samples of taxa for conservation (e.g. for comparing parks), the phylogenetic diversity (PD) of a sample is the sum of branch lengths of all internodes traversed between the root and all Ssample species on a phylogeny of a larger pool (Faith 1992). With increasing Ssample, PD will necessarily increase. Standardized PD: the PD metric of a particular community compared to randomized null communities of the same size, as used by Procheşet al. (2006).NRI, NTI, etc. Mean pairwise distance (MPD): the mean distance between each of the Ssample taxa and every other terminal in the sample. The response of MPD to increasing Ssample depends upon the balance of the tree. Mean nearest neighbour distance (MNND; aka MNTD): the mean distance between each of the Ssample taxa and its own most closely related terminal taxon in the sample. With increasing Ssample, MNND will decrease — as more species are included in a local community, most additional species will be a close relative of at least one of the already-sampled species. Net relatedness index (NRI): standardized MPD, by expectation for random draws of the same number of species from the same pool phylogeny, and by inversion of sign (high NRI is high clustering; Webb 2000; Webb et al. 2002); NRI is closely correlated with standardized PD. Nearest taxon index (NTI): similarly standardized MNND. May or may not be correlated with NRI, depending on the ‘clumping of clustering:’ a single cluster of sample taxa on the pool should give high NTI and NRI, whereas several clusters, evenly distributed around the tree may give a high NTI with low NRI. NRI and NTI are in units of standard deviation, and if the simple null model used to derive these metrics is appropriate, the significance of a pattern is contained in the value of the metrics themselves (< –1.96 is significantly even and > 1.96 is significantly clustered). I ST, PST, and ΠST: Hardy & Senterre (2007) developed a set of metrics intended to partition phylogenetic community structure into components of diversity manifest within sites (α diversity) and across sites (β diversity). These metrics are based on computation of Simpson diversity, which can be generalized to include phylogenetic information: DP. = ΣiΣjδijfifj, where fi and fj are the frequencies of species i and j in the community, and δij is the phylogenetic distance between them. If δij = 1 when i = j and 0 otherwise, DP. reduces to DI., the standard Simpson diversity based on species identity. These can be calculated within sites (subscript S) or for all sites combined (subscript T). IST and PST then express the proportion of identity-based and phylogenetic diversity (respectively) that is expressed among sites, by analogy with FST from population genetics: IST = (inline image – inline image)/inline image and PST = (inline image – inline image)/inline image. Finally, working with species incidence rather than species abundances leads to a metric ΠST that is analogous to PST but ignores relative abundance. ΠST, while derived differently, has an interpretation similar to that of NRI: ΠST = 0 in the absence of phylogenetic community structure, ΠST > 0 with phylogenetic clustering, and ΠST < 0 with phylogenetic evenness.PSV, PSR, and PSE: Helmus et al. (2007a) took a quite different approach to quantifying a community's phylogenetic species diversity, beginning with PSV, or ‘phylogenetic species variability’. PSV measures the expected among-species variance in a hypothetical trait that evolved neutrally along the branches of a community phylogeny, standardized by the largest possible such variance (for a star phylogeny). When some species are more closely related than in a star phylogeny, this variance is reduced, and so independently of any particular trait, PSV is a metric that quantifies phylogenetic relatedness among species in a community. PSV is closely but inversely related to NRI, with phylogenetic clustering producing smaller values of PSV and phylogenetic evenness larger values. Two further metrics can be derived based on PSV. PSR, or ‘phylogenetic species richness’, is simply species richness multiplied by PSV (that is, penalized for close relatedness of component species). PSE, or ‘phylogenetic species evenness’, is a version of PSV that measures expected trait variance across individuals rather than species, and thus incorporates species abundances.Correlations of pairwise taxon co-occurrence distance and phylogenetic distance: phylogenetic clustering is evident when closely related taxa co-occur more often than taxa that are less closely related, while phylogenetic evenness is seen in closely related taxa co-occurring less often than less closely related taxa. By correlating pairwise taxon co-occurrence measures (Schoener 1970) with phylogenetic distance (e.g. Steers 2001; Cavender-Bares et al. 2004; Slingsby & Verboom 2006), the sign of this relationship can be assessed.Because a phylogeny is a complex structure, a single scalar metric cannot completely characterize all aspects of phylogenetic community structure, and different metrics will be sensitive to different aspects of community structure. Other metrics of phylogenetic diversity (and thus phylogenetic community structure) have also been proposed (Clarke & Warwick 1998; Shimatani 2001; Barker 2002; Ricotta 2004).Multidimensional indicesRandomization tests generally compare the observed value of a unidimensional metric to a null distribution. Perhaps because searching for nonrandom phylogenetic structure in communities has been the goal of most of the studies to date, less consideration has been given to multidimensional metrics that are more informative but harder to interpret. Examples of such metrics would include position in a two-dimensional space of NRI and NTI, which would indicate the ‘clumping of clusters’ (see above), or NRI and diversity skewness (Heard & Cox 2007). The significant over- or under-representation of samples subtending to each node in a phylogeny can be assessed (e.g. using ‘nodesig’ in Phylocom, Webb et al. 2008, or by constrained phylogeny randomization, Hardy & Senterre 2007) and this highly multidimensional representation of phylogenetic structure (e.g. a vector of significance values of length n = S−1 nodes, where there are n nodes in a phylogeny with S species) can both be examined to aid in interpretation, and could be compared directly between places and over times.

  • Notes: (1) Generalized form since phylocom version 4.0; (2) Hardy & Santerre 2007; (3) Webb, Ackerly & Kembel 2008; (4) Webb et al. (2008); (5) PST only; (6) Contact Hardy; (7) Steers 2001; Cavender-Bares et al. 2004; (8) Kembel et al. 2008; (9) Helmus et al. 2007a, b; (10) Procheşet al. 2006; (11) depending on measure of co-occurrence used; (12) Cavender-Bares & Lehman 2007.

  • Box 3 Clustering in predaceous diving beetlesVamosi & Vamosi (2007) surveyed diving beetle communities in 53 lakes in Alberta, Canada. The 106 species found in the 53 lakes represent over two-thirds of the approximately 147 species recorded in the province [and approximately 84% (21 of 25) of the genera]. Sixteen lakes exhibited significant clustering (i.e. NRI > 1.96) and a further 21 lakes had NRI values > 0, with no lakes exhibiting significant evenness. These data were from historical and museum collections, but contemporary sampling has corroborated this pattern (S. Vamosi, unpublished data). Indeed, the community of six species in Fig. 1(b) is rather exceptional in that it contains members of five subfamilies. However, this comparatively ‘even’ community still contains two congeneric species from the commonly encountered genus Hygrotus (H. impressopunctatus and H. unguicularis). Clustering was observed despite the fact that: (i) all of the species were from one family, (ii) body size, which is correlated with prey size, is strongly conserved in diving beetles (e.g. Fig. 1a), and (iii) there is prior evidence that prey size partitioning is common in aquatic predatory systems (e.g. Travis et al. 1985; Tate & Hershey 2003). The effectiveness of temporal and spatial partitioning of microhabitats in facilitating co-existence of diving beetles has been largely uninvestigated, although recent efforts have revealed that relative abundances of closely related species may change dramatically along fairly minor depth gradients (D. Yee & S. Vamosi, unpublished observations, 2007).

  • Box 4 Dispersal limitation and the ‘true’ regional poolFew studies, if any, have made any attempt to determine that all species in the regional pool were equally likely to disperse to a site. Here, we examine possible effects of dispersal limitation in the results for dytiscid community structure by re-analyzing data from Vamosi & Vamosi (2007). We did so by decomposing the overall (Alberta) set of sites and source pool into two subregional analyses. For montane lakes, we compared dytiscid communities to a regional pool consisting of all but only species that occur in the mountains (> 1600 m elevation); and for prairie lakes, to a regional pool consisting of all but only species that occur in the prairies (< 1600 m). We found a high degree of overlap between subregional species pools, with many species common to montane and prairie lakes (but see also Vamosi et al. 2007). As a result, in this particular case the narrowing of geographical scale had little effect on the overall results, with a continued strong result of phylogenetic clustering (see Fig. 2). Nevertheless, more explicit examination of the probability that dispersal does indeed link samples together is warranted.

    • image(2)

    [ Effects of more ‘localized’ regional species pools on apparent phylogenetic structure in predaceous diving beetle communities. Data points are for 53 Alberta lakes, either from mountain (filled symbols) or prairie (open symbols) regions. The plot shows NRI values calculated with all species are included in a single regional pool (horizontal axis) vs. values calculated with separate regional pools for mountain and prairie regions (vertical axis; the dashed line indicates the expectation if dispersal limitation is unimportant in driving apparent structure). In this instance, altering the regional pool composition had no systematic effect on resulting NRI values, likely because of relatively high overlap in species between the two regions. ] On a finer scale, dispersal limitation can pose another problem for analyses of phylogenetic community structure: it creates spatial autocorrelation in species occurrence data, and in turn such spatial autocorrelation can yield inflated type I error rates in tests of clustering or evenness (Helmus et al. 2007a; Hardy 2008). Avoiding this problem is simple in principle: one can constrain the set of study plots to be distant enough to minimize spatial autocorrelation in species occurrences. However, this is not simple in practice, for at least two reasons. First, increasing distances among plots may temper concerns about spatial autocorrelation, but it only intensifies concern about dispersal limitation driving species’ absence from plots. Second, spatial autocorrelation generated because species are ecologically filtered on a spatially structured environmental gradient is actually a signal we want to detect. Further work on the connection between spatial and phylogenetic nonrandomness in community structure is clearly necessary.


We are very grateful to the following for responding to queries about their studies: J. Cavender-Bares, F. Forest, C. Horner-Devine, K. Kozak, M. Helmus, W. Hochachka, S. Kembel, I. Lovette, T. McMahon, P. Peres-Neto, S. Proches, J. Silvertown, and W. Starmer. We thank J. Cahill and four anonymous reviewers for comments on the manuscript. L. Bernatchez kindly offered us the opportunity to write this review on community phylogenetics. The authors acknowledge funding from the Natural Sciences and Engineering Research Council of Canada (S.M.V., S.B.H., and J.C.V.), the US National Science Foundation (grants 0212873, 0408432, 0515520) and the Arnold Arboretum of Harvard University (C.O.W.).

Steven Vamosi is an evolutionary ecologist interested in predator-prey interactions, phylogenetic community structure, and associations between breeding systems and life history/ecological traits. His research focuses on fish, aquatic and terrestrial invertebrates, and Neotropical plants. Stephen Heard is an ecologist and evolutionary biologist interested in ecological controls on speciation rates. His research has included theoretical and empirical studies of phylogenetic tree shape, and field and lab studies of host specialization in phytophagous insects and their parasitoids. Jana Vamosi is interested in how angiosperm species diversity can influence speciation and extinction, especially with regard to pollination dynamics. Cam Webb's research interests range from plant ecology and biogeography to molecular phylogenetics and bioinformatics. He is a pioneer in the study of phylogenetic community structure.