Author for correspondence: R. Toby Pennington: Tel: +44 (0)131 248 2818 Fax: +44 (0)131 248 2901 Email: firstname.lastname@example.org
I. Introduction 606
II. Methodological issues 606
III. Insights into processes that give rise to species rich biomes 608
IV. Future directions: neutral ecological theory, community phylogenetic structure, and processes leading to species accumulation 612
V. Conclusions 613
Analytical methods are now available that can date all nodes in a molecular phylogenetic tree with one calibration, and which correct for variable rates of DNA substitution in different lineages. Although these techniques are approximate, they offer a new tool to investigate the historical construction of species-rich biomes. Dated phylogenies of globally distributed plant families often indicate that dispersal, even across oceans, rather than plate tectonics, has generated their wide distributions. By contrast, there are indications that animal lineages have undergone less long distance dispersal. Dating the origin of biome-specific plant groups offers a means of estimating the age of the biomes they characterize. However, rather than a simple emphasis on biome age, we stress the importance of studies that seek to unravel the processes that have led to the accumulation of large numbers of species in some biomes. The synthesis of biological inventory, systematics and evolutionary biology offered by the frameworks of neutral ecological theory and phylogenetic community structure offers a promising route for future work.
Global biomes are generally defined by the physiognomy of their dominant plant species (e.g. Woodward et al., 2004), but also differ in their species-richness. Explaining these different levels of species diversity is a problem that has long troubled biologists and ecologists. A great deal of ecological research has addressed the problem of how so many species can coexist in species-rich biomes such as tropical rain forest (e.g. Hubbell, 2001; Wright, 2002; Burslem et al., 2005). Until recently, however, perhaps less progress had been made in studies of history that ask how and when large numbers of species accumulated. Despite some compelling critiques (e.g. Nelson & Platnick, 1981), the most widely accepted recent approach to this historical biogeographic problem focused upon determining areas of origin of taxonomic groups. For plants, perhaps the most influential work of this type was that of Raven & Axelrod (1974). They used various lines of evidence to infer areas of origin, including current distribution (with emphasis on areas of maximum diversity and endemism), fossil distribution, degree of specialization of a taxon in a given area, systematic relationships inferred using classical morphological methods and powers of dispersal. Tectonic history played perhaps the predominant role, not unsurprisingly, given that this work was published soon after the general acceptance of plate tectonic theory. As an example, species-rich plant families such as Leguminosae (Fabaceae) with centres of diversity in Africa and South America were considered to have originated in west Gondwana, with their current distribution explained by tectonic vicariance, even though in the case of legumes, the earliest fossil finds (c. 60 million years (Ma); Herendeen et al., 1992; Lavin et al., 2005) are considerably younger than the split of Africa and South America (c. 100 Ma). This approach was extended to the study of the history of biomes by authors such as Gentry (1982), who wished to explain why South American rain forests are the world's most species-rich. He used Raven & Axelrod (1974) to assign 145 neotropical plant families a Laurasian or Gondwanan origin, and concluded that the Laurasian contribution to the entire South American flora was < 10%, and that it was even less significant for the lowland rain forest flora. The subsequent inference was that in situ diversification of taxa isolated by the break-up of west Gondwana must have led to today's high diversity of South American species.
This approach, largely emphasizing how a unique tectonic and geographic history has shaped a resident biota, is now under challenge from insights provided by molecular phylogenetics. The estimation of the evolutionary history of plant species using DNA sequence data has been a principal focus of the field of systematics, driven by the desire to produce a more evolutionary taxonomy (e.g. APG, 1998; APG II, 2003). However, the massive accumulation of molecular plant phylogenies over the past two decades is now having a much wider effect on evolutionary biology and ecology (Webb et al., 2002; Davies et al., 2004; Hardy & Linder, 2005). This review focuses on the recent use of molecular phylogenies, particularly those that have been dated, to infer the history of some of the world's most species rich biomes such as tropical rain forests. Some of the insights gained are startling: that dispersal, often over long-distances, has played a possibly predominant role in the assembly of species-rich floras, and that rates of immigration, speciation and extinction may differ between biomes such as rain forest and seasonally dry tropical forests, implying that their histories have been different and perhaps more driven by ecological processes than tectonic history.
II. Methodological issues
Molecular phylogenies are estimated using DNA sequence data based upon the premise that a shared substitution at a given nucleotide site in different species potentially indicates common ancestry. Most phylogenies are estimated using parsimony, maximum likelihood or Bayesian likelihood approaches. These differ in the complexity of the DNA substitution models they assume, with parsimony the most simple and likelihood approaches more complex, and therefore computationally more difficult. The Bayesian likelihood approach has become widespread because it provides a systematic and computationally feasible means of finding a global optimum in a complex likelihood surface typical of the parameter-rich nucleotide substitution models. All the methods face the same computational problem of investigating each of the astronomical number of possible resolved topologies for phylogenetic trees with more than c. 20 taxa, so for large data matrices, the solutions they provide are only approximate.
The lengths of the branches on phylogenetic trees are proportional to the number of substitutions inferred to have occurred along each branch. If the rate of DNA substitution were constant and clock-like over evolutionary time, all extant species in a given group would have accumulated the same number of substitutions (though, of course, across different nucleotide sites) in diverging from their common ancestor. However, substitution is seldom, if ever, clock-like (Li, 1997; Sanderson, 1998) and the result is the accumulation of a variable number of substitutions in different lineages (Fig. 1).
A phylogeny that has not been calibrated with an absolute dimension of time still contains information of relative recency of common ancestry and is useful to investigate how closely ecology or geography correlates with relationships. However, if we can date a phylogeny, it becomes a more powerful tool. We can assign a date to a given bifurcation (node) in a phylogeny using external evidence, such as knowledge of fossils that can provide a minimum age for the common ancestor of a group (Fig. 1). An example is the first appearance of the family Leguminosae in the fossil record at 60 Ma that was used to assign this date to the stem node of the family (Lavin et al., 2005). If DNA substitution were always clock-like, estimating the date of divergence of every node in the phylogenetic tree would be simple, but the problem in the real world of nonclock-like substitution is far more complex. Fortunately, several algorithms for dating phylogenies using one or more calibrations have been proposed that can allow for variation in substitution rate (for excellent, detailed reviews see Renner, 2005; Rutschman, 2006). The most widely used are nonparametric rate smoothing (NPRS; Sanderson, 1997) and penalized likelihood (PL; Sanderson 2002), both implemented using the software r8s (Sanderson, 2003) and an approach using Bayesian likelihood (Thorne & Kishino, 2002), but many additional approaches are well summarized by Rutschman (2006). Using NPRS minimizes the difference in ancestor–descendant rate transitions simultaneously over the entire tree, assuming correlation of rates from ancestor to descendent. It is a potentially parameter-rich model of nucleotide substitution rate variation, contrasting with a rate constant model of rate variation (Langley & Fitch, 1974) that allows only one rate across the entire tree. However, NPRS tends to over-fit the data and thus allows too much rate variation and subsequent loss of predictive power (Sanderson, 2002, 2003). This has the empirical effect of a bias towards older age estimates (Lavin et al., 2005). Penalized likelihood rate smoothing uses an optimal rate smoothing parameter that is selected via cross-validation, which involves pruning terminal branches and predicting the rate along that branch. Penalized likelihood converges on rate constancy if the smoothing parameter is high, or an NPRS model if the smoothing parameter is low. Typically, the optimal smoothing parameter in PL lies intermediate between that of rate constancy and NPRS (Sanderson, 2002). The Bayesian approach to the estimation of rate variation is fully parametric and assumes a model of rate change (e.g. log normal or compound Poisson) that is ad hoc (Sanderson, 2002). According to Sanderson & Kim (2000) fully parametric phylogenetic approaches may be at a disadvantage when the sampling distribution of data are complex (e.g. the substitution process is not stationary), so non-parametric (and semiparametric) approaches can then have an advantage in phylogenetics.
The critical issue of time calibration (or age constraint) of molecular phylogenies is now receiving the attention that it merits (e.g. Sanderson & Doyle, 2001; Crepet et al., 2004; Magallón, 2004; Near & Sanderson, 2004). Dates can be assigned to nodes using fossil evidence and geological events such as the timing of vicariance of continental landmasses or the emergence of oceanic islands. Fossil calibrations are preferable because of the possibility of recent dispersal over oceanic barriers (Renner, 2004; see text below) and studies that rely too heavily on single geological calibrations (e.g. Becerra, 2003, 2005; Morley & Dick, 2003) have been criticized (Renner, 2005) for their circularity and potentially dubious assumptions.
Multiple fossils are desirable because they provide independent calibrations that can be cross-checked (Near & Sanderson, 2004). Ideally, these fossil constraints should be spread across the tree to minimize the problem that date estimates for nodes will be more error-prone the further they are from a calibration point. The incomplete fossil record means that multiple constraints are rarely available. However, single calibrations, which are best if they can be placed at the base of the tree, can still be informative if used and interpreted correctly. For example, the root node can be assigned a fixed age that is deliberately biased to be somewhat old. This could be accomplished, for example, by using the maximum age estimate for a node (e.g. a subclade within the legume family) that was derived from another evolutionary rates study on a more comprehensive data set (e.g. the entire legume clade). Once the root node is assigned a fixed age that is biased old, then ages estimated for all nodes higher in the tree are biased old. This approach is very useful if the ages estimated for some groups are unexpectedly young. For example, Lavin et al. (2004) set out to test the hypothesis of Raven & Axelrod (1974) and Raven & Polhill (1981) that transoceanic disjunctions in legumes are largely the result of tectonic vicariance. Even fixing the root of legumes at 70 Ma (c. 10 Ma older than the oldest reported legume fossil) still estimated transcontinental disjunctions at 8–16 Ma, which is far too young for tectonic explanations.
Another possible source of error in the estimation of the ages of crown groups may result from undersampling of taxa in the initial phylogenetic analyses. Linder et al. (2005) gave examples where this might be an issue and a further example is illustrated by Pirie et al. (2005) in Annonaceae. If the basally divergent lineages within a crown group are not sampled then the age of that crown group will be underestimated. However, understanding the effects of sampling few taxa can be used to advantage to test more powerfully specific hypotheses. In the same legume example (Lavin et al., 2005), the phylogeny sampled only 335 of the estimated 20 000 legume species. For two sister clades in this phylogeny separated by an oceanic barrier, it is probable that addition of more taxa to the phylogeny might reduce the age estimate for the transoceanic disjunctions because these taxa might be a closer transoceanic relative. Thus, the ages estimated by Lavin et al. (2004) are biased old by taxon sampling (as well as by calibration), adding weight to the conclusion that estimated ages are too young to be explained by tectonics.
III. Insights into processes that give rise to species-rich biomes
Phylogenies, especially when calibrated with a dimension of time, are illuminating our knowledge of biome history at multiple temporal and spatial scales. On a global scale, they are permitting a re-evaluation of the relative roles of plate tectonics and long-distance dispersal in the assembly of continental biota (Crisp et al., 2004; Linder & Hardy, 2004; Pennington & Dick, 2004; Renner, 2004, 2005). On a local and more recent timescale, they give insights into the assembly of communities within species rich biomes (Webb et al., 2002; Bramley et al., 2004). An exciting future direction of research that promises to link more closely phylogenetics and ecology, and increase our knowledge of biome history, uses the framework of phylogenetic community structure (Webb, 2000) and neutral ecological theory (Hubbell, 2001).
1. Plate tectonics vs dispersal in the assembly of species-rich continental biomes
The tension in biogeography between a school that places great importance on the powers of dispersal of organisms vs another that gives it little credence has a long history (Darwin, 1859; Brown & Lomolino, 1998). The acceptance of tectonic theory, which potentially explained many global distributions with a single mechanism, made dispersal explanations unpopular, especially for continental disjunctions. This view was most clearly expressed in the influential school of cladistic or vicariance biogeography (Nelson & Platnick, 1981; Humphries & Parenti, 1999).
Dated phylogenies are providing evidence that tectonic vicariance is not responsible for many intercontinental disjunctions (Renner, 2004, 2005). For example, the distribution of many plant families (e.g. Lauraceae, Leguminosae, Annonaceae, Burseraceae, Melastomataceae, Malpighiaceae and Moraceae) in both Africa and the Neotropics was widely assumed to be the result of their presence in the supercontinent of west Gondwana before these continents separated c. 100 Ma (e.g. Raven & Axelrod, 1974; Gentry, 1982). However, phylogenies calibrated largely with fossils of all these groups (Chanderbali et al., 2001; Renner et al., 2001; Davis et al., 2002; Richardson et al., 2004; Lavin et al., 2005; Weeks et al., 2005; Zerega et al., 2005) strongly suggest that this was not the mechanism. In these families, most of the continental disjunctions are dated as far too young to make a tectonic explanation plausible. In the cases of Lauraceae, Melastomataceae, Annonaceae, Moraceae, Burseraceae and Malpighiaceae, the high frequency of early Tertiary dates of separation of Old World from neotropical groups, coupled with the presence of fossils of these families found in North America or Eurasia makes more probable explanations of migration through ‘boreotropical’ forests (Wolfe, 1975; Lavin & Luckow, 1993) that were present at high latitudes during this period of high global temperature. Short range dispersal via a boreotropical route may have been easier because of the shorter distance between continental landmasses at these latitudes. As outlined above, in the case of Leguminosae, the modal age of transoceanic disjunctions (8–16 Ma; Lavin et al. 2004) implies that the only plausible mechanism is transoceanic long-distance dispersal. This conclusion is supported by many other studies in unrelated groups. For example, Renner (2004) summarized 11 instances at various taxonomic levels of long-distance, transoceanic dispersal between the Neotropics and Africa, dated between 2 Ma and 11 Ma, and other examples continue to be published (e.g. Plana et al., 2004; Bartish et al., 2005).
Leguminosae dominate the tropical rain forests, tropical dry forests and woody savannahs of both the Neotropics and Africa. In these biomes, legumes are generally the family that provides the greatest number of species, and which is most abundant. Lauraceae are species-rich and ecologically important canopy trees in neotropical rain forests. Melastomataceae and Annonaceae are also species-rich and abundant understorey trees in rain forests worldwide. The importance of dispersal in these families, often by long-distance and over oceanic barriers, implies that this process has been far more important in the assembly of these woody, tropical biomes than was assumed by previous authors (e.g. Gentry, 1982, 1993).
Given that molecular evidence suggests that long distance dispersal is playing a greater role in establishing current distributions than had been previously assumed, one might expect particular propagule types to correspond with those taxa that have dispersed long distances. Higgins et al. (2003) explored whether dispersal distances, colonization rates and migration rates support the idea that dispersal processes suggested by the morphology of the dispersal unit are responsible for long distance dispersal. They concluded that that there is no apparent correlation between propagule type and long-distance dispersal and that dispersal could often be by some nonstandard means that differs from that suggested by the morphology of the propagule. Dated transoceanic disjunctions in legumes (Lavin et al., 2004, 2005) support this conclusion because they provide little correlation of age of disjunction with fruit type and dispersal mode. For example, dates of transoceanic disjunctions of the tropical dalbergioid legumes Chapmannia (14.2 ± 1.7 Ma), Dalbergia (9.6 ± 2.6 Ma) and Pictetia–Ormocarpum (14.5 ± 2.6 Ma) are similar, and too young to invoke overland migration such as a boreotropical dispersal route. Some of these taxa have samaroid fruits dispersed by wind (e.g. Dalbergia), whereas others passively drop their seeds in indehiscent pods (e.g. Pictetia) or as sections of indehiscent lomented fruits (e.g. Chapmannia).
2. Differences between the biogeographic patterns of plants and animals
One intriguing result that is emerging from these early biogeographic studies is a possible difference in the relative frequency of dispersal in the biogeographic history of animals and plants (Donoghue & Smith, 2004; Pennington & Dick, 2004; Sanmartín & Ronquist, 2004; Corlett & Primack, 2006). In a study of dated phylogenies of 18 plant (angiosperm) and 54 animal (44 insect, four fish, one reptile, one bird, one marsupial, one gastropod, one onychophoran and one arachnid) clades from the southern hemisphere, Sanmartín & Ronquist (2004) discovered that dates for the same geographic divergences were older for animals than for plants. In general, tectonic events better explained the animal patterns and greater amounts of dispersal had to be invoked to explain the plant patterns. Intriguingly, a similar result was reported for dated phylogenies of 66 plant (62 angiosperm, three gymnosperm, one fern) and 39 animal (26 insect, three arachnid, four fish, two bird, two mammal, one nematode, one crustacean) clades containing disjunctions between the temperate forests of North America and Eastern Asia (Sanmartín et al., 2001; Donoghue & Smith, 2004). In this case, it was necessary to invoke more recent over-water dispersal for plants, whereas animal patterns are better explained by overland migration. Similarly, plant phylogenies suggest that many plant lineages arrived in South America during the period that it was isolated as a continental island (100–3 Ma; Pennington & Dick, 2004), whereas the fossil record for vertebrates suggests that this animal group evolved largely in isolation in the same area (Simpson, 1980), with fewer immigrant lineages arriving, including caviomorph rodents, platyrrhine monkeys and lizards (reviewed by Renner, 2004). Pennington & Dick (2004) suggested that the capacity of some plant propagules to survive long periods in adverse conditions, coupled with the capacity for vegetative propagation and self pollination may all contribute to explain greater powers of dispersal for plants than animals.
These results contrasting plants and animals are preliminary, and suffer the fault of too narrow a taxonomic sample. For example, Donoghue & Smith (2004) pointed out that their plant dataset contained a preponderance of studies of angiosperms, though long-distance dispersal has also been suggested to be the predominant factor underlying transatlantic disjunctions in pteridophytes (Moran & Smith, 2001). The animal dataset of Sanmartín et al. (2001) was dominated by studies of insects and lacking in studies of groups of vertebrates such as amphibians and lizards. Furthermore, the dates in these studies were estimated using a variety of analytical methods and both fossil and biogeographic calibrations. However, if this pattern is general it has considerable implications for models of community assembly over evolutionary time (Donoghue & Smith, 2004), envisaging a resident fauna needing to adapt to a changing flora as new plant taxa arrive.
3. Age of origin of biomes
The oldest model explaining the high species diversity of tropical rain forests invoked accumulation of species over long geological periods because of low extinction rates in the face of relatively constant, benign, environmental conditions (Wallace, 1878; Dobzhansky, 1950; Wiens & Donoghue, 2004). A key test of this model is that tropical rain forests should be geologically old. One means of estimating the age of a biome is by the appearance of biome-specific morphologies and characteristic taxonomic groups in the fossil record. For example, Burnham & Johnson (2004) examined the neotropical plant fossil record for features characteristic of a tropical rain forest such as leaves with drip tips, large fruits and large trunks, and for families also characteristic of neotropical rain forests such as palms, legumes and Sapotaceae. Their conclusion was that there is no strong evidence for neotropical rain forest before the early Tertiary, c. 60 Ma. A similar conclusion was reached for African rain forests by Jacobs (2004). These findings fit the well known pattern of the Eocene modernization of the terrestrial biota (Wing, 1987). The fossil record demonstrates the early Tertiary as the time during which the most species-rich clades of eudicots and epiphytic ferns underwent most diversification (Magallón et al., 1999; Crepet et al., 2004; Schneider et al., 2004), and implicates the same time for the rise to dominance of groups of animals such as mammals, birds and ants (Springer et al., 2003; Barker et al., 2004; Moreau et al., 2006). This gives a general timeframe within which to understand the processes that have lead to the assembly of contemporary biomes.
However, dated plant phylogenies offer another means of verifying the ages of biomes (Becerra, 2005; Davis et al., 2005). Such studies assume that the time of origin of a biome-specific taxonomic group may provide a reasonable proxy for the age of the biome itself. A pioneering study (Davis et al., 2005) used a dated phylogeny of the order Malpighiales (sensu APG II, 2003) in order to estimate the time of origin of tropical rain forests globally. Davis et al. (2005) used parsimony and maximum likelihood and knowledge of the current ecology of families and genera of Malpighiales, many of which are characteristic of tropical rain forest understories, to reconstruct this habitat as the most probable ancestral biome for the order. Given that a dated phylogeny indicates that many lineages in this order diversified in the Cretaceous (all 28 traditionally recognized families within this clade originated well before the Tertiary, mostly during the Albian and Cenomanian, 112–94 Ma), the conclusion was that tropical rain forests must have a far longer history than the fossil record indicates. This receives some support from Cenomanian North American fossil deposits that have been interpreted as evidence of megathermal, closed canopy, multistratal forest (Upchurch & Wolfe, 1987), but there is little, if any, fossil evidence for Cretaceous equatorial tropical rain forests elsewhere (reviewed by Burnham & Johnson, 2004; Jacobs, 2004). Though some of Davis et al.'s (2005) assumptions might be questioned (for example, that a closed canopy tropical forest in the Cretaceous could be considered the same biome as a modern tropical rain forest), this study is a good example of how dated phylogenies have been used to challenge fossil evidence of biome age.
Although most attention has been focused upon tropical rain forests, which are the most species rich biome globally, other studies have examined the time of origin of other biomes that are significantly diverse. Examples of such studies include dating the origin of seasonally dry tropical forests of Mexico (Becerra, 2005) using the genus Bursera that is very species-rich and characteristic of this biome in this geographic area, and inferences of the time of origin of neotropical savannahs (Pennington et al., 2006), the Cape Floristic Province (Richardson et al., 2001a; Linder & Hardy, 2004) and Californian (Hileman et al., 2001; Calsbeek et al., 2003) and Australian vegetation types (Crisp et al., 2004).
These studies, and associated palaeontological evidence, indicate that many of these biomes have a more recent origin than tropical rain forests. At the most extreme, the few dated phylogenies for plants of the savannah biome (‘cerrado’) that covers 2 million km2 of central Brazil suggest an onset of diversification no later than 4 Ma (Pennington et al., 2006). This is consistent with inferences from the fossil record that inflammable C4 grasses that are abundant in this biome, and are the cause of the frequent fires that characterize it ecologically, only arose to dominance in the late Pliocene, c. 4 Ma (Jacobs et al., 1999).
4. Is biome age the real issue? Understanding the accumulation of species diversity
MacArthur & Wilson (1967) remarked that field of biogeography had scarcely begun to approach the fundamental questions bearing on the causes of diversity, not necessarily because these problems are intractable, but mostly because biogeographical questions were centred on particular taxa, places and times, and not on general processes. With this in mind, we caution against the approach outlined above that simply seeks the maximum age of a given biome by examining the time of diversification of single clades of biome-specific species (Becerra, 2005; Davis et al., 2005; Pennington et al., 2006). This to some extent hearkens back to the search for centres of origins, but now the age rather than the place is being emphasized. Although knowing the maximum age of a biome is an interesting fact, it is surely more critical to understand the general processes that have led to the accumulation of high diversity.
Knowing the age of a biome does, however, give an estimation of the period of time available for the assembly of its endemic species. In seeking to explain the high species diversity of tropical rain forests and of the central Brazilian savannahs, models may have to assume considerably different time periods. In tropical rain forest, a biome that is at least 60 million yr old, there is evidence for relatively ancient diversification (e.g. Clusia; Gustafsson & Bittrich, 2003; Acridocarpus; Davis et al. 2002a) and for the persistence over long periods of time (10 Ma or more) of some widespread, abundant species of trees (e.g. Symphonia globulifera; Dick et al., 2003). However, other examples show that all species in some remarkably diverse genera have arisen very recently. For example, all 300 species of the genus Inga (Leguminosae), which is an important element in neotropical rain forests, have arisen in the past two million years (Richardson et al., 2001b; Lavin, 2006). This pattern of recent explosive diversification is likely to have been repeated in the species-rich genera Ocotea (Lauraceae; Chanderbali et al. 2001) and Miconia (Renner et al., 2001), though the phylogenies of these groups sampled few species and conclusions must be tentative. Thus, it seems likely, perhaps unsurprisingly, that the explanation of high species diversity in neotropical rain forests lies in a combination of both ancient diversification and the long persistence time of some species, and in recent explosive evolution of other clades. By contrast, if the Brazilian savannah biome has only existed for c. 4 Ma, then relatively recent diversification must have produced the high percentage of endemic plant species found there (c. 35% endemic trees and shrubs; c. 70% endemic herbs and subshrubs; Ratter et al., 2006).
5. Ecological phylogenetic structure: diversification within biomes or between biomes?
Phylogenies can also address the question of whether most species in a given biome have evolved in situ within that biome, or whether switching of habitat, or ecological speciation may have played a predominant role. As an example, the neotropical savannah biome is dominated by woody genera that are also found in rain forest, which might be a reflection of the relative ages of these biomes (of the 121 woody species in 90 genera listed by Bridgewater et al. (2004) as dominating the Brazilian cerrados, 99 species belong to 72 genera that also have species found in rain forests). If all savannah species are most closely related to other savannah species, and all rain forest species most closely related to rain forest species, then evolutionary processes within single biomes must account for the observed species diversity. If rain forest and savannah species are consistently resolved as pairs of sister species, then switching of habitat must have played the predominant role in the generation of diversity. Surprisingly few studies have investigated this tractable problem. In Africa, a phylogeny of the genus Acridocarpus indicated a single lineage (four species sampled) of dry forest taxa that is derived from within a wet forest group (Davis et al., 2002). In the Neotropics, a phylogeny of Ruprechtia (Polygonaceae; Pennington et al., 2004) estimated two invasions of rain forest from dry forest lineages. An excellent study, which was methodologically similar, but confined to habitats with a single biome, examined the evolution of edaphic specialization in Amazonian Burseraceae trees (Fine et al., 2005). In none of these cases was the maximum possible amount of ecological speciation observed, which would be an independent event underlying the origin of each species in the new habitat. Schrire et al. (2005) analysed a taxon–biome consensus tree for the legume family using standard cladistic biogeographic approaches (DIVA, Component Analysis, Three Area Statements and Brooks Parsimony Analysis) to show that ecological clade transition to new biomes was mainly between the tropical biomes of seasonally dry forests, savannah, and rain forests. There were also geographical transitions between the temperate biomes of the northern and southern hemisphere, but fewer links between tropical and temperate biomes, and no instances of temperate clades giving rise to tropical clades. This pattern for Leguminosae, of diversification strongly constrained by ecology, matches the concept of ‘phylogenetic niche conservatism’ (Harvey & Pagel, 1991; Ricklefs & Latham, 1992; Webb et al., 2002; Wiens, 2004), which suggests that lineages are predisposed to maintain their ancestral ecological predilection.
6. Geographic phylogenetic structure: diversification within single geographic areas, or migration between them?
Using similar logic, phylogenies can be used to investigate the geographic structuring of lineages by examining how closely geographical proximity reflects phylogenetic relatedness both at more restricted spatial scales (e.g. Avise, 2000; Hewitt, 2000), and at the widest global scales (e.g. Irwin, 2002; Lavin et al., 2004; Lavin, 2006). Simply, are all species growing in a given geographic area each other's closest relatives (high geographical structure), or is each species most closely related to another that is endemic to a distant area (low geographical structure)? Intriguingly, there are hints that patterns of geographic structure may differ among biomes (Lavin, 2006). Phylogenies of several genera with species largely endemic to seasonally dry tropical forests of the Neotropics (e.g. Coursetia, Poissonia and Ruprechtia) show a high degree of geographic phylogenetic structure corresponding to the separated areas of this biome (Fig. 2), whereas phylogenetic structure on a similar spatial scale in the phylogeny of the rain forest genus Inga is low. This implies restricted historical movement of species in these groups between areas of seasonally dry tropical forest, and more extensive migration between areas of rain forest. This difference is not an artefact of the age of the clades studied, because the crown groups of Coursetia, Poissonia and Ruprechtia are older (ranging from c. 20–8 Ma; Pennington et al. 2004) and have had more time for dispersal to operate, but show more geographical phylogenetic structure than the younger Inga (crown group aged c. 2 Ma; Lavin, 2006). Rather than any inherent biological differences, this may simply reflect the greater present day and historical continuity of rain forest in the Neotropics compared with the fragmented distribution of seasonally dry tropical forests (Fig. 2). Thus, in seasonally dry tropical forests, endemic species produced over time in each of the isolated areas are not replaced by immigrants from elsewhere, resulting in the small clades of species confined to each area (i.e. high geographic structure) seen in genera largely confined to this biome. By contrast, historical migration has been greater among Inga species, and in a given area of rain forest, co-occurring Inga species are unlikely to be each other's closest relatives. This inference of migration rate is supported by the relative numbers of widespread species found in these genera – much greater in Inga (28%), than in Coursetia and Poissonia (2%; Lavin, 2006). These conclusions may not be general if the taxa are not typical of others in the biomes, and many more phylogenetic studies that would permit a meta-analysis are necessary.
IV. Future directions: neutral ecological theory, community phylogenetic structure, and processes leading to species accumulation
The frameworks of neutral ecological theory and community phylogenetic structure (Hubbell, 2001; Webb et al., 2002) may offer another route for testing mechanisms suggested by phylogenies that have led to species diversifications within biomes without needing to resort to multiple phylogenetic studies. These recent developments in ecology offer a new and exciting synthesis of the disciplines of biological inventory, systematics and evolutionary biology.
Neutral ecological theory (Hubbell, 2001) estimates rates of speciation and migration from the zero sum multinomial distribution of relative species abundance data. The number of species produced per metacommunity generation (Θ) and the number of resident deaths within a local community that are replaced by immigrants from the metacommunity (m) are central to understanding the historical construction of ecosystems (Latimer et al., 2005).
Although it is not within the scope of this review to argue the pros and cons of neutral ecological theory, the large majority of critiques overlook its potential for the purposes of understanding patterns of biodiversity, biogeography, and especially phylogeny. Critics of neutral ecological theory rarely identify when and why it might be important to estimate the parameters Θ and m from the zero-sum multinomial distribution, or relate this model of community abundance patterns to phylogeny (e.g. Harpole & Tilman, 2006). Biogeographers working with clades or communities of hundreds to thousands of species that coexist in the same biome may not have niche assembly as a primary research objective. Instead, they may be more interested in whether fragmentation, position, or size of a geographic region causes dispersal limitation and thus the evolution of endemics, high beta diversity, or patterns of local commonness combined with global rarity. Fortunately for biogeographers, the implementation of neutral ecological models continues to be advanced through user-friendly software using more efficient and exact methods of parameter estimation (e.g. Latimer et al., 2005; Chave & Jabot, 2006; Etienne et al., 2006; Hankin, 2006).
As an example, phylogenetic patterns for clades mostly confined to neotropical seasonally dry tropical forests suggest that this biome is dispersal limited (Lavin, 2006; see earlier). This is consistent with island biogeographic theory because neotropical seasonally dry forests are more patchy in distribution and occupy a total area that is smaller than neotropical rain forests (Fig. 2). Extending these observations, seasonally dry tropical forests should be globally more dispersal limited than tropical rain forests because, as in the Neotropics, their distribution is always more scattered and their overall area smaller (Lavin et al., 2004; Schrire et al., 2005). This prediction can be tested with both phylogenetics and community data, particularly relative species abundance data from these ecosystems.
Lavin et al. (2004) found that geographically structured phylogenies in intercontinentally distributed legume groups were mostly found in seasonally dry tropical forests or thorn scrub (e.g. a clade endemic to the dry forests of Mesoamerica and its sister from the Somalia–Masai region of the Horn of Africa, such as Chapmannia). Clades confined to tropical wet forests or tropical savannahs showed little evidence of this sort of structure and, indeed, it is species from these wet forest clades that are more frequently pantropically distributed. The inference is that dispersal limitation of the seasonally dry tropical forests and thorn scrub allows the accumulation of related allopatric species within the separate areas of this biome (Fig. 2). Tropical wet forests and savannahs appear less dispersal limited and thus related species are predicted to be more widely separated geographically and more individual species should have broad, or even pantropical distributions. This view of tropical wet forests based upon legumes is in agreement with Pitman et al. (1999), who found that few taxa of an 825-species sample of the tree flora in western Amazonian Peru are locally endemic. Many of the most common species have broad distributions, extending to Ecuador, causing beta diversity to be low (Pitman et al., 2001).
The above hypothesis suggesting greater dispersal limitation of seasonally dry tropical forests might be tested by comparing numbers of museum collections as proxies for relative species abundances among clades confined to this biome vs the wet forest or savannah biome. If dispersal limitation of the seasonally dry forests allows local endemics to ecologically drift to high frequency (because they are not contending with many immigrants), then such species ought to be well represented in herbaria or museums because they are so locally common and easy to locate, despite their restricted global distributions. If high immigration rates into local wet forest communities keep regional endemics at low abundances, then such species, even if broadly distributed, should be not well represented in herbaria or museums because they are locally rare, occurring at very low frequencies. Preliminary studies of the relative abundances of herbarium collections for rain forest clades (e.g. the legume genus Inga) and seasonally dry tropical forest clades (e.g. the legume tribe Robinieae) reveals that the relative species abundance distribution has a long tail of rare species for the rain forest clades, which is lacking in this distribution for dry forest clades (M. Lavin et al. unpublished). This results in estimates of high rates of immigration for rain forests clades and low rates for dry forest clades, as is predicted from the geographic phylogenetic structure.
Another way of testing the influence of dispersal limitation on patterns of local endemism is to correlate geographical distance with community similarity. A stronger negative relationship is predicted among seasonally dry tropical forest sites as opposed to those of rain forests because of dispersal limitation of the former that has led to the potentially greater beta-diversity of neotropical seasonally dry tropical forests (Pennington et al., 2006). Related to this are species–area curves, which are expected to show steeper slopes (higher z-values) for seasonally dry tropical forest inventory data compared with wet forest data.
A final way of testing the hypothesis that seasonally dry tropical forests are more dispersal-limited is to measure their community phylogenetic structure (Webb, 2000), and contrast it with that of rain forests. This is phylogenetic structure on a much smaller spatial scale than the regional scale considered under ‘geographic phylogenetic structure’ above. Local forest communities (e.g. one to a few hectares) are assembled from a larger local species pool (Webb et al., 2002). Studies of community phylogenetic structure ask whether the distribution of species in a local community is nonrandom with respect to a phylogeny of the local species pool (Webb et al., 2002). This ‘local species pool’ phylogeny is an estimate of the relationships of all the species in the local species pool derived using ‘supertree’ techniques (Webb, 2000) that link phylogenies from separate studies. Webb (2000) and Webb et al. (2002) defined the scale of a local species pool as several hectares to a few square kilometres, but for our purpose of contrasting rain forests and seasonally dry tropical forests it may be useful to consider a broader local species pool of the size of one of the isolated areas of seasonally dry forest shown in Fig. 2. In this case, community phylogenetic structure might be expected to be high within rain forest plots simply because they would be more likely to have coexisting members of the same niche-conservative clade that have arrived from disparate locations. In comparison, in the more dispersal limited seasonally dry tropical forests, fewer coexisting, closely related species should be found within plots, and therefore low community phylogenetic structure is predicted. Even if sister species produced allopatrically in a single separate regional area of seasonally dry forest (Fig. 2) migrate and come into secondary sympatric contact in a local community, we suggest that dispersal between the isolated patches of these dry forests may be insufficient to produce levels of sympatry of congeneric species produced by higher migration rates across the rain forest biome. There are some hints from taxonomic and forest inventory studies to support this. For example, Enquist et al. (2002) analysed taxonomic diversity of woody plant communities from around the world based upon 227 samples of 0.1 Ha and showed that for a given species diversity, dry sites have proportionately greater numbers of genera and families than mesic sites. This lower species to genus ratio implies fewer sympatric congeneric species at plot scale, which matches our own experience of some legume groups. Species of Coursetia and Poissonia tend to be allopatric within the same regional area of seasonally dry forest such as the Interandean valleys of Peru (Lavin, 1988) and so are not found together in the same inventory plots (1 Ha or less; R.T. Pennington, unpublished), whereas up to 19 Inga species co-occur in one hectare of Amazonian rain forest (Valencia et al., 1994).
Rather than solely using dated phylogenies to attempt to infer the age of biomes, the distribution and modalities of ages of crown clades that show niche conservatism (e.g. seasonally dry tropical forest vs wet forest or savannah clades) or geographic restriction (continental island crown clades vs oceanic island crown clades) would give a better glimpse of the general processes that cause patterns of biome diversity. Older age distributions of ecologically or geographically confined crown clades might identify less dispersal prone systems. For example, one might predict that Madagascan crown clades would be young, despite the island's age because it is large and near the huge African mainland source area. Hawaiian crown clades would be expected to be on average older, despite the younger age of the archipelago, because the islands are of smaller size and distant from the mainland. Similarly, on a global scale, isolated patches of seasonally dry tropical forest might be predicted to harbour older crown clades, whereas more extensive and contiguous rain forests should average younger-aged crown clades because high immigration rates would put diversifying residents at risk of extinction. The meta-analyses necessary to test these predictions will require gathering and dating more phylogenies.
Such phylogenetic work will be time-consuming and expensive, not least because of the logistic difficulties of locating and collecting plants in tropical biomes. However, some of the mechanisms suggested by phylogenetic studies to have influenced species diversification and accumulation in biomes, such as dispersal limitation, may be testable using neutral ecological theory and by examining community phylogenetic structure. Studies with these theoretical frameworks have the advantage of potentially using existing ecological inventory data and museum collections as sources of species abundance data from which diversity and migration estimates can be derived with neutral, or more elaborated, ecological models.
We thank David Ackerly, Cam Webb and three anonymous reviewers for very constructive comments, José Saito for help with Fig. 2, and Jim Ratter for his encyclopaedic knowledge of the flora of the Brazilian cerrado.