Aim Because of their broad distribution in geographical and ecological dimensions, seaweeds (marine macroalgae) offer great potential as models for marine biogeographical inquiry and exploration of the interface between macroecology and macroevolution. This study aims to characterize evolutionary niche dynamics in the common green seaweed genus Halimeda, use the observed insights to gain understanding of the biogeographical history of the genus and predict habitats that can be targeted for the discovery of species of special biogeographical interest.
Location Tropical and subtropical coastal waters.
Methods The evolutionary history of the genus is characterized using molecular phylogenetics and relaxed molecular clock analysis. Niche modelling is carried out with maximum entropy techniques and uses macroecological data derived from global satellite imagery. Evolutionary niche dynamics are inferred through application of ancestral character state estimation.
Results A nearly comprehensive molecular phylogeny of the genus was inferred from a six-locus dataset. Macroecological niche models showed that species distribution ranges are considerably smaller than their potential ranges. We show strong phylogenetic signal in various macroecological niche features.
Main conclusions The evolution of Halimeda is characterized by conservatism for tropical, nutrient-depleted habitats, yet one section of the genus managed to invade colder habitats multiple times independently. Niche models indicate that the restricted geographical ranges of Halimeda species are not due to habitat unsuitability, strengthening the case for dispersal limitation. Niche models identified hotspots of habitat suitability of Caribbean species in the eastern Pacific Ocean. We propose that these hotspots be targeted for discovery of new species separated from their Caribbean siblings since the Pliocene rise of the Central American Isthmus.
Various interacting features influence the distribution of a species. The niche of a species is commonly defined as the set of biotic and abiotic conditions in which it is able to persist and maintain stable population sizes (Hutchinson, 1957). Further distinction is made between a species’ fundamental niche, which consists of the set of all conditions that allow for its long-term survival, and the realized niche, which is a subset of the fundamental niche that a species actually occupies. Species tolerances are determined by their morphological, reproductive and physiological traits, which are in turn susceptible to evolutionary forces. Hence, niche characteristics can be interpreted as evolutionary phenomena. Understanding niche evolution yields valuable insights into biogeography, biodiversity patterns and conservation biology (Wiens & Graham, 2005; Rissler et al., 2006; Wiens et al., 2007).
The niche concept provides a conceptual framework to predict geographical distributions of species. Niche models establish the macroecological preferences of a given species based on observed distribution records and a set of macroecological variables, and these preferences can subsequently be used to predict geographical areas with suitable habitat for the species (e.g. Guisan & Thuiller, 2005; Raxworthy et al., 2007; Rissler & Apodaca, 2007). The availability of macroecological data, either in the form of remotely sensed or interpolated measurement data, is increasing and has already provided many biological studies with environmental information (Kozak et al., 2008). To date, most ecological niche modelling studies have focused on terrestrial organisms. A notable exception is the study by Graham et al. (2007), which used a synthetic oceanographic and ecophysiological model to identify known kelp populations and predict the existence of undiscovered kelp habitats in deep tropical waters.
Integration of niche models, macroecological data and phylogenetic information yields information on niche shifts and insights into the evolution of environmental preferences across phylogenetic trees. So far, evolutionary niche dynamics have been studied almost exclusively in terrestrial organisms (e.g. Graham et al., 2004; Knouft et al., 2006; Yesson & Culham, 2006) and little information is available on niche evolution of the organisms inhabiting the world's oceans. Seaweeds appear to be an excellent model system for studying the evolutionary dynamics of the macroecological niche in coastal marine organisms. Individual seaweed specimens are fixed in one location, yielding a direct link to georeferenced macroecological data. As a whole, seaweeds occur in a wide range of coastal habitats and many seaweed genera or families have a world-wide distribution, resulting in sufficient variability in macroecological dimensions and biogeographical patterns. Evolutionary relationships between and within seaweed genera are being characterized in increasing detail as a result of molecular phylogenetic research during the past few decades. Finally, seaweeds are straightforward to collect and process, making them easy targets for this kind of research.
The marine green algal genus Halimeda is among the better-studied seaweeds from a phylogenetic perspective and is therefore an obvious candidate for studies of niche evolution and biogeography. Halimeda consists of segmented, calcified thalli and abounds on and around coral reefs and in lagoons throughout the tropics and subtropics up to depths in excess of 150 m (Hillis-Colinvaux, 1980). Halimeda species are important primary producers and provide food and habitat for small animals and epiflora (Jensen et al., 1985; Naim, 1988). After the algae reproduce, they die and their calcified segments are shed. Halimeda segments account for up to 90% of tropical beach sand and carbonate rock of tropical reefs (e.g. Drew, 1983; Freile et al., 1995). The biogeography of Halimeda has been described in some detail. All but one species are restricted to a single ocean basin (Indo-Pacific or Atlantic), and biogeography has a strong phylogenetic imprint: each of the five sections of the genus is separated into Atlantic and Indo-Pacific sublineages, suggestive of a strong vicariance event. Even though the species distribution ranges and the historical biogeographical patterns have been identified, questions about what causes them remain (Kooistra et al., 2002; Verbruggen et al., 2005b). Are species restricted to one ocean basin because of habitat unsuitability in the other basin or should the limited distribution ranges be attributed to dispersal limitation? It is also not known with certainty which vicariance event may be responsible for the phylogenetic separation of Indo-Pacific and Atlantic lineages. So far, two geological events have been implied: the Miocene closure of the Tethys Seaway in the Middle East and the Pliocene shoaling of the Central American Isthmus (Kooistra et al., 2002; Verbruggen et al., 2005b).
The first goal of the present study is to investigate the evolutionary niche dynamics of the seaweed genus Halimeda, focusing on niche dimensions relevant to global geographical distributions rather than local distributional issues such as microhabitat preferences. The second goal is to investigate two aspects of the biogeography of the genus: why are species restricted to a single ocean basin and what caused the historical biogeographical splits. Our approach consists of a combination of molecular phylogenetics, niche modelling, optimization of models of macroecological trait evolution, and ancestral state estimation.
MATERIALS AND METHODS
Species delimitation was based on a combination of DNA sequence data and morphological knowledge, with molecular data serving as the primary source of information used to define species boundaries and morphological species boundaries being assessed secondarily, using the species groups determined with DNA data. We used this approach because traditional morphological species definitions are often inaccurate in seaweeds due to morphological plasticity, convergence and cryptic speciation (e.g. Saunders & Lehmkuhl, 2005). The proposed approach has previously been applied to define species boundaries more accurately (Verbruggen et al., 2005a).
The DNA datasets initiated by Kooistra et al. (2002) and Verbruggen (2005) were extended for this study using previously described protocols (Verbruggen, 2005), resulting in 264 UCP7 sequences, 337 ITS sequences and 106 tufA sequences belonging to a total of 444 specimens. These three datasets were subjected to neighbour joining analysis to detect species-level clusters. Using this approach, the sequenced specimens were attributed to 52 Halimeda species. If easily recognizable combinations of morphological features could be identified for species by studying the sequenced specimens, these features were used for identification of additional collections from various herbaria (BISH, Bishop Museum; GENT, Ghent University; L, Nationaal Herbarium Nederland, Leiden University branch; PC, Muséum National d’Histoire Naturelle; UPF, Université de Polynésie Française; US, Smithsonian Institution) that were not suitable for sequencing (see the Index Herbariorum website (http://sweetgum.nybg.org/ih/) for further herbarium details).
Preprocessing observation data
Recent collections had accurate coordinates recorded with a global positioning device. Older collections with detailed locality information were georeferenced (latitude and longitude) using Google Earth (http://earth.google.com). Points that fell ashore when plotted on coarse-resolution environmental grids were manually moved to the adjacent coastal waters using idrisi Andes (http://www.clarklabs.org/). Data were examined for georeferencing errors by checking for geographical outliers with visual and overlay methods in Arcgis (http://www.esri.com/). Errors were identified by creating an overlay between the point locality layer and a maritime boundaries layer (exclusive economic zones and coastlines) provided by the Flanders Marine Institute (http://www.vliz.be/). Any mismatch between these layers was indicative of a potential georeferencing error and outlying points were removed if their origin could not be confirmed.
The evolutionary history underlying the 52 species of Halimeda included in the study was inferred from a multilocus DNA dataset using Bayesian phylogenetic inference (Holder & Lewis, 2003). Bayesian phylogenetic inference techniques make explicit use of models of sequence evolution, an approach that has been shown to outperform methods that do not assume such models (Swofford et al., 2001). Sequence data from four chloroplast loci (rbcL, tufA, UCP3, UCP7) and two nuclear markers (SSU nrDNA, ITS region) were obtained following previously described protocols (Famà et al., 2002; Kooistra et al., 2002; Provan et al., 2004; Lam & Zechman, 2006) or from previously published studies (Kooistra et al., 2002; Verbruggen et al., 2005a,b). Individual loci were aligned by eye and ambiguous regions were removed. Data for a few loci were missing mainly for recently discovered species but the concatenated data matrix was 90% filled. The final alignment can be obtained from http://www.phycoweb.net/ and http://www.treebase.org/. All new sequences generated in this study have been submitted to GenBank (accession numbers FJ624485–FJ624863).
In order to identify a suitable model of sequence evolution for our dataset, we used model selection procedures based on the second-order Akaike information criterion (AICc) (Sullivan, 2005). The phylogenetic analysis was carried out with the model of sequence evolution that yielded the lowest AICc score. This model contained 14 partitions: SSU nrDNA, the ITS region and three codon positions per protein-coding gene. The GTR + Γ8 substitution models yielded the best fit to the data for all partitions. Bayesian phylogenetic inference was carried out with MrBayes v.3.1.2 (Ronquist & Huelsenbeck, 2003). Five runs of four incrementally heated chains were run for 10 million generations using default priors and chain temperature settings. Convergence of the Markov chain Monte Carlo (MCMC) runs was assessed with Tracer v.1.4 (Rambaut & Drummond, 2007). An appropriate burn-in was determined with the automated method proposed by Beiko et al. (2006) and a majority rule consensus tree was built from the post-burn-in trees. The tree was rooted at the point where root-to-tip path length variance was minimal.
The branch lengths of the obtained consensus phylogram are proportional to the estimated amount of molecular evolution occurring on the branches. In order to model character evolution, in our case evolutionary niche dynamics, branch lengths should be proportional to evolutionary time rather than amounts of molecular evolution. To obtain a chronogram (i.e. a phylogram with branch lengths proportional to evolutionary time), penalized likelihood rate smoothing (Sanderson, 2002) was carried out on the consensus tree with r8s (Sanderson, 2003), using both the additive and the log-additive penalty settings. The root of the phylogeny was assigned an age of 147 Ma, following the molecular clock result from Verbruggen et al. (2009). We refer to the latter paper for details regarding the dating of the phylogeny.
Macroecological variables were selected to represent the major environmental dimensions assumed to influence seaweed distribution at a global scale and subject to data availability (Lüning, 1990) (Table 1). The base macroecological data included geophysical, biotic and climate variables derived from level-3 preprocessed satellite data (Aqua-MODIS and SeaWiFS) available at OceanColor Web (http://oceancolor.gsfc.nasa.gov/). We downloaded grids representing monthly averages at a 5 arcmin (≈ 9.2 km) spatial resolution. These geometrically corrected images are two-dimensional arrays with an equidistant cylindrical (Platte Carre) projection of the globe. Yearly minimum, maximum and average values were calculated from the monthly averages with matlab (http://www.mathworks.com/). To achieve this, average monthly images were generated by averaging images of the same month across years (e.g. average sea surface temperature (SST) of July from 2003 to 2007). Subsequently, yearly minimum and maximum images were composed by selecting the minimum and maximum pixels from these monthly averages. Finally, yearly average images were created by taking the mean value for every grid cell of the monthly averages. All images were cropped to the latitudinal range 50° N–40° S, which includes the highest latitudes at which Halimeda can be found.
Table 1. Geophysical parameters included in the macroecological dataset.
Original spatial resolution (arcmin)
Sea surface temperature (SST)
Max, min, average (day and night)
Diffuse attenuation (DA)
Max, min, average
Calcite concentration (Ca)
Chlorophyll A (CHLO)
Max, min, average
Photosynthetically active radiation (PAR)
Max, min, average
Evolutionary analysis of niche characteristics
The evolutionary dynamics of niche features were studied by inferring their patterns of change along the chronogram in a maximum likelihood (ML) framework. The macroecological niche features included in our study are continuous variables and we inferred their evolution with common models of continuous trait evolution. Brownian motion models, also known as constant-variance random walk models, assume that traits vary naturally along a continuous scale and that variation is accumulated proportionally to evolutionary time, as measured by the branch lengths in a chronogram (Martins & Hansen, 1997; Pagel, 1999). Two branch length scaling parameters (lambda and kappa) were used to extend this model and better describe the mode and tempo of trait evolution (Pagel, 1999). Lambda (λ) transformations measure the amount of phylogenetic signal present in a continuous character. The transformation consists of multiplying all internal branch lengths of the tree by λ, leaving tip branches their original length. When the ML estimate of λ is close to 1, the internal branches retain their original length, indicating strong phylogenetic signal in the trait. If λ approaches 0, the evolution of the trait is virtually independent of phylogeny. Kappa (κ) transformations measure the degree of punctuational versus gradual evolution of characters on a phylogeny, by raising all branch lengths to the power κ. If the ML estimate of κ is close to 0, all branch lengths approach unity, and path lengths become proportional to the number of lineage splitting events, suggesting that the evolution of the trait approximates punctuated evolution associated with speciation events. If the ML estimate of κ is close to 1, branch lengths remain unchanged, indicating that the amount of change in the character is proportional to evolutionary time. In other words, κ values close to 1 indicate gradual evolution.
In order to fit the models above and infer changes of the macroecological niche along the species phylogeny, a species × variables matrix had to be constructed. To achieve this, the values of the macroecological data layers were extracted for each sample locality. For each species, the minimum, maximum and average of each macroecological parameter were stored in the species × variables matrix. To reduce the influence of geographical sampling bias on the average values, they were calculated by weighted averaging. The Euclidean distance from the sample location to the centre of gravity for the species in question was used as the sample weight. The centre of gravity for the species was determined by averaging the three-dimensional Cartesian coordinates of all sample locations for that species.
The models of continuous trait evolution listed above were optimized along the phylogenetic tree for the minimum, average and maximum values of a selection of niche variables using the ML optimization of the geiger package (Harmon et al., 2008). Ancestral character values for macroecological niche features were estimated by ML inference (Schluter et al., 1997) with the ape package (Paradis et al., 2004). Resulting ancestral state values were plotted on the phylogeny with TreeGradients v1.03 (Verbruggen, 2008).
Niche modelling procedure
The macroecological niches of species were modelled with Maxent, a presence-only niche modelling technique based on the maximum entropy principle (Phillips et al., 2006). We used a presence-only technique because only specimen collection data are available and absence data cannot be reliably obtained for seaweed species on a global scale. Maxent has shown remarkably good performance in a comparative study of presence-only niche modelling techniques (Elith et al., 2006). It estimates the probability distribution of maximum entropy (i.e. that is most spread out, or closest to uniform) of each macroecological variable across the study area. This distribution is calculated with the constraint that the expected value of each macroecological variable under the estimated distribution matches the empirical average generated from macroecological values associated with species occurrence data. The model output consists of a spatially explicit probability surface that represents an ecological niche (habitat suitability) translated from macroecological space into geographical space. The output grid is in the logistic format, where each pixel value represents the estimated probability that the species can be present at that pixel (Phillips & Dudík, 2008).
To avoid using redundant and correlated macroecological layers for niche modelling, an unstandardized principal component analysis was performed on the original variables in idrisi Andes. The first, second and third principal component grids, which together accounted for 98.82% of the overall variance in the original variables, were exported for subsequent use in Maxent.
Global species niches were modelled for all Halimeda species for which more than 10 distribution records were available, while excluding species with distribution records suffering from high spatial autocorrelation. Additionally, a single regional model was generated using pooled occurrence data of six Caribbean species (Halimeda goreauii, Halimeda simulans, Halimeda incrassata, Halimeda monile, Halimeda discoidea.atl and Halimeda tuna.car).
The Maxent algorithm was run with default parameters (convergence threshold = 10−5, maximum iterations = 500, regularization multiplier = 1, maximum number of background points = 10,000, and use of linear, quadratic, product and hinge features). Models were created using 80% of the localities for model training and 20% for model testing.
Statistical evaluation of the models was based on threshold-independent receiver operating characteristic (ROC) analysis (Phillips et al., 2006). For presence-only modelling, the ROC curve is a plot of sensitivity (proportion of correctly predicted presences) against the fractional area predicted present. The area under the ROC curve (AUC) is subsequently compared with the area under the null expectations line connecting the origin and (1,1), thus providing a measure of predictive model performance. An AUC approximating 1 would mean optimal discrimination of suitable versus unsuitable sites, whereas an AUC between 0 and 0.5 is indicative of predictions no better than random. Additionally, we use a modified AUC based on partial ROC curves as proposed by Peterson et al. (2008). This approach accounts for a user-defined maximum acceptable omission error, which we set at 0.1, and takes only the range of acceptable models in terms of omission error into account. The partial AUC is then rationed to the partial area under the null expectations line. Hence, the AUC ratio equals 1 for models performing no better than random, and increases with improving model accuracy. All partial AUC calculations were performed in the R statistical computing environment (R Development Core Team, 2008).
Species delimitation and phylogeny
Neighbour joining analysis of the UCP7, ITS and tufA sequence alignments pointed out 52 clusters with low sequence divergence within clusters and relatively high divergence between clusters, as is typically found at the species boundary (Hebert et al., 2004; Verbruggen et al., 2005a). Not all clusters corresponded to described, named species. The undescribed clusters represent cryptic or pseudo-cryptic species (Kooistra et al., 2002; Verbruggen et al., 2005a,b). The clusters inferred from DNA data formed the basis of the species definitions used in the remainder of the paper. After the addition of morphologically identified herbarium specimens, the database consisted of 1080 samples from 538 unique localities. Analysis of the concatenated alignment of rbcL, tufA, UCP3, UCP7, 18S and ITS sequences (4965 characters) yielded a well-resolved species phylogeny in which five lineages, corresponding to the five sections of the genus, could be recognized (Fig. 1).
Evolution of niche characteristics
A few niche features contained considerable amounts of phylogenetic signal, as indicated by the high ML estimates of λ values using Pagel's lambda branch length modifier (Table 2). A general observation was that average trait values contained more phylogenetic signal than minimum and maximum trait values (e.g. average temperature, not minimum or maximum temperature). High κ values for the average trait values indicate that change of these traits is proportional to evolutionary time; in other words, change is gradual (Table 2). Some traits that also contained phylogenetic signal were not included in the table because of significant correlation with the listed variables. This is the case for photosynthetically active radiation, which is correlated with SST, and diffuse attenuation, which is correlated with chlorophyll values (caused by phytoplankton).
Table 2. Optimum values of the branch length scaling parameters λ and κ used to test the mode and tempo of evolution of niche features. The niche traits are sea surface temperature (SST) and chlorophyll A (CHLO) values. The high optimal λ values inferred for average trait values indicate strong phylogenetic signal in these traits whereas the low λ values obtained for the minimum and maximum traits suggest a lack of phylogenetic signal. The relatively high optimum values for κ for average SST values suggest that evolution of this niche feature was more or less gradual (proportional to time). The lower value for average CHLO suggests that there is a non-negligible punctuated component to the evolution of nutrient preferences. The first two columns used the tree smoothed with the additive penalty; the last two columns used the tree smoothed with the log-additive penalty. The κ parameter was not optimized when there was poor phylogenetic signal in the data (low λ).
Figure 2 illustrates the estimated evolutionary patterns of average annual temperature and chlorophyll values. Estimated ancestral trait values are shown at the internal nodes and visualized using a colour gradient. An average annual temperature of 27.4 °C (95% confidence interval, 25.6–29.2) is inferred at the basal split (Fig. 2a), indicating a tropical origin for the genus. The tree clearly shows that evolution along the SST niche dimension is not homogeneous throughout the tree. Whereas the sections Rhipsalis, Micronesicae, Pseudo-opuntia and Opuntia barely deviate from typical tropical temperatures, evolution along the temperature axis has been common in section Halimeda. More specifically, the lineages leading to H. tuna.med, Halimeda cuneata.africa.1, H. cuneata.africa.2 and H. cuneata.australia have evolved a preference for colder water. Chlorophyll values were mapped onto the phylogeny as a proxy for nutrient preferences (Fig. 2b). Deviations from the average (low) nutrient preference values are present in Halimeda section Halimeda (H. cuneata.brazil, H. cuneata.africa.1, H. cuneata.africa.2, H. cuneata.australia, H. cuneata.arabia and Halimeda magnicuneata) and in Halimeda section Rhipsalis (H. incrassata).
Niche models at the global scale
Niche models indicating the areas where macroecological conditions are suitable for species to occur were generated for all species (Figs 3 & S1). The average AUC across all models with 20% test localities was 0.917 (SD = 0.046) for the training data and 0.906 (SD = 0.054) for the test data. The corresponding average AUC ratios were 1.576 (SD = 0.209) for the training data and 1.615 (SD = 0.234) for the test data. The high AUC values and ratios indicate that the most essential macroecological variables determining species distributions were accounted for in the dataset. The high scores for the test data indicate adequate model performance rather than overfitting of the model on the training data. The predicted distributions are clearly broader than the known species distributions. For example, the distribution model of the exclusively Indo-Pacific species H. borneensis (Fig. 3a) predicts habitat suitability in parts of the Atlantic Ocean. Similarly, the model of the Caribbean species Halimeda simulans (Fig. 3b) predicts habitat suitability in parts of the Indo-Pacific basin. In general, there was a stronger tendency of predicting Atlantic species into the Indo-Pacific than vice versa.
Niche model at the regional scale
The model predicting suitable habitat for a suite of six Caribbean species is shown in Fig. 3(c) (AUC ratio = 1.783). Potentially suitable habitats of these Caribbean species in the eastern Pacific are mainly predicted along the southern coast of Panama, the western coast of Colombia and in the Galapagos Islands (Fig. 3c–e).
The obtained results invite discussion about several issues related to the macroecological niche of seaweeds, how it evolves and how it relates to patterns of biogeography.
Modelling seaweed distributions
Niche modelling versus previous approaches
Our niche models indicate areas where the macroecological conditions are likely to be suitable for various Halimeda species to establish populations. They reflect the marked tropical nature of most species and show that many species occupy only part of the potentially suitable habitat (see below). Previous knowledge about the macroecological niche of seaweeds mainly stemmed from comparing distribution ranges with isotherms (isotherm fitting), studying survival and growth under various culture conditions or a combination of both approaches (e.g. van den Hoek, 1982). These approaches and the niche modelling approach presented here differ from each other in a number of aspects. Whereas the fundamental niche is investigated with in vitro studies of survival and growth, the realized niche is central in modelling techniques and isotherm fitting. A fundamental difference between niche modelling and both the other approaches is that the former yields probabilistic output whereas the latter usually propose hard thresholds. The ease with which a niche modelling study can be carried out has benefits as well as drawbacks. The advantage is obvious when targeting species that are difficult to grow in culture. A disadvantage of niche modelling is that the choice of a specific niche modelling algorithm and the parameter settings may influence niche predictions and predictive model performance (Elith et al., 2006; Peterson et al., 2008). The maximum entropy method with ROC modifications appeared to be the most suitable option for our goals. All methods share the drawback of being sensitive to specimen sampling. In this respect, the absolute number of samples is likely to be of inferior importance compared with the spread of samples across relevant macroecological dimensions (Pearson, 2007).
An additional concern about the application of niche models in seaweed research is the ease with which heterogeneous distribution records can be used to generate models. As mentioned earlier, morphological species delimitation is troublesome in algae and, as a consequence, published species occurrence records based on morphological identifications are not always meaningful. We have taken great caution to avoid identification errors through DNA-guided species delimitation.
Macroevolution of the macroecological niche
Evolutionary processes are influenced by environmental variation in space and time (Kozak et al., 2008). Many studies taking a niche modelling approach to the study of environmental variation in a phylogenetic framework have shown strong heritability of macroecological preferences (e.g. Martínez-Meyer & Peterson, 2006; Yesson & Culham, 2006). To our knowledge, these studies have all focused on terrestrial organisms. The evolutionary dynamics of the niche of seaweeds have hardly been studied in the past. Breeman et al. (2002) investigated the evolution of temperature responses in the seaweed genus Cladophora. Their approach consisted of measuring cold tolerance, heat tolerance and growth of various culture strains at different temperature regimes. The response variables (tissue damage and growth rates) were interpreted along a phylogenetic tree, leading to the conclusion that the two main lineages of the Cladophora vagabunda complex had divergent cold tolerances. Although the experimental data from this study differ from ours as discussed above, the approach taken to infer niche dynamics in both studies is not fundamentally different. However, thanks to the advances in models describing the evolution of continuous characters that have taken place since the publication of Breeman et al. (2002) and their implementation in user-friendly packages for the R statistical computing environment (Paradis et al., 2004; Harmon et al., 2008), much more detailed inferences can now be made.
Our study shows that the macroecological niche in the seaweed genus Halimeda has a strong phylogenetic imprint and that niches appear to change gradually with time. The results clearly indicate the phylogenetic heritability of macroecological preferences: four out of five sections (Rhipsalis, Micronesica, Opuntia and Pseudo-opuntia) demonstrate conserved preference for high temperatures and low nutrient levels, confirming the association of these sections with tropical coral reefs and shallow lagoons (Fig. 2). Adaptation to colder and more nutrient-rich water only occurred in section Halimeda. Remarkably, the transition into colder water seems to have taken place four times independently (in H. tuna.med, H. cuneata.africa.1, H. cuneata.africa.2 and H. cuneata.australia). The species H. tuna.med is the only one inhabiting the Mediterranean Sea and can maintain populations at sites with yearly sea surface temperature minima around 10 °C. The species H. cuneata.africa.1 and H. cuneata.africa.2 occur in south-east Africa. H. cuneata.australia is found along the shores of south-western Australia. Chlorophyll values, used as a proxy for the trophic status of the surface water (Duan et al., 2007), are above average for certain species in section Halimeda, often the subtropical species. It is known that nutrient levels increase with latitude in the latitudinal range studied here (Sasai et al., 2007). Halimeda cuneata.brazil occurred in waters with high average chlorophyll values due to an overall high concentration along the Brazilian coast. The high average chlorophyll value of waters in which H. incrassata was recorded is largely due to an outlier observation in Florida.
Sources of uncertainty
Our study of evolutionary niche dynamics involves several subsequent analyses, hence a discussion of the potential sources of uncertainty affecting the final result is in place. The first source of uncertainty is in the species phylogeny. A lack of support for phylogenetic relationships will have direct repercussions on the accuracy of downstream analyses. In our study, the use of a multilocus alignment yielded very high statistical support for the great majority of branches in the tree. Therefore we have used the tree resulting from the Bayesian analysis (Fig. 1) in subsequent analyses as if it were known without uncertainty. Second, inferences of trait evolution also depend on branch lengths, which are affected by two potential sources of uncertainty: branch length estimation error in the phylogenetic analysis and error from the rate smoothing process that transforms the phylogram into a chronogram. Rate smoothing in particular can lead to variation in branch lengths if different settings are used. We followed the recommendations in the manual of the r8s program. Third, the values used as character states of the terminal taxon influence the results. We used distance-weighted averages as fixed character states for the terminal taxa, whereas in reality there is variation around the average. Taking this variation into account is expected to broaden confidence intervals on inferred ancestral states (Martins & Hansen, 1997). A fourth source of error could result from the inability of Brownian motion models to capture the complexity of historical forces affecting niche evolution, a source of error inherent in using simple models to describe a more complex reality. The last element of uncertainty lies in the ancestral character estimation, which infers values for ancestral taxa based on values of recent taxa. These analyses, however, report the 95% confidence intervals around the inferred value. If a character evolves fast, this will be reflected in broader confidence intervals on ancestral character states (Martins, 1999). We have not attempted to quantify the accumulation of uncertainty throughout our sequence of analyses due to practical limitations, but the reader should be aware of the assumptions that were made.
Despite the relatively high levels of uncertainty usually associated with ancestral state estimation of continuous characters (Schluter et al., 1997), the observed conservatism for environmental preferences yields a relatively narrow 95% confidence interval for the average SST characterizing the habitat of the most recent common ancestor of extant Halimeda species (25.6–29.2 °C). The ML estimate of 27.4 °C appears to be in agreement with the tropical Tethyan origin of Halimeda that was previously derived from the fossil record. The earliest known fossil that is considered to belong to the genus is Halimeda soltanensis from the Upper Permian (± 250–270 Ma) of Djebel Tebaga in South Tunisia (Poncet, 1989), which was at that time located at a low latitude along the western shore of the Tethys Ocean (Smith et al., 1994). A more diverse assemblage of species with a markedly tropical distribution had evolved by the Upper Cretaceous (± 100–65 Ma) (Dragastan & Herbig, 2007). The invasion of Halimeda into higher latitudes has not been documented in the fossil record. Our chronogram suggests that the invasion occurred during late Palaeogene and Neogene times, a period characterized by global cooling (Zachos et al., 2001). This finding confirms earlier hypotheses that at least parts of the warm-temperate seaweed floras originated from tropical ancestry during this period of globally decreasing temperatures (van den Hoek, 1984; Lüning, 1990).
Halimeda species have previously been shown to be geographically restricted to either the Atlantic Ocean or the Indo-Pacific basin (Kooistra et al., 2002; Verbruggen et al., 2005a,b). One could ask whether the absence of Atlantic species in the Indo-Pacific (and vice versa) is a consequence of dispersal limitation or if habitat differences may be responsible for the limited distributions. The niche model of the Indo-Pacific species H. borneensis clearly indicates that some parts of the Caribbean Sea would be suitable habitat (Fig. 3a) and the niche model of the Atlantic species H. simulans suggests that it could survive in large parts of the Indo-Pacific tropics (Fig. 3b). Similar patterns were observed for other species (Fig. S1). So, unless Halimeda species are limited by habitat differences between the Atlantic and Indo-Pacific basins that are not represented in our macroecological data, it can be concluded that dispersal limitation is the most likely explanation for the strong separation of Atlantic and Indo-Pacific species. Dispersal limitation of benthic tropical marine organisms between oceans is not uncommon (Lessios et al., 2001; Teske et al., 2007) and can be explained by the north–south orientation of the African and American continents, prohibiting marine dispersal between the Atlantic and Indo-Pacific basins through tropical waters. Halimeda opuntia is the only species that occurs in both ocean basins. It is part of a clade of Indo-Pacific species, indicating that it originated in the Indo-Pacific basin and subsequently dispersed to the Atlantic Ocean and spread throughout its tropical regions. It was previously suggested that H. opuntia was introduced to the Atlantic Ocean by early inter-oceanic shipping (Kooistra & Verbruggen, 2005). If this scenario is correct, our model's prediction that parts of the tropical Atlantic Ocean form suitable habitat for Indo-Pacific species and the conclusion of dispersal limitation between ocean basins would be confirmed.
Geographical distribution patterns show a clear phylogenetic signal: each section separates largely into an Atlantic and an Indo-Pacific lineage (Fig. 1), confirming previous observations (Kooistra et al., 2002; Verbruggen et al., 2005b). This pattern indicates ancient lineage splitting through vicariance and subsequent diversification within the Atlantic and Indo-Pacific basins. A number of geological events are commonly invoked to explain sister relationships between strictly Atlantic and strictly Indo-Pacific lineages. The first is the spreading of the Atlantic Ocean, which started during the Jurassic (± 170–160 Ma) (Smith et al., 1994). The second is the collision of the African and Eurasian plates in the Middle East during the Miocene (± 15–12 Ma) (Rögl & Steininger, 1984). The third event is the closure of the Central American Seaway in the Pliocene (± 3 Ma) (Coates & Obando, 1996). Different events have been hypothesized to be at the basis of the geographical splits in Halimeda but results have remained inconclusive (Kooistra et al., 2002; Verbruggen et al., 2005b). Our chronogram suggests that the splits between Atlantic and Indo-Pacific lineages originated at various times during the Palaeogene (65–25 Ma). In other words, the time frame of initial divergence does not correspond closely with either one of the geological events. During the Palaeogene, however, an important oceanographic event that may have limited dispersal between the Atlantic and Indo-Pacific ocean basins took place: the circum-equatorial current that homogenized the tropical marine biome during the Cretaceous was deflected to the south of Africa (Lawver & Gahagan, 2003). This result suggests that geological barriers may not be the initial cause of divergence between populations but instead act as barrier reinforcements after divergence has been initiated by oceanographic events. A similar conclusion was reached in molecular and paleontological studies of species across the Central American Isthmus (e.g. Collins et al., 1996; Knowlton & Weigt, 1998). The generality of this pattern requires additional study. For some organisms at least, divergence times between Atlantic and Indo-Pacific lineages obtained with a molecular clock match more closely with the timing of the collision of the African and Eurasian plates in the Middle East (e.g. Teske et al., 2007).
Regional biogeography of tropical America
As an alternative to the molecular clock, one would also be able to infer which geological events were involved in species partitioning between the Atlantic and Indo-Pacific through a thorough study of eastern Pacific Halimeda species. The Caribbean and eastern Pacific formed a single tropical marine biota that was separated by the shoaling of the Central American Isthmus during the Pliocene, resulting in the formation of many trans-isthmian sister species (Knowlton & Weigt, 1998). The emergence of a land bridge has been dated at approximately 3 Ma (Coates & Obando, 1996). The presence of trans-isthmian species pairs with a distribution limited to the tropical Americas (i.e. not in the wider Indo-Pacific) can be taken as evidence for vicariance across the Central American Isthmus.
Only H. discoidea has been reported from the eastern Pacific and, curiously, molecular analyses have shown these populations not to be related to the Caribbean species H. discoidea.atl as one may expect but to the Indo-Pacific species H. discoidea.ip (Verbruggen et al., 2005b). So either Halimeda does not have trans-isthmian species pairs in the tropical Americas or they have not been discovered yet. The seaweed flora of the tropical East Pacific Ocean has not been studied in great detail in the past and recent inventories have shown lots of new discoveries (Wysor, 2004). We aimed to facilitate the discovery of trans-isthmian sister pairs by identifying geographical regions in the East Pacific Ocean that are hotspots of habitat suitability for Caribbean species. The niche model of pooled distribution data of six Caribbean species predicted parts of the East Pacific Ocean as suitable habitat (Fig. 3c) and identified three hotspots of habitat suitability: the western Galapagos Islands (Fig. 3e), the west coast of Colombia and parts of the south coast of Panama (Fig. 3d). We suggest that these areas should be targeted in future research expeditions aiming to discover trans-isthmian species pairs. The utility of ecological niche models to guide discovery has already been documented. Unexplored deep-water kelp forests were recently found in the Galapagos Archipelago based on predictions of a synthetic oceanographic and ecophysiological model (Graham et al., 2007). Similarly, expeditions directed by niche models of chameleons led to the discovery of additional populations of known species and several species new to science (Raxworthy et al., 2003). It should be noted that the niche model presented here predicts habitat suitability only as a function of the macroecological variables included in the dataset. It is beyond doubt that factors not included in our dataset (e.g. microhabitat characteristics, tidal amplitudes, grazing pressure and other biotic interactions) affect the actual distribution of species. If such data were available, they could be used to create a more specific model and would probably result in smaller hotspots, allowing even more targeted expeditions.
We thank W. Willems for providing the R script to calculate partial AUC values and for discussion of techniques. We are grateful to M. Accioly, K. Arano, M. Bandeira-Pedrosa, C. Battelli, B. Brooks, K. Clifton, M. Coffroth, P. Colinvaux, R. Collin, E. Coppejans, O. Dargent, Y. de Jong, G. De Smedt, E. Demeulenare, R. Diaz, E. Drew, S. Fredericq, C. Galanza, S. Guimaraes, F. Gurgel, O. Gussmann, R. Haroun, I. Hendriks, J. Hernandez, L. Hillis, J. Huisman, M. Kaufmann, L. Kirkendale, L. Liao, D. Littler, M. Littler, G. Llewellyn, P. Marshall, J. Maté, A. Maypo, A. N’Yeurt, D. Olandesca, C. Ortuno, K. Page, F. Parrish, C. Payri, G. Procaccini, W. Prud’homme van Reine, L. Raymundo, T. Schils, E. Tronchin, M. Van Veghel, P. Vroom, S. Williams, S. Wilson, B. Wysor and J. Zuccarello for providing specimens. Funding was provided by the Research Foundation – Flanders (research grant G.0142.05 and post-doctoral fellowships to H.V. and F.L.) and IWT (doctoral fellowship to L.T.). We thank two anonymous referees for their constructive comments on a previous version of the manuscript.
The Phycology Research Group at Ghent University (Belgium) has a broad interest in seaweed evolution. Research focuses on the exploration of patterns of algal diversification and answering specific questions about seaweed evolution through integrative research. Our research focus includes molecular phylogenetics, reproductive biology, bacterial–algal interactions, remote sensing, biogeographical inquiry and genomics. Through the integration of phylogenetic techniques and niche modelling in a GIS framework, the research group aims to study evolutionary dynamics of the macroecological niche of seaweeds. URL: http://www.phycology.ugent.be/.