Evolutionary history of the butterflyfishes (f: Chaetodontidae) and the rise of coral feeding fishes



    1. School of Marine and Tropical Biology, James Cook University, Townsville, Qld, Australia
    2. Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Qld, Australia
    Search for more papers by this author

    1. School of Marine and Tropical Biology, James Cook University, Townsville, Qld, Australia
    2. Molecular Evolution and Ecology Laboratory, James Cook University, Townsville, Qld, Australia
    Search for more papers by this author
  • P. F. COWMAN,

    1. School of Marine and Tropical Biology, James Cook University, Townsville, Qld, Australia
    2. Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Qld, Australia
    Search for more papers by this author

    1. Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Qld, Australia
    Search for more papers by this author
  • N. KONOW,

    1. School of Marine and Tropical Biology, James Cook University, Townsville, Qld, Australia
    2. Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
    Search for more papers by this author

    1. School of Marine and Tropical Biology, James Cook University, Townsville, Qld, Australia
    2. Molecular Evolution and Ecology Laboratory, James Cook University, Townsville, Qld, Australia
    Search for more papers by this author

David R. Bellwood, School of Marine and Tropical Biology, James Cook University, Townsville, Qld 4811, Australia.
Tel.: +61 7 4781 4447, fax: +61 7 4725 1570; e-mail: david.bellwood@jcu.edu.au


Of the 5000 fish species on coral reefs, corals dominate the diet of just 41 species. Most (61%) belong to a single family, the butterflyfishes (Chaetodontidae). We examine the evolutionary origins of chaetodontid corallivory using a new molecular phylogeny incorporating all 11 genera. A 1759-bp sequence of nuclear (S7I1 and ETS2) and mitochondrial (cytochrome b) data yielded a fully resolved tree with strong support for all major nodes. A chronogram, constructed using Bayesian inference with multiple parametric priors, and recent ecological data reveal that corallivory has arisen at least five times over a period of 12 Ma, from 15.7 to 3 Ma. A move onto coral reefs in the Miocene foreshadowed rapid cladogenesis within Chaetodon and the origins of corallivory, coinciding with a global reorganization of coral reefs and the expansion of fast-growing corals. This historical association underpins the sensitivity of specific butterflyfish clades to global coral decline.


Coral reef fishes are a highly diverse group, with an evolutionary history extending back more than 50 Myr (Bellwood & Wainwright, 2002). From the fossil record, it appears that scleractinian-dominated coral reefs and modern coral reef fish families first appeared and then diversified at approximately the same time, in the early Cenozoic (Bellwood, 1996; Bellwood & Wainwright, 2002; Wallace & Rosen, 2006). This suggests that the origins of modern coral reefs and their associated fish families may be closely linked. However, it is remarkable that of the 5000 or more fish species recorded from coral reefs today only 128 eat corals (Cole et al., 2008; Rotjan & Lewis, 2008) and just 41 are believed to feed directly on scleractinian corals as their primary source of nutrition. Moreover, 61% (25 of 41) belong to a single family, the butterflyfishes (f. Chaetodontidae); of the remainder most (eight) are in the Labridae. Why so few species have been able to exploit such a widespread resource remains a mystery. It also highlights the exceptional abilities of the few corallivores that have managed to subsist on corals, and the remarkable status of butterflyfishes. Despite being one of the most intensively studied families of reef fishes, the evolution of this highly specialized feeding mode remains poorly understood. In particular, how many times has corallivory arisen within the group and when did this unusual feeding mode first arise? Did corallivory arise along with major coral groups in the early Eocene?

Butterflyfishes are conspicuous and iconic inhabitants of coral reef environments. The family contains over 130 species with representatives in all coral reef regions (Allen et al., 1998; Kuiter, 2002). Their colourful patterns, and ease of identification and observation have ensured that the behavioural, ecological, morphological and biogeographic characteristics of butterflyfishes have been extensively studied (e.g. Motta, 1988; Ferry-Graham et al., 2001a; Findley & Findley, 2001; Pratchett, 2005). Indeed, they have been regularly identified as indicator species of reef health (Reese, 1975; Roberts et al., 1988; Kulbicki et al., 2005). It is this close association with corals and coral reefs that stands as one of the most important features of this family. With almost a quarter of the species feeding on corals, they have what is arguably the closest association of any fish group with coral reefs. The key to understanding the history of this relationship, however, is to obtain well-supported phylogenies based on multiple genes, and to use robust molecular dating methodologies, informed by reliable fossil data, to provide a temporal framework in which to interpret recent ecological evidence.

Fossil evidence of increasing reef–fish interactions points to a major change between the late Mesozoic and the beginning of the Tertiary (Bellwood, 2003). Although there is a diverse range of acanthomorph fishes in the late Cretaceous, the earliest evidence of the vast majority of extant reef fish families, for which there is a fossil record, is from the Eocene. Most of these families are first recorded from the 50-Myr-old deposits of Monte Bolca, in northern Italy (e.g. Blot, 1980; Bellwood, 1996; although it should be noted that a few molecular studies have suggested that some lineages may predate the Cretaceous/Tertiary (K/T) boundary, e.g. Streelman et al., 2002; Alfaro et al., 2007; Azuma et al., 2008). The Monte Bolca deposits mark the first modern coral reef fish assemblage in terms of both its taxonomic composition and the functional attributes of the component taxa (Bellwood, 1996), and it is here that we observe the first evidence of increased interactions between fishes and the benthos, with the appearance of several lineages of fishes that were almost certainly grazing herbivores (Bellwood, 2003). Although there have been many Eocene fossils ascribed to the Chaetodontidae, a recent evaluation of this material has rejected all of these taxa; there is no reliable record of the family from the Eocene (Bannikov, 2004). The oldest reliable fossil evidence for the Chaetodontidae is of Miocene age (Carnevale, 2006). However, with a robust molecular phylogeny we can build on the fossil record and place recent ecological advances in an evolutionary framework.

Recent ecological research has provided a new perspective on the nature of corallivory in reef fishes, especially butterflyfishes. Building on existing information (e.g. Birkeland & Neudecker, 1981; Blum, 1988; Motta, 1988) recent studies have provided a detailed understanding of the family in terms of feeding morphology and kinematics (Ferry-Graham et al., 2001a,b; Konow et al., 2008; Konow & Ferry-Graham, in press), feeding strategies and behavioural interactions (Zekeria et al., 2002; Gregson et al., 2008) and, most importantly, the nature and extent of corallivory (Pratchett et al., 2004; Berumen et al., 2005; Pratchett, 2005, 2007; Cole et al., 2008; Rotjan & Lewis, 2008). It is now known that butterflyfishes exhibit considerable diversity in the nature of corallivory, underpinned by both ecomorphological and behavioural variation. This includes at least two different modes of coral feeding, exploitation of both soft and hard corals, and a spectrum of coral feeding specializations ranging from highly specialized obligate coral feeders that primarily target just one coral species to facultative and generalist coral feeders that can feed on a wide range of coral species (Pratchett et al., 2004; Berumen et al., 2005).

The evolutionary history of these traits is poorly understood. Existing morphological and molecular phylogenies have produced discordant tree topologies, opening questions about the nature of character evolution. Most molecular analyses of the family have focused on relationships between species pairs or species complexes (e.g. McMillan & Palumbi, 1995; Hsu et al., 2007) or have included butterflyfishes as part of broader studies of putative sister taxa (Bellwood et al., 2004). The most comprehensive analysis to date using 3332 bp of mitochondrial and nuclear DNA yielded a well-supported phylogeny, which differed significantly from all previous topologies, and underpinned a thorough evaluation of the systematics and taxonomy of the family (Fessler & Westneat, 2007).

The evolutionary and ecological ramifications of these relationships for corallivory, however, have yet to be fully explored. We address four critical questions: (1) How many times has corallivory arisen? (2) When did it first arise? (3) Did corallivory and/or a move onto coral reefs underpin diversification within the family? and (4) What are the broader implications for the evolution of coral reefs? In this present study, we use a comprehensive new molecular phylogeny to explore the evolutionary history of the family. For the first time, we use molecular methods to examine the relationships between all 10 genera and all respective subgenera (with the exception of Roa Roa sensu, Blum, 1988; = Roa, Kuiter, 2002). We examined 56 species using two nuclear (ETS2 and S7I1) and a mitochondrial marker (cyt b); the nDNA markers have not been used previously in this family. We also apply, for the first time, current Bayesian analyses for chronogram construction with multiple fossil calibrations. Using this chronogram, ecological evidence and diversification statistics, we explore the evolutionary origins of corallivory and the timing of this highly unusual fish–coral interaction. We then consider the basis for these trophic innovations in the context of the evolution of coral reefs.

Materials and methods

Taxon sampling

In total, 56 butterflyfish species were examined, with multiple representatives of all 11 identified butterflyfish genera and 12 subgenera (authorities given in Allen et al., 1998). Specimens were obtained using spears or nets with additional material obtained from the ornamental fish trade. A further eight species from the Pomacanthidae, Ephippidae, Kyphosidae and Scatophagidae were included as putative outgroup species. Based on Smith & Wheeler (2006), the Pomacanthidae then the Ephippidae are the putative successive sister taxa to the Chaetodontidae; the more distant kyphosid and microcanthid species were used to root the entire phylogeny. Our taxon sampling specifically included 18 of the 26 species known to be obligate coral feeders (Appendix S1; Table 1). Full taxon sampling in several lineages (Forcipiger, Chelmon and Chelmonops) also enables us to explore the temporal and biogeographic patterns of species origins.

Table 1.   Departure of chaetodontid lineages from global diversification rate estimated of the family Chaetodontidae.
Clade nameAgeTotal∋ = 0∋ = 0.3∋ = 0.5∋ = 0.6∋ = 0.8∋ = 0.9
Crown group (CT)32.8130(56)rG = 0.1274rG = 0.1246rG = 0.1188rG = 0.1141rG = 0.0971rG = 0.0787
  1. Bold P-values highlight significantly higher species richness in subtending clade than expected under the global rate of cladogenesis (*significance after Bonferroni correction). ∋ is the extinction rate, rG is the estimated global Chaetodontidae speciation rate conditional on the extinction rate. Clade names and ages are taken from the node labels and mean node heights of Fig 1, and table 1 in Appendix S2.

C2 + C3 + C416.689(38)1.61E-04*1.26E-03*3.90E-036.71E-032.21E-025.00E-02
C3 + C415.752(24)5.44E-031.69E-023.14E-024.28E-028.81E-021.49E-01

Laboratory procedures

Total DNA was extracted from tissues using standard salt-chloroform and proteinase K digestion extraction procedures (Sambrook & Russell, 2001). Two nuclear genes, ETS2 (ETS is a transcription factor important in cell proliferation; Dwyer et al., 2007; Lyons et al., 1997); S7 Intron 1 (S7 is a ribosomal protein required for assembling 16s RNA; Maguire & Zimmermann, 2001; Chow & Hazama, 1998) and the mitochondrial protein-coding region, cytochrome b (which participates in electron transport; Kocher et al., 1989; Irwin et al., 1991; McMillan & Palumbi, 1995) were used to explore the evolutionary relationships among the butterflyfishes (Appendix S1; Table 2). An average of two specimens were sequenced for each species. Each 20 μL polymerase chain reaction (PCR) volume contained 2.5 mm Tris–Cl (pH 8.7), 5 mm KCl(NH4)2SO4, 200 μm each dNTP, MgCl2 ranging from 1.5 to 4 mm, 10 μm each primer, 1 U of Taq Polymerase (Qiagen, Doncaster, Victoria, Australia) and 10 ng template DNA. Amplifications followed the same basic cycling protocol: an initial denaturing step of 2 min at 94 °C, followed by 35 cycles, with the first five cycles at 94 °C for 30 s, 30 s at primer-specific annealing temperatures (Ta) (Appendix S1; Table 2), followed by 1 min 30 s extensions at 72 °C and the remaining 30 cycles were performed as before, but at T−2 °C. PCR products were purified by isopropanol precipitation (cyt b and S7I1) or gel-purification on 2% agarose gels, as two bands appeared routinely (ETS2). This was also the case for S7I1 amplified fragments of some species. A 500-bp fragment was retained for ETS2 whilst a 700-bp fragment was retained for S7I1. Gel-excised fragments were purified in a column following manufacturer’s protocols (Qiagen). Purified templates were quantified by UV-Vis absorbance (ND-1000 Spectrophotometer, NanoDrop®, Wilmington, NC, USA) and sent to Macrogen Inc. (Seoul, South Korea) for direct sequencing in both directions.

Analytical procedures

Data compilation

The consensus sequence of the multiple specimens sequenced was used to represent each taxon. Sequences were edited using Sequencher 4.5 (Gene codes corporation, Ann Arbor, MI, USA), and automatically aligned using ClustalX (Thompson et al., 1997) and finally manually corrected using Se-Al version 2.0 available at http://evolve.zoo.ox.ac.uk (Rambaut, 1996). Sequences of this study are available at GenBank accession numbers (ETS2: FJ167730FJ167792; S7I1: FJ167793FJ167846, FJ167848FJ167856 and cyt b: FJ167682FJ167709, FJ167711FJ167719, FJ167721FJ167729). Several sequences of cytochrome b were used from GenBank (Chaetodon: C. argentatusAF108580, C. citrinellusAF108585, C. kleiniiAF108591, C. lineolatusAF108593, C. lunulaAF108594, C. meyeriAF108597, C. milliaris U23606, C. multicinctus U23588, C. ornatissimusAF108600, C. plebiusAF108602, C. quadrimaculatusAJ748302, C. unimaculatusAJ748304; Chelmon rostratusAF108612, Coradion altivelisAF108613, Coradion chrysozonusAF108614, Forcipiger flavissimusAF108615, Hemitaurichthys polylepisAF108616, Heniochus acuminatusAF108618 and Parachaetodon ocellatusAF108622 as perLittlewood et al., 2004; McMillan & Palumbi, 1995; Nelson et al., unpublished GenBank submission). Each locus was first examined for saturation using DAMBE version 5.010 (Xia & Xie, 2001). This method employs an entropy-based index of substitution saturation (Iss); if Iss is significantly larger than the critical Iss (i.e. Iss.c), sequences have experienced substitution saturation (Xia et al., 2003). Prior to concatenating gene regions, each gene was partitioned based on its function or structure. Coding genes [cyt b and a short exon region (99 bp) of ETS2] were partitioned according to codon positions (1st and 2nd combined as conserved region, and 3rd codon separately). Nuclear introns (ETS2 and S7I1) were partitioned into putative stem (conserved) and loop (hypervariable) regions. Eight separate gene partitions were identified in total.

Phylogenetic analyses

Maximum parsimony (MP) analyses were implemented in paup* 4.0b10 (Swofford, 1998) using heuristic search methods with 1000 pseudo-replicate bootstraps, tree-bisection-reconnection branch swapping and random addition of taxa. Two separate heuristic MP runs were performed. First, all sites were treated equally and second, sites were weighted according to gene partitions [3rd codon, loop regions = 1; conserved (1st and 2nd codon) and stem regions = 2]. A 50% majority rule consensus tree was generated from all shortest trees obtained. Bayesian inference (BI) analyses were implemented in MrBayes version 3.1.2 (Huelsenbeck & Ronquist, 2001) using James Cook University’s HPC GridSphere system (https://ngportal.hpc.jcu.edu.au/gridsphere/). The analysis of the combined data used a partition mix model method (pMM) according to gene partitions with locus-specific substitution models, using MrModeltest version 2.2 (Nylander, 2004) and Akaike information criterion (AIC) (Nylander et al., 2004). Two Bayesian pMM analyses were performed using Markov chain Monte Carlo (MCMC) simulations with four chains of 2 000 000 generations each, sampling trees every 100 generations. Stationarity was reached after 10 000 generations and a 50% majority rule consensus tree was computed using the best 16 000 post-burn-in trees from each run. Six putative sister taxa were included in the analyses, three pomacanthids (Pomacanthus annularis, P. rhomboids and P. sexstriatus), an ephippid (Platax orbicularis), and two scats (Scatophagus argus and Selonotoca multifasciata) and, in addition, two distant outgroups, a kyphosid (Kyphosus vaigiensis) and a microcanthid (Tilodon sexfasciatum), which were used to root resulting trees. The single best tree was selected for molecular dating.

Maximum likelihood (ML) analysis was performed using Garli version 0.95 (Zwickl, 2006). Ten independent runs were performed using the best substitution model (as per AIC) for the combined data (not partitioned) implemented with Modeltest version 3.7 (Posada & Crandall, 1998). The best trees from the individual runs were compared with ensure they did not differ in topology and that the ML search was not arriving in a suboptimal area of tree space. In addition, a ML analysis with 100 bootstrap replicates was preformed to show support of individual clades in the tree.

Molecular dating

Age estimation of the chaetodontid lineages was performed in the program beast v1.4.8 (Drummond & Rambaut, 2007). beast implements BI and a MCMC analysis to simultaneously estimate branch lengths, topology, substitution model parameters and dates based on fossil calibrations. It also does not assume substitution rates are autocorrelated across lineages, allowing the user to estimate rates independently from an uncorrelated exponential distribution or lognormal distribution (UCLD). Many empirical data sets have been shown not to demonstrate autocorrelation of rates and times (Drummond et al., 2006; Alfaro et al., 2007; Brown et al., 2008). An initial ultrametric tree was constructed in r8s 1.71 (Sanderson, 2004) from the topology and branch lengths of the best Bayesian tree recovered from phylogenetic analyses, using a penalized likelihood (PL) method (Sanderson, 2002). This topology was used for calibration purposes by using parametric priors implemented in beast to make assumptions a priori based on fossil and biogeographic data (see below).

The vast majority of reported fossil chaetodontids are demonstrably erroneous (Bannikov, 2004), a pattern common in many other reef groups (e.g. scarids, Bellwood & Schultz, 1991; pomacentrids, Bellwood & Sorbini, 1996). Fossil selection is critical, and fossils were only used if placed in a family based on reliable morphological criteria. Fossil calibrations were therefore restricted to fossils from two families recorded from the Eocene (50 Ma) deposits of Monte Bolca: Eoplatax papilio (Ephippidae) (Blot, 1969) and Eoscatophagus frontalis (Scatophagidae) (Tyler & Sorbini, 1999). With both putative ephippid and scatophagid fossils having a minimum age of 50 Ma, an exponential prior was placed on the Platax node (PL) with a hard lower bound age of 50 Ma and a 95% soft upper bound of 65 Ma. The exponential prior reflects the decreasing probability of a lineage being older than its oldest fossil (Yang & Rannala, 2006; Ho, 2007). The soft upper bound of 65 Ma representing the transition of fish faunas at the K/T boundary (following Bellwood & Wainwright, 2002; Bellwood et al., 2004; Fessler & Westneat, 2007) beyond which there is no fossil record of modern reef fish families.

beast MCMC runs of 10 × 106 generations were performed assuming the UCLD model with eight unlinked data partitions and unlinked substitution models specified by MrModeltest v2.2 (Nylander, 2004). Ten independent analyses were run sampling every 500th generation. Resulting log files were examined using Tracer v1.4 (Rambaut & Drummond, 2007) to ensure all analyses were converging on the same area in tree space. Tree files (approx. −10% burn-in) were then combined using LogCombiner (Rambaut & Drummond, 2007) and compiled into a maximum clade credibility chronogram to display mean node ages and highest posterior density (HPD) intervals at 95% (upper and lower) for each node.

Optimizing ecological traits

Two ecological traits (corallivory and habitat use) were mapped to the best phylogenetic tree that was the basis for dating diversification in the chaetodontids, using Mesquite v 2.6 (Maddison & Maddison, 2007). Habitat use traits were scored as 0 = not on reefs, 1 = rocky reefs and 2 = coral reefs. Diet traits were scored as 0 = noncorallivore, 1 = omnivores (that have < 1% coral in the diet and feed on other invertebrates and/or algae), 2 = facultative or occasional coral feeders [which include some (1–80%) hard coral in the diet] and 3 = corallivores (in which the diet is dominated by i.e. > 80% hard or soft corals). Ecological character states were drawn from the published literature (Appendix S1; Table 1). Within the corallivores, species are further identified as obligate hard or soft corallivores when they feed exclusively on a specific coral type.

Diversification rates

All diversification statistics were preformed in r version 2.7.2 (http://www.Rproject.org) (Ihaka & Gentleman, 1996) using functions written for GEIGER (Harmon et al., 2008), LASER (Rabosky, 2006) and associated packages. The constant rates (CR) test of Pybus & Harvey (2000) was used to investigate the rates of cladogenesis of the chaetodontid crown group. This test estimated the gamma statistic of the beast generated chronogram. Significantly negative gamma values (< −1.645, one-tailed test) indicate a decrease in the rates of cladogenesis over time. This implies that internal nodes of the tree are distributed closer to the root than would be expected under a Yule (pure birth) process. To account for incomplete taxon sampling (which increases Type 1 error of the CR test; Pybus & Harvey, 2000) a Markov chain CR (MCCR) test (Pybus & Harvey, 2000) was used to compare the observed gamma to that of the null distribution created from 10 000 randomly subsampled, simulated (full) topologies under a Yule process. The relative cladogenesis statistic (Nee et al., 1992) was used to identify lineages with significantly faster/slower rate of cladogenesis. These methods have previously been used to investigate diversification rates in tetraodontiform lineages (Alfaro et al., 2007).

Methods implemented in the tetraodontiform study (Alfaro et al., 2007) were used to calculate the global diversification rate (rG) of the chaetodontids across extinction rates (∋) in increments of 0.1 from 0 to 0.9 (see Magallon & Sanderson, 2001). Using functions in GEIGER (based on the method of moments estimator of Magallon & Sanderson, 2001) the probabilities of the observed species richness in each of the major chaetodontid lineages were calculated using crown group ages and the global estimates of diversification rate (rG) for each increment of extinction. In case of the C. robustus lineage, no other reported taxa in this clade were included in this study and thus the stem group age estimator was used (equation 10a, Magallon & Sanderson, 2001; see Alfaro et al., 2007) to calculate the above probability.


Sequence variability

We examined 1759 bp of sequence of which approximately 50% was parsimony-informative. The two nuclear markers, ETS2 and S7I1 had 647 and 655 bp respectively, cytochrome b contributed a further 426 bp with (47%, 65% and 47% parsimony-informative sites respectively). None of the individual gene regions were saturated (Iss 0.43 < 0.8 Iss.c, Iss 0.3 < Iss.c 0.8 and Iss 0.3 < Iss.c 0.78 respectively), neither was the concatenated data (Iss 0.4 < 0.8 Iss.c).

Model selection

The gene-specific models (AIC) for each of the eight gene partitions used for Bayesian analysis were as follows: Cyt b conserved region (gene 1, 1st and 2nd codons) required a GTR + G model (gamma shape parameter = 0.2397) with substitution Nst = 6, for its 3rd codon region (gene 2) a GTR + I + G model (invariable sites = 0.02, γ = 3.9890) with substitution Nst = 6. ETS2 coding region required for its conserved (gene 3) region a K80 model with substitution Nst = 2 (ti/tv ratio = 2.5419), and its variable (gene 4, 3rd codon) region a HKY model with substitution Nst = 2 (ti/tv ratio = 1.5633). Both ETS2 stem (gene 5) and loop (gene 6) required a GTR + G model (γ = 0.9426 and 1.2595 respectively) with Nst = 6 substitution classes. S7I1 stem (gene 7) region required a HKY + G model (γ = 1.5603) with substitution Nst = 2 (ti/tv ratio = 1.3058) and its loop regions (gene 8) a GTR + G model (γ = 5.4651) with substitution Nst = 6. The model selections for pMM BI only requires a general ‘form’ of the model (Nylander, 2004), as the Markov chain integrates uncertainties of the parameter values. Therefore, seven of the eight gene partitions had a base frequency = dirichlet (1,1,1,1) (i.e. unequal) while the eight gene (gene 3, ETS2 coding, conserved region) base frequency was set to = fixed (equal).

Maximum likelihood analysis (GARLI) required an overall GTR + G model (γ = 0.5120) for all regions, substitution Nst = 6 with substitution rates fixed (0.7160, 2.3497, 0.8239, 0.6913 and 3.8464), and base frequencies fixed (0.26, 0.2336, 0.2115 and 0.2949).

Tree inference

Stationarity of the Bayesian analyses was reached after much fewer then 10 000 generations in both runs, (visualized in Tracer version 1.4; Rambaut & Drummond, 2007) and the 50% majority rule consensus tree topology was no different from the best trees of each run (−lnL = −22 255.116 1st run and lnL = −22 254.149 2nd run) with very high posterior probabilities (Fig. 1). Both MP and ML analyses inferred the same tree topology as per Bayesian analysis. We therefore included only the support for each retrieved node (Fig. 1). Four major clades of Chaetodon were retrieved, resembling closely the four clades retrieved in a previous molecular study by Fessler & Westneat (2007). Although identical species were not analysed in the two studies, the placement of species common to both studies was identical, despite the use of different loci in the two studies. For clarity, in Chaetodon we follow the four clades of Fessler & Westneat (2007). Old taxonomic groupings were found to be of limited utility (only one remains intact and retains its traditional boundaries (Radophorus in clade 4) and we will not consider them further within Chaetodon. Fessler & Westneat (2007) provide a thorough evaluation of the taxonomy of the family. The only additional detail from our study is that Parachaetodon would make Chaetodon (and the subgenus Discochaetodon) paraphyletic and Parachaetodon is probably best placed within Chaetodon (as a junior synonym).

Figure 1.

 Inferred phylogeny of the butterflyfish and bannerfish (f. Chaetodontidae), based on 56 species with representatives from all 11 genera and 12 subgenera, obtained by Bayesian, maximum parsimony (MP) and maximum likelihood analyses for three loci (ETS2, S7I1 and cyt b). The topology shows the best bayesian tree with posterior probabilities (consensus of 32 000 trees) and bootstrap support (> 50%) of MP and ML (1000 and 100 bootstrap replicates respectively). (*) 100% support. (--) no bootstrap support. The tree was rooted with Kyphosus vaigiensis and Tilodon sexfasciatus.

Molecular dating

The best Bayesian topology and branch lengths received by phylogenetic analyses (Fig. 1) was used as the initial starting tree with an exponential prior used to calibrate the PL node (see Materials and methods). beast log files analysed in Tracer showed convergence between independent runs in tree space. High effective sample size scores of individual parameters indicated valid estimates based on independent samples from the posterior distribution of the MCMC. A maximum clade credibility chronogram was compiled in Tree Annotator from 180 000 post-burn-in trees (9 × 107 generations from 10 beast MCMC runs). The chronogram displays mean node heights received at each node by beast MCMC with bars representing 95% HPD (Appendix S2, Fig. 1). The family Chaetodontidae dates back to the early Eocene where it split from pomacanthids with a mean age of the most recent common ancestor (MRCA) of 50.1 Ma (41.5–60.7, 95% HPD). Estimated ages indicate the origin of the butterfly fish and bannerfish clades at a mean age of 32.8 Ma (24.9–40.9, 95% HPD) after which they rapidly diversified, with the four major Chaetodon lineages in place by the mid Miocene (Appendix S2; Table 1). Also during the early Miocene we see the origins of the three major divisions within the bannerfish clade.

Optimization of ecological traits

Based on the species examined it is clear that corallivory has arisen on at least five separate occasions (Fig. 2). Corallivory has been reported in 25 chaetodontid species (Appendix S1; Table 1). All are in the reef-butterflyfishes clade and are restricted to a single monophyletic genus, Chaetodon. Of these 25 corallivores, 17 are included in the current phylogeny. The remaining eight species are easily included in the four main Chaetodon clades based on previous phylogenetic and taxonomic evidence (Appendix S1; Table 1).

Figure 2.

 A chronogram of the Chaetodontidae with optimized trophic modes reveals five independent origins of corallivory over the last 15.7–3.2 Ma. Red/dotted branches indicate obligate hard coral feeders and blue/dashed branches obligate soft coral feeders. The estimated ages are in Ma (see Fig. 1 and table 1 in Appendix S2 and for confidence intervals of the mrca age estimates). The butterflyfish illustrations exemplify some of the corallivores in each of the independent clades in which coralivory has arisen (images from Kuiter, 2002).

Chaetodon clade 1 contains only three species, all are restricted to West African coastal waters, with no record of coral feeding. Chaetodon clade 2 (37 species) contains three distinct lineages of coral feeders (estimated MRCA to omnivorous sister taxa in parentheses): C. quadrimaculatus (3.2 Ma), the C. multicinctus clade (4.9 Ma) and the C. unimaculatus-interruptus clade (4.3 Ma). The first two lineages are hard coral feeders and probably have an obligate dependence on corals. Chaetodon unimaculatus and C. interruptus feed on soft and hard corals. Chaetodon clade 3 is predominantly corallivorous, with 19 of the 21 species being obligate corallivores. Of the remaining species, the diet of C. tricinctus is unknown, leaving one noncorallivore, Parachaetodon ocellatus. The chronogram places the origins of corallivory at about 15.7 Ma. Of the 31 species in Chaetodon clade 4 only two are corallivores: C. melannotus and C. ocellicaudus. These sister species are both obligate soft coral feeders (Appendix S1; Table 1). They separated from their omnivorous sister at about 9.8 Ma.

Diversification rates

The relative cladogenesis statistics identified the Chaetodon lineage as having a significantly different rate of cladogenesis than its sister lineage. Both CR and MCCR tests showed no evidence for a slowdown in the rate of cladogenesis through time for the family Chaetodontidae (γ = −1.248, MCCR adjusted P = 0.55). As noted by Magallon & Sanderson (2001) the estimates of rG decreased with increasing extinction rates (Table 1). The Chaetodon clade (CH) with all subtending lineages showed significantly higher species diversity than expected given the global diversification rate up to 90% extinction rate (adjusted P = 0.03; ∋ = 0.9). Clades 2 and 4 (C2 and C4) showed significantly higher species diversity than expected (up to 80% extinction), however, the corallivorous clade 3 (C3) is not significantly more diverse than expected given the crown diversification rate even in the absence of extinction. The Prognathodes lineage also shows significantly higher diversification for up to 30% extinction (P = 0.044). If using a Bonferroni correction (adjusted P = 0.0038) the Chaetodon (CH and C2 + 3 + 4) clade still remains significant at low extinction rates.


Systematics of the Chaetodontidae

We present a comprehensive evaluation of the Chaetodontidae, with representatives from all described genera and currently recognized subgenera. There was an extremely high degree of congruence among gene regions and among methods (Likelihood, Parsimony and Bayesian). Using independent models for each gene partition the resultant phylogeny had strong support for all major nodes. In all analyses, the phylogeny strongly supports the monophyly of the family with a basal split into two clades: the long-snouted bannerfishes and the reef butterflyfishes.

Our phylogeny identified three major divisions in the long-snouted bannerfish clade and four divisions in the reef butterflyfish clade. A comparable pattern, for nine of the 10 genera, was reported by Fessler & Westneat (2007). Our data support this earlier study in placing Amphichaetodon within the bannerfish clade, rather than as a sister taxon to all remaining species in the family, as suggested by most morphological phylogenies (Smith et al., 2003). The degree of agreement between the two molecular studies is noteworthy. Despite using different markers and different representative species, the topology of the resultant trees were almost identical. This provides excellent independent corroboration of our tree. With a robust, well supported phylogenetic reconstruction for this family we are now able to explore the evolutionary history of corallivory.

Divergence times within the Chaetodontidae

Given a well-supported cladogram, with independent support for the topology, we endeavoured to provide robust molecular age estimates within a chronogram. The use of the program beast allowed more precise age calibrations than previous approaches. Furthermore, the use of exponential priors accommodates both the influence of the faunal transition at the K/T boundary and the stronger influence of the 50 Ma calibration based on the fossil ephippid (Eoplatax). Our age estimates in the resultant chronogram are supported by several independent lines of evidence.

Firstly, our estimated ages agree well with the available fossil record. We used the two best fossil dates for calibration [i.e. (1) The K/T boundary, marking the transition between Mesozoic and Cenozoic faunas (Patterson, 1993; Bellwood & Wainwright, 2002) and, (2) the calibration 50 Ma, marking the earliest fossil record of numerous reef fish families (Bellwood, 1996)]. However, there is a third piece of fossil evidence: a fully articulated Miocene fossil chaetodontid (Carnevale, 2006). It is morphologically extremely similar to extant taxa in clade 4, and at 7 Ma old lies shortly after the age estimates for this clade with a mean age of 11.3 and 95% HPD of 7.9–15.2 Ma.

Secondly, as an independent check, we can compare our estimated ages with major biogeographic events. These again compare favourably. Firstly, the terminal Tethyan event (TTE) marking the final closure of the Red Sea land bridge is dated between 12 and 18 Ma (Steininger & Rögl, 1984). These ages approximate the minimum age of the initial division between the major clades within Chaetodon at 17.8 Ma (13.3–23.2 Ma, 95% HPD). Of the major clades, two (2 and 4) lie on either side of the land bridge, whereas clade 1 is restricted to the Atlantic and clade 3 is restricted to the Indo-Pacific. Secondly, the ages of lineages that appear to have been separated by the rising of the Isthmus of Panama (IOP) i.e. Chaetodon humeralisC. ocellatus at 3.4 Ma (1.8–5.4 Ma, 95% HPD) are again extremely close to the estimated final closure of this land bridge at 3.1 Ma (Coates et al., 1992), with the 95% density distribution encompassing the geological dates (Lessios, 2008). Finally, divisions between the Indian Ocean and Pacific Ocean pairs (e.g. C. unimaculatusinterruptus at 1.3 Ma and C. trifasciatus at 2.4 Ma) closely match the estimated ages of other Indian Ocean–Pacific Ocean divisions (McCafferty et al., 2002; Read et al., 2006). The separation of C. sedentarius and C. sanctaehelenae from their closest known sister lineage in the Indo-Pacific may be a further example of an invasion of the Atlantic via the Cape of Good Hope (reviewed in Floeter et al., 2008). Based on the first two biogeographic divisions (TTE and IOP), our estimated ages with HPD intervals closely approximate these two well dated biogeographic events.

Finally, a comparison of our age estimates with those in previous studies, using a range of calibration methods, suggest that the estimated ages of our terminal taxa are comparable with those of other reef fishes (e.g. Fauvelot et al., 2003; Bernardi et al., 2004; Klanten et al., 2004; Barber & Bellwood, 2005; Read et al., 2006; Cowman et al., 2009) and other reef organisms (Palumbi et al., 1997; Lessios et al., 1999; Renema et al., 2008). The closest study to the present work is by Fessler & Westneat (2007) which yielded a very similar phylogeny and broadly comparable ages, even though they used a single model in tree construction and an additive PL method for age estimation, while we used a partitioned mixed model and Bayesian MCMC analyses with informative prior calibrations. These differences will not necessarily change the tree topology but can change relative branch lengths, while the beast analyses take into account uncertainty in topology, sequence dataset and model parameters. Overall, fossil, biogeographic and comparative data provide strong support for our chronogram. This provides a relatively robust platform for evaluating the evolution of corallivory on coral reefs.

Evolutionary and biogeographic patterns within the Chaetodontidae

In the Chaetodontidae, a move onto reefs was associated with a significant increase in species richness. Interestingly, there was no increase associated with a switch to corallivory and the exploitation of this widely available reef resource. The Chaetodontidae can be effectively divided into two ecologically and morphologically distinct clades that should be represented as sub-families: the bannerfishes and the butterflyfishes. The bannerfish clade is characterized by a distinctive long-snout morphology and it is within this clade that we see a novel suspensorial protrusion mechanism (Ferry-Graham et al., 2001a). Despite the morphological variation and innovation within the bannerfish clade, however, the standing species richness of the bannerfish lineage is not significantly different than expected given the global rate of cladogenesis (even at high extinction rates) for the crown Chaetodontidae (Table 1). Biogeographically, the bannerfish clade has close links with Australia and temperate or sub-tropical waters, and a subtropical Australian origin for this clade remains a distinct possibility. Although habitat optimization in the bannerfishes is uncertain (Fig. S1), three of the eight lineages are found on temperate subtropical rocky reefs and many species in the other lineages are found in rocky or coastal waters, supporting these temperate associations.

In contrast to the bannerfishes, the butterflyfishes exhibit limited morphological variation. Indeed, they appear to be relatively uniform (Motta, 1988) with relatively simple oral jaw mechanics and kinematics (Ferry-Graham et al., 2001a). Only with respect to intramandibular flexion does there appear to be any clear morphological variation (Konow et al., 2008). Depending on the definition of a coral reef and a coral reef fish (cf. Bellwood & Wainwright, 2002), it appears that there have been multiple invasions of coral reefs by chaetodontids. The butterflyfish clade contains 103 species, approximately 80% of species within the family, and is strongly associated with coral reefs. As in parrotfishes (Streelman et al., 2002), wrasses (Westneat & Alfaro, 2005) and tetraodontoids (Alfaro et al., 2007), the reef dwelling clades are exceptionally species rich. The Chaetodon clade, in particular, exhibits far higher numbers of species than expected even at high extinction rates (= 0.03, ∋=0.8). When considering just the reef-based clades 2, 3 and 4 there is a greater significant difference (P = 0.021, ∋ = 0.8) from expected. It thus appears that a move to reefs did indeed underpin diversification in Chaetodon, as previously reported in the tetraodontiformes (Alfaro et al., 2007). This pattern may be expected in a number of reef fish groups (Bellwood & Wainwright, 2002). However, it is noteworthy that clade 3 does not demonstrate higher species richness than expected, even though this obligate reef fish clade contains the largest number of corallivores found in any teleost taxon. It appears that a move onto reefs, not a switch to corallivory, underpinned diversification within the family.

The rise of corallivory

The Chaetodontidae contains more corallivores than any other fish family; however, this did not arise as a result of a single exceptional event. Corallivory has arisen at least 5 times, with representatives in almost every major butterflyfish clade. Furthermore, it appears to have arisen relatively recently (15.7–3.2 Ma) and in a number of markedly different ways.

The oldest estimated record of corallivory is in Chaetodon clade 3 at 15.7 Ma (the MRCA with an omnivorous sister lineage; Fig. S2). Of the 13 species examined in this clade, 12 are corallivorous (the exception is Parachaetodon ocellatus). Nine additional species can be placed in this clade based on phylogenetic (austriacus, larvatus, octofasciatus, speculum and zanzibarensis) and taxonomic (melapterus, lunulatus, andamanensis and triangulum) evidence (Fessler & Westneat, 2007; Hsu et al., 2007). All these taxa are obligate corallivores. This is the oldest record of corallivory in the family and it is in clade 3 that we see the strongest reef associations and the tightest links between fishes and corals. Several species feed on just one or two coral species and may be incapable of switching prey species (Berumen & Pratchett, 2008), while others specialize by ingesting specific parts of the coral or just mucous (Cole et al., 2008). These species have relatively long intestines and appear to represent an extreme level of coral feeding specialization (Elliott & Bellwood, 2003; Konow & Ferry-Graham, in press). Given this level of specialization, it is no surprise that it is species within this clade that exhibit the most extreme negative response to the decline in coral cover as a result of anthropogenic disturbances and climate change (Pratchett et al., 2006, 2008; Wilson et al., 2006).

Given this long association with corallivory, the monotypic Parachaetodon was a striking inclusion in clade 3. Parachaetodon ocellatus is not a corallivore and often lives in sheltered sediment rich areas (Allen et al., 1998). Given its position in the tree, this appears to be the first recorded reversal from corallivory to omnivory. The evolutionary scenario that may have triggered such a change is unclear. The explanation may be biogeographic, with a dietary switch following the loss of corals in an isolated marine basin.

The second oldest record of corallivory is in Chaetodon clade 4 at about 9.8 Ma, in C. melannotus and its sister species C. ocellicaudus (cf. Fessler & Westneat, 2007; Hsu et al., 2007). These species are again strongly reef associated and highly specialized obligate coral feeders. However, these taxa are restricted exclusively to soft corals. Their relationship with other members of the clade is not well resolved and a sister group relationship with the omnivore C. selene suggested by Fessler & Westneat (2007) would imply that the origins of corallivory in the melannotusocellicaudus clade are younger than our estimate. Nevertheless, this represents an independent, and highly distinctive, obligate soft coral feeding lineage.

The most recent examples of corallivory are found in Chaetodon clade 2. This clade contains a large number of species that occasionally graze on live corals, but only four obligate corallivores. Here, corallivory arose as a result of three independent events: C. multicinctus clade (inc. pelewensis and punctatofasciatus) at about 4.9 Ma, C. quadrimaculatus at about 3.2 Ma and the C. unimaculatusinterruptus clade at about 4.3 Ma. These ages are not well-established as incomplete taxon sampling precludes robust estimates. Nevertheless, all the three stand as relatively recent independent events, a pattern that is unlikely to be altered by further taxon sampling. There are two different feeding modes. The first two lineages contain obligate corallivores and in both cases the preferred coral prey appear to be Pocillopora spp. (Berumen & Pratchett, 2006). The latter clade consists of two sister taxa that feed exclusively on corals; C. unimaculatus in the Pacific and C. interruptus in the Indian Ocean. Chaetodon unimaculatus appears to be unique in that it feeds on both soft and hard corals (hard in French Polynesia and Hawaii vs. soft on the GBR and in Guam; Motta, 1988; Wylie & Paul, 1989; Konow & Ferry-Graham, in press; M.S. Pratchett, unpublished). It is also the only butterflyfish to take large bites from corals that remove both the polyp and the surrounding tissues. In this, the bite is more reminiscent of excavating parrotfishes which leave distinctive scars at the feeding site (Bellwood & Choat, 1990). This robust feeding mode is reflected by an unusually robust jaw morphology in this lineage (Motta, 1988; Konow et al., 2008).

Despite the clear patterns, care is needed when interpreting evolutionary history from phylogenies. The ages of origination refer to the approximate ages at which extant lineages are hypothesized to have commenced corallivory. The ages of these taxa are comparable with those recorded from other reef fish families such as the Pomacentridae (e.g. McCafferty et al., 2002), Labridae (e.g. Read et al., 2006; Cowman et al., 2009) and Acanthuridae (Klanten et al., 2004). Yet in each of these three families the Eocene fossil record yields several extinct fossil taxa that are the functional equivalents of extant taxa (Bannikov & Sorbini, 1990; Bellwood & Sorbini, 1996; Tyler & Sorbini, 1999). One can not, therefore, discount the possibility that corallivory predated the origins of extant lineages and our minimum age estimates. However, our estimates do provide a clear indication of the minimum age of this feeding mode and evidence of an increasing diversity of corallivores, in terms of both feeding modes and number of lineages, during the Miocene and Pliocene (15.7–3 Ma).

Corallivory and its implications for reef–fish interactions and the evolution of coral reefs

Our chronogram clearly suggests that corallivory did not arise with the origins of the major coral groups in the Eocene. Rather, it ties in with a major expansion and reorganization of reefs in the Miocene, and coincides with the initial formation of the biodiversity hotspot in the Indo-Australian Archepilago.

Even given that our estimates are minimum ages, 15.7–3 Ma still represents a relatively recent origination for such a derived feeding mode as corallivory. Scleractinian corals have been a significant component of shallow carbonate reefs since the early Tertiary, with most of the major Acropora clades (the coral genus targeted by most modern corallivores) already represented in the Eocene at 49–37 Ma (Wallace & Rosen, 2006). In contrast, other major coral reef benthic feeding modes, e.g. grazing herbivory and crushing with pharyngeal jaws, have been present for at least 50 Ma (Bellwood & Sorbini, 1996; Bellwood, 1999, 2003; Cowman et al., 2009). The problem of minimum estimates notwithstanding, this relatively recent rise of corallivory raises two questions: are chaetodontids one of the most recent taxa to switch to corallivory and does this switch reflect a broader change in the nature of reef–fish interactions?

In terms of the evolution of corallivory, the evidence is scarce but all the indications are that the timing of corallivory in chaetodontids is comparable to that of the only other major group with significant numbers of corallivores, the labrids. Based on the most recent labrid phylogeny (Cowman et al., 2009) corallivory appears to be derived, to have arisen only once (in the LabropsisLabrichthys clade) and to have arisen relatively recently, although considerably earlier than in the Chaetodontidae (at 29 Ma). In the parrotfishes (i.e. Bolbometopon muricatum and Sparisoma viride) coral feeding probably arose prior to the late Miocene (12 and 10 Ma respectively; Robertson et al., 2006; Cowman et al., 2009). Overall, it appears that the chaetodontids are only exceptional in terms of the number of corallivorous species within the family. Their dietary shift appears to have coincided with a general rise in corallivory in a range of reef fish families.

In terms of the broader changes in the nature of reef–fish interactions, the rise of corallivory in the Miocene is consistent with several other lines of evidence. We see a progressive increase in detritivory in the Miocene (Harmelin-Vivien, 2002) and a number of novel specialist groups e.g. specialist foraminifera feeders and fish cleaners (Macropharyngodon; Read et al., 2006; Cowman et al., 2009). The origins of corallivory, therefore, fit in a broader context in which the Miocene exhibits a new level of reef–fish interactions with more specialized reef-associated taxa. This may be associated with the rise of Acropora and Pocillopora as the dominant coral groups during this period (Johnson et al., 2008; B. Rosen, personal communication). The vast majority of corallivores and all obligate specialists feed only on these coral genera.

The number of independent origins of corallivory and the lack of morphological modifications to the feeding apparatus suggest that there are few morphological restrictions to corallivory, although the elongation of the intestine suggests that the difficulty, if any, may lie in processing rather than procuring coral tissues. Extant corallivores are often highly selective feeders, exploiting specific coral species or even specific sites on a coral (e.g. damaged tissues) (McIlwain & Jones, 1997). The rise of corallivory may therefore have been dependent on corals reaching sufficient densities to permit the selective feeding necessary to adequately process the coral tissues; the increased access to Acropora and Pocillopora colonies triggering the expansion of corallivory in the Miocene. The rapid expansion of coral bearing carbonate platforms in the Indo-Australian Archepelago in the early-mid Miocene (Wilson, 2008) may therefore have acted as the trigger for not only the rapid expansion of numerous fish groups but the origins of trophic novelty, including corallivory (Renema et al., 2008; Cowman et al., 2009). In this context, it is interesting to note that on modern coral reefs the number of corallivores declines swiftly in response to the loss of coral cover (Pratchett et al., 2006, 2008).

Coral reefs have been exposed to escalating predation pressure for millennia (Vermeij, 1977; Bellwood, 2003). For corals, predation by fishes certainly appears to have increased over the last 15 Ma. We now have, for the first time, an understanding of the origins of corallivory in fishes. Of all corallivorous fishes 63% are found within a single family, the Chaetodontidae. Yet, surprisingly, within this family this derived feeding mode has arisen at least five times over the last 3–15.7 Ma, with specialists on both soft and hard corals. This unusual feeding mode appears to reflect an exceptionally close association between this family and coral reefs. An understanding of this history offers a new perspective on the nature of the relationships between fishes and coral reefs in a changing world.


The authors wish to thank C. Fulton, A. Hoey, P.C. Wainwright, F. Walsh and P. Wirtz for tissue samples; S. Wismer for graphics assistance; the staff of Carrie Bow Cay, Orpheus Island, Lizard Island and Moorea CRIOBE Research Stations for invaluable field support, colleagues in the Centre of Excellence for Coral Reef Studies for helpful discussions and several reviewers for constructive comments. This work was financially supported by James Cook University and the Australian Research Council.