Tooth and cranial disparity in the fossil relatives of Sphenodon (Rhynchocephalia) dispute the persistent ‘living fossil’ label


Correspondence: Carlo Meloro, Hull York Medical School, The University of Hull, Loxley Building, Cottingham Road, Hull HU6 7RX, UK.

Tel.: +44 0 190 432 1737; fax: +44 0 190 432 1695; e-mail:


The tuatara (Sphenodon punctatus) is the only living representative of Rhynchocephalia, a group of small vertebrates that originated about 250 million years ago. The tuatara has been referred to as a living fossil; however, the group to which it belongs included a much greater diversity of forms in the Mesozoic. We explore the morphological diversity of Rhynchocephalia and stem lepidosaur relatives (Sphenodon plus 13 fossil relatives) by employing a combination of geometric morphometrics and comparative methods. Geometric morphometrics is used to explore cranium size and shape at interspecific scale, while comparative methods are employed to test association between skull shape and size and tooth number after taking phylogeny into account. Two phylogenetic topologies have been considered to generate a phylomorphospace and quantify the phylogenetic signal in skull shape data, the ancestral state reconstruction as well as morphological disparity using disparity through time plots (DTT). Rhynchocephalia exhibit a significant phylogenetic signal in skull shape that compares well with that computed for other extinct vertebrate groups. A consistent form of allometry has little impact on skull shape evolution while the number of teeth significantly correlates with skull shape also after taking phylogeny into account. The ancestral state reconstruction demonstrates a dramatic shape difference between the skull of Sphenodon and its much larger Cretaceous relative Priosphenodon. Additionally, DTT demonstrates that skull shape disparity is higher between rather than within clades while the opposite applies to skull size and number of teeth. These results were not altered by the use of competing phylogenic hypotheses. Rhynchocephalia evolved as a morphologically diverse group with a dramatic radiation in the Late Triassic and Early Jurassic about 200 million years ago. Differences in size are not marked between species whereas changes in number of teeth are associated with co-ordinated shape changes in the skull to accommodate larger masticatory muscles. These results show that the tuatara is not the product of evolutionary stasis but that it represents the only survivor of a diverse Mesozoic radiation whose subsequent decline remains to be explained.


Today, the reptile group Rhynchocephalia is represented by a single species, the tuatara of New Zealand: Sphenodon punctatus (Günther, 1867; Jones et al., 2009a, 2011; Hay et al., 2010). This taxon is mainly restricted to islands off the coast of New Zealand where it eats a range of invertebrate and vertebrate prey (Walls, 1981). Because of its relatively slow breeding capacity and apparent predation by invasive rodents, it has been subject to conservation efforts (Asimov, 1990; Cree et al., 1991; Cree, 1994; Towns et al., 2001, 2007; Nelson et al., 2002). Due to its apparently marginal existence and unusual anatomy, Sphenodon was also historically labelled as a ‘living fossil’ and an exemplar of its Mesozoic relatives (Bogert, 1953; Burton, 1956; Robb, 1977; Asimov, 1990). More recently, Alibardi & Toni (2006, p. 801) describe this taxon as being an ‘ancient reptilian species’ and having ‘anatomical traits unchanged in the last 220 milions[sic] years’. Subramanian et al. (2009, p. 17) state ‘Sphenodon are a relict taxon with little skeletal change from Cretaceous relatives’. Similarly, with respect to the mode of feeding, Reilly & McBrayer (2007, p. 317) base the ‘ancestral condition’ for Squamata on Sphenodon. These statements and similar ones elsewhere are often made without a single citation to any primary palaeontological literature to support them. Another study (Alfaro et al., 2009) uses Sphenodon as a lone taxon to infer the long-term diversity and evolutionary history of its entire lineage.

There is in fact little evidence to suggest that Sphenodon is an ancient species unchanged since the Mesozoic, is representative of the ancestral lepidosaur or on its own reflects the diversity of Mesozoic Rhynchocephalia (Whiteside, 1986; Reynoso, 2000; Jones, 2006a, 2008, 2009; Evans & Jones, 2010). Work over the last 30 years has contributed to the naming of almost 50 fossil species and has shown that all well-known Mesozoic Rhynchocephalia differ from Sphenodon with respect to feeding apparatus, skull structure, and body proportions (Whiteside, 1986; Fraser & Benton, 1989; Reynoso, 2000; Evans et al., 2001; Apesteguía & Novas, 2003; Evans, 2003; Jones, 2004, 2006a, 2008, 2009; Evans & Jones, 2010; Jones et al., 2012). This is particularly true of the most plesiomorphic Rhychocephalia Gephyrosaurus and Diphydontosaurus, which unlike Sphenodon lack a complete lower temporal bar (Evans, 1980; Fraser & Walkden, 1983; Whiteside, 1986; Jones, 2008). There are a few Mesozoic taxa that possess very similar dentition to the tuatara, but these species are only known from partial jaws (Evans, 1992; Reynoso, 1996; Jones et al., 2009b, 2012; Apesteguía & Jones, 2012).

Fossil Rhynchocephalia are particularly diverse in terms of feeding apparatus and skull shape (Jones, 2008). Their teeth vary in number, size, arrangement, stoutness and enamel texture and may possess blades of varying size and orientation (Throckmorton et al., 1981; Fraser & Walkden, 1983; Fraser, 1986; Evans et al., 2001; Jones, 2006a, 2009; Jones et al., 2011). As teeth are effective tools for food processing (Evans & Sanson, 1998, 2003), this diversity may reflect diversity in diet (Fraser & Walkden, 1983; Jones, 2006a, 2009). Skull shape also varies substantially between the species (Jones, 2008; Evans & Jones, 2010). This variation is not obviously linked to size or allometry but aquatic taxa exhibit streamlining, and terrestrial taxa show variation in the size of their orbit, temporal region, jaw joint position and snout shape (Jones, 2004, 2008). As Jones (2008) showed, taxa with similar teeth possessed similar skull shapes, for example, taxa with a high number of small teeth possess short temporal regions and long snouts whereas those with a small number of stout teeth possess long temporal regions with short rounded snouts.

Here we aim to explicitly test the effect of allometry and dentition on skull shape evolution of Rhynchocephalia by employing geometric morphometrics and comparative methods.

Geometric morphometrics provides a means of measuring the diversity in the shapes of complex structures (Dryden & Mardia, 1998; O'Higgins & Jones, 1998; Adams et al., 2004; Zelditch et al., 2004). It has therefore become an increasingly important technique for studies of skull shape evolution in a broad range of extinct vertebrates (cf. Jones, 2004, 2008; Stayton & Ruta, 2006; Pierce et al., 2009; Young et al., 2010, 2011; Brusatte et al., 2012; Bhullar et al., 2012; Foth & Rauhut, in press; Foth et al., 2012). Many of these studies (Stayton & Ruta, 2006; Pierce et al., 2009; Meloro, 2011; Brusatte et al., 2012; Foth et al., 2012) used specific measures of morphological disparity (Foote, 1992, 1993, 1997; Wills et al., 1994; Ciampaglio et al., 2001) to quantify morphological diversity during different time bins that may span several million years. However, very few have considered the technique promoted by Harmon et al. (2003), which involves a time-calibrated branching pattern of a phylogenetic hypothesis (cf. Burbrink & Pyron, 2009; Meloro & Raia, 2010; Slater et al., 2010).

We use the two alternative phylogenetic topologies of Rhynchocephalia (Reynoso, 1996; Apesteguía & Novas, 2003) to specifically address the following questions: does skull shape exhibit a phylogenetic signal (sensu Blomberg et al., 2003; i.e. the tendency for evolutionarily related organisms to resemble each other) in Rhynchocephalia? To what extent does size or tooth number correlate with rhynocephalian skull shape evolution? What was the evolutionary tempo and mode of skull form (= size and shape) and dentition in Rhynocephalia? We expect a relatively high phylogenetic signal in skull shape data because both rhynchocephalian phylogenies used here are based on morphological characters (cf. Foth et al., 2012). Additionally, amongst Rhynchocephalia, skull shape is supposed to be strongly influenced by the number of teeth (less so by size, cf. Jones, 2008); therefore, we test whether this association applies also after phylogeny is taken into account as a potential source of error (Garland et al., 2005). By employing plots of disparity through time (Harmon et al., 2003), we provide for the first time a measure of the tempo and mode of evolution in cranium size, shape and number of teeth independently.

Materials and methods


Our sample comprised two-dimensional digital pictures of crania in lateral view as presented in Jones (2008; Table 2), after selecting only adult specimens of Sphenodon (size category 3 sensu Jones, 2008) and adding three additional fossil taxa: Clevosaurus brasiliensis, from the Late Triassic of Brazil (Bonaparte & Sues, 2006), Sophineta from the Early Triassic of Poland (Evans & Borsuk-Białynicka, 2009) and Marmoretta from the Middle Jurassic of Europe (Waldman & Evans, 1994). Clevosaurus brasiliensis represents a new species of ‘clevosaur’ (Bonaparte & Sues, 2006) whereas Sophineta and Marmoretta along with Kuehneosaurus represent stem lepidosaurs and serve as outgoup comparisons (Robinson, 1962; Waldman & Evans, 1994; Evans & Borsuk-Białynicka, 2009).

Our total sample of digital images included 13 fossil taxa (Priosphenodon, Clevosaurus bairdi, Clevosaurus hudsoni, Clevosaurus brasiliensis, Brachyrhinodon, Pleurosaurus, Palaeopleurosaurus, Planocephalosaurus, Diphydontosaurus, Gephyrosaurus, Sophineta, Marmoretta and Kuehneosaurus) and 26 specimens of adult Sphenodon (whose shape configurations were averaged). Sphenodon is known to show sexual dimorphism (Herrel et al., 2009), but available skull material frequently lacks sex data. Also, although a second species of Sphenodon has previously been recognized, its separate identity is not supported by genetic data (Hay et al., 2010).

Unfortunately, adequate data on cranial shape are only available for about 20% of the approximately 50 named rhynchocephalian species. As for pterosaurs (Foth et al., 2012), the majority of lepidosaur fossil material is often crushed or incomplete (Evans, 2003). However, most major groups of Rhynchocephalia are represented in our analysis. No ‘sapheosaurs’ sensu Reynoso (2000) were included because although they may be represented by almost complete skeletons (Cocude-Michel, 1963; Reynoso, 2000) the lateral aspect of their skulls are invariably damaged or obscured from view. Material from the Early Jurassic of China referred to as Clevosaurus sp. is also not sufficiently well preserved to be included here (Jones, 2006b, 2008). Our sample also includes taxa from all periods of the Mesozoic: Triassic, Jurassic and Cretaceous. The modern Sphenodon represents the only Cenozoic taxon because although rhynchocephalian material was recently described from the Miocene of New Zealand, it is based on partial dentaries (Jones et al., 2009b). No squamates are included in this analysis because there are no suitable Triassic or Jurassic fossil squamates known from adequate skull material. The earliest valid squamate material is Jurassic in age but mainly comprises teeth and partial jaws (Evans, 1998, 2003; Evans et al., 2002). Moreover, modern squamate taxa are highly diverse (Stayton, 2005; Evans, 2008), and it is unclear as to which (if any) most closely resemble the stem squamate condition (Evans, 2008).

Geometric morphometrics

Spatial coordinates of two-dimensional landmarks were digitized using tpsdig 2 vs. 2.09 (Rohlf, 2006a) to describe the main cranial features as identified in Jones (2008, Fig. 1). To provide a comprehensive data set for macroevolutionary analyses, landmark coordinates of species represented by multiple specimens were averaged by applying Generalized Procustes Analysis (GPA, Rohlf & Slice, 1990). GPA removes differences between original landmark coordinates applying rotation, translation and scaling to a common reference computed using generalized least square algorithm. The common reference (=consensus) represents the averaged landmark configuration on which all the specimens are superimposed. After GPA, the centroid size (a measure that quantifies the distance between each landmark and the centre of gravity of each configuration; Bookstein, 1989) of landmark coordinates is scaled to unity so that the new set of coordinates (Procustes) differ only in shape (Zelditch et al., 2004). After computing GPA for species with multiple specimens (Sphenodon), consensus configuration was included into a new data set of 14 specimens to generate a new GPA (cf. Meloro et al., 2008).

Figure 1.

The skull of an adult Sphenodon specimen in lateral view (NMNZ RE 382 from the collections of Museum of New Zealand Te Papa Tongarewa, Wellington). (a) anatomical features. (b) equivalent landmarks used. FR, frontal; JUG, jugal; ltb, lower temporal bar; ltf, lower temporal fenestra; NA, nasal; OP, opisthotic; orb, orbit; PAR, parietal; PMX, premaxilla; POFR, postfrontal; PORB, postorbital; PRFR, prefrontal; qu.cot, quadrate cotyle; SQ, squamosal; utb, upper temporal bar; utf, upper temporal fenestra. Scale bar = 10 mm.

The procustes coordinates were decomposed into affine and nonaffine components that quantify shape differences under the theory of thin plate spline. The consensus is assumed to lay on an infinite metal plane with no deformation. Affine (Uniform component = Uni) and nonaffine (Partial Warp = PW) transformations quantify the amount of energy (bending energy matrix) necessary to deform the thin metal plane from the consensus to match the shape of each specimen.

A Principal Component analysis was applied to PWs and Uni using the software tpsrelw (Rohlf, 2008). This procedure is exactly equivalent to perform PCA on procustes coordinates and is named Relative Warp (RW) analysis because PCA can be employed on all shape variables (both affine and nonaffine, in this case) or only on a portion of that (e.g. only PWs).

We also tested for association between cranium shape and other factors (as previously identified in Jones, 2008), employing Partial Least Square (Rohlf & Corti, 2000). PLS is an exploratory technique that allows testing for association between two multivariate blocks of variables. In our case, we test association between size (one variable = log maximum cranium length) and cranium shape data (51 pair of PWs and 2 Uni) as well as dentition (one block with four variables = minimum and maximum number of adult lower marginal teeth, minimum and maximum number of adult upper marginal teeth, Table 1) and cranium shape.

Table 1. Values based on Robinson (1962, 1976), Evans (1980), Fraser (1982, 1988), Fraser & Walkden (1983, 1984), Whiteside (1986), Fraser & Benton (1989), Carroll and Wild (2005), Sues et al. (1994), Waldman & Evans (1994), Apesteguía & Novas (2003), Bonaparte & Sues (2006), Evans & Borsuk-Białynicka (2009) and Jones M.E.H. personal observations
TaxonBroad affinitySkull lengthMin no. of adult lower marginal teethMax no. of adult lower marginal teethMin no. of adult upper marginal teethMax no. of adult upper marginal teeth
Kuehneosaurus Outgroup4030302525
Marmoretta Outgroup2130353540
Sophineta Outgroup9.527272222
Gephyrosaurus Rhynchocephalia3030353540
Diphydontosaurus Rhynchocephalia1520253035
Planocephalosaurus Rhynchocephalia2012121618
Clevosaurus bairdi Rhynchocephalia223445
C. hudsoni Rhynchocephalia392727
C. brasiliensis Rhynchocephalia212?4?3?4?
Brachyrhinodon Rhynchocephalia232447
Palaeopleurosaurus Rhynchocephalia578899
Pleurosaurus Rhynchocephalia8213131919
Sphenodon Rhynchocephalia5812151519
Priosphenodon Rhynchocephalia9716162626

Partial Least Square maximizes the degree of covariation between two blocks of variables extracting latent vector through Singular Value Decomposition (SVD) from the correlation matrix of each block. The vectors obtained employing SVD are called Singular Warp (SW), and they are interpreted in pairs (one for each block). Correlation between pairs of SW was tested using nonparametric linear correlation r randomized 10 000 employing the software tpspls (Rohlf, 2006b).

Although size here is represented by only one variable, we preferred to apply PLS that has no restriction on sample size and multivariate data set when compared to the standard multivariate allometry. We aim to extract comparable vectors for size–shape data and dentition–shape data so as not to alter biological interpretation of the results.

Comparative methods

As our morphological data represent interspecific variation of Rhynchocephalia, they are biased by the lack of independence due to common ancestry (Garland et al., 2005). Furthermore, the phylogenetic relationships of extinct Rhynchocephalia are based entirely on morphological characters (Reynoso, 1996; Apesteguía & Novas, 2003). Precise relationships within Rhynchocephalia remain problematic particularly for taxa known from incomplete or juvenile individuals (Reynoso, 2005), but with respect to the taxa discussed here, there is a general consensus of relationships. Keuhneosaurus, Marmoretta and Sophineta represent successive outgroups to Rhynchocephalia (Evans & Borsuk-Białynicka, 2009). Within Rhynchocephalia, Gephyrosaurus, Diphydontosaurus and Planocephalosaurus represent successive sister taxa to a clade including ‘clevosaurs’ (Brachyrhinodon, Clevosaurus), eilenodontines (Priosphenodon), pleurosaurs (Palaeopleurosaurus, Pleurosaurus) and sphenodontines (Sphenodon; Reynoso, 1996; Reynoso & Clark, 1998; Evans et al., 2001; Apesteguía & Novas, 2003). A sister taxon relationship between eilenodontines and sphenodontines seems to be well supported (Reynoso, 1996; Apesteguía & Novas, 2003). However, phylogenetic hypotheses differ with respect to whether it is ‘clevosaurs’ (Reynoso, 1996; Reynoso & Clark, 1998) or pleurosaurs (Apesteguía & Novas, 2003), which are more closely related to this grouping of eilenodontines + sphenodontines. The intrarelationships of Clevosaurs remains poorly understood (Jones, 2006b); therefore, we treat the three species used here (bairdi, brasiliensis, hudsoni) as forming an unresolved polytomy.

The alternative topologies of Reynoso (1996) and Apesteguía & Novas (2003) were plotted against time according to the currently known stratigraphic range of each taxon (Table 2). To minimize the effect of ghost lineage in taxa with similar stratigraphic range, a minimum of 1 million year was added to separate sister taxa that occurred at the same time interval. Tree topologies were manually written in a nexus file format and branch length assigned as time of divergence in million of years (based on first and last occurrence of each specific taxon). There are no complete character state data for the taxa included in our data set, and for this reason, we could not implement other methods to obtain a more accurate time resolution of our topologies (cf. Ruta et al., 2006; Brusatte et al., 2008). We combined our time-calibrated topologies with data from the cranium shape morphospace (cf. Figueirido et al., 2010; Brusatte et al., 2012; Foth et al., 2012). Shape coordinates of the hypothetical ancestor for each node (the node values defined as Heritable Taxonomic Unit, HTU) were estimated using squared change parsimony method (Maddison, 1991). With the software tpstree 1.21 (Rohlf, 2007), the HTU shape is estimated using a weighted mean of the Partial Warp for each Operational Taxonomic Unit (OTU), and the nodes directly connected to them. We saved the estimated shape coordinates of HTU and added that to the original OTU shape coordinates. Thus, we re-performed GPA including 14 OTU + 12 node estimates, and Relative Warp analysis was repeated to describe the empirical phylomorphospace.

Table 2. First and last occurrences based on Whiteside (1986), Fraser & Benton (1989), Carroll and Wild (1994), Evans & Kermack (1994), Fraser (1994), Sues et al. (1994), Waldman & Evans (1994), Martin & Krebs (2000), Apesteguía & Novas (2003), Worthy & Grant-Mackie (2003), Heckert (2004), Bonaparte & Sues (2006), Evans & Borsuk-Białynicka (2009) and Jones et al. (2009b)
TaxonEarliest known occurrenceStageAge MaLatest known representativeStageAgeKnown range
Kuehneosaurus Cromhall, Emborough, Batscombe, UKRhaetian200n/an/an/a1
Marmoretta Kirtlington, UKLate Callovian165Guimarota, PortugalOxfordian/Kimmeridgian15510
Sophineta Czatkowice, PolandOlenekian248n/an/a 1
Gephyrosaurus St Brides, UKRhaetian198n/an/an/a1
Diphydontosaurus Chinle specimen, USALate Carnian218Tytherington, UKRhaetian20018
Planocephalosaurus Chinle specimen, USALate Carnian218Tytherington, UKRhaetian20018
Clevosaurus hudsoni Cromhall, Quarry, UKRhaetian200n/an/an/a1
Clevosaurus bairdi McCoy Brook Fm, CanadaHettangian198n/an/an/a1
Clevosaurus brasiliensis Caturrita Formation, Brazil specimenLate Carnian – early Norian218n/an/an/a1
Brachyrhinodon Elgin, ScotlandCarnian225.5n/an/an/a1
Palaeopleurosaurus Posidonienschiefer, GermanyToarcian179.5n/an/an/a1
Pleurosaurus Posidonienschiefer, GermanyToarcian179.5Canjeurs, FranceEarly Tithonian14732.5
Sphenodon Old Rifle Butts, New ZealandLatest Pleistocene0.3ExtantHolocene01
Priosphenodon Candeleros Formation, ArgentinaTuronian94n/an/an/a1

The same procedure is implemented in the software morphoj (Klingenberg, 2011) to detect the amount of phylogenetic signal into the cranium shape data (Klingenberg & Gidazwiski, 2010). Randomizing the order of tip data (in our case, shape coordinates of the 14 OTU), it is possible to re-compute ancestral shapes. The sum of procustes distances between OTUs and estimated HTU under random models are computed and described as tree length. The observed tree length (based on the correct topology) is then compared with random tree length values: if the observed tree length is greater than that obtained by random, a strong phylogenetic signal occurs. P values obtained with this procedure were used to detect phylogenetic signal in cranium shape data, and tree length values of both topologies were compared to identify degree of homoplasy (Klingenberg & Gidazwiski, 2010).

Once detected, the phylogenetic signal, Phylogenetic Generalized Least Square (PGLS, Rohlf, 2001, 2006c), was employed into PLS models to validate the association of cranium shape with size or dentition (cf. Meloro et al., 2011; Meloro, 2012; Piras et al., 2012). PGLS allows incorporating phylogenetic information (=OTU covariance matrix) as error term in ordinary least square models. The OTU covariance matrix was obtained for both topologies with the software ntsyspc 2.21c and added as error term to validate association between significant Singular Warp vector of cranium shape (arbitrary set as dependent) and that of size or dentition.

Evolutionary tempo and mode

If phylomorphospace provides a means of detecting phylogenetic signal in cranial shape data, the plot of cranial shape deformation into a phylogenetic tree is primarily an instrument to explore evolutionary shape transformations and infer mode of evolution (Rohlf, 2002). We used tpstree 1.21 (Rohlf, 2007) to visualize shape deformation as estimated at the node of phylogenetic topology. To quantitatively explore evolutionary tempo and mode, we applied the procedure of Harmon et al. (2003) to generate disparity through time plots (DTT) for cranium shape, size and dentition data. Morphological disparity (MD) is a robust measure of morphological variance, and Harmon et al. (2003) introduced a way to detect changes in MD along a phylogenetic topology: at each tree node, mean subclade relative disparity is computed as the mean of the ratios of each subclade disparity divided by the disparity of the entire clade. This procedure is repeated from the tip until the root of the tree, and DTT obtained as node time (X axis) vs. relative disparity value. When values of relative disparity are close to 0.0, it means that subclades contain only a small proportion of the total variation (i.e. little overlap occurs in the empirical morphospace between different subclades) whereas relative disparity values close to 1.0 indicate that there is an extensive morphological overlap between subclades.

To detect departures of observed rate of evolution vs. the theoretical model of random evolution (Brownian motion), the observed line of disparity through time is plotted against the theoretical line obtained from Brownian motion model simulations (Harmon et al., 2003). The differences in area under the line computed for the observed and theoretical disparity are then computed so that positive values indicate that relative disparity in the data is high, whereas negative values indicate that it is low. A high relative disparity is associated with significant morphological overlap between subclades, while negative area suggests a low degree of overlap. This procedure was repeated for the three structures analysed (cranium shape, size, and dentition) and for both phylogenetic topologies. As both topologies included a three-way polytomy with respect to the species of Clevosaurus, we employed the R command multid2 to randomly solve that. Meloro & Raia (2010) have recently demonstrated that this procedure can be implemented to the study of fossil organisms, and it provides robust results also when phylogenetic topologies are uncertain. The DTT and area differences were performed using the package Geiger (Harmon et al., 2008).



The Relative Warp analyses of the 14 OTU and 12 HTU using the topology of Reynoso (1996) show a high degree of differentiation in cranium shape between subclades of Rhynchocephalia. Such differentiation is apparent on the RW1 and 2 that explain 31.13% and 18.73% of the total variance, respectively (Fig. 2a). Clevosaurs, Sphenodon and Priosphenodon have negative scores for RW1 whereas the basal taxa, pleurosaurs and the outgroup taxa (Sophineta, Marmoretta and Kuhenosaurus) show positive scores for RW1. The estimated HTU all plot within the morphological variation of their respective descendants. The eilenodontine Priosphenodon is relatively isolated with negative RW1 scores and positive RW2 scores.

Figure 2.

Phylomorphospace identified by the first and the second Relative Warp of skull shape data. (a) Topology of Reynoso (1996) is plotted into the RW while that of Apesteguía & Novas (2003) is plotted in (b). Skull shape deformations from the negative to the positive RW scores are shown using thin plate spline.

The group dichotomy displayed on the RW1 is clearly associated with variation in the relative proportions of the maxilla, orbit and temporal fenestra. Crania at negative RW1 are relatively robust with a short maxilla and a large area of the lower temporal fenestra, whereas crania with high RW1 scores are characterized by a relatively long rostrum and a short temporal area. Relative Warp 2 differentiates eilenodontines and pleurosaurs from everything else according to differences in the dorsoventral proportion of the skull. Taxa at the positive scores such as pleurosaurs have relatively shallow skulls whereas those with negative scores such as Sphenodon have tall skulls (Fig. 2a). Connecting HTU to their respective OTU suggests a general phylogenetic trend towards skulls with high scores along RW1: robust skulls with larger postorbital areas and short snouts (e.g. clevosaurs and sphenodontines). However, something of a reverse trend occurs with respect to the aquatic pleurosaurs taxa that share skull shape similarities with the basal rhynchocephalians and outgroup taxa. When the analysis is repeated for topology of Apesteguía & Novas (2003), there is a similar pattern (Fig. 2b). The only obvious difference being the skull shape calculated for the common ancestor of the lineage leading to Priosphenodon+Sphenodon. With clevosaurs as the sister taxa (Reynoso, 1996), the HTU has a rounded snout and almost complete lower temporal bar, whereas with pleurosaurs as the sister taxon, the HTU has a longer snout, larger nares and less complete lower temporal bar.

A phylogenetic signal in skull shape is supported for both tree topologies (= 0.044 after 10 000 randomization using the topology of Reynoso (1996) or = 0.0021 for the Apesteguía & Novas, 2003). There is very little difference in the computed tree length for the two topologies, being 0.383 for Reynoso (1996) and 0.379 for Apesteguía & Novas (2003).

Skull shape, size and dentition

Partial Least Square analysis detects one pair of Singular Warp vectors when correlating skull length = size (only 1 variable) vs. shape. The correlation between these two vectors is high r = 0.748 (Fig. 3), but it is not significantly different from the random expectation (on 9999 randomizations 36% of cases hold r values higher than this). Small skull sizes are generally associated with a shorter rostrum and a larger orbit whereas the opposite is true of larger forms (Fig. 3). When using multivariate regression (size vs. shape), size explains only 9% of total shape variance, and the Goddall F test still implies nonsignificance. This mainly reflects the superficial similarities between the two largest skulls Pleurosaurus and Priosphenodon related to relative orbit size. Size does not associate with any of the Relative Warps extracted for a subset of 14 OTU only (all correlation P values > 0.1 except for PC9 = 0.053). The PGLS confirms nonsignificant association between skull size and shape along PLS vectors (with both topologies > 0.1).

Figure 3.

Plot of the first Singular Warp extracted for association between size and shape. Deformation grids represent thin plate spline related to negative or positive scores of SW1 shape.

When using tooth number as possible correlating factor of skull shape, PLS extracts four pair of Singular Warps. The first pair explains the highest percentage of co-variation (98.12%) and exhibits the highest correlation = 0.83 (Fig. 4). Nonparametric tests support significance even if on 9999 random simulation a small portion, 6.96%, showed higher r values (due to the small sample size). Subclade clustering still occurs on the PLS dentition–skull morphospace space although distribution of the basal Rhynchocephalia overlaps with both the aquatic pleurosaurs and outgroup taxa. The most negative SW1 shape and tooth scores are those of clevosaurs that possess a relatively low number of teeth and a relatively stocky skull with a short rostrum, a small orbit and a large lower temporal fenestra. The most positive SW1 shape and teeth scores are found in the outgroup and ‘basal’ taxa. These share a high number of teeth and a skull with a relatively long rostrum, large orbit and incomplete lower temporal bar.

Figure 4.

Plot of the first Singular Warp extracted for association between dentition (4 variables) and shape. Deformation grids represent thin plate spline related to negative (low teeth number) or positive (high teeth number) scores of SW1 shape.

As the phylogenetic signal is mixed in this SW plot, it is not unexpected to find out that association along the first pair of SW is still significant when phylogeny is taken into account. Moreover, the association between SW1 teeth and SW1 shape is actually stronger. Reynoso (1996): PGLS: = 0.864, F1,12 = 35.32, < 0.0001 and Apesteguía & Novas (2003): PGLS: = 0.885, F1,12 = 43.65, < 0.0001. No correlation exists between number of teeth and skull length.

Tempo and mode of evolution in Rhynchocephalia

The phylogenetic mapping of skull shape shows evolutionary transformation that confirms a strong phylogenetic signal in the data. Gradual shape transformation occurs between the more ancient nodes and the group of Clevosaurus so that the least deformed configuration is associated with the ancestry of Gephyrosaurus+all other Rhynchocephalia. The ancestor of Clevosaurus is calculated to be Sphenodon-like in some aspects whereas the evolutionary transformation into pleurosaurs retains a shallow skull with a long rostrum. The greater difference between the alternative topologies emerges in the estimated shape of the common ancestor of Priosphenodon and Sphenodon (Fig. 5). This largely reflects differences in the temporal region between the two taxa that have a profound impact on the estimate of their common ancestor (Fig. 5).

Figure 5.

Phylogenetic mapping of skull shape into the two topological hypotheses for Rhynchocephalia: (a) Reynoso (1996); (b) Apesteguía & Novas (2003). Deformation grids represent the major transformation between the consensus (no deformation) and the Heritable Taxonomic Unit (HTU target estimated using squared change parsimony method.

Disparity through time plots reveal different trends occurring between skull shape and size and dentition (Fig. 6). Relative disparity of the cranium shape is always higher when compared to that simulated under Brownian motion of evolution, and this is proved also by the area difference that is positive in all cases (Table 3). In contrast to the results of cranium shape, the relative disparity of size and dentition shows a dramatic decrease at the origin of Rhynchocephalia. Area differences are always negative with respect to Brownian motion (Table 3), suggesting lower and more conservative diversification rate in skull size and dentition during evolutionary history. These results always apply irrespective of the phylogenetic topology used.

Figure 6.

Disparity through time plot generated independently for skull shape, size and dentition using the topology of Reynoso, 1996 (right side), or Apesteguía & Novas, 2003 (left). On the X axis the relative time age (going from 0 to 3.12), while on the Y axis the relative disparity values. The dashed line represents the disparity computed under Brownian simulation model.

Table 3. Differences in area between curves of relative observed and expected (under Brownian motion evolution model) disparity. The difference is computed considering all points that generate the curves, only 2/3 of the curve or only the mean points generating the curve
Area differenceSkull shapeSkull sizeDentition
Reynoso (1996)
Apesteguía & Novas (2003)


Major patterns in rhynchocephalian cranial shape

Our phylomorphospace confirms the results of Jones (2008) with respect to variation in rhynchocephalian skull shape. The majority of variation involves differences in relative orbit size, nares size, nares orientation, skull proportions, temporal structure and position of the jaw joint. There is strong phylogenetic signal with subclades being distinct from one another. For example, all four clevosaurs plot near each other with respect to other taxa such as the pleurosaurs. All three outgroup taxa (stem lepidosaurs) plot near one another despite representing animals of different palaeoecological role: Sophineta being a small scansorial animal (Evans & Borsuk-Białynicka, 2009), Marmoretta being an aquatic predator (Waldman & Evans, 1994) and Kuehneosaurus being a parachuter/glider (Robinson, 1962; Stein et al., 2008). However, in the sample as a whole, evolutionary paths are not very parsimonious. This is related to convergence, for example, between the pleurosaurs and stem lepidosaurs that both posses a long snout and short temporal region. In pleurosaurs, it is presumably partly linked to the benefits of having a streamlined skull for life in an aquatic environment (Taylor, 1987; Jones, 2008). However, it is also probably related to permitting a large gape and rapid jaw closure with less emphasis on bite force (Reynoso, 2005; Jones, 2008). Sphenodon and clevosaurs also show similarities in sharing a short snout and large lower temporal fenestra. This configuration probably reflects the use of their jaws for forceful biting as suggested by their stout teeth (Jones, 2008, 2009). The comparative isolation of Priosphenodon (Apesteguía & Novas, 2003) is due to its derived temporal region, relatively small orbit and relatively shallow skull profile. It is likely that the apparent morphological gap between Priosphenodon and other rhynchocephalians will be reduced by improvements in the fossil record, for example, with discovery of complete skull material of the more ancient eilenodontines Eilenodon and Toxolophosaurus from Late Jurassic and Early Cretaceous of the USA, respectively (Rasmussen & Callison, 1981; Throckmorton et al., 1981). A well-preserved adult skull of Opisthias from the Late Jurassic of North America (Kirkland, 2006) may also help as it possibly represents the sister taxon to eilenodontines (Apesteguía & Novas, 2003) and its dentition shares features with both eilenodontines and Sphenodon (Throckmorton et al., 1981; Jones, 2009).

Phylogeny and cranial shape

The phylogenetic signal is always present and significant in skull shape of Rhynchocephalia. The topology of Reynoso (1996) suggests a slightly higher degree of homoplasy than that of Apesteguía & Novas (2003). Thus, the more recent topology is possibly more parsimonious with skull shape evolution. However, all the analyses show that the inclusion of two alternative topologies does not alter the results and conclusion of the study. This was already experimented extensively in mammals (Carnivora, Meloro et al., 2008; Meloro & Raia, 2010) and suggests that fossil topologies should always be incorporated even when not that well established.

Recently, much has been made of the association between skull shape and phylogeny (Daza et al., 2009; Figueirido et al., 2010; Brusatte et al., 2012; Foth et al., 2012). It is not at all surprising that phylogeny is important particularly when dealing with fossil taxa whose phylogenetic hypotheses are dependent on morphological data such as skull structure. However, we note that significant phylogenetic signal has been detected also in the skull shape of different populations of smooth newts (Mesotriton alpestris, Salamandridae) from the Balkan Peninsula based on molecular topology (Ivanović et al., 2011). It is possible that the vertebrate skull carries a significant phylogenetic signal in many groups of vertebrates due to its complexity and the influence of many developmental pathways to its variation (cf. Caumul & Polly, 2005; Cardini & Elton, 2008). A significant phylogenetic signal at different evolutionary scales (e.g. from populations to orders) does not imply any particular causal mechanisms (Losos, 2011a,b) but simply suggests that comparative methods must be broadly applied because data interdependency in biological traits is very common in neontological and paleontological data sets (Blomberg et al., 2003). Further studies are required to identify if phylogenetic signal is more dependent on the scale of taxonomic coverage (e.g. data belonging to members of different families) or on the use of molecular or morphological topologies.

Size and tooth number

In Rhynchocephalia, tooth number can explain variation in skull shape to a greater extent than skull size as shown by the analysis using Partial Least Square. Skull shape in Rhynchocephalia does vary with size but not in a consistent fashion. In many analyses, skull size is found to promote morphological convergence in the shape morphospace (Brusatte et al., 2012). Particularly, when relatively few landmarks are used to describe the skull, the sample shows little disparity, or the sample includes examples spanning a very large size range. However, we confirm the results of Jones (2008) that size is not strongly correlated with skull shape in Rhynchocephalia as whole: Sphenodon, Palaeopleurosaurus, Pleurosaurus and Priosphenodon have large skulls (57–97 mm) but show substantial differences from one another with respect to snout shape and structure of the temporal region. Similarly, clevosaurs and the basal taxa overlap in size range (21–39 mm vs. 15–30 mm) but differ in aspects of snout shape, temporal structure and robusticity.

The explanatory power of tooth number suggests that skull shape in Rhynchocephalia could have been highly influenced by the developmental pathways of teeth and/or associated mode of feeding (Jones, 2008). Rhynchocephalia do possess unusual teeth in that replacement is either slow or entirely absent (Robinson, 1976; Evans, 1980, 1985; Fraser, 1986; Whiteside, 1986). New teeth are added to the jaw bone as it grows. Therefore, within species, tooth number is linked to skull size. However, this is not necessarily true between species for Rhynchocephalia as a whole. Variation in tooth number is partly linked to variation in tooth size and shape (Jones, 2008, 2009). The jaws of Gephyrosaurus can accommodate a large number of small conical or columnar teeth whereas those clevosaurs support a small number of stout elongated teeth bearing bladed edges (Evans, 1980; Fraser, 1988; Jones, 2008, 2009). Differences in tooth shape have clear biomechanical implications for feeding (Evans & Sanson, 1998, 2003; Freeman & Lemen, 2006; Jones, 2006a,b, 2009; Meloro et al., 2008; Anderson, 2009; Meloro & Raia, 2010), but tooth number is also important (Lucas & Luke, 1984). A large number of small teeth may permit a large amount of food to be reduced per jaw movement, but intuitively a skull with longer tooth row may restrict the size of the adductor chamber length. Conversely, a small number of teeth will likely reduce the surface area of initial contact with food items, thus maximizing the loading from bite forces (Jones, 2008, 2009). A trend towards a reduction in number of teeth seems to follow phylogeny in Rhynchocephalia. High number of teeth tend to be found in the outgroup or basal forms (Fig. 4), but the PGLS clearly suggests that changes in number of teeth are significantly associated with changes in skull shape. Triassic and Jurassic rhynchocephalians with a high tooth number also exhibit a large orbit, an incomplete temporal bar and a pointed snout compared with taxa with a smaller number of teeth (e.g. Sphenodon but especially clevosaurs) that possess more robust skull with greater room for the adductor muscles (Jones, 2008).

The evolution of skull shape in rhynchocephalians

Cranial shape seems to have evolved relatively slowly until the emergence of Gephyrosaurus. Between Sophineta and Gephyrosaurus, there is a reduction in lacrimal size, expansion of the maxillary facial process and a posterior extension of the jugal bone. The inferred common ancestor of Rhynchocephalia did not possess a complete lower temporal bar. The evolution of clevosaurs is characterized by a shortening of the snout so that the nares become more upright and an elongation of the posterior jugal process. The evolution of pleurosaurs by contrast involves elongation of the snout (so that the nares become less upright) and reduction of the posterior jugal process. The anteroventral border of the squamosal is much more embayed in clevosaurs. Both topologies show the great deformations occurring between Priosphenodon and Sphenodon while the relative position of more internal nodes has little impact on the associated ancestral shape reconstruction. This observation supports the relatively small impact of different topologies in understanding macroevolutionary history of clades at large scale (cf. Meloro et al., 2008; Meloro & Raia, 2010). DTT confirms this, so that trends are almost identical even after phylogenetic relationships are different. Skull shape appears to be strongly influenced by species ecology when compared to size or dentition. In fact, skull shape evolved generating a very high diversification of adaptation that includes repeated evolution of long snouted forms. This explains similarities between plesiomorphic forms and the more derived pleurosaurs. In Marmoretta and the pleurosaurs, this is associated with an aquatic lifestyle (Taylor, 1987; Jones, 2008), but in other taxa such as Gephyrosaurus, this may relate to carnivory and rapid jaw closure (Evans, 1980; Metzger & Herrel, 2005; Jones, 2008).

Like other eilenodontines, Priosphenodon possesses a dense battery of wide teeth with thickened enamel that would be suited to an herbivorous diet (Throckmorton et al., 1981; Apesteguía & Novas, 2003; Jones, 2008, 2009). However, this reptile's cranium does not show the features typically associated with herbivory in squamates: a deep temporal region and a short snout (Metzger & Herrel, 2005; Stayton, 2006). Interestingly, it is clevosaurs that most obviously possess these traits. Although both partial herbivory (Fraser & Walkden, 1983) and facultive herbivory (Fraser, 1985) have been suggested as dietary modes of Clevosaurus, its teeth resemble mammalian carnassials and appear suited to carnivory (Jones, 2008, 2009).

Skull shape DTT shows a dramatic increase after relative time 0.5, which corresponds generally to the node of 200 million years ago. This time represents the emergence of clevosaurs, and it is likely that such high peak in disparity is due to the high difference between taxa with (Clevosaurus) and without a complete temporal bar (Gephyrosaurus). There are also major differences in the relative orbit size, temporal fenestra and snout, including the construction of the maxilla and premaxilla. Such high disparity is diluted when all taxa are considered (relative time 0.0) because the number of species with no temporal bar is much higher. The simulated Brownian motion DTT also follows a similar trend even if the observed disparity through time is much higher.

Compared to skull shape, the evolution of skull size and dentition are more conservative within subclades, and they suggest that species did not overlap in time in these traits. In particular, we note a dramatic drop in the disparity of number of teeth before relative time 0.5 while skull size shows another peak in disparity before declining. Size and dentition evolved differently so that also successive peaks in disparity (e.g. size at 1.5 roughly 130 Ma; dentition at 0.7, c. 195 Ma) are shifted. The nonassociation between skull size and number of teeth is then explained by the passage of time. We note also a strong departure of disparity through time from Brownian motion of evolution trend especially in tooth number and skull size contra skull shape (Fig. 6). This suggests that the same morphological structure might be interpreted differently depending on the trait analysed (cf. Meloro & Raia, 2010).


New comparative methods combined with Geometric Morphometrics shows that for Rhynchocephalia, differences in tooth number can account for a greater amount of cranial shape variation than can cranial size. The phylomorphospace highlights the fact that Rhynchocephalia were once a highly diverse group and Sphenodon is not the product of evolutionary stasis. Similarly, changes in disparity over time show that the evolution of this group was not simple. Skull shape evolved at a faster rate (high levels of averaged subclade disparity) than size and tooth number (low levels of subclade disparity). This allowed Rhynchocephalia to quickly occupy different ecomorphological niches that were replaced through time by similar skull shape morphotypes (e.g. the aquatic taxa). Conversely, skull size and tooth number evolved slowly and showed uncorrelated shift of disparity through time (they evolved at different rates).

It remains unclear as to why Rhynchocephalia, which was once such a diverse group, is now so reduced in number. Representatives seem to have disappeared from east Asia sometime after the Early Jurassic and became restricted to southern continents by the Late Cretaceous (Evans et al., 2001; Apesteguía & Novas, 2003; Jones, 2006b; Jones et al., 2009b; Apesteguía & Jones, 2012). Competition with emerging derived lizards has been suggested (Milner et al., 2000; Apesteguía & Rougier, 2007), but given the degree of oral food processing employed by members of this group (Jones et al., 2012), competition with mammals (Bogert, 1953) and small dinosaurs should also be considered. Distinguishing between competitive and opportunistic replacement remains challenging (Benton, 1987; Jones, 2006b), and differences in preferred environment are difficult to rule out (Evans, 1995). The circumstances of rhynchocephalian geographic contraction may also have differed between northern and southern continents (Apesteguía & Novas, 2003; Apesteguía & Jones, 2012), and the palaeoecological and morphological diversity now apparent in Rhynchocephalia (Jones, 2006a, 2008; Evans & Jones, 2010; Jones & Lappin, 2009) might suggest that different lineages went extinct for different reasons. As Sphenodon is now the sole representative of Rhynchocephalia, one of the six major amniote clades (Squamata, Rhynchocephalia, Testudinata, Crocodylia, Aves and Mammalia), its conservation is of prime importance to amniote biodiversity.


We are grateful to staff at the Grant Museum of Zoology, UCL, UK; University of Bristol, UK; the Natural History Museum, UK, and Cambridge University Museum of Zoology, UK, and the Museum of New Zealand Te Papa Tongarewa, Wellington, New Zealand for allowing access to specimens. We thank Steve Brusatte and Juan Diego Daza for constructive reviews that considerably improved the quality of this manuscript.