Geometric morphometric analysis of craniofacial variation, ontogeny and modularity in a cross-sectional sample of modern humans

Authors


Correspondence

Hans L. L. Wellens, Researcher, Department of Orthodontics and Craniofacial Biology, Radboud University Nijmegen Medical Centre, 309 Tandheelkunde, P.O. Box 9101, 6500 HB Nijmegen, The Netherlands. T/F: + 32 50 396836; E: wellens.hans@telenet.be

Abstract

This investigation aimed to quantify craniofacial variation in a sample of modern humans. In all, 187 consecutive orthodontic patients were collected, of which 79 were male (mean age 13.3, SD 3.7, range 7.5–40.8) and 99 were female (mean age 12.3, SD 1.9, range 8.7–19.1). The male and female subgroups were tested for differences in mean shapes and ontogenetic trajectories, and shape variability was characterized using principal component analysis. The hypothesis of modularity was tested for six different modularity scenarios. The results showed that there were subtle but significant differences in the male and female Procrustes mean shapes. Males were significantly larger. Mild sexual ontogenetic allometric divergence was noted. Principal component analysis indicated that, of the four retained biologically interpretable components, the two most important sources of variability were (i) vertical shape variation (i.e. dolichofacial vs. brachyfacial growth patterns) and (ii) sagittal relationships (maxillary prognatism vs. mandibular retrognathism, and vice versa). The mandible and maxilla were found to constitute one module, independent of the skull base. Additionally, we were able to confirm the presence of an anterior and posterior craniofacial columnar module, separated by the pterygomaxillary plane, as proposed by Enlow. These modules can be further subdivided into four sub-modules, involving the posterior skull base, the ethmomaxillary complex, a pharyngeal module, and the anterior part of the jaws.

Introduction

Human craniofacial growth and the morphological variance that comes with it have been the subject of long-standing interest. The functional matrix hypothesis (FMH) formulated by Moss (1962, 1997) suggests that craniofacial skeletal growth is directed mainly by the operational and spatial demands of developing neighboring ‘functional volumes’: skeletal muscles and multiple other tissues and organs, such as the brain, eyes, nasopharynx, masticatory system and even sinuses (Moss, 1962). According to this hypothesis, the craniofacial skeleton is therefore literally molded into shape, through time, by developing adjacent organs (Moss, 1962, 1997). The skeletal muscles represent the so-called periosteal matrix, while all other tissues and organs combined constitute the capsular matrix (Moss, 1962).

Although the basic principles of the FMH (Moss, 1962) are fairly widely accepted, there is some discussion regarding the amount to which basicranial growth might be under epigenetic (i.e. non-genetic) control, in addition to being molded into shape by developing surrounding tissues. Since the basicranium, by way of its central position in the skull, divides and connects the neuro- and viscerocranium, it might to some extent influence facial growth (Bastir & Rosas, 2006; Gkantidis & Halazonetis, 2011; Lieberman et al. 2000b, 2008). Lieberman & McCarthy (1999) and Lieberman et al. (2008) pointed out that basicranial growth occurs mainly through endochondral ossification at the synchondroses, thus allowing for cranial base flexion or extension. This occurs either through differing depositional or resorptive growth fields on either side (anteroposteriorly) of the synchondroses (i.e. drift) (Giles et al. 1981; Enlow & Hans, 1996; Lieberman & McCarthy, 1999) or through differential chondrogenic activity at the upper vs. lower margins (Giles et al. 1981; Lieberman & McCarthy, 1999). Contrary to the intramembranous ossification around many of the various organs constituting the capsular matrix, basicranial growth might therefore be under more intrinsic control (Jeffery & Spoor, 2002; Lieberman et al. 2008). Additionally, the mid-sphenoidal synchondrosis ossifies prior to birth, whereas the spheno-ethmoidal one usually does not fuse before 6 years of age (Lieberman & McCarthy, 1999). The spheno-occipital synchondrosis remains active up to approximately 12 years of age (Lieberman & McCarthy, 1999). The midline basicranium therefore reaches adult shape at about 7–8 years, contrary to the lateral cranial base (at about 11–12 years). Both structures attain their adult shape before the neurocranium and face (at 15–16 years) (Bastir et al. 2006), possibly constraining the growth and/or position of the latter structures (Lieberman et al. 2008).

From a general point of view, growing and developing organs should not impinge on one another. As a result, all organs must grow/develop in a more or less coordinated way. The two closely related biological concepts of morphological integration and modularity have the potential to explain the aforementioned notion of coordination/balance of craniofacial growth (Moss, 1962, 1997; Lieberman et al. 2000a,b, 2008; Klingenberg et al. 2003; Bastir & Rosas, 2005; Mitteroecker, 2007; Bastir, 2008; Klingenberg, 2008; Mitteroecker & Bookstein, 2008; Klingenberg, 2009). Growing organs and their surrounding skeletal structures share abundant and strong interactions which can be anatomic, developmental, functional or genetic in nature (Bastir & Rosas, 2005; Mitteroecker, 2007; Bastir, 2008; Klingenberg, 2008, 2009; Mitteroecker & Bookstein, 2008). As such, they form morphologically tightly integrated organismal units, which are referred to as modules. The latter are usually defined as serving a common functional goal (Mitteroecker, 2007; Mitteroecker & Bookstein, 2008), being tightly integrated internally, while at the same time being relatively independent from other such units, with which they interact and from which they can be delineated clearly (Mitteroecker, 2007; Klingenberg, 2008, 2009; Mitteroecker & Bookstein, 2008). Morphological integration therefore refers to a high degree of structural interactivity, leading to tightly coordinated morphological development of the structures involved (e.g. strong covariation). Modularity, on the other hand, implies a relative independence thereof, due to a much lower degree of interactivity in terms of frequency and strength.

As pointed out by Bastir & Rosas (2005), the functional volumes and their associated skeletal structures in the FMH (Moss, 1962) are an example of morphological integration. On the other hand, Enlow's counterpart analysis (Enlow et al. 1969; Enlow, 1990; Enlow & Hans, 1996) can be regarded as an example of modularity: the growth counterparts are hypothesized to subdivide the skull into various relatively independent modules, both sagittally and vertically. It is important to note that modularity is not limited to a single developmental organizational level: what appears as a single module at a given level of complexity can represent multiple modules seen from the next, lower level of complexity (Bastir, 2008) (e.g. when inspecting specific substructures at a higher resolution). To shed some light on specific craniofacial mechanisms of morphological integration, it is essential to first identify and delimit modules (Klingenberg et al. 2003; Klingenberg, 2008, 2009).

As stated above, the counterpart analysis (Enlow et al. 1969; Enlow, 1990; Enlow & Hans, 1996) divides the face into an anterior and posterior module, separated by the posterior maxillary plane. The anterior module, referred to as the nasomaxillary complex, has been identified as a tightly integrated facial block, together with the orbits (Enlow et al. 1969, 1990, 1996; Lieberman et al. 2000b; McCarthy & Lieberman, 2001). More specifically, the posterior maxillary plane has been found to maintain a 90º relationship to the neutral horizontal axis (NHA) in primates and humans (amongst others) (Enlow et al. 1969, 1990, 1996; Lieberman et al. 2000a,b; McCarthy & Lieberman, 2001). The NHA is defined anteriorly by the midpoint between the upper and lower orbital rims, and posteriorly by the midpoint between the superior orbital fissures and the lower border of the optic foramen (McCarthy & Lieberman, 2001). Biegert (1957) found that in non-human primates, basicranial flexion decreased as facial size increased. Combined with the fact that in humans, the much more rapid increase in brain size relative to facial size was accompanied by an increase in basicranial flexion, this led him to formulate the ‘bi-directional hypothesis’, which states that an increase in facial size relative to brain size is associated with a reduction in basicranial flexion (Biegert, 1957; Lieberman et al. 2008; Bastir et al. 2010). A strong morphological integration has also been reported between the bilateral middle cranial fossa and the width of the mandibular ramus (Bastir & Rosas, 2004; Lieberman et al. 2008), which in turn was found to be significantly less correlated with the midline cranial base (Bastir & Rosas, 2004). Bastir & Rosas (2005) concluded that the ethmomaxillary complex is tightly integrated with the mandible in modern humans. Intriguingly, many studies report poor morphological correlations between the midline cranial base and various facial variables (facial breath and height, vertical facial pattern, and mandibulo–maxillary relationships) (Lieberman et al. 2000b; Bastir & Rosas, 2006; Polat & Kaya, 2007; Proff et al. 2008). Based upon the 90º relationship of the posterior maxillary plane to NHA reported by Enlow et al. (1969, 1990, 1996), some studies have suggested that midline cranial base flexure could be developmentally limited for functional reasons, resulting in pharyngeal airway patency (Ross & Henneberg, 1995; McCarthy & Lieberman, 2001), although a definitive confirmation of this hypothesis has yet to be provided (Ross et al. 2004). Additionally, morphological correlations might change with growth and development (Arthur, 2002; Gkantidis & Halazonetis, 2011). Indeed, the midline cranial base was found to be slightly better correlated with the face compared with the lateral cranial base in children (Gkantidis & Halazonetis, 2011). The latter, however, retained and even strengthened its facial correlation during growth, contrary to the midline cranial base (Gkantidis & Halazonetis, 2011).

The aims of this study were threefold:

  1. To evaluate craniofacial shape variance in a sample of modern humans (orthodontic patients) by applying principal component analysis.
  2. To test the male and female subgroups for differences in mean shapes and ontogenetic trajectories.
  3. To test six different hypotheses of modularity by applying the methodology proposed by Klingenberg (2009). Three scenarios involving two modules were considered, one involving three, and one, four separate modules.

Materials and methods

Lateral cephalometric radiographs of patients aged between 8 and 20 years, treated between April 2006 and May 2009 were collected from the records of the first author's private orthodontic practice. Additional inclusion criteria included the availability of good quality lateral cephalograms, the absence of craniofacial deformities, and that patients could only appear in the sample once. The resulting experimental group consisted of 178 patients, 79 of whom were male (mean age 13.3 years, SD 3.7 years, range 7.5–40.8 years) and 99 female (mean age 12.3 years, SD 1.9 years, range 8.7–19.1 years).

All radiographs were taken with the same machine by a trained operator (H.L.L.W.), using a standardized technique. The lateral cephalograms were traced, by the same author, on a light box in a darkened room, using matte acetate tracing paper and a sharpened pencil. The landmarks used in the current project are shown in Fig. 1.

Figure 1.

Landmark definitions. Point S, midpoint of the pituitary fossa of the sphenoid bone; Point N, most anterior point of the frontonasal suture; Porion, highest point of the meatus acousticus externus; Orbitale, lowest point on the averaged left and right inferior margin of the orbit; Articulare, intersection between the posterior border of the mandible, with the inferior outline of the cranial base; Posterior nasal spine, the most posterior point in the median plane on the bony hard palate; Anterior nasal spine, the tip of the median anterior process of the maxilla; Basion, lowest point on the anterior margin of the foramen magnum, in the midsagittal plane; Point A, deepest point on the anterior surface of the maxilla between ANS and Prosthion; Point B, deepest point on the anterior surface of the mandibular symphysis between Infradentale and Pogonion; Pogonion, most anterior point of the mandibular symphysis; Gnathion, most anterior and inferior point on the contour if the mandibular symphysis, constructed by bisecting the angle formed by the mandibular plane and N-Pogonion line; Menton, most inferior point of the mandibular symphysis; Gonion, most posterior and inferior point of the angulus mandibulae, determined by bisecting the angle formed by the tangent to the posterior border of the mandible, and the mandibular plane; Spheno-ethmoidale, intersection between the anterior border of corpus of the Os sphenoidale with the inferior border of the Os ethmoidale.

The finished tracing was placed approximately in the middle of the scanning surface of a desktop scanner (Scanjet 8200; Hewlett Packard, Palo Alto, CA, USA). The resulting image file was imported into a software program (digitizeit 1.5.7, I. Bormann; Bormisoft, Braunschweig, Germany) to record the landmarks' coordinates using three calibration points, located on a transparent calibration sheet, which was included in the scan. The recorded coordinates were then grouped in excel (2010; Microsoft Corporation, Redmond, WA, USA) for subsequent analysis in r (http://www.r-project.org), morphoj (Klingenberg, 2011) or viewbox (dHal Software, Kifissia, Greece). Since radiographic magnification was the same for all lateral cephalograms, it was not accounted for.

Statistical analysis

All statistical analyses except the modularity test were programmed in r by the first author, and confirmed using viewbox by the third author. To determine whether there were significant shape differences between male and female subjects, a permutation test was designed: the translated, scaled and rotated coordinates resulting from a pooled generalized Procrustes fit were used to calculate the Procrustes distance between the two groups' average configurations. Next, 1000 group pairs of the same size as the original male and female groups were created by randomly allocating the Procrustes coordinates to either group of each pair, without replacement. The number of group pairs exhibiting a larger Procrustes distance as the one between the original two groups, divided by 1000, served as the P-value for the significance of the findings. A similar permutation test was used to evaluate potential size differences by randomly permuting the log(centroid size) values (the natural logarithm of centroid size), without replacement. Finally, the male–female mean shape differences were revisited by rerunning the first permutation test while controlling for the effects of allometry. For this purpose, the residuals resulting from the pooled within-group regression of shape over centroid size were used.

In view of the relatively large age range in the experimental sample, it was deemed necessary to ascertain whether there were differences in the male and female ontogenetic shape trajectories: does craniofacial shape vary as a function of growth and development? If so, is this variation similar for male and female patients? Size, in this context, is used as a (admittedly poor) proxy for development. The approach proposed by Mitteroecker et al. (2004) was applied to the × m matrix of stereometrically projected Procrustes coordinates X, whereby n is the number of rows, and m the number of columns. The vector of the ‘common allometric component’ (CAC), the component of shape change which is most closely aligned with size, was calculated as: math formula, whereby s is a column matrix containing the logarithm of centroid size, and normalized as math formula. The CAC could then be visualized relative to log(centroid size). The first residual component was calculated by first projecting out the CAC: W = X(I − a′(at)), and then performing a singular value decomposition of WtW into VDwVt. The columns of V are the residual components, with scores XV. Additionally, the × m matrix of stereometrically projected Procrustes coordinates X was augmented with s. Principal component analysis was applied to the resulting matrix, which allowed the plotting of the first PCs of the resulting form space in a three-dimensional plot.

The craniofacial variance of the experimental sample was further scrutinized by applying principal component analysis to the covariance matrix of the pooled generalized partial Procrustes coordinates (Zelditch et al. 2004). The number of biologically interpretable (i.e. non-trivial) PCs was determined using the ‘Random average under permutation’ rule, as outlined by Peres-Neto et al. (2005): The variables in the data matrix were randomized within variables 1000 times, and a PCA was performed on each reshuffled data matrix. The average eigenvalues were then calculated and compared with the ones obtained. If the observed exceeded the average random value, that axis was perceived as non-trivial. The percentage of variation explained by each of the non-trivial PCs was also calculated.

Next, the hypothesis of modularity was tested with the morphoj software package (Klingenberg, 2011), using the methodology proposed by Klingenberg (2009). Four scenarios involving two modules were considered, one with three modules, and one involving four modules. The same, pruned adjacency graph (Fig. 2b) was used for all scenarios, constructed beforehand using Delaunay triangulation (Delaunay, 1934), whereby connections that did not pass over continuous (skeletal) tissue were omitted (red lines in Fig. 2a) and, where needed, an additional diagonal was added to quadrilaterals (Klingenberg, 2009) (red lines in Fig. 2b).

Figure 2.

Adjacency graphs, constructed using Delaunay triangulation (Delaunay, 1934). The original adjacency graph is depicted in (a). The red lines in (A) indicate connections that were removed in (b) because they did not pass over continuous (skeletal) tissue (Klingenberg, 2009). The red lines in (B) represent diagonals that were added to selected quadrilaterals from (a) (Basion to Sella, Articulare to Spheno-ethmoidale, Gonion to Point A, and PNS to ANS) (Klingenberg, 2009).

Figure 3a depicts the structural subdivision, as used in the modularity scenarios involving two or three modules. In case of two modules, two of the three substructures were combined into one. Figure 3b represents the subdivision used when testing the counterpart principle, while Fig. 3c visualizes the location of the subdivisions in the four-module scenario.

Figure 3.

Subdivisions used during modularity hypothesis testing. The subdivisions in (a) were used either separately or combined in modularity scenarios 1, 2, 3 and 5. (b) The subdivisions employed when testing for the counterpart principle (modularity Scenario 4). (c) The four sub-modules which were considered in modularity Scenario 6. (Scenario 1) The presence of two modules (a): the skull base (in red) and maxilla (in green), vs. the mandible (in blue). (Scenario 2) The presence of two modules (a): the skull base and mandible (in red and blue, respectively) vs. the maxilla (in green). (Scenario 3) The presence of two modules (a): the skull base (in red) vs. the mandible and maxilla (in blue and green, respectively). (Scenario 4) The counterpart principle (b): the anterior vs. posterior module (in red and blue, respectively). (Scenario 5) The presence of three different modules (a): the skull base (in red), mandible (in blue) and maxilla (in green). (Scenario 6) Combining Scenarios 3 and 4: the four-module scenario (c). Note: The colors were used only to discriminate the various modules and are therefore not necessarily structurally consistent throughout the various scenarios depicted.

To correct for the effects of allometry, the modularity test was performed using the residuals resulting from a pooled within-group regression of shape over centroid size. The (multi-)RV coefficient (Klingenberg, 2009) could then be calculated for each scenario, to be compared with the corresponding value of randomly generated alternative subdivisions into two to four spatially continuous modules. These modules would contain the same number of points as the corresponding modules in the proposed subdivision, the adjacency graph serving as an algorithmic tool to assure spatial continuity in these alternative modules. Each round of GPA superimpositions was performed using the simultaneous-fit approach (i.e. the superimpositions were performed while maintaining relative size among the modules). The multi-RV coefficient was used when testing for the presence of three or four modules, whereas the original RV coefficient was employed when only two modules were involved. All possible continuous alternative subdivisions were generated for comparison with the respective proposed modularity scenarios. The number of alternative subdivisions exhibiting a multi-RV coefficient lower than the proposed one was recorded as the P-value for the significance of the finding.

Error analysis

The digitizing procedure was repeated by the same author (H.L.L.W.) for 15 randomly selected cases, at least 2 weeks apart. Statistical significance was determined using a Procrustes analysis of variance.

Results

The error analysis did not reveal any statistically significant differences between the first and second digitizing round (P = 0.175) (Table 1).

Table 1. Summary of the anova results for the error experiment
EffectSSMSdf F P
Centroid size
Individual0.120454 0.120454 100.9706
Residual2.43987.12083628  
EffectSSMSdf F Pillai trace P
Shape
Individual0.00045161.73693E-05260.130.970.1753
Residual0.1002580.0001377177281  

The male and female average configurations, calculated from a pooled generalized Procrustes fit, are depicted in Fig. 4. Very subtle differences can be observed, mainly at the Articulare and spheno-ethmoidale landmarks. These shape differences were not significant, as indicated by the permutation test P-value of 0.33. Evidence of sexual dimorphism in size was found: the male patients' centroid sizes were found to be significantly larger (P < 0.001). None of the second permutation test's resampled datasets were found to have a larger difference in log(centroid size) values.

Figure 4.

Mean configurations resulting from the pooled generalized Procrustes analysis. Subtle differences can be observed at the Articulare and Spheno-ethmoidale landmarks.

Rerunning the permutation test while controlling for the effects of allometry revealed modest, but highly significant, male–female mean shape differences (P < 0.001, Fig. 5a). These were exaggerated three times for visualization purposes in Fig. 5b. Apart from obvious differences at the Articulare and spheno-ethmoidale landmarks, it appears females were slightly more orthognathic and dolichofacial.

Figure 5.

(a) The male–female mean shape differences when controlling for the effects for allometry. Apart from obvious differences at the Articulare and Spheno-ethmoidale landmarks, it appears females were slightly more orthognathic and dolichofacial. (b) The differences exaggerated three times.

Two approaches were employed to assess possible divergences in the male and female ontogenetic allometric signals. The first method involved calculating the common allometric component (CAC: that component of shape change which most closely aligns with growth and development), as well as the residual shape components (RSC) (Mitteroecker et al. 2004). The first RSC, plotted relative to the CAC in Fig. 6, suggests that both sexes go through very similar ontogenetic shape changes: the shape trajectories are very similar, if not identical.

Figure 6.

Two-dimensional projection of the rotated 3D scatterplot, representing the common allometric component (CAC, y-axis) vs. the first residual shape component (RSC1, x-axis). The sexual ontogenetic allometric trajectories seem to largely coincide.

When plotting the CAC vs. log(centroid size) in Fig. 7, a divergence in the male and female ontogenetic allometric signals could be observed. The statistical summary for the linear regression (Table 2) indicated that the slope for the male subgroup was not significant, probably due to the presence of several outliers in the male points cluster.

Table 2. Summary of the linear regression results for the common allometric component (CAC) scores vs. log(centroid size). The male regression line slope was not significant due to the presence of several outliers
 EstimateSEt-valueP-value
Male group
Intercept0.99996.24E-0516016.57< 0.001
Slope1.81E-051.16E-050.950.346
Female group
Intercept0.99984.10E-0524417.46< 0.001
Slope4.30E-057.67E-065.61< 0.001
Figure 7.

Scatterplot of the common allometric component (CAC, y-axis) vs. the natural logarithm of centroid size (x-axis). The male regression line was non-significant (Table 2) due to the presence of several outliers.

Upon removing the four most extreme male outliers, the recalculated slope was found to be significant (Table 3 and Fig. 8), but the associated regression line still diverged from the female one.

Table 3. Summary of the linear regression results for the common allometric component (CAC) scores vs. log(centroid size), after removing the four most extreme male outliers. The slopes of both regression lines were now significant
 EstimateSEt-valueP-value
  1. Significance: *P < 0.05.

Male group
Intercept0.99994.79E-0520859.23< 0.001
Slope1.81E-058.90E-062.040.0454*
Female group
Intercept0.99984.10E-0524417.46< 0.001
Slope4.30E-057.67E-065.61< 0.001
Figure 8.

Scatterplot of the common allometric component (CAC, y-axis) vs. the natural logarithm of centroid size (x-axis), with the four most extreme male outliers removed. The male regression line was non-significant (Table 3) due to the presence of several outliers.

Intriguingly, the CAC was found to represent a vector of pure size change, with no clearly identifiable shape change (Fig. 9).

Figure 9.

The common allometric component (CAC), visualized using the male Procrustes mean shape, with representations plus and minus 0.1 along the component axis. The CAC seems to represent a vector of almost pure size change, with little to no observable concomitant shape change.

The second method involved augmenting the matrix of Procrustes coordinates with a column matrix holding the log(centroid size) values, and subsequently performing a principal component analysis of the resulting matrix, pre-multiplied by its transpose. This allowed the first three PCs of the resulting Procrustes form space to be plotted in a three-dimensional plot (Fig. 10a,b). The resulting point scatter was quite spherical, with broad regions of overlap between the male and female point clusters. In the PC1–PC2 view of the resulting 3D plot (Fig. 10a), there was a clearly discernible divergence in the male and female ontogenetic allometric signals. In smaller individuals, females tended to have higher PC2 scores in comparison with males, and vice versa for larger individuals. In contrast, the PC1–PC3 view revealed almost parallel trajectories (Fig. 10b). A bootstrap test was designed to confirm these visual impression (10 000 iterations, with replacement). In the PC1–PC2 view, the angle between the male and female trajectories (0.257 radians) was found to be significant (P-value: 0.027; 95% confidence interval: −0.249 to 0.259 radians), contrary to the PC1-PC3 view (P-value: 0.406; 95% confidence interval: −0.164 to 0.169 radians).

Figure 10.

Three-dimensional plot of the first three principal components in form space. The red spheres represent males, the blue spheres, females. The black line indicates the direction of pure size change (Mitteroecker et al.2004), which should be more or less parallel to the first PC. The red and blue lines represent respectively the three-dimensional male and female ontogenetic trajectories.

In view of the subtlety of the male–female average shape differences and the rather spherical nature of the above point clouds, we opted nevertheless to pool males and females for the principal component analysis, performed in shape space. Based upon Perez-Neto's ‘Random average under permutation’ stopping rule (Perez-Neto et al. 2005), the first four principal components were found to be biologically interpretable (P < 0.001). These are depicted in Fig. 11(a-d). Together, the four PCs account for 59.45% of the total variance in the sample, ranging from 29.56 (first PC) to 6.47% (fourth PC). Their biological interpretation is provided in the Discussion section.

Figure 11.

Visualization of the four retained biologically interpretable principal components, in shape space. The black wireframes represent the positive deformation of the male Procrustes mean shape along the PC axis. The red wireframes are the negative deformations. The percentage of variability explained by PC 1 through 4 is 29.7, 15.9, 7.68, and 6.5%, respectively.

Although there were no significant differences in mean shape between males and females, static allometry could still influence the modularity hypothesis test (Klingenberg, 2009). Since the male and female patients clearly differed in size, the modularity test was performed using the residuals of a pooled within-group regression of shape over size. The original adjacency graph, constructed using Delaunay triangulation, is visualized in Fig. 2a and the corrected adjacency graph for the three different modularity scenarios in Fig. 2b. The connections between Basion and Gonion, ANS and Pogonion, as well as between ANS and Point B were removed, and four diagonals were added: Articulare to Spheno-ethmoidale, Gonion to Subspinale, Basion to Point S and ANS to PNS. The subdivisions associated with each the three modularity scenarios are depicted in Fig. 3(a-c).

The results of testing the hypothesis of modularity are listed in Table 4. The RV coefficient for the subdivision into two modules, the skull base vs. the mandibulomaxillary complex, proved significant (Fig. 3a; the mandible and maxilla were combined into one structure for testing) (P < 0.05, Table 4). The same holds true for the subdivision representing the counterpart principle (Fig. 3b) (P = 0.49, Table 4). The modularity scenario involving four modules proved significant as well (Fig. 3c) (P < 0.05). The modularity hypothesis was rejected when considering the skull base, mandible and maxilla separately, as well as when combining the skull base with the maxilla, or with the mandible (Table 4).

Table 4. Results for the modularity hypothesis test using the (multi-)RV coefficient. The counterpart modularity scenario is visualized in Fig. 3b and the scenario involving four modules in Fig. 3c
No. of modulesSkeletal parts in modules(Multi-)RV coef.Number of alt. subdivisionsNo. of alt. subdivisions with lower (multi-)RVP-value
  1. Mnd, mandible; Mx, maxilla; SkB, skull base.

  2. Significance: *< 0.05.

2Skb + Mx, Mnd0.6668462652240.845
2Skb + Mnd, Mx0.45898461270.443
2Skb, Mx + Mnd0.49557226580.030*
2Counterparts0.537199305150.049*
3SkB, Mx, Mnd0.445382545880.162
4Four modules0.365604608230.038*

Discussion

One of the aims of this study was to characterize craniofacial variation in a large, preferably unselected, sample of orthodontic patients. To sample realistically the highly variable contemporary (orthodontic patient) population, inclusion criteria need to be limited in scope and number. This might in turn lead to differences in the age and sex distribution of the experimental sample. It is important to consider the relevance of any such differences to the planned principal component analysis or the modularity hypothesis test. While the first permutation test found no statistically significant differences in the male and female Procrustes mean shapes (Fig. 4), rerunning the permutation test while controlling for the effects of allometry revealed highly significant, albeit surprisingly subtle, differences (Fig. 5a). In view of the rather liberal inclusion criteria used and the heterogeneous nature of the resulting experimental group, the similarity between the mean configurations is striking.

The subtle nature of the male–female mean shape differences in the current study was an unexpected finding, since marked sexual dimorphism has frequently been reported, both in size and shape (Ursi et al. 1993; Humphrey, 1998; Rosas & Bastir, 2002; Bulygina et al. 2006). Bulygina et al. (2006) reported male–female size differences in the anterior part of the neurocranium already 1 year after birth (or earlier) which remained constant during growth, confirming earlier results by Ursi et al. (1993), who found the anterior cranial base to be larger in males from 6 years of age. In the early stages of growth, males tended to exhibit a more profound cranial base flexion, relatively smaller faces, and larger frontal bones (Bulygina et al. 2006). In the subsequent years this reversed, until at 6–12 years of age, the midline shapes of both sexes were very similar (Bulygina et al. 2006). The maxillary and mandibular position seemed to be dimorphic at any age, while their effective lengths did not exhibit male–female differences until about 9 years of age (Ursi et al. 1993).

With regard to adults, most authors seem to agree on the presence of facial dimorphism, mainly as a consequence of male hypermorphosis (Enlow, 1990; Ursi et al. 1993; Humphrey, 1998; Rosas & Bastir, 2002; Bulygina et al. 2006): the female growth spurt slows down at about 13 years of age (Enlow, 1990; Bulygina et al. 2006) while male pubertal growth peaks at 15 years of age (Dean et al. 2000), an age at which female growth is usually complete (Bulygina et al. 2006). Since the cranial base matures completely at about 11–12 years of age (Bastir et al. 2006), remaining craniofacial growth is spatially and functionally limited to the masticatory and facial structures (Enlow, 1990; Humphrey, 1998; Lieberman et al. 2000b, 2008; Bastir & Rosas, 2006; Gkantidis & Halazonetis, 2011).

In view of the large age range of the experimental sample, it was deemed interesting to further scrutinize craniofacial variance with regard to ontogenetic allometric differences. Apart from purely allometric shape changes occurring in the pooled sample (in the case of more or less parallel ontogenetic trajectories), the ontogenetic trajectories of males and females might also diverge (Mitteroecker et al. 2004). Both scenarios might necessitate a separate analysis of younger vs. older patients and/or males vs. females, with regard to the PCA and the modularity hypothesis test.

As suggested by Mitteroecker et al. (2004), the first three components of the data decomposition were visualized simultaneously. These were further analyzed by providing two-dimensional projections of the most relevant rotations of the 3D plot. Figure 6 shows a scatter plot of the common allometric component (CAC) vs. the first residual shape component (RSC1), illustrating ontogenetic shape changes. These components are the first and second PC, resulting from the principal component analysis, performed in shape space. The male and female trajectories are remarkably similar: if not identical, they are virtually parallel. This would seem to indicate that within the growth period studied, developing males and females go through almost identical shape changes. These findings align with those of Viðarsdóttir et al. (2002), who found no discernible divergence in the male and female ontogenetic shape trajectories in any of the 10 populations they studied (at least for those that contained enough sexed males and females to draw this conclusion). It should be pointed out that the latter study, as well as the current one, was cross-sectional in nature, which would seem to limit the potential to pick up subtle ontogenetic shape trajectory variations. This might explain why Bulygina et al. (2006), using the longitudinal Denver Growth Study data, were able to demonstrate that the sexual ontogenetic shape trajectories are somewhat parallel until the beginning of puberty, but differed in direction thereafter.

The plot of the CAC vs. log(centroid size) in Fig. 7 to visualize the allometric growth trajectories seems to confirm the common notion that males and females go through craniofacial shape changes at different rates. The female regression line is considerably steeper than the male one. Since the male regression line was found to be non-significant (Table 2), the four most extreme male outliers were removed and the regression line recalculated (Fig. 8). Although the male slope was now found to be significant (Table 3), the still steeper female slope confirmed that females on average tend to reach their adult size/shape earlier, whereas the male growth spurt, apart from exhibiting a delayed onset relative to females (Enlow, 1990), takes much longer to complete, especially in the mandibular and maxillary region (Mitani & Sato, 1992; Bastir et al. 2006). Performing a linear regression of the CAC on log(centroid size) was considered appropriate here, since the methodology proposed by Mitteroecker et al. (2004) specifically separates shape changes associated with allometry (CAC) from those that are not (RSCs). Even if one does not accept the use of regression lines in this scenario due to the multivariate nature of shape or the somewhat spherical nature of the point scatter, Figs 7 and 8 clearly contain evidence of allometric differences, the female scatter being located more to the top left of the graph.

Intriguingly, the CAC seems to represent a vector of almost pure size change, with little or no identifiable accompanying shape change (Fig. 9). This might again be explained by the fact that the brunt of the experimental sample fell within the age range for which Bulygina et al. (2006) demonstrated a surprising similarity in the midline cranial shape (8–12 years). Rosas & Bastir (2002) studied allometry and sexual dimorphism in two groups of 55 adult males and females. In terms of male–female shape differences, they found males to exhibit a relative forward angulation of the nasal bones, with a more pronounced glabella. The latter finding has also been reported by Bulygina et al. (2006). Furthermore, a downward rotation of the anterior nasal floor was noted, as well as a more retro-positioned symphysis, leading to a more pronounced chin in comparison with females, who were more protrusive at the dento-alveolar level (Rosas & Bastir, 2002). Increased antegonial notching, as well as an antero-inferior displacement of the gonial angle and a more forward condylar position were also reported (Rosas & Bastir, 2002). With respect to allometric shape differences, smaller individuals exhibited a vertical decrease in the maxillary alveolar process and mandibular ramus, leading to a more retrognathic profile (Rosas & Bastir, 2002). The glabellar region moved slightly back and the occipital clivus was displaced downward, relative to large individuals (Rosas & Bastir, 2002).

As explained by Mitteroecker et al. (2004), the Procrustes form space can also be produced by performing a PCA on the matrix of shape coordinates, augmented with a column vector holding the log(centroid size) values, thus allowing visualization of the first three principal components in a 3D plot (Fig. 10a,b). In this scenario, log(centroid size) is part of the eigenanalysis, and usually largely dominates the first PC (Mitteroecker et al. 2004). This can be readily observed in Fig. 10(a,b), which in the PC1–PC2 view (top graph) shows a clearly discernible divergence in the male and female ontogenetic allometric signals, contrary to almost parallel trajectories in the PC1–PC3 view (bottom graph). Although a bootstrap permutation test confirmed these findings, the spherical nature of the male and female point clouds and their poor separation seem to limit the conclusions that can be drawn from them. As evident from both views in Fig. 10, the most important male–female separating characteristic remains dimorphism in size.

The lack of a clear correlation between the CAC and log(centroid size) in Figs 7 and 8 may be explained, in part, by the cross-sectional nature of the experimental sample. Since the various stages of craniofacial development are represented by different individuals, more variance is introduced along ages (Bulygina et al. 2006; Polanski, 2011). Additionally, log(centroid size) might be considered to be a poor proxy for development. It is therefore perfectly conceivable that small, precocious individuals as well as larger ones with a somewhat delayed craniofacial development were misclassified in Figs 7 and 8. Size and dental eruption staging are quite often the only measures available for estimating age in anthropological specimens; limitations that do not apply to the current experimental sample. However, several studies in modern Homo considering calendar age, hand-wrist radiographs, standing height or dental maturation have alluded to the poor correlation between the latter and craniofacial development. Additionally, it has been suggested that the growth spurt in standing height does not always coincide perfectly with that of the cranium, or that the latter growth spurt might be modular in nature, affecting some structures earlier or later than others (Mitani & Sato, 1992). Although hand-wrist radiographs and standing height measurements were not available for the current sample, the cervical vertebrae maturation index (CVM) might have been used. This index attempts to quantify the remaining adolescent growth using the morphology of the first four cervical vertebrae, and has been suggested as an alternative to the use of hand-wrist radiographs. However, some recent publications have questioned the reproducibility of the CVM technique (Chatzigianni & Halazonetis, 2009; Fudalej & Bollen, 2010; Nestman et al. 2011).

The results for the (pooled) principal component analysis are depicted in Fig. 11(a-d). The number of biologically interpretable PCs was determined using the ‘Random average under permutation’ stopping rule, proposed by Peres-Neto et al. (2005). This more robust approach was preferred over the use of the ‘5% of explained variation’ rule of thumb or the screeplot approach (Zelditch et al. 2004). The 5% rule is quite arbitrary, whereas screeplots do not necessarily exhibit a readily distinguishable abrupt change in curvature (Zelditch et al. 2004). The latter is required to make a clear-cut decision on which PCs precede this point and can therefore be regarded as being biologically pertinent.

According to the ‘Random average under permutation’ rule, the first four PCs are biologically interpretable (P < 0.001). It should be noted that although the first principal component (i.e. the vector of maximal variance) could be argued to have a clear biological justification, the subsequent ones are constrained to be orthogonal. As such, their individual biological interpretation requires some caution. The first PC (Fig. 11a) seems to deal mainly with vertical effects: hyperdivergency (in red) vs. hypodivergency (in black), with the accompanying decrease or increase in relative facial depth, respectively. Some rotation of the skull base can also be observed. The second PC (Fig. 11b) represents Class II vs. class III skeletal patterns, with mandibular retrognathism and maxillary prognatism in black, and the exactly opposite arrangement in blue. Equally intriguing is the skull base rotation which can be found in both PC1 and PC2, and which seems to confirm the conclusions by Kuroe et al. (2004): that skull base rotation might be more important than skull base flexure in reference to different growth patterns. Additionally, the vertical facial patterns in the first PC seem to be associated with pure skull base rotation, whereas the second PC suggests sagittal discrepancy to be correlated with the length of the anterior and posterior skull base as well, which confirms the results from Kerr & Adams (1988), and Kuroe et al. (2004). These first two PCs also compare very favorably to the results of Halazonetis (2004), who reported strikingly similar morphological associations. For the second PC, however, he reported a relative superior/inferior positioning of the skull base, as opposed to the currently found combined skull base rotation and lengthening/shortening. The third PC (Fig. 11c) seems to pertain to the length of the mandibular ramus, represented by the vertical position of the Gonion landmark, as well as a rotation of the posterior skull base. The third PC skull base variation is somewhat more pronounced than that found by Halazonetis (2004), whereas a very slight vertical maxillary displacement in his study was found to be a slight maxillary rotation in the current one. Total relative skull base rotation around Sella as well as maxillary length seem to be the next most important areas of variation, as evidenced by the fourth PC (Fig. 11d).

The original and corrected adjacency graphs are depicted in Fig. 2a and 2b, respectively. The latter graph is missing the connections between Basion and Gonion, as well as those between ANS and Pogonion, and between ANS and Point B. These were removed since they were located ‘outside of the skeletal structures of interest’, as recommended by Klingenberg (2009), who also proposed removing connections between structures that do not typically interact, and adding a second diagonal to quadrilaterals, where needed (Basion to Sella, Articulare to Spheno-ethmoidale, Gonion to Point A, and PNS to ANS in Fig. 2b). Although personal considerations and/or preferences might come into play here, pilot studies indicated these modifications did not influence the result of the modularity hypothesis test; however, static allometry might (Klingenberg, 2008, 2009). As pointed out by Goshwami & Polly (2010), Procrustes analysis standardizes size but does not remove the component of shape which is correlated with size, potentially creating an appearance of complete integration, and masking modularity. Since the second permutation test indicated that the male and female subgroups were significantly different in size, we opted to test the hypothesis of modularity using the shape coordinates resulting from a pooled within-group regression of shape over centroid size.

Of the modularity scenarios involving the skull base, mandible and maxilla, either separately or combined (modularity scenarios 1, 2, 3 and 5), only the hypothesis involving the skull base vs. the mandibulomaxillary complex turned out to be significant (Table 4). In all, 265 possible alternative (continuous) subdivisions were tested, of which only eight (0.030%) exhibited a lower RV coefficient than the corresponding value of the original subdivision. Hence the latter's RV coefficient is located far enough in the left tail of distribution to be considered significant. This would seem to confirm the frequently reported poor correlation between the midline cranial base and the face (Lieberman et al. 2000b; Bastir & Rosas, 2006; Polat & Kaya, 2007; Proff et al. 2008). Indeed, Gkantidis & Halazonetis (2011) found the correlation between the midline cranial base and the face to decrease into adulthood, contrary to the lateral cranial base. The subdivision into an anterior and posterior module (Fig. 3b) mimicking the counterpart principle (Enlow et al. 1969; Enlow, 1990; Enlow & Hans, 1996), was significant as well (P = 0.049, Table 4). This provides additional evidence, albeit not exceptionally strong, for the presence of an anterior and posterior craniofacial column, which could be regarded as two vertically oriented modules, each consisting of highly integrated sub-modules: the anterior column consisting of the anterior skull base, the ethmomaxillary complex and the mandibular corpus, and the posterior column of the posterior skull base and the mandibular ramus (Enlow et al. 1969; Enlow, 1990; Enlow & Hans, 1996). The presence of sub-modules within the previously outlined framework was confirmed using the sixth modularity scenario (Fig. 3c), which considered four modules (P = 0.038, Table 4). Interestingly, one of the sub-modules involves two mandibular ramal landmarks and a maxillary landmark, which could be regarded as delimiting pharyngeal space. These were previously found to predict vocal tract dimensions independently of cranial base flexion in a longitudinal sample of Homo sapiens (Lieberman & McCarthy, 1999). Although some evidence has been found that skull base angulation may be constrained by pharyngeal restructuring prenatally (Jefferey, 2005), the latter structure's growth peaks only after the ossification of most sphenoidal synchondroses, and is therefore correlated more strongly with mandibulary and maxillary landmarks (Lieberman & McCarthy, 1998). Dento-alveolar modularity was not considered in this study. Since this was a two-dimensional landmark-based investigation, although the anterior limits of the dento-alveolar regions could be pinpointed with relative accuracy, the posterior limits were often very hard to locate reliably due to cephalometric superimposition and/or differential enlargement of the bilateral landmarks. Also, due to the heterogeneous nature of the experimental sample, it was difficult to select landmarks which would consistently define these posterior limits of the dentition: some patients did not have erupted permanent second molars, whereas in others, the third molars were in place.

Conclusion

Within the age period studied and the limitations of this cross-sectional study, we found subtle but significant differences in the male and female Procrustes mean shapes. Males tended to be larger. Additionally, mild sexual ontogenetic allometric divergence was found. The principal component analysis retained four biologically interpretable components, the first two of which relate to vertical growth patterns (dolichofacial vs. brachyfacial) and sagittal skeletal relationships (maxillary prognatism vs. mandubulary retrognathism, and vice versa), respectively. The mandible and maxilla were found to constitute one module, independent of the skull base. We were also able to provide evidence for the counterpart principle, in the form of an anterior and posterior module, separated by the pterygomaxillary plane, which could be further subdivided into four separate modules involving the posterior skull base, the ethmomaxillary complex, a pharyngeal module, and the anterior part of the jaws.

Author contributions

H.L.L.W. performed the data acquisition and drafted the manuscript. H.L.L.W. and D.J.H. jointly designed this study and completed the data analysis and interpretation. The A.M.K,-J. and D.J.H. critically revised and approved the finished manuscript.

Ancillary