Many ecological and evolutionary studies seek to explain patterns of shape variation and its covariation with other variables. Geometric morphometrics is often used for this purpose, where a set of shape variables are obtained from landmark coordinates following a Procrustes superimposition.
We introduce geomorph: a software package for performing geometric morphometric shape analysis in the r statistical computing environment.
Geomorph provides routines for all stages of landmark-based geometric morphometric analyses in two and three-dimensions. It is an open source package to read, manipulate, and digitize landmark data, generate shape variables via Procrustes analysis for points, curves and surfaces, perform statistical analyses of shape variation and covariation, and to provide graphical depictions of shapes and patterns of shape variation. An important contribution of geomorph is the ability to perform Procrustes superimposition on landmark points, as well as semilandmarks from curves and surfaces.
A wide range of statistical methods germane to testing ecological and evolutionary hypotheses of shape variation are provided. These include standard multivariate methods such as principal components analysis, and approaches for multivariate regression and group comparison. Methods for more specialized analyses, such as for assessing shape allometry, comparing shape trajectories, examining morphological integration, and for assessing phylogenetic signal, are also included.
Several functions are provided to graphically visualize results, including routines for examining variation in shape space, visualizing allometric trajectories, comparing specific shapes to one another and for plotting phylogenetic changes in morphospace.
Finally, geomorph participates to make available advanced geometric morphometric analyses through the r statistical computing platform.
The comparison of anatomical features of organisms, and understanding how variation in those features associates with variation in other traits, has long been of interest to ecologists and evolutionary biologists. In recent years, the quantitative study of anatomical form has matured into the field of morphometrics: the study of shape variation and its covariation with other variables (Bookstein 1991; Rohlf & Marcus 1993; Adams, Rohlf & Slice 2004; Zelditch et al. 2004; Adams, Rohlf & Slice 2013). One common approach to shape analysis, geometric morphometrics (GM), utilizes the coordinates of landmarks to record the relative positions of morphological points, boundary curves and surfaces as the basis of shape quantification. Geometric morphometric shape analyses are typically accomplished through a series of steps that can be called the Procrustes paradigm (Adams, Rohlf & Slice 2013). First, a set of two- or three-dimensional landmark coordinates are obtained on each specimen, which record the relative positions of anatomically-corresponding (or homologous) locations. Next, a generalized Procrustes analysis (GPA: Gower 1975; Rohlf & Slice 1990) is used to superimpose the specimens to a common coordinate system by holding constant variation in their position, size and orientation (an additional step is included to standardize points on curves and surfaces: Bookstein et al. 1999; Gunz, Mitteroecker & Bookstein 2005). From the Procrustes-aligned coordinates, a set of shape variables is obtained (Bookstein 1991; Dryden & Mardia 1998; Rohlf 1999), which can be used in multivariate statistical analyses to address a wide range of biological questions. Finally, graphical methods are used to visualize patterns of shape variation and facilitate descriptions of shape changes.
Because geometric morphometric methods provide a more comprehensive quantification of biological shape as compared to alternative approaches, their use in ecological and evolutionary studies has increased dramatically in recent years. For instance, geometric morphometric methods are now commonly used in studies of evolutionary quantitative genetics (Klingenberg, Debat & Roff 2010; Adams 2011; Martínez-Abadías et al. 2012), to reveal phenotypical changes associated with species interactions (Adams 2004; Langerhans et al. 2004; Adams, West & Collyer 2007), to describe patterns of fluctuating and directional asymmetry (Klingenberg, Barluenga & Meyer 2002; Schaefer et al. 2006), to identify convergent and parallel evolution (Stayton 2006; Adams 2010; Adams & Nistri 2010; Piras et al. 2010), to discover phylogenetic and macroevolutionary trends (Sidlauskas 2008; Klingenberg & Gidaszewski 2010; Monteiro & Nogueira 2011) and to reveal ontogenetic patterns in human evolution (Bookstein et al. 2003; Mitteroecker et al. 2004; Mitteroecker & Bookstein 2008), among other applications. Consequently, several software packages are now available for applying geometric morphometrics to particular problems. However, freely available software implementing all of the steps of the Procrustes paradigm in a single computer package, including the digitization of specimens and the analysis of both fixed landmarks and sliding semilandmarks in two- and three-dimensions, is generally lacking.
The purpose of geomorph is to fill this gap. Geomorph (Adams & Otárola-Castillo 2012) is a freely available software package for performing geometric morphometric shape analysis in the r statistical computing environment. It can be installed from the Comprehensive r Archive Network, CRAN. In geomorph, routines for all stages of landmark-based geometric morphometric analyses are provided, including: digitizing landmarks on two and three-dimensional objects; reading and manipulated landmark data files; generating shape variables via Procrustes analysis for points, curves and surfaces; performing statistical analyses of shape variation and covariation; and providing graphical depictions of shapes and patterns of shape variation. A variety of statistical methods for shape analyses germane to ecological and evolutionary studies are included. Geomorph extends the capabilities of landmark-based shape analysis in r over prior packages and routines (e.g. the ‘shapes’ package: Dryden 2012 and Morphometrics With r: Claude 2008), by incorporating both semilandmark methods and the digitization of specimens directly within r. However, because geomorph utilizes previously developed data structures implemented in these packages, one may combine functions across packages to expand the breadth of shape analyses available within the r computing environment. Below we describe some of the major features of geomorph to demonstrate some of its functionalities.
The geomorph package is written in the r scientific computing language (R Development Core Team 2012). The functions in geomorph are designed to enhance all aspects of a landmark-based geometric morphometric shape analysis. Currently implemented functions are listed in Table 1, which is arranged by workflow. In the coming years, we will incorporate additional functions to further expand the utility of geomorph.
Table 1. Major functions of the geomorph package
Data Input, Data Collection, and Data Preparation Functions
Convert landmark data matrix into array (p × k × N)
Build 3D surface template
Select points to ‘slide’ along two-dimensional curves
Select points to ‘slide’ along three-dimensional curves
Digitize fixed 3D landmarks only
Digitize 2D landmarks
Digitize 3D fixed landmarks and surface semilandmarks
Edit 3D template
Estimate locations of missing landmarks using the thin-plate spline
Read landmark data from ply files
Read landmark data from vrml files
Read landmark data from nts file
Read landmark data from tps file
Read 3D landmark data from Morphologika file
Read landmark data from multiple nts files
Convert (p × k × N) data array into 2D data matrix
Data Analysis Functions
Compare modular signal to alternative landmark subsets
Generalized Procrustes analyis of points, curves, and surfaces
Quantify morphological integration between two modules
Estimate mean shape for a set of aligned specimens
Assessing phylogenetic signal in morphometric data
Procrustes anova/regression for shape data
Quantify and compare shape change trajectories
Plots and Graphical Functions
Plot allometric patterns in landmark data
Plot landmark coordinates for all specimens
Plot phylogenetic tree and specimens in tangent space
Plot shape differences between a reference and target specimen
Plot 3D specimen, fixed landmarks and surface semilandmarks
Plot specimens in tangent space
Data input and digitizing
Previously digitized landmark data stored as text files in the *.tps or *.nts formats can be read into geomorph using readland.tps, readland.nts, or readmulti.nts. The resulting data are then stored in r as a three-dimensional array for subsequent morphometric analyses (p landmarks × k dimensions × N specimens). Previously digitized data stored in other formats (such as those used by MorphoJ: Klingenberg 2011 and Morphologika: O'Higgins & Jones 1998) may be read using read.morphologika and the base functions of r (e.g. read.csv, read.table). The data matrix can then be converted to a three-dimensional array for use in geomorph using the function arrayspecs.
A major feature of geomorph is the ability to digitize landmark data directly from two- and three-dimensional images in r (Fig. 1). Digitizing two-dimensional landmarks from a *.jpeg image is accomplished using the function digitize2d. For three-dimensional data, digital surface images in the form of *.ply or *.vrml files may be read into geomorph using read.ply and read.vrml respectively. From these, the coordinates of points, curves and surfaces may be digitized using one or more functions in geomorph (for list of functions see Table 1). When semilandmarks on surfaces are desired, these points are digitized using a template (following Gunz, Mitteroecker & Bookstein 2005). Here, a set of fixed landmarks are first digitized on a specimen; then a series of equally spaced points are identified mathematically, and treated as a semilandmarks representing the shape of the surface (using the function buildtemplate). The remaining specimens are then digitized by matching this template to their surface scans using the function digitsurface, which allows for a one-to-one correspondence between semilandmarks across specimens. Finally, because geometric morphometric analyses require that all specimens have the same set of landmarks, incomplete specimens require some additional treatment. For incomplete specimens the function estimate.missing can be used to estimate the location of missing landmarks. This function implements thin-plate spline interpolation (following Gunz et al. 2009), which maps the locations of landmarks on a complete specimen to their corresponding locations on the specimen with missing landmarks. This approach is particularly useful for biological disciplines where missing landmarks and partial specimens are common; such as palaeontology, archaeology and biological anthropology.
Procrustes superimposition and data analysis
The workhorse of geometric morphometrics is GPA, which superimposes specimens to a common coordinate system by holding constant variation in their position, size, and orientation. In geomorph, GPA is accomplished using the function gpagen. Importantly, gpagen is a general function which can be used to superimpose sets of landmark as well as semilandmarks, the latter of which can be used to capture the shape of boundary curves and of surfaces. Here an additional step is incorporated into the Procrustes algorithm where semilandmarks on curves are slid along their tangent vectors, and semilandmarks on surfaces are slid within their tangent planes, until their positions minimize the shape difference between specimens (based on either Procrustes distance [the default] or bending energy: Bookstein 1997; Bookstein et al. 1999; Gunz, Mitteroecker & Bookstein 2005; Rohlf 2010). Aligned specimens can then be projected into a linear tangent space for subsequent statistical analysis using the option Proj = TRUE.
A number of functions in geomorph provide statistical assessment of patterns of shape variation. For instance, plotTangentSpace performs a principal components analysis of the shape data and provides a graphical view of the resulting scatter. Hypothesis-testing for anova and regression models is accomplished in procD.lm, which uses the Procrustes distances among specimens to quantify explained and unexplained components of shape variation, which are statistically evaluated via permutation (Goodall 1991). Comparisons of trajectories of shape change or motion paths may be obtained using trajectory.analysis (Adams & Cerney 2007; Adams & Collyer 2009), and patterns of morphological integration explored using the functions morphol.integr and compare.modular.partitions. Recent methods for assessing phylogenetic signal in morphometric data (Klingenberg & Gidaszewski 2010) are implemented in physignal. Finally, because r is a statistical computing language, the function arrayspecs can be used to reformat the shape data to a matrix of specimens by variables, so that other statistical routines in r may be utilized.
Graphics and visualization
Geomorph provides numerous functions for visualizing the results of shape analyses for both two-dimensional and three-dimensional data. For instance, one can visualize the shape changes between two specimens, as either landmark displacements or thin-plate spline deformation grids, with the function plotRefToTarget. Evolutionary changes in shape along a phylogeny can be viewed in shape space using plotGMPhyloMorphoSpace, and changes as a function of size (i.e. allometric trajectories) may be viewed using the function plotAllometry. Other visualization options are described in Table 1.
Here we provide several illustrative examples to demonstrate the use of geomorph. All examples can be reproduced using the data that come with the package. First, we load geomorph in r:
Next, we call the data set ‘plethoodon’, and plot the original landmark data (Fig. 2a):
We then perform a Procrustes superimposition and plot the aligned landmark coordinates. In this case, the resulting shape variables are stored as a data frame ‘Y.gpa’ and graphical links between landmarks (included in the data set ‘plethodon’) are plotted to aid visual interpretation (Fig. 2b).
Next, we will perform several additional analyses that relate to particular biological hypotheses. First, to visualize patterns of shape variation in shape space we perform a principal components analysis:
The resulting plot of the first two dimensions of tangent space explains 67% of the total shape variation, and reveals several distinct clusters of specimens, implying that shape differences may be present (Fig. 3a). Indeed, for this data example, specimens represent two species in two distinct environments and a statistical evaluation using manova reveals significant shape differences between species, between sites and in the interaction between the two factors:
 ”No specimen names in response matrix. Assuming specimens in same order.”
The shape differences between group mean can be visualized graphically, by obtaining the average landmark coordinates for each group and the overall mean, and plotting the differences as thin-plate spline transformation grids:
Here, the shape differences have been amplified by a factor of two to aid in the description of shape differences and facilitate biological interpretation (Fig. 3b).
Multivariate patterns of allometry can also be visualized, using one of three visualization options (Mitteroecker et al. 2004; Drake & Klingenberg 2008; Adams & Nistri 2010). The example data set ‘rats’ can be used to illustrate the approach (Fig. 3c):
 ”No specimen names in response matrix. Assuming specimens in same order.”
As a final example, one may combine phylogenetic data with shape data to estimate the degree of phylogenetic signal in shape. This function provides a view of shape space with the phylogeny superimposed (Fig. 3d):
Additional examples illustrating how to implement other functions are found in the geomorph help files.
The past several decades have seen a major increase in the development and use of landmark-based geometric morphometric methods in ecological and evolutionary studies. During this same time, the r statistical language has become the standard platform for statistical and computational analyses in many biological disciplines. The package geomorph leverages both of these advances by providing comprehensive software for performing the latest morphometric shape analyses in r. Users may read, manipulate and digitize two- and three-dimensional landmark data within r, generate shape variables from landmark and semilandmark data, perform statistical analyses to address ecological and evolutionary hypotheses and obtain graphical depictions of shapes and patterns of shape variation. Importantly, geomorph can perform Procrustes superimposition on landmark points, and semilandmarks on curves and surfaces in two or three dimensions, providing a more comprehensive quantification and analysis of biological shape variation. Full descriptions of all geomorph functions, as well as examples of their use, are available from the online help files.
Citation of geomorph
Scientists using geomorph in a published paper should cite this article. Users can also cite the phytools package directly. Citation information can be obtained by typing:
at the command prompt.
This work was sponsored in part by NSF grant DEB-1118884 to DCA, and by a Harvard Fellowship to EOC. Three anonymous reviewers provided valuable comments that greatly improved this work.