Relative tooth size, Bayesian inference, and Homo naledi

Objectives: Size-corrected tooth crown measurements were used to estimate phenetic affinities among Homo naledi ( (cid:1) 335 – 236 ka) and 11 other Plio-Pleistocene and recent species. To assess further their efficacy, and identify dental evolutionary trends, the data were then quantitatively coded for phylogenetic analyses. Results from both methods contribute additional characterization of H. naledi relative to other hominins. Materials and Methods: After division by their geometric mean, scaled mesiodistal and buccolingual dimensions were used in tooth size apportionment analysis to compare H. naledi with Australopithecus africanus , A. afarensis , Paranthropus robustus , P. boisei , H. habilis , H. ergaster , H. erectus , H. heidelbergensis , H. neanderthalensis , H. sapiens , and Pan troglodytes . These data produce equivalently scaled samples unaffected by interspecific size differences. The data were then gap-weighted for Bayesian inference. Results: Congruence in interspecific relationships is evident between methods, and with many inferred from earlier systematic studies. However, the present results place H. naledi as a sister taxon to H. habilis, based on a symplesiomorphic pattern of relative tooth size. In the preferred Bayesian phylogram, H. naledi is nested within a clade comprising all Homo species, but it shares some characteristics with australopiths and, particularly, early Homo . Discussion: Phylogenetic analyses of relative tooth size yield information about evolutionary dental trends not previously reported in H. naledi and the other hominins. Moreover, with an appropriate model these data recovered plausible evolutionary relationships. Together, the findings support recent study suggesting H. naledi origi-nated long before the geological date of the Dinaledi Chamber, from which the specimens under study were recovered.

. In TSA, the unit of study is the complete permanent dentition, rather than individual mesiodistal (MD) and buccolingual (BL) crown dimensions. These lengths and widths are first size corrected (below), to yield equivalently scaled samples, for submission to principal components analysis. Statistically, uncorrelated factor scores are used to place samples on axes of a scatterplot to visualize how crown size is differentially distributed, or apportioned, within the tooth rows. Because TSA is useful for comparing human individuals and groups (Harris, 1997(Harris, , 1998Harris & Bailit, 1988;Harris & Rathbun, 1991;Hemphill et al., 1992;Irish & Hemphill, 2001;Irish & Kenyhercz, 2013;Lukacs & Hemphill, 1993), which on an intraspecific level exhibit minimal variation, the technique was projected to be particularly effective when comparing more discernible interspecific differences of our hominin ancestors. This prediction was proven to be correct. The grouping of species (Irish et al., 2016) included in other, albeit, cladistic studies (Smith & Grine, 2008;Strait et al., 1997;Strait & Grine, 2004) is comparable, as are the affinities of A. sediba (Berger et al., 2010;Dembo et al., 2015Dembo et al., , 2016Irish et al., , 2014. As such, the initial intent here was to simply undertake an equivalent TSA analysis to dentally characterize H. naledi, estimate interspecific relationships, and assess its taxonomic classification. Earlier studies based on characters across the skeleton supported its inclusion in the genus Homo, but as a distinct member Thackeray, 2015;Dembo et al., 2016;Irish et al., 2018; also see Holloway et al., 2018;Davies et al., 2020). A phenetic approach was deemed as most appropriate because continuous odontometric data do not lend themselves well to standard cladistic analyses; that is, they are typically reduced to a few ratios or crown areas qualitatively discretized into two or more states, along with other morphological characters (Berger et al., 2010;Strait & Grine, 2004). Of course, this same strategy applies to all continuous data with traditional phylogenetic analyses (Felsenstein, 2004;Parins-Fukuchi, 2018a, 2018bPimentcl & Riggins, 1987;Poe & Wiens, 2000;Pogue & Mickevich, 1990;Stevens, 1991;Thiele, 1993;Wiens, 1995), when they are not excluded entirely (Garcia-Cruz & Sosa, 2006;Poe & Wiens, 2000).
Yet, the benefits of continuous data, including more objective recording through standardized measurements, among others (below), encourage their use with phylogenetic inference. Recently, applicable models have been employed to analyze such data directly (e.g., Parins-Fukuchi, 2018a, 2018bbelow), but a more established strategy is to apply one of several quantitative coding techniques.
Some of these, to boost phylogenetic signal over qualitative discretizing, can return up to 30 states (Felsenstein, 2004;Garcia-Cruz & Sosa, 2006;Jones & Butler, 2018;Wiens, 2001). In reality, all morphological characters are "fundamentally quantitative," and in the present study they are treated as such (Wiens, 2001:689;Felsenstein, 2004;Schols et al., 2004), through the oft-used gap-weighted coding method (Garcia-Cruz & Sosa, 2006;Goloboff et al., 2006;Schols et al., 2004;Thiele, 1993). Therefore, analyses of H. naledi, nine other African and Eurasian Plio-Pleistocene hominin species, two samples of recent African H. sapiens, and Pan troglodytes proceed as follows. First, TSA analysis was used to estimate interspecific affinities with the continuous, scaled MD and BL dimensions. Second, given the demonstrated utility of this technique, it was decided to investigate further how these scaled data differ and distinguish among species. To do so, gapweighted data were used in Bayesian inference under a Mkv (Lewis, 2001) or "standard discrete (morphology)" model (Huelsenbeck & Ronquist, 2001;Ronquist & Huelsenbeck, 2003;Ronquist et al., 2020:133). The results identify effects of presumed evolutionary trends on relative tooth size across species, beyond that previously reported. Moreover, with the appropriate parameters, these data can also recover plausible phylogenetic relationships. Finally, results from TSA and Bayesian analyses, in reference to prior studies, provide additional morphological characterization of H. naledi relative to the other species.

| The samples and their data sources
The H. naledi sample consists of 122 dental specimens from the Dinaledi Chamber of the Rising Star cave system, with mean MD and BL measurements in the present analyses from Berger et al. (2015).
Mean MD and BL data for the latter two are from Berger et al. (2015), with information provided therein.

| Odontometric measurements
For both H. sapiens samples, crown dimensions were recorded with needlepoint calipers accurate to 0.05 mm following Moorrees and Reed (1964). If antimeric pairs in an individual were present, mean MD and BL measurements were calculated; if only the right or left tooth remained, it was recorded, for up to 16 measurements in each isomere and 32 per dentition. For the other samples, measurements were reviewed for conformity to facilitate data compatibility, though inter-observer error obviously could not be tested.
Commonly in previous hominin studies, notably phylogenetic analyses, the genetic contribution of characters is often assumed but was not, or cannot, be estimated. This is an important point because if at least some characters lack a genetic basis, the results can be misleading (Wiens & Hillis, 1996). In modern humans narrow-sense heritability of MD and BL diameters was found to be high, in some cases h 2 > 0.8 (Alvesalo & Tigerstedt, 1974;Baydaş et al., 2005;Dempsey et al., 1995;Dempsey & Townsend, 2001;Hlusko et al., 2002;Kieser, 1990;Rizk et al., 2008;Townsend et al., 2003;Townsend & Brown, 1980). Recently, an association between body size and BL diameters was indicated (Hlusko et al., 2016), and a study of just MD diameters returned a lower h 2 of 0.51-though reproductive isolation and socioeconomic stress in the population sampled and small samples may have affected the value (per Stojanowski et al., 2017). Heritability of the scaled dental data has also not been assessed directly, but it should parallel the original MD and BL dimensions, given the correlation between datasets in the present study (r = 0.93, p = 0.00). At any rate, the h 2 value in Plio-Pleistocene hominins cannot be known. However, based on the above findings the prospect of relatively high heritability is, at a minimum, encouraging to estimate phenetic affinities and a phylogeny of characters from simple crown measurements readily available in the literature.
The geometric mean (GM) is computed as the nth root of the product for all n dimensions (x) per case. Each dimension is divided by this mean (x/GM) for an average of 1.0 across the sample rows. Scaling "cancels out size differences by giving each [sample] the same average character state or magnitude over all the measurements taken on it" (Corruccini, 1973:747).
Data description was undertaken prior to submitting the correlation matrix of 32 DM_RAW-scaled mean MD and BL measurements to PCA. Group component scores were plotted in three dimensions to visualize phenetic variation using SPSS Ver. 26.0. Ideally, TSA analyses would be conducted with samples divided by sex, although this strategy was not followed in the aforementioned modern human comparisons. The reason is that, while sexual dimorphism may be a factor in absolute crown size differences between males and females in a common population (though see Harris, 2003), relative tooth size within the dentitions is unaffected (Harris & Rathbun, 1991;Hemphill, 2016b;Hemphill et al., 1992). Like heritability, the same cannot be claimed for Plio-Pleistocene species with significant differences in body size between the sexes. Regardless, it is out of necessity, including an inability to determine sex in most hominin specimens, considerable missing data, and a need to maximize sample sizes, that all specimens and individuals were pooled by species for analysis.

| Bayesian phylogenetic inference
Probabilistic or statistical methods to infer phylogenies, including Bayesian inference, are seeing increased use over nonprobabilistic methods like maximum parsimony. The reasons include methodological consistency, the ability to estimate branch lengths and evolutionary rates and, basically, better performance in genetic and morphological cases (Felsenstein, 2004;Lee et al., 2014;Wright and Hillis, 2014;EC.Europa.EU, 2016;Nascimento et al., 2017;Parins-Fukuchi, 2018a, 2018bGuillerme & Brazeau, 2018). Indeed, the ".. . inconsistency of parsimony has been the strongest challenge to its use," although it works well with very large datasets to compare recently derived species (Felsenstein, 2004:121;EC.Europa. EU, 2016). The theories behind, overviews of, and techniques concerning Bayesian inference in parameter estimation are covered thoroughly in the preceding references, and have been discussed in prior hominin studies (Dembo et al., 2016;Mongle et al., 2019). Additional, pertinent information is provided here in describing the analytical progression.
Phylogenies were inferred from quantitatively coded versions of DM_RAW-scaled data with, as noted, a Mkv model. These 32 scaled dimensions were gap-weighted using Thiele's (1993) method in Mor-phoCode 1.1 (Schols et al., 2004). It generates a data matrix, with the order and dispersal of means determined for each morphological character, and then converted to "ordered, multistate characters where the distance between means is represented by the distance between ordered character states" (Thiele, 1993;Wiens, 2001;Schols et al., 2004:2). This matrix of coded scaled data, in Nexus format, was submitted to MrBayes 3.2.7 (Huelsenbeck & Ronquist, 2001;Ronquist et al., 2020;Ronquist & Huelsenbeck, 2003) using the maximum number of states allowed by the program (see below).
Given the vast range of parameters, the aim was to begin simply, with a rooted strict-clock model and default, "so-called flat, uninformative, or vague [prior] distributions" (Felsenstein, 2004;EC.Europa. EU;, 2016;Ronquist et al., 2020:91). The latter are suggested to base the posterior distributions principally on the data-to establish their contribution (Ronquist et al., 2020;though see Felsenstein, 2004;Nascimento et al., 2017). From this, more complex parameters were added in a series of analyses. Of these, two relaxed-clock models representative of this progression are discussed: one basic and the other with many constraints, calibrations, and additional priors. All entail Bayesian molecular clock methods to estimate divergence among taxa (Hedges & Kumar, 2009;Nascimento et al., 2017).
Each model was analyzed using Markov chain Monte Carlo (MCMC) simulation with the Metropolis algorithm (EC.Europa. EU;, 2016;Felsenstein, 2004;Nascimento et al., 2017;Ronquist et al., 2020). Because the dataset is not large MrBayes default run values were used, with an increase in generations if needed. Two concurrent but independent analyses beginning with different random trees were run for 1,000,000 generations, with a sampling frequency of 500 to yield 2000 samples, and diagnostics calculated every 5000 generations. Runs consisted of one cold and three heated chains, with a 25% burn-in of samples from the cold chain so it could settle into its equilibrium distribution. This process allowed expedient calculation of convergence diagnostics to assess if a representative sample of trees resulted from the posterior probability distribution.
Established diagnostics used for the gap-weighted scaled data include: (1) standard deviation of split frequencies ≤0.01, (2) potential scale reduction factor (PSRF) of $1.0 for all parameters, and (3) average effective sample sizes (ESS) of >200 (EC.Europa.EU, 2016; Felsenstein, 2004;Guillerme & Brazeau, 2018;Huelsenbeck & Ronquist, 2001;Nascimento et al., 2017;Ronquist et al., 2020;Ronquist & Huelsenbeck, 2003). If cut-offs were not met, the generation number was increased until minimums were achieved or exceeded, to yield similar trees from the independent runs. Finally, a cladogram with posterior probabilities, a.k.a. clade credibility values, and a phylogram with mean branch lengths were produced. Trees were rendered with FigTree 1.1.4. Related diagnostics include posterior probabilities to determine final tree number, where $1.0 specifies one tree (Ronquist et al., 2020). The three clock analyses described here share several initial priors particular to the quantitatively coded data type [full parameter list in SI Table S2]. For state frequencies a symmetric dirichlet distribution fixed to infinity was used to correspond to the assumption of no transition rate asymmetry across sites (Ronquist et al., 2020). Coding bias was variable and type ordered, as necessary for the gap-weighted continuous data, where it is assumed evolution between states moves through intermediate states (0 < ! 1 < ! 2) (Felsenstein, 2004).
MrBayes can handle 10 character states (0-9) if unordered, but only six if ordered (Ronquist et al., 2020). Therefore, states of 0-5 were calculated for each of the 32 scaled characters in Morphocode 1.1 for the input matrix.
First, for the strict-clock analysis (SI Table S2), a clock parameter was specified for branch lengths type, with a uniform prior and a fixed clock rate. Constant rates of evolution are assumed among taxa, where branch tips are presumed to be the same age (EC.Europa. EU, 2016;Felsenstein, 2004;Pybus, 2006;Ronquist et al., 2020). This approach is preferred for analyzing the same species with similar molecular evolution rates (Felsenstein, 2004), which cannot be assumed for the present hominin taxa.
Second, a basic relaxed-clock analysis (SI Table S2) was conducted. A model of this type is suggested for different species, because it can incorporate a prior distribution of evolutionary rates that vary among taxa and branches of a phylogeny (Felsenstein, 2004;Pybus, 2006). Like the strict-clock a relaxed-clock model is rooted, but information on root position is weaker. Therefore, following standard protocol a tree topology constraint was introduced to exclude Pan and force all other taxa into a monophyletic ingroup. The key change was to "'relax' the strict clock assumption" with an independent gamma rates (IGR) model of continuous uncorrelated variation across lineages (Ronquist et al., 2020:60). A related prior is a standard exponential rate of variance in effective branch lengths over time.
Three other notable changes were made to this model. First, a gamma distribution was substituted for the rates prior (also used by Dembo et al., 2016), to accommodate rate variation across sites (EC. Europa.EU, 2016;Kuhner & Felsenstein, 1994;Ronquist et al., 2020). Second, the clock rate default prior, which measures node age as the number of expected substitutions per site, was swapped for a normal distribution to calibrate the tree in millions of years. A mean of 0.2 and a standard deviation of 0.02 designated that this distribution be truncated at 0, to yield positive values (Ronquist et al., 2020;Ronquist & Huelsenbeck, 2003). Third, a fossilized birth-death (FBD) prior with random sampling was substituted for the uniform branch lengths default (Ronquist et al., 2020). A standard birth-death prior (e.g., Dembo et al., 2016) is often used with dating and root constraints (Nascimento et al., 2017;Ronquist et al., 2020).
However, the FBD is most appropriate for clock trees with calibrated external nodes (i.e., fossils) and if both extinct and extant species are included (Stadler, 2010;Heath et al., 2014;Heath, 2015;Stadler et al., 2018;Zhang et al., 2016;Ronquist et al., 2020). This prior describes the probability of tree and fossil data conditional on a number of birth-death parameters (SI Table S2), including speciation with branching (i.e., birth), extinction (death), and fossil preservation and recovery (sampling).
Finally, all clock models were compared by calculating Bayes factors (B12) from the marginal likelihoods that result when substituting the MCMC with the stepping-stone (ss) method (Xie et al., 2011).
Evolutionary correlation can be investigated a posteriori through the phylogenetic hypotheses (Guillerme & Brazeau, 2018). Otherwise, with exception (below), effects on inference cannot be verified or addressed, especially with fossil taxa (Billet & Bardin, 2019;Felsenstein, 2004;Guillerme & Brazeau, 2018;O'Leary & Geisler, 1999). The same goes for possible homoplasy, another effect assumed inherent with morphological data-that nonetheless is a necessity when analyzing fossils (Kay, 2015;Wiens, 2001). A likely source of evolutionary correlation of particular relevance here is morphological integration (Billet & Bardin, 2019;G omez-Robles & Polly, 2012;Klingenberg, 2014;Strait & Grine, 2004). It cannot be claimed that the DM-scaled dimensions of serially homologous teeth are an exception. Indeed, while incisors may be genetically independent from posterior teeth, at least in baboons (Hlusko & Mahaney, 2009), integration was found to affect crown shape in the postcanine dentition (G omez-Robles & Polly, 2012); certain regions are affected more than others, including the mandibular dentition (relative to the maxillary), molars (vs. premolars), and UM3 and LP3.
To address this lack of independence one suggested strategy is to merge dental observations into a few or one character, that is, "composite coding" (Billet & Bardin, 2019:268); however, doing so risks unverified a priori dismissal of phylogenetic signal (O'Leary & Geisler, 1999). And, given that the most substantial size variance occurs among tooth types (Harris, 2003), it would preclude exploring relative size variation among species in this study via phylogenetic inference.
Whatever the case, prospective issues with the present data are likely mitigated by Bayesian inference, said to be less affected by character correlation than parsimony, notably when comparing relatively few taxa (Guillerme & Brazeau, 2018). Also, this factor may not be as challenging for phylogenetic reconstruction as generally presumed (Adams & Felice, 2014;Parins-Fukuchi, 2018a). Evolutionary correlation is expected in closely related species, and vice versa (Felsenstein, 1985;Lajeunesse & Fox, 2015;Martins & Hansen, 1997). Simply put, the dilemma is that the question of correlation cannot be answered if the true phylogeny is not known, and the phylogeny cannot be reconstructed without knowing the answer (Felsenstein, 2002;P.D. Polly, personal communication, 2021). Therefore, like with any characters-qualitative or quantitative-a pragmatic approach (above) is to consider the plausibility, or the lack thereof, of the resulting phylogenies a posteriori.

| Descriptive and TSA analyses
Mean MD and BL dimensions of H. naledi and the 12 other samples, with total number of teeth from which each were calculated, are provided in Table 1. Maxillary and mandibular crown surface areas (MD x BL) were also determined and plotted (Figures 1 and 2). These are crude estimates of actual areas (Garn et al., 1977;Hemphill, 2016a), but can be useful indicators of absolute dental size variation among species (e.g., Evans et al., 2016;and below (Table 2), yet the absolute dimensions are 11.2 and 15.5 mm ( Table 2). The effect of correction can be visualized by submitting the original and scaled data to UPGMA cluster analysis (Sokal & Sneath, 1963; SI Section S1). The former data yield two major size based clusters (SI Figure S1). The first comprises six 'large'-toothed species near the top of Figures 1 and 2 that, when summing all crown areas by sample, range between 1813 mm 2 for H. ergaster and 2483 total mm 2 for P. boisei. The second cluster, based on the UPGMA results and a "natural" break between Asian and African H. erectus (ergaster) evident in a bar graph of summed sample sizes (SI Figure S2), contains the seven "small"- For TSA analysis the correlation matrix of DM_RAW-scaled data was submitted to PCA. Un-rotated factor scores from the first three components with eigenvalues >1 were used to plot sample variation.
Component loadings, eigenvalues, individual variance, and total variance explained, 90.7%, are listed in Table 3. The loadings are also presented as bar graphs (SI Figures S4-S6) to visualize those of the greatest importance in driving variation on axes of the scatterplot ( Figure 3). By interpreting this output it can be determined how crown size is differentially apportioned or distributed along the maxillary and mandibular tooth rows, to compare variation in interspecific patterning.
As expected, the first component accounts for the most variation, 74.3%, which identifies the primary difference in relative intraspecific tooth size, and the longest branch in the subsequent phylogenies below). Like the first TSA hominin study (Irish et al., 2016), the Paranthropus pattern of megadont posterior and diminutive anterior teeth is evident. Except for the DM-scaled BL dimension of the UM1, 0.478, and scaled MD of the LP3, À0.217 (Table 3; SI Figure S4), strong loadings of 0.541-0.964 indicate relatively large cheek teeth in both isomeres; this influence pushed P. boisei and P. robustus toward the positive end of the PC1-axis (Figure 3). The first exception, the DM-scaled UM1 BL dimension with a lower loading of 0.541 for the scaled MD, marks the extreme M1 < M2 < M3 size progression in this genus. The second exception, the negative loading for DM-scaled LP3 MD, is more a function of the sectorial premolar in Pan near the opposite end of the axis. Otherwise, the latter species' location is driven by loadings of À0.786 to À0.978 for relatively large I1, I2, and C in both maxilla and mandible. The remaining samples plot between these two extremes. The stimulus for this distribution is apparent in Table 2, where except for UM1 and LM1, the Paranthropus taxa have the largest size-corrected posterior diameters, especially compared to Pan. The opposite holds for anterior teeth; Pan has the largest scaled diameters, and P. boisei and P. robustus the smallest.
The second component accounts for 11.6% of the variance, with samples separated by differences in the molar class. As implied on component 1, the maxillary and mandibular first molars are responsible. In Table 3 (and SI Figure S5 (Table 2); this stands in contrast to the australopith samples near the positive end of the PC2-axis. This patterning likely reflects size effects of the inhibitory cascade model discerned in hominins (Evans et al., 2016;and below); yet, a closer inspection of the loadings also suggests shape differences, where samples toward the positive end of the axis have larger scaled MD than BL diameters for M2s and M3s, unlike those at the negative end. A more obvious factor is the loading 0.846 for the DM-scaled MD of the LP3 (above); it drives Pan toward the positive end, and affects H. naledi, with the latter's slightly greater DM-scaled MD (0.99) dimension than BL (0.96) ( Table 2). Again, values are relative to those of the full dentition, as seen by less variation in the actual MD (9.0 mm) and BL (8.8 mm) LP3 dimensions in H. naledi (Table 1). Though sample size must be considered, also note the identical MD and BL dimen-      (0.390) are also involved (see Table 2).
F I G U R E 1 Line plot showing tooth-bytooth trends in absolute occlusal surface areas of the maxillary dentition in mm 2 by sample. Line colors and format apply loosely to genus (e.g., Figures 3, 6, SI Figure S2), but are primarily used to differentiate samples. See text for sample compositions F I G U R E 2 Line plot showing toothby-tooth trends in absolute occlusal surface areas of the mandibular dentition in mm 2 by sample. Line colors and format apply loosely to genus (e.g., Figures 3, 6, SI Figure S2)

| Bayesian phylogenetic inference
The strict-clock cladogram (SI Figure S7) is nearly identical to the abovementioned UPGMA dendrogram (SI Figure S3). This is unsurprising, given the model (as above) assumes constant evolution rates and age among taxa, to yield phylogenies based largely on overall similarity. Again, the Paranthropus sister taxa are an outgroup to two larger hominin clades.  Table S3), a final tree number of 1.0, to provide support for a single tree, and a marginal likelihood of À617.48.
An MCMC run of 2,000,000 generations was necessary for the basic relaxed-clock analysis to achieve a standard deviation of split frequencies <0.01 (SI Table S3  Finally, the dated relaxed-clock analysis using the fossilized birthdeath (FBD) branch lengths and gamma rates priors produced the fully resolved phylogram in Figure 4 (SI Table S3 (Kass & Raftery, 1995;Ronquist et al., 2020), those between this model and the other two (Mod 1 and Mod 2 ) are, respectively, 33.08 and 11.44. This dated relaxed-clock model is deemed the best in terms of the tree topology, in that it provides the most likely a posteriori hypothesis (Kass & Raftery, 1995).
T A B L E 3 Loadings, eigenvalues, and the variance explained for the first three principal components based on size corrected maxillary and mandibular measurements in the 13 samples

| The data
From a practical standpoint, the DM-RAW scaled data hold several advantages over discrete characters generally employed in hominin research (Strait & Grine, 2004;Smith & Grine, 2008;Berger et al., 2010;Dembo et al., 2015Dembo et al., , 2016Mongle et al., 2019; also see coding issues in Scotland et al., 2003). Because they are continuous, means based on multiple specimens are used instead of 'typical' characters to represent a species. With a range of standard statistical methods, missing data may also be estimated (e.g., Kenyhercz & Passalacqua, 2016). An absence of empty cells in the present data matrix undoubtedly is a factor in stronger node support than the prior Bayesian hominin trees (Dembo et al., 2015(Dembo et al., , 2016Mongle et al., 2019). Crown measurements are reasonably straightforward, and data comparatively unbiased among studies. Observer replicability is a factor like all osteometric recording, but subjective interpretation of characters is minimized. Digital 2D and 3D imaging methods are even available to enhance precision (Baab et al., 2012;Bernal, 2007;Braga, 2016 (Bernal, 2007;Hlusko et al., 2002).
Further, beyond simply reflecting relative size, the phylogenetic signal is seemingly sufficient to recover reliable evolutionary relationships (i.e., Figure 4). It could be argued that scaling of data in serially homologous teeth, which act as a unit, make them less subject to homoplasy than other morphological characters; contrarily, perhaps it is independence of the incisors from posterior teeth (Hlusko & Mahaney, 2009) that plays a role. Teeth are certainly less affected by any purported homoiology (Lycett & Collard, 2005;von Cramon-Taubadel, 2009). However, it is more likely related to the relatively few taxa and/or their particular evolutionary pathways. Whatever the explanation(s), the posterior probabilities indicate a single maximum a posteriori tree for each model. Felsenstein (2004: 299) (Wiens, 2001;O'Leary et al., 2013;Parins-Fukuchi, 2018a;and above). Second, character number is limited compared to the 'super-matrices' in some studies (Dembo et al., 2016;Mongle et al., 2019;Strait & Grine, 2004), but this is more of an issue with maximum parsimony (Wiens, 2004;Wiens & Hillis, 1996) than Bayesian inference (Felsenstein, 2004: EC.Europa.EU, 2016also Scotland et al., 2003). Finally, perhaps more problematic, the data are from a single anatomical region-the dental module. Characters from one region may provide a different phylogeny than another (Kay, 2015), so sampling across the skeleton is suggested (Dembo et al., 2016;Kay, 2015). In any case, results dictate the usefulness of data, as expanded upon in the next section.

| The analyses
As expected, TSA results (Table 3, Figure 3) emulate the prior study for the same species (Irish et al., 2016). This technique was designed for continuous odontometric data, so it yields expedient phenetic affinities. However, it is also a useful bridge to phylogenetic inference.
The structure of the first component is highly phylogenetic and identifies the deepest node on a resulting tree, that is, that separating one taxon from all others; the second component separates another taxon from the rest, the third another taxon, and so on P. D. Polly, personal communication, 2021). Component loadings (Table 3) quantify characters responsible for clade formation, while the 3D plot ( Figure 3) illustrates degrees of relationship interpretable using a neighborhood approach (Guttman, 1954;Kruskal & Wish, 1978). As such, the plot provides an indication of likely clades, and samples inclined to shift-so-called wildcard taxa (Nixon & Wheeler, 1992). For example, the potential for both Paranthropus species to form a separate clade with the strict-clock model (SI Figure S7), but nest with other australopiths under the dated relaxedclock model (Figure 4; also SI Figure  In brief, the first statistically uncorrelated variable, a.k.a. component, identifies the differences in anterior and posterior tooth size; the second, size in the molar class; and the third, shape variation in three teeth. So the 32 scaled data were reduced to three characters explaining >90% of the variance. The first two account for 86%, which is unsurprising given that phylogenetic correlation is expected to contribute to high proportional variance on the first few components; this correlation is needed when reconstructing a phylogeny P. D. Polly, personal communication, 2021).
Other patterning is discernable as well, for example, the first component additionally defines UM1 and LP3 shape variation (Table 3; SI Figure S4). The second component reflects influence of the inhibitory cascade (ICM) ( Evans et al., 2016;Kavanagh et al., 2007;Polly, 2007;Halliday & Goswami, 2013;Schroer & Wood, 2015), here in both isomeres. Postulating that the LM2 occlusal surface comprises a third of total molar area (Evans et al., 2016;Kavanagh et al, 2007;Schroer & Wood, 2015) (e.g., Figures 1 and 2), the ICM is considered valuable for phylogenetic reconstruction, with M1 < M2 < M3 plesiomorphic in hominins (Schroer & Wood, 2015). Yet, as above, crown areas are F I G U R E 4 Bayesian inference phylogram from dated relaxed-clock analysis based on gap-weighted, DMscaled data under an MKv model of H. naledi and the 12 comparative samples, with clade credibility values for internal nodes included. Scale in millions of years. This is the preferred topology for this study. See text for details only crude estimates (Garn et al., 1977;Hemphill, 2016a), so more information is represented in this component. Loadings for the DMscaled MD and BL dimensions suggest some shape differences for the M2, M3, and, noticeably, LP3 (Table 3; SI Figure S5). The same goes for the third component. Accounting for only 4.9% of the variance, its moderate loadings (Table 3; SI Figure S6) imply that H. naledi and H. habilis have relatively large UI1, UM1, and LI2 MD diameters. This configuration is confirmed for these and other teeth in both species, but only after comparing the 32 DM-scaled data between sample pairs and plotting quotients (SI Figure S9). This is a key factor in differentiating species (below). As a data reduction technique to visualize phenetic variation, the components account for most, but not all phylogenetic signals. Thus, gap-weighted scaled data in Table 2 were used for the phylogenetic analysis.
Overall tree topology does vary across Bayesian models (SI Figures S7 and S8, Figure 4), but as expanded upon below, uniformity in several clades indicates data-driven results. The same clades are also credible as reported in earlier studies, H. naledi notwithstanding (Berger et al., 2010;Dembo et al., 2015Dembo et al., , 2016Irish et al., , 2014Irish et al., , 2016Irish et al., , 2018Mongle et al., 2019;Smith & Grine, 2008;Strait et al., 1997;Strait & Grine, 2004). Those in the strict-clock cladogram (SI Figure S7) resemble the dendrogram clusters (SI Figure S3) for reasons stated. Node support is high, but a polytomy is present, both Paranthropus taxa form an outgroup, and H. ergaster and H. erectus occupy separate clades. Also, at issue is H. habilis (and H. naledi), in a clade with australopithecines rather than other Homo species. Relative to this, the basic relaxed-clock model is favored by Bayes factors. Its tree has greater node support and all australopiths comprise one clade (SI Figure S8). However, it remains unresolved, some taxa shifted, and the two main clades again divide H. ergaster and H. erectus. These issues might suggest insufficient phylogenetic information (Maddison, 1989;Nixon & Wheeler, 1992;Pol & Escapa, 2009;Purvis & Garland, 1993). Potential factors are model inadequacy, compression of information from the six character-state limit, and/or the data describe nothing beyond the phylogenetic signal of relative tooth size.
The final Bayesian analysis implies that gap-weighted data can recover a plausible phylogenetic hypothesis-under an adequate model. The dated relaxed-clock is strongly favored over the preceding two models. It yielded a fully-resolved tree, very strong node support, and a credible topology (Figure 4). Aside from H. naledi (i.e., Dembo et al., 2016), the calibrated phylogram is congruent with those from prior Bayesian inference (Dembo et al., 2015;Mongle et al., 2019), and the preferred trees from maximum parsimony (Mongle et al., 2019;Smith & Grine, 2008;Strait et al., 1997;Strait & Grine, 2004). With this model any compromised signal from low state number was likely improved by two factors. First, a relaxed-clock is recommended for comparing different species, with a prior distribution of evolutionary rates that vary among the taxa and branches (Felsenstein, 2004;Pybus, 2006). Second, the fossilized birth-death prior promotes unrestricted variation in branch length. To illustrate, an earlier dated relaxed-clock analysis using the default priors of equal rates and uniform branch lengths, yielded a phylogram with highly inaccurate branch variation relative to divergence times (SI Figure S10); clades are identical to those from the basic relaxed clock model, including the same two polytomies (SI Figure S8).
To test further the dated FBD model, Ardipithecus ramidus and A. anamensis were added after the above analyses were completed (not shown). Data are available for these species (details in SI Table S4), but they do not meet the second criterion for original sample selection-multiple MD and BL measurements for all teeth. However, by adding these older species (4.5 and 4.2 Ma FAD, respectively; Du et al., 2020) the branch length to the root was shortened; long branch lengths can unduly bias locations of the remaining taxa (Felsenstein, 2004). The resulting calibrated phylogram, presented in  Figure 2). Second, In H. habilis these teeth are generally reduced but, importantly, in scaled BL size more than MD to result in relatively long, narrow posterior teeth as described here. Additional teeth in the species show similar unequal reduction in scaled size (also PC3 in Figure 3). This pattern is retained in the overall smaller teeth of H. ergaster, but intensified in H. naledi, as detailed below. These trends may be gleaned from Table 2, but are succinctly illustrated by plotting scaled dimensions of the LM2 (Figure 6), that is, the central tooth of the molar ICM (also see plots of between-sample quotients in SI Figure S9, as discussed below). The three African Homo species all lie below the reference line of the LM2 graph, with a long DM-scaled MD dimension relative to BL. The remaining nine samples, on or above this line, have an LM2 ranging from relatively proportional to short and wide in shape.
Numerous diet-related hypotheses have been proposed to explain the postcanine megadontia of Paranthropus (overview in Wood & Patterson, 2020), and the opposite in Homo, though most of the latter consider extra oral processing of food rather than direct consumption (overview in Veneziano et al., 2019). But what explains the shape differences seen in more ancient African Homo versus non-African and recent Homo species-most notably between H. ergaster and H. erectus (before application of the calibrated FBD model)? Homo erectus is characterized by (re)expansion of scaled BL dimensions relative to MD (Table 2), as again visualized using the LM2 (Figure 6). Turning to the preferred calibrated phylogram (Figure 4; also 5), the discussion now focuses on H. naledi. It seems that a common supposition (Greshko, 2017), with minimal published support, is that the species is directly descended from African H. erectus (i.e., H. ergaster). Yet, in the original article, Berger et al. (2015) described only what was considered enough similarities with several  Hawks et al., 2017;Schroeder et al., 2017). Cranially, it is nothing like recent Homo-seen in its endocranial morphology (Holloway et al., 2018) and Australopithecus-like cranial capacity . In the postcrania, Homo-like traits include long tibiae and gracile fibulae, muscle attachments that suggest a striding gate, and modern features in the ankles, feet, and hands.
This mosaic of plesiomorphic, apomorphic, and apparent autapomorphic characters affected the prior attempt at Bayesian inference by Dembo et al. (2016) Despite the method employed, both remain sister taxa with strong node support of >75-90%. The latter value is from the dated relaxedclock phylogram in Figure 4; it reaches 94% in the expanded cali- also have relatively larger DM-scaled MD than BL diameters (SI Figure S9(a)). A pattern contrary to highly derived P. robustus and P. boisei would be expected in H. naledi, namely, larger anterior and smaller posterior teeth. This pattern is evident, but enough scaled dimensions are similar to Paranthropus, notably P. robustus (Table 2), that exceptions occur. The UM1s are equivalent in relative size across these species, as are DM-scaled MD dimensions of the UP3, UM2, LM1, and LM2, and DM-scaled BL equivalents of the UI1, LI1, LC, and LM3 (SI Figure S9(b,c)). Again, as with Pan, the H. naledi teeth are comparatively longer in DM-scaled MD dimensions than, in this instance, the buccolingually larger teeth of Paranthropus.
The apportionment of tooth size in H. naledi is most similar to that of H. habilis and, to a lesser degree, A. africanus and A. afarensis. Other than the general uniformity in DM-scaled anterior and posterior dentition size, all share a strong M1 < M2 < M3 gradient relative to the ICM (Schroer and Wood, 2015). As well, molars and several other teeth are of similar relative size among the species. In fact, in ICM proportions, Evans et al. (2016) found that H. habilis is more like the australopiths than other early Homo species, which would not be unexpected in a putatively basal member of the latter genus. This finding prompted these authors to cite a paper suggesting the taxon could be Australopithecus habilis (Wood & Collard, 1999). In any event, a number of scaled dimensions distinguish H. naledi from the australopithecines. The former has a noticeably smaller LC, but comparatively large scaled UI2, LI1, and LI2 MD dimensions -particularly in contrast to A. africanus (SI Figure S9(d-f)). Though less marked than Pan (see above), the scaled MD dimension of the H. naledi LP3 is also large versus the BL, as indicated by the associated loading in Table 3. Homo naledi can be differentiated from H. habilis on these bases to some extent, but their symplesiomorphies are more obvious. As indicated, they are the only two species with an LP3 that is not wider (BL) than it is long (MD) ( Table 1). In fact, teeth in both species are characterized by large DM-scaled MD dimensions relative to all australopiths (Tables 2 and 3, Figure 3 PC2 and PC3). Beyond the shared molar size gradient, H. naledi and H. habilis have long, narrow posterior teeth noted above (also SI Figure S9(f)), unlike derived recent Homo with mesiodistally reduced premolars and molars (SI Figure S9(g-l)).  (Davies et al., 2020:13196.9). The present results support this inference and others finding links to a common, and early, Homo condition. That said, the phylogenetic place of H. naledi is clearly a work in progress.
More remains are being recovered, but of greater importance to increase understanding is the discovery of specimens older than the age of the Dinaledi chamber; as implied by the above findings, they should be present. As/if more ancient H. naledi remains are found it should be possible, ideally in combination with characters from multiple anatomical regions, to discern just how long this potentially long surviving lineage survived, alongside or in the shadow of several successive hominin species, including H. sapiens.

| SUMMARY AND CONCLUSIONS
The DM_RAW correction from Jungers et al. (1995)  One aim was to provide further morphological characterization of the recently discovered South African hominin. The DM-scaled data were employed in an approach called TSA analysis to assess intersample phenetic affinities (Irish et al., 2016). Then, after quantitative coding, the 32 scaled characters were used in Bayesian inference. Yet, whether 3D ordination or phylogenetic tree, and irrespective of Bayesian priors, the comparability in several key clades implies datadriven results. The results identify effects of presumed evolutionary trends on relative tooth size across species, beyond that previously reported. Then, using the relaxed-clock model to permit variation in evolutionary rates and, importantly, gamma rates and fossilized birthdeath priors for unrestricted branch length variation, the final dated analysis yielded a tree congruent with prior phylogenetic studies.
Under an appropriate Bayesian model the implication is that, beyond reflecting information about relative tooth size, these data can recover plausible evolutionary relationships. Finally, though limited to one anatomical region and lacking proof of independence, the DM-scaled data, with an appropriate model, warrant consideration for future hominin phylogenetic research. This approach would preferably entail combining them with other quantitative and/or more traditional discrete characters from multiple anatomical regions to yield a larger character matrix. Moreover, DMscaled data are candidates for use in their original continuous form (Table 2). Recent advances in probabilistic phylogenetics, notably Bayesian inference, allow use of models, for example, Brownian motion, capable of approximating evolution of continuous morphological characters (Felsenstein, 2004;Parins-Fukuchi, 2018a, 2018b.
Beyond objective data recording as mentioned, continuous data do not require the ordering of states, and should retain phylogenetic information at much higher evolutionary rates than coded characters-qualitative or quantitative-because they do not necessitate compression into a limited number of states (Parins-Fukuchi, 2018a, 2018b. Initial results using this approach with the current dataset (

CONFLICT OF INTEREST
This statement is to certify that the authors have no conflict of interest to declare.

DATA AVAILABILITY STATEMENT
The odontometric data used in the analyses are available in the text (Tables 1-2) and in the supporting information file.