Using national register data to estimate the heritability of periodontitis

Aim: To identify whether periodontal traits derived from electronic dental records are biologically informative and heritable. Materials and methods: The study included 11,974 adult twins (aged 30– 92 years) in the Swedish Twin Registry. Periodontal records from dental examinations were retrieved from a national register and used to derive continuous measures of periodontal health.


| INTRODUC TI ON
Gum inflammation, a common reversible condition, is caused by tooth-associated bacteria which initiate an inflammatory response leading to host defence cell activation. The process may be limited to the gingiva but some people develop an irreversible destruction of the tooth-supporting connective and bone tissue (periodontitis) (Pihlstrom et al., 2005). Despite improved dental care, periodontitis continues to affect a significant proportion of adults and approximately 10% have a severe form (Kassebaum et al., 2014). In some populations, including Sweden and Scotland, periodontitis is a main cause of tooth loss from middle age (Chestnutt et al., 2000;Haworth et al., 2018). The onset, progression and ultimate outcomes of the disease are influenced by the individual's genetic risk alongside other factors including tooth microbiota, immune biology and behaviour, but current knowledge is insufficient to identify people at risk for severe disease.
Heritable contributions to periodontal disease have been explored in animal studies, twin-, family-and genome-wide association (GWA) studies in humans. Two recent meta-analyses support a heritable contribution to periodontitis (Nibali et al., 2019;Nibali et al., 2020), albeit with considerable variation between and within studies. Heritable factors explained around 15% of periodontitis variance in family studies, 38% in twin studies and 7% in GWA studies (Nibali et al., 2019) and 43% in experimental animal studies (Nibali et al., 2020). Thirteen twin studies were included with eight studies using clinically assessed periodontal status in up to 132 twin pairs and five studies using self-reported periodontal status in 2,122 to 10,578 twin pairs (Nibali et al., 2019). The two largest studies involved twins in the Swedish Twin Registry (STR) (Mucci et al., 2005;Mucci et al., 2009) with self-reported proxies for periodontal disease and dichotomous disease classification. To date, there is no large-scale twin study employing clinically assessed periodontal data.
Genetic information can theoretically be used in clinical practice for risk stratification and precise diagnosis; however, this requires knowledge of the specific genetic loci involved and few loci have been identified in GWA studies to date (Shungin et al., 2019;Morelli et al., 2020). This may be because GWA studies need to choose simple phenotypes which can be collected in large studies or harmonized across different studies. While dichotomous disease classification is practical in large studies, results so far suggest this may not be sufficiently biologically informative (Shungin et al., 2019;Morelli et al., 2020) and there is therefore a need to identify other measures which are both practical and biologically meaningful.
While it is difficult to estimate the variation in a trait due to any single genetic variant without committing to GWA studies, understanding the aggregate variance explained by all additive genetic variation may help guide the traits which are most likely to have detectable single variant association signals.
Quantitative measures may have greater power and capture disease severity which is missed in dichotomous classification.
Principal components (PCs) of periodontal traits (Offenbacher et al., 2016) appear to have stronger genetic association than dichotomous traits but lack clear clinical interpretation. As an alternative approach, number of deepened probing pockets, a proxy for tooth attachment loss, might provide a quantitative measure that captures both presence and severity of disease and has clinical interpretation. However, there is no evidence whether this trait is sufficiently heritable for GWAS. A first aim of this paper was therefore to test heritability of quantitative traits for periodontitis versus dichotomous classification in a large twin study with clinically assessed phenotype data.
A different approach involves refining qualitative measures by including multiple sources of information or by including more groups than a conventional dichotomous measure. For example, the 2017 periodontitis classification suggests a comprehensive classification of multiple disease stages. This requires measurement of markers for both current and previous disease experience including clinical attachment loss, amount and percentage of bone loss, probing depth, angular bone defects and furcation involvement, tooth mobility and previous tooth loss due to periodontitis (Caton et al., 2018). Others suggest inclusion of additional host traits such as inflammatory biomarkers alongside clinical traits . While these protocols are attractive for precise diagnosis in the single patient, there are practical and theoretical problems for genetic epidemiology. At a practical level, complex and detailed examination protocols are difficult to implement in very large studies and some information (e.g. on the reason for tooth loss) may not be present. Theoretically, the inclusion of host traits which are partially heritable in phenotypic definition, including inflammatory biomarkers, can complicate inference. However, if refined subgroups could be identified from data sources in existing dental charting, this would overcome both problems. Therefore, a second aim of this project was to empirically identify cases based on latent structure in the dental charting rather than on pre-defined criteria and evaluate whether these measures are any more heritable than dichotomous classification.
Principal findings: In this cohort, periodontal traits were moderately heritable including rate of change in number of severe periodontal pockets. Around 10% of the study group had a severe form of periodontitis with high risk for future tooth loss.
Practical implications: Periodontitis is a moderately heritable disease, and genetic risk factors appear relevant for disease severity and progression as well as onset.

| Study participants and exclusion criteria
Twins in the STR (https://ki.se/en/resea rch/the-swedi sh-twinregistry) aged 30 years or older for whom information on periodontal status could be retrieved from the national quality register on caries and periodontitis (SKaPa, www.skapa reg.se) were eligible (n = 18,653). Twins were excluded if (a) the zygosity was unknown, (b) data were unavailable for both twins in a pair, (c) either twin in a pair had fewer than four remaining natural teeth or (d) the dental examination dates fell more than ±1 year apart in the twin pair. Where multiple dental visits were available for one or both twins, dental visits which were closest together in time were chosen for the primary analysis. The final cohort contained 11,974 twins ( Figure 1).
The STR received ethical approval from the Regional Ethical

| Zygosity ascertainment
Information on zygosity was determined by a validated test using 46 single-nucleotide polymorphism markers (Magnusson et al., 2013), an intra-pair similarity algorithm, or by being opposite sex.
Approximately 12% of all pairs had their zygosity determined by DNA-based tests.

| Retrieval and quality control of periodontal records
Dental examinations in Sweden include probing of periodontal pocket depth and results are stored in electronic records. SKaPa started to compile data in 2008 and since 2010 has compiled and ensured quality control over data from public dental health and some private clinics in Sweden. Information from records where pocket depth had been recorded at either 4 or 6 (mesio-buccal [mb], mesio- and mid-lingual [l]) tooth sites was retrieved from SKaPa using unique person identification numbers. Examinations with two sites recorded per tooth or with the Community Periodontal Index of Treatment Need (CPITN) scores were excluded. Records meeting the inclusion criteria were retrieved for all visits between 2010 and 2019. Probing pocket depth, here used as a proxy for loss of periodontal supportive tissues and periodontal disease stage, was graded as no (<4 mm), mild (4 mm), moderate (5 mm) and severe (≥6 mm) disease . For records where four sites (mesial, distal, buccal and lingual) were measured, the mesial measure was used to impute the missing mb or ml values and the distal measure to impute db or dl.
For teeth missing due to hypodontia, orthodontic treatment, trauma, caries or failed endodontic treatment, periodontal pockets were imputed as <4 mm. Periodontal pockets of teeth missing due to periodontitis or other reasons were imputed using an age-and tooth-based approach. Values were assigned taking into account tooth type, that is missing premolars are a common result of orthodontic treatment and not periodontal disease, number of 5 or 6 mm pockets found for the corresponding tooth elsewhere in the mouth and (where no corresponding tooth was available), the most common reported pocket depth at that tooth type in other patients of the same age group.

| Derivation of quantitative periodontal traits
For each twin and for each visit, three quantitative scores were created representing (a) the sum of tooth surfaces with 5 mm pocketing or greater termed "moderate and severe surface sum", (b) the sum of tooth surfaces with 6 mm pocketing or greater termed "severe surface sum" and (c) the number of tooth surfaces which were missing at the time of examination termed "missing surface sum". For twins with two or more visits spanning at least 1 year (n = 11,171; Figure 1), rates of change in "moderate and severe surface sum" and "severe surface sum" over time were estimated using a random intercept, random gradient linear mixed model (detailed description in Appendix S1).

| Derivation of qualitative periodontal traits
Qualitative traits were defined using a conventional system (with a prior definition) and a latent class approach to define categories with empirical support in the data. The dichotomous classification of periodontitis was scored as disease-free if all pockets were ≤4 mm and disease-affected if at least one pocket was ≥5 mm. Latent class validation was then performed using an observational, longitudinal analysis of tooth loss, with the rationale that class allocation at the first visit should predict tooth loss at subsequent visits if the latent classes captured biologically informative differences in periodontal health. The analysis was restricted to a subset of participants who had dental data on at least two occasions one or more years apart. Latent classes were re-defined in a model based only on data from the first dental visit available, and hazard and incidence risk for subsequent tooth loss during follow-up was estimated (Appendix S2).

| Estimating the heritability of periodontal phenotypes
For each periodontal trait, quantitative genetic models were fitted with data from monozygotic and dizygotic twin pairs and same-and opposite-sex twin pairs (Kohler et al., 2011). These models decompose variation into three variance components, namely variation attributable to additive genetic effects (termed variance component A), shared environmental factors (termed variance component C) and unexplained variation due to a combination of unique environmental factors and error in the model (termed variance component E). These models assume that periodontal traits are partially correlated in twin pairs due to components A and C, but can distinguish between these two components since monozygotic twin pairs share ~100% of their nuclear DNA while dizygotic twin pairs share on average ~50% of their nuclear DNA.
Quantitative traits (moderate and severe surface sum, severe surface sum and rate of change in these two sums, missing surface sum) were modelled on the observed scale. Traits were regressed on covariates, and residuals were transformed to z-scores. Qualitative traits were modelled using a liability threshold model with one threshold (for dichotomous traits) or multiple thresholds (for ordinal categorial traits). These models assume an underlying continuous distribution of disease liability which presents as an observed category when a threshold level of liability is reached. All models included adjustment for age, age squared, sex and examination year and were fitted using OpenMx (version 2.17.2) implemented in R (version 3.6.3) (Neale et al., 2016). All estimates of A, C and E are standardized to the total phenotypic variance and presented as per-

| Demographics
Totally, 27,107 twins were potentially eligible for inclusion in the study; 11,974 twins met all inclusion criteria and were included in the final sample (Figure 1). There were a similar number of monozygotic, dizygotic same-sex and dizygotic opposite-sex pairs, but a slightly higher proportion of female than male participants (Table 1).

| Latent classes
Latent class analysis was undertaken using the 11,974 twins, and a three-class model was chosen which had confident class allocations.
Underlying metrics are found in Table S1 and posterior probabilities are plotted in Figure S1.
The characteristics of participants in each class are shown in Table 2 and the pattern of periodontitis features in Figure 2. Briefly, Class I was the largest group and was formed of people with few signs of periodontitis. Class II was the next largest group and included people with localized features of periodontitis but low levels of tooth loss. Class III included under 10% of the population with more severe and generalized signs of periodontitis and higher levels of tooth loss. The proportion of female participants was similar in the three classes, but those in classes II and III were older than in Class I.

| Latent class longitudinal validation
Longitudinal validation of the three latent classes was performed by re-classifying the participants based on their first visit and comparing TA B L E 1 Demographic information of the study sample (n = 11,974 twins) included in the principal analysis rates of tooth loss at subsequent appointments. As anticipated, a slightly higher proportion of participants were classified in Class I than in the main analysis, given the younger age and generally better periodontal health at the first visit compared to subsequent visits ( Birth year, median [range] 1976 [1926, 1989] 1961 [1919, 1989] 1945 [1919, 1989] Examination

F I G U R E 2
Prevalence of features of periodontitis (proportion of pockets ≥5 mm) at the six estimated tooth surfaces in the three latent classes at the index dental examination participants in Class I (Table 3). Given these findings, latent class membership was treated as an ordinal trait in heritability analysis, where classes II and III are assumed to represent different severity levels of a similar underlying trait.

| Heritability of periodontal phenotypes
The heritability of quantitative traits for number of periodontitisaffected sites was estimated at approximately 40%, with similar results for moderate/severe and severe phenotypes. Compared with the cross-sectional analysis, estimated heritability of rate of change, that is the gradient random effect from linear mixed modelling, in the severe phenotype was slightly higher and with confidence intervals excluding the cross-sectional estimate, but estimates were similar for the moderate/severe phenotype. Heritability of number of missing teeth was estimated at around 48% (Table 4).
The heritability of periodontal latent class membership was estimated at around 45%, with a slightly higher point estimate than a crude definition from presence or absence of a pocket ≥5 mm, but with overlapping confidence intervals. This approach is hampered by a lack of defined criteria for disease classification and does not take disease severity (extent and progression) into account. We hypothesized that an empirical classification based on a latent class model using probing pocket measures would produce more biologically informative categories with higher heritability than dichotomous classification. Although the latent classes appeared to perform better than a crude dichotomous classification in observational analysis of incident tooth loss, the estimated heritability was only slightly higher than the dichotomous classification.

| DISCUSS ION
Given that these latent classes introduce additional complexity into the analysis and interpretation of data, that is extrapolating findings across populations with differing latent structures, it is unclear whether this approach would be of benefit in future GWA studies.
Although the value of a latent class approach in GWA studies is not clear, it may be useful in other research contexts. The latent class analysis identified a group of just under 10% of the study population with signs of severe periodontitis and high risk for tooth loss during longitudinal follow-up. This is in keeping with epidemiological data (Kassebaum et al., 2014) and a recent clinical study in Sweden where around 60% had no, 30% moderate and 10% severe periodontitis based on probing pockets and attachment levels from clinical examination and radiographs (Jönsson et al., 2020) and other studies from Sweden (Wahlin et al., 2018). Moreover, the group response pattern parallels the findings that people with different periodontal profiles at baseline have different rates of disease progression and tooth loss (Morelli et al., 2018). This potentially supports a role for empirically derived risk prediction rules in clinical precision diagnosis and risk stratification and potentially in risk screening for adverse health outcomes, such as cardiovascular disease (Holmlund et al., 2017).
We hypothesized that quantitative traits taking disease severity into account would be more heritable than a crude dichotomous measure (Diehl et al., 2005). There was partial support for this hypothesis, as the highest heritability estimate was seen for rate of change in number of affected pockets, but cross-sectional analyses did not yield higher heritability estimates than dichotomous classification. This may be because rate of change was more robust to some forms of measurement error in the present setting. We performed imputation using information on the reason for tooth loss to identify teeth which were unlikely to have signs of periodontal disease, and age and tooth imputations based on the non-missing tooth surfaces to assign values for other missing teeth. Errors introduced by this imputation are not expected to explain changes in periodontal status over time, which may explain the improved results in models using longitudinal rate of change over cross-sectional data. Thus, for future GWA studies in cohorts where genetic data have already been obtained, measures of disease progression may be preferred to cross-sectional quantitative or qualitative assignments and have the additional advantages that they can be obtained from pre-existing dental records at low cost and have greater statistical power for the same sample size than dichotomous measures (Altman & Royston, 2006). Other approaches to minimize measurement error involve using additional clinical data (e.g. with radiographic assessment of alveolar bone levels) or combining clinical data with biomarkers . These approaches are appealing but such data are not routinely available from clinical records and are expensive and demanding to collect at scale. In addition, biomarkers are themselves heritable, which complicates inference and may exclude use of GWA results in downstream analyses (Figure 3). Given the need for very large sample sizes in genetic epidemiology, it seems likely that longitudinal progression traits provide a suitable trade-off between heritability and practicality and are therefore suitable for future GWA studies. We note however that these traits can only be derived where dental records can be linked, and in large studies where this is not possible, there may still be value in the use of less refined proxy phenotypes.
Compared with the literature, the narrow-sense heritability estimates from the present study are slightly higher than the metaanalysis estimate of broad-sense heritability of periodontitis of 0.38 in twin studies reported by Nibali et al. (2019). Results are similar to the narrow-sense heritability estimate of 0.42 reported in two F I G U R E 3 Theoretical consideration for including a heritable biomarker in phenotypic definition. Case a: the biomarker is a measure of periodontitis and host genotype cannot affect the biomarker except through periodontitis. Inclusion of the biomarker in phenotypic definition reduces measurement error in estimation of genetic effects on periodontitis. Case b: the biomarker is affected both by periodontitis and by host genotype. The estimate of genetic effect on periodontitis is biased. Case c: genotypes with no real effect on periodontitis can still be associated through the biasing pathway provided the biomarker is partially heritable previous studies in STR twins (Mucci et al., 2005;Mucci et al., 2009 The major strengths of the study include the sample size, which is considerably larger than previous twin studies with clinical periodontal data and therefore allows more precise estimates of heritability. The sample is also considered representative of Swedish adults who attend dental primary care services, as the twins were not recruited on the basis of their dental status. The inclusion of longitudinal data for most participants is considered a strength as few previous studies have modelled genetic effects on longitudinal change in periodontal status.
In conclusion, the study found that a little over 40% of variation in periodontitis traits is due to additive genetic factors in Swedish adults.
The study demonstrates the potential value of electronic clinical records in genetic epidemiology, as moderately heritable periodontal traits can be generated from clinical records alone without the use of biomarker or other adjuvant data. Quantitative traits derived from clinical records are an attractive target for future GWA studies aiming to identify the specific genetic risk loci implicated in this important disease.

ACK N OWLED G EM ENTS
The participating twins and the staff at the Swedish Twin Registry and the SKaPa quality register are greatly acknowledged. had any influence on the study design, data collection, analysis, or interpretation, or the writing of the manuscript.