Effects of risk factors on periodontal disease defined by calibrated community periodontal index and loss of attachment scores

Objectives We evaluated whether and how the effects of risk factors on periodontal disease (PD) were modified by measurement errors using community periodontal index (CPI) and loss attachment (LA) in the community‐based survey. Methods A pilot validation study was performed to estimate the rates of false negative and false positive for both CPI and LA in 31 subjects from different regions using measurements from 12 well‐trained dentists and a senior periodontist as a gold standard. Afterward, a Taiwanese nationwide survey was conducted by enrolling 3,860 participants to estimate the effect of each risk factor on PD calibrated with both sensitivity and specificity of two indices. Results The values obtained for the sensitivity to false‐positive ratio for CPI ranged widely from 1.12 to 7.71, indicating regional variation in both errors. The calibrated adjusted odds ratio for smoking vs non‐smoking was higher than the uncalibrated odds ratio for PD defined by CPI (2.75 (2.01, 3.77) vs 2.02 (1.63, 2.52)) and LA (3.85 (2.44, 6.13) vs 1.93 (1.47, 2.54)) scores. Similar underestimation was noted for other risk factors. Conclusion The effects of risk factors on PD measured using CPI and LA in a large population‐based survey were underestimated without correcting for measurement errors.

reflect disease severity. More importantly, CPI is recommended by the WHO to use as an indicator of early PD for individuals 35 years and older as a part of large community-based screening programs.
In addition to its multistage and progressive properties, PD is characterized as a multifactorial chronic disease. Elucidating risk factors responsible for PD is therefore important for strengthening primary prevention of PD (Petersen & Ogawa, 2005). Numerous previous studies have identified a constellation of causative risk factors, including male gender, old age, smoking, education, and comorbidities such as type 2 diabetes and obesity (Haber et al., 1993;Jan Bergström & Preber, 1994;Kinane & Chestnutt, 2000;Lai et al., 2007;Wang et al., 2009;Lai et al., 2015).
While these epidemiological studies have reported the effect sizes (often reported by odds ratio or relative risk) of risk factors associated with PD, it is argued that measurement errors (false negative and false positive) in CPI or LA may affect these estimations. The effect of measurement errors on the evaluation of risk factors for PD measured by CPI or LA may not be a serious problem in clinical studies because if PD is severe, its diagnosis is unlikely to be affected by measurement errors. However, the misclassification of PD, particularly when examined by even well-trained general dentists, may result in either an underestimation or overestimation of the effect sizes for the risk factors of interest in the setting of a community survey in which many of the participants may be presymptomatic PD cases.
At this time, few studies have been conducted to address this issue. A previous Taiwanese study of a large-scale community-based survey, targeted to residents aged 18 years and older, used CPI and LA, as measured by a group of trained general dentists, to assess the prevalence of PD and a collection of conventional risk factors. These study characteristics render it a candidate study for assessing the impact of measurement errors on effect size. The aim of this study was to apply a two-stage design: The first stage included a pilot validation study conducted to estimate the rates of false-negative and falsepositive results for the CPI and LA measured by the examiners, and the second stage used these estimations of measurement errors to correct the effect sizes of the risk factors associated with PD based on the calibrated CPI and LA measured by the same examiners in a large-scale community-based survey to assess whether measurement errors underestimated or overestimated the effect sizes of each risk factor.

| Study design
In this study, we used a two-stage study design to estimate sensitivity and specificity. In the first stage, we conducted a validation study to calibrate the discrepancy in CPI and LA measurement between the gold standard (a senior periodontist, Lai H) and dentists after professional training in PD. The estimated sensitivity and specificity from this pilot study were used to calibrate the association between risk factors and PD obtained from the main study, a nationwide survey.
The periodontal examination was measured by CPI (WHO, 1997). The examination consisted of CPI scores in the following five categories: healthy, gingival bleeding, calculus, shallow pockets of 4-5 mm, and deep pockets of 6 mm or deeper (Ainamo et al., 1982). All participants provided informed consent after receiving sufficient information. This study was approved by the Joint Institutional Review Board of Taipei Medical University (TMUJIRB No. 201207011).

| Validation study
General dentists who participated in a nationwide survey of periodontal disease were invited to participate in a validation study comparing their measurements of CPI and LA with those taken by a senior dentist specializing in periodontology (gold standard). Two trained dentists selected from each area to participate in the nationwide survey (six areas: two northern, one central, two southern, and one eastern area of Taiwan), and one gold standard dentist examined a total of 31 subjects; these data were included in the analysis of the intra-and inter-rater reliability of the CPI and LA measurements. We excluded subjects that had undergone scaling or treatment for periodontitis in 2 months before calibration. Each subject was examined by both the trained general and gold standard dentists. All the teeth of each subject were examined by the different raters at six conventional sites: mesiobuccal, mid-buccal, disto-buccal, mesiolingual, mid-lingual, and disto-lingual. At each site, CPI and LA scores were measured and recorded. The highest score of all the sites in each sextant was treated as the representative of that sextant.

| Nationwide survey with calibration
The main objective of the nationwide survey, commissioned by the Health Promotion Administration, Ministry of Health and Welfare, Taiwan, in 2008, was to explore the prevalence and severity of periodontal disease and its association with oral hygiene, lifestyle, and other risk factors in adults aged 18 years and older. The details of this study have been described in full elsewhere (Lai et al., 2015). The nationwide survey invited 13 dentists and one gold standard dentist to measure the sextant-level CPI and LA for 4,601 subjects from different regions.
As one dentist did not complete the calibration stage, we excluded data related to that dentist, leaving twelve dentists and one gold standard dentist and their measurements on 3,860 subjects (17,244 sextants) for inclusion in the current study. These data were used to calculate the uncalibrated and calibrated ORs for the association between the risk factors and PD with simultaneous consideration of the correlated properties of sextant-level data from the same subject or the same region using the following Bayesian hierarchical random-effect model.

| Risk factors
For periodontal participants in the main survey, we designed a structured questionnaire to collect information on a constellation of variables, including demographic variables; anthropometric measurements, such as height and weight; lifestyle factors, including cigarette smoking, alcohol consumption, and betel quid chewing; and personal diseases, such as type 2 diabetes mellitus. The questionnaire was administered by public health nurses between 2007 and 2008 in the nationwide survey.

| Definition of underestimation and overestimation
In epidemiological studies, biases due to these measurement errors are classified into differential and non-differential. If the effect size is away from the null hypothesis (no association expressed by OR = 1), it is a differential misclassification and often results in overestimation of effect size if uncorrected. On the other hand, if the effect size is toward the null hypothesis, it is called non-differential misclassification and often results in underestimation if uncorrected.
We conducted a validation study to assess the possibility of measurement errors. The Supplementary provides an example demonstrating how the effect size of smoking on the odds of PD measured by CPI was substantially changed after correcting for the measurement errors; these results were classified as non-differential misclassification due to the underestimation of the effect size of smoking, which inflated from 2 to 4.43 for the uncalibrated and calibrated odds ratios, respectively. This Bayesian hierarchical model may be further applied to large-scale epidemiological surveys to calibrate the odds ratios of other risk factors.

| Statistical analyses
In order to measure the impact of the risk factors on PD, the attributable proportion (AR) and population attributable proportion (PAR) were used in the analysis. AR was defined as the proportion of disease in the exposed group that could be attributed by a given risk factor.
The AR was formally written as, PAR is the proportional reduction in population disease that would occur if exposure to a risk factor were reduced to an alternative exposure scenario. The formula was written as, In the calibration study, the estimates of sensitivity and specificity comparing PD status measured as by participating trained dentists and PD status measured by the gold standard dentist and their confidence intervals following binomial distribution were reported. For the nationwide survey, we first reported the distribution of sextantlevel PD measured by the participating trained dentists by personal characteristics, including gender, age, education level, body mass index (BMI), type 2 diabetes mellitus (DM) and lifestyle factors such as cigarette smoking and alcohol consumption. To take into account the correlated property of sextant-level data from the same subject or the same dentist, we applied a Bayesian hierarchical model with the incorporation of correlated properties (Yen, Liou, Lin, & Chen, 2006) and measurement errors to estimate the calibrated odds ratio between risk factors and PD; we applied this hierarchical univariate logistic regression model with a random intercept, accounting for different baselines in the same cluster, to estimate the crude odds ratio (cOR) for the effect of each risk factor on PD. The random intercept term was assumed to follow a normal distribution centered at zero with a standard deviation, denoted by σ, which was used to test whether the random effect is statistically significant. Finally, the hierarchical multivariable logistic regression models with and without calibration were further applied to calculate the calibrated odds ratio (OR) adjusting for confounding factors with each other. We also calculated AR and PAR for each risk factor given the estimated adjusted odds ratios before and after calibration. The estimation of the hierarchical models was accomplished using the Markov chain Monte Carlo simulation underpinning the developed Bayesian-directed acyclic graphic model and Windows-based Bayesian Inference Using Gibbs Sampling (WinBugs) software (Spiegelhalter, Thomas, Best, & Lunn, 2004). The 95% confidence interval was extracted from the posterior distribution of each parameter and reported for the assessment of statistical significance.

| RESULTS
The overall sensitivity and specificity of CPI measurement at the sex-  Table 1 shows the descriptive data for the nationwide survey on periodontal disease at the sextant-level with CPI and LA scores. Table 2 shows the comparisons of crude and adjusted odds ratios (ORs) with PD defined by CPI ≥3 or LA ≥ 1 as the outcome between the uncalibrated and the calibrated models. The effect sizes for calibrated cORs for each variable in the univariate analysis were more considerably further away from the null (cOR = 1), displaying so-called non-differential misclassification, in the calibrated model in comparison with the corresponding uncalibrated model, suggesting an underestimation of the influence of each variable on PD in the absence of calibration. Taking smoking as an example, the cOR for the odds of PD for smokers vs non-smokers was two times greater with calibration than without calibration and increased from 3.42 (2.81, 4.17) to 6.50 (4.65, 9.39).
Similar underestimations of cORs were also noted for other variables with various extents of non-differential misclassification.
Table 2 also shows that the influence of such non-differential misclassification on the underestimation of effect sizes was attenuated by adjustment, but the tendency toward non-differential misclassification still remained in the multivariate analysis with adjustment for variables with (multivariate model 2) and without (multivariate model 1) incorporation of alcohol drinking.

Attributable proportion =
Odds Ratio − 1 Odds Ratio Population attributable proportion = (Exposure%) × (Odds Ratio − 1) (1 + (Exposure%) × (Odds Ratio − 1)) T A B L E 2 Estimated adjusted odds ratio of risk factors for periodontal disease with CPI and LA (CPI ≥ 3 or LA ≥ 1) at sextant-level for univariate logistic regression model before and after calibrating measurement errors using Bayesian hierarchical model T A B L E 1 Descriptive data on nationwide survey of periodontal disease at sextant-level with CPI score and LA score The underestimation of effect sizes with PD defined by CPI ≥3 alone is listed in Table 3. Elevated effect sizes for calibrated and uncalibrated ORs and 95% confidence intervals were noted for the association between smoking and PD in the univariate analysis (2.62 (2.19, 3.18) to 4.29 (3.22, 5.84)) and in multivariate analysis 2.02 (1.63, 2.52) to 2.75 (2.01, 3.77). The calibrated OR was still consistently higher than the uncalibrated OR using different cutoffs of CPI score (Table S2). These findings suggest such a non-differential misclassification is unlikely to be modified by different definitions of PD. Table 4 shows the corresponding results using a LA score ≥1 alone.
It is interesting to note that the alteration of effect size was greater than that observed using CPI alone. However, such a change was ameliorated when all the variables were considered in the multivariate analysis.

| DISCUSSION
Because periodontal probing measurements depend on a hand-held probe, the outcome measurements are subject to a dentist's subjective judgment and periodontal expertise (such as probing force and position). Therefore, the potential probability of measurement error for PD is greater than for other diseases when community-based screening for early detection of PD is conducted. This may explain why prevalence of PD varied from study to study.
As expected, the periodontal measurement errors in our validation study varied with region. There were higher sensitivities in the northern and eastern area but more false-negative cases in the central area and two southern areas. For periodontal disease prevalence surveys, the measurement errors exist across dentists. Therefore, before a nationwide survey, we had conducted a validation study to assess the measurement errors in the measures for PD and use them for calibration to improve the accuracy of PD prevalence estimation. Moreover, the magnitude of the effects of the risk factors on periodontal disease was also affected.
The results of the effect of the calibration of the estimation of effect size for each risk factor associated with the risk for PD were consistently demonstrated as non-differential misclassification using either CPI or LA score. Specifically, the calibrated OR was generally higher than the uncalibrated OR, although the underlying effect sizes in terms of OR varied with different cutoffs for CPI score. These findings suggest that non-differential misclassification is unlikely to be modified due to using different outcomes to define PD. T A B L E 5 AR and PAR by status of PD with CPI and LA at sextant level One might be interested to know whether the measurement errors are different by different sites. Suppose senior periodontist is less likely to include such kind of gingival recession as the outcome of CPI, the sensitivity of mid-buccal sites is supposed to be lower than that of other sites in addition to the quality of professional training in the skills of CPI and LA. It is interesting to assess the impact of measurement errors attributed to this drive resulting from brushing. Unfortunately, we did not collect data at site level and only at sextant level in the main study, and we could not re-analyze the data by sensitivity analysis with and without excluding mid-buccal. However, we can check the influence of this concern on measurement errors by examining sensitivity and specificity by different sites using data from the first stage of validation study that were collected on the basis of site level. Based on the validation data on 31 participants, the sensitivity of mid-buccal site (36%) was lower than other sites (47%) for the CPI measurements.
The sensitivity of mid-buccal site (56%) was lower than other sites (67%) for the LA measurements. The impact of lower sensitivity might underestimate the effect size of the risk factors. However, the analysis of data on the main study can be only limited to sextant-level due to unavailable information on site level. This is one of our study limitations. Another limitation is that our periodontal measurements were recorded at the sextant-level in a large-scale epidemiological study, and the measurement at sextant-level was determined by index teeth. The highest score of all sites in each sextant was selected as the representative of that sextant in our calibration. However, whether measurement error at the sextant-level in the validation study can represent measurement error in a large epidemiological study should be confirmed in future studies.
In conclusion, our study shows that the effect of measurement error on PD varied with dentists and regions. The results of our validation study provide the basis for correcting the effect size regarding the association between relevant correlates and PD. The estimated odds ratio for certain risk factor (such as smoking) in association with PD was substantially underestimated without calibration, which may also undervalue the ability of risk factor intervention through health education to impact PD at the population level.