We developed rheumatoid arthritis (RA) risk models based on validated environmental factors (E), genetic risk scores (GRS), and gene–environment interactions (GEI) to identify factors that can improve accuracy and reclassification.
We developed rheumatoid arthritis (RA) risk models based on validated environmental factors (E), genetic risk scores (GRS), and gene–environment interactions (GEI) to identify factors that can improve accuracy and reclassification.
Models including E, GRS, and GEI were developed among 317 white seropositive RA cases and 551 controls from the Nurses' Health Studies (NHS) and validated in 987 white anti–citrullinated protein antibody–positive cases and 958 controls from the Swedish Epidemiologic Investigation of Rheumatoid Arthritis (EIRA), stratified by sex. Primary analyses included age, smoking, alcohol, parity, weighted GRS using 31 non-HLA alleles and 8 HLA–DRB1 alleles, and the HLA × smoking interaction. Expanded models included reproductive, geographic, and occupational factors and additional GEI terms. Hierarchical models were compared for discriminative accuracy using the area under the receiver operating characteristic curve (AUC) and reclassification using the integrated discrimination improvement (IDI) and the continuous net reclassification improvement.
The mean age at RA diagnosis was 56 years in the NHS and 51 years in the EIRA. Primary models produced AUCs of 0.716 in the NHS, 0.716 in women in the EIRA, and 0.756 in men in the EIRA. Expanded models produced improvements in discrimination with AUCs of 0.738 in the NHS, 0.724 in women in the EIRA, and 0.769 in men in the EIRA. Models including genetic factors (G) or G + GEI improved reclassification over E models; the full E + G + GEI model provided the optimal predictive ability by IDI analyses.
We have developed comprehensive RA risk models incorporating E, G, and GEI that have improved the discriminative accuracy for RA. Further work developing and assessing highly specific prediction models in prospective cohorts is still needed to inform primary RA prevention trials.
Rheumatoid arthritis (RA), an autoimmune disease that causes inflammatory and disabling arthritis, is thought to develop in individuals with inherited genetic risk factors after exposure to environmental factors (E), including cigarette smoking, residential history, air pollution, occupational exposures, alcohol, female reproductive factors, and low socioeconomic status ([1-31]). The identification of risk alleles for RA through genome-wide association studies and meta-analyses, along with the discovery of gene–environment interactions (GEI), could potentially allow the prediction of RA risk among individuals without symptoms ([2, 32-42]). The Framingham Risk Score () was developed with the specific goal of clinical risk prediction, aiding clinicians in making both recommendations about risk factor modification and decisions about preventive treatment. This successful paradigm of individualized risk factor assessment and stratification has led to a reduction of cardiovascular morbidity and mortality worldwide ([44, 45]). Efforts are now underway to develop similar predictive models for the early identification of individuals at high risk of developing RA among asymptomatic populations who could be enrolled in primary prevention trials (). The first step in this process is to determine the optimal variables to include in such models.
The goal of this study was to develop a risk model based on E that can be collected easily at a clinical visit, and to study the benefit of adding genetic (G) and GEI terms to the model. We used novel statistical methods to choose the optimal combination of predictors among E, genetic susceptibility alleles, and GEI for training and validation in an independent data set. Our hypothesis is that models including environmental and genetic and GEI terms will have the optimal predictive accuracy.
We conducted a nested case–control study of RA susceptibility among the NHS and NHSII prospective cohorts. Among 121,700 female nurses ages 30–55 years in the NHS, 32,826 participants (27%) provided blood samples and another 33,040 (27%) provided buccal cell samples. Of 116,609 female nurses ages 25–42 years in the NHSII, 29,611 (25%) provided blood samples. The 2 cohorts were combined in this study and are referred to as the NHS. Incident RA cases in the NHS were confirmed using a 2-stage screening method with a connective tissue disease screening questionnaire for RA symptoms () and confirmed by chart review by 2 board-certified rheumatologists (EWK, KHC). Rheumatoid factor was determined by chart review and anti–citrullinated protein antibody (ACPA) status was determined by chart review and/or direct assay for RA cases with banked plasma samples from prior to diagnosis (). For each confirmed RA case, a healthy control was chosen, matched on cohort (NHS/NHSII), year of birth, menopause status, and postmenopausal hormone use. There were 585 women with validated RA who provided blood samples; 21 cases (4%) were excluded for non–self-reported white race and an additional 22 (4%) were excluded due to missing HLA information. For anyone missing other single-nucleotide polymorphisms, we assigned them a value equal to the expected value (2 × risk allele frequency defined in cases or controls separately). Finally, since prior genetic association studies focused on seropositive RA (), analyses were limited to seropositive RA cases (n = 317) compared to healthy controls (n = 551).
The EIRA is a population-based case–control study that enrolled newly diagnosed cases of RA ages 18–70 years between May 1996 and December 2009 in Sweden. Controls were randomly selected and matched to cases on age, sex, and geographic location ([7, 40]). In the EIRA, a total of 1,218 ACPA-positive RA cases and 1,129 controls recruited from May 1996 to the end of 2009 were selected for genome-wide genotyping. After sample quality control (sample genotype call rate >0.95, ethnicity outliers removed), 988 ACPA-positive cases (81%) and 958 controls (85%) remained. One male case with 3 HLA alleles was further removed from the analyses. A final data set with 987 cases (702 women and 285 men) and 958 controls (715 women and 243 men) with information on the 31 non-HLA loci and the HLA alleles was used for analyses.
All aspects of these studies were approved by Partners' HealthCare or the Karolinska Institutet Institutional Review Boards.
We selected E that had been demonstrated in the literature by other groups and replicated in our data sets to be significantly associated with RA susceptibility, including age, smoking, alcohol, education, and parity (in women only), and could be easily ascertained in clinical practice ([3, 4, 8, 9, 16, 18, 19, 24, 30, 31, 49]). In the primary analysis, we only included factors that were available in both cohorts and not discovered in either the NHS or the EIRA. For a secondary analysis, we explored expanded models that also considered risk factors first published by our groups, including additional reproductive factors, geographic region, and occupational exposures (in men only) ([10, 11, 14, 15, 24]), as well as GEI for the GST and HMOX1 genes ().
E for the primary models in the NHS included year of birth, smoking (pack-years), alcohol consumption (cumulative average of daily intake), husband's educational attainment (as a marker of socioeconomic status), and parity (for women). For expanded NHS models, we included region of residence at age 30 years, age at menarche, menstrual regularity, breastfeeding, oral contraceptive use, menopause status, and postmenopausal hormone use ([10, 24]). All E were updated through the biennial questionnaire before RA diagnosis (or the index date for controls).
E for the primary models in the EIRA included year of birth, smoking exposure before disease onset (pack-years), alcohol consumption, education level (as a marker of socioeconomic status), and parity (for women). For expanded models in men in the EIRA, we also included occupational exposure to silica, mineral oil, and solvents, and HLA shared epitope (SE) × exposure interaction terms ([14, 15]). All data were collected from subjects at the time of incident RA and pertained to exposures prior to RA onset.
Thirty-nine validated risk alleles for RA were combined to form a continuous GRS, a weighted combination of 8 HLA–DRB1 SE alleles and 31 non-HLA risk alleles (GRS39), weighted by the natural log of the published odds ratio () (see Supplementary Table 1, available in the online version of this article at http://onlinelibrary.wiley.com/doi/10.1002/acr.22005/abstract). To assess the independent contribution of the HLA SE alleles, we created a GRS limited to only the non–HLA SE alleles (GRS31). Genotyping and quality control procedures for both the NHS and EIRA are described in detail elsewhere ([2, 37, 41, 42]).
GSTT1 homozygous deletion (GSTT1 null) and HMOX1 variants, GSTT1 × smoking and HMOX1 × smoking interaction terms, were included in the expanded models because both were found to significantly interact with smoking and RA risk in the NHS, with replication of the GSTT1 × smoking interaction in the EIRA ().
Logistic regression models were used to estimate the predicted odds of seropositive or ACPA-positive RA. GEI was modeled on a multiplicative scale using a product term (G × E). Analyses were performed in accordance with the 25 recommendations for evaluations of risk prediction models (Genetic Risk Prediction Studies statement) ().
The lists of all variables and interactions considered for inclusion are summarized in Table 1 in 2 categories described as “primary” variables available in both the NHS and EIRA and “expanded” variables available in either the NHS or EIRA cohorts. The NHS was used as the primary data set to determine the optimal combination of variables. The EIRA was not used as the primary data set because many of the genetic risk alleles and the HLA × smoking interaction were discovered in this study and could have led to overfitting of the models. Variables were selected based on contribution to the overall model prediction based on the integrated discrimination improvement (IDI; described below). The final list of variables included in the NHS model was then assessed in the EIRA validation data sets.
|Factors||Primary model||Expanded NHS model||Expanded EIRA model|
|Environmental||Year of birth||Year of birth||Year of birth|
|Parity (F)||Parity/breastfeeding (F)||Parity (F)|
|Menses age <12 years (F)||Silica (M)a|
|Menstrual irregularity (F)||Mineral oil (M)b|
|Menopause (F)||Solvents (M)|
|PMH use (F)|
|Region (US) (F)||Region (Sweden)|
|Genetic||HLA SE (0, 1, 2)||HLA SE (0, 1, 2) (F)||HLA SE (0, 1, 2)|
|Gene–environment interactions||HLA SE × smoking||HLA SE × smoking||HLA SE × smoking|
|GSTT1 × smoking||GSTT1 × smoking|
|HMOX1 × smoking||HMOX1 × smoking|
|Environment–environment interactions||Silica × smoking (M)a|
|Mineral oil × smoking (M)b|
|Solvent × smoking (M)|
As a secondary analysis, we studied expanded risk models that considered all variables, including those available only in one cohort or discovered in either the NHS or EIRA. For men in the EIRA, we excluded parity but added silica, mineral oil, and solvents to the expanded models. These models were further categorized according to variable groups: E, G, and GEI. In order to assess the benefit of adding G variables and GEI terms to risk models, we compared models with just E variables to those with E + G and finally models with E + G + GEI + E × E. These models were developed in a stepwise fashion in each data set (NHS, EIRA women, and EIRA men). The optimal combination of variables was chosen based on significant contribution to the model using the IDI explained below.
Models were assessed on variance explained, goodness of fit, and discrimination ability. Nagelkerke's R2 was used as a measure of variance explained by the model (). Goodness of fit was measured using the Hosmer-Lemeshow chi-square test, where a nonsignificant value indicates a good model fit (). Finally, receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) were used to assess how well each model discriminates between RA cases and controls. Ninety-five percent confidence intervals (95% CIs) were obtained via bootstrapping. The models were compared within each data set using the IDI, both to decide on inclusion in the primary NHS model (and then tested in the EIRA) and also between models in the expanded analysis. Finally, the continuous net reclassification improvement (cNRI) was used to compare the primary and validation models to the optimal model from the expanded analyses within each data set.
Developed by Pencina et al ([54, 55]), the IDI is a measure of overall improvement in sensitivity and specificity between 2 models (e.g., E model compared with the E + G model). The IDI is calculated using the predicted probability (p) of the outcome (seropositive RA or ACPA-positive RA) in 2 models, as follows:
One limitation is that the IDI is calculated using predicted probabilities, which are biased in a case–control design because the probability of RA in the sample is not an estimate of probability of RA in the population. However, for our analyses, we used the IDI to compare models within the same cohort, not to generalize across cohorts or to the general population.
To further address this issue, we also used the cNRI developed by Pencina et al (). Whereas the original NRI required categories of risk and quantified the overall upward and downward movement between categories (), the new cNRI does not require categories. The cNRI () quantifies any movement in predicated probability from the models. This can also be thought of as the amount of correct reclassification among event and nonevents without respect to the magnitude of the change in predicted risk. The cNRI is calculated as: Depending on the future application of a prediction model, an improvement in reclassification of cases may be more important than that of controls, or vice versa. The cNRI can therefore be interpreted for cases (cNRI [events]) and for controls (cNRI [nonevents]) separately.
In the NHS, 317 seropositive RA cases had a mean ± SD age at diagnosis of 56 ± 10 years and 195 (63%) were current or former smokers (Table 2). In the EIRA, 987 ACPA-positive RA cases had a mean ± SD age at diagnosis of 51 ± 12 years, 644 (73%) were current or former smokers, and 702 (71%) were women. All available variables were included in the final primary model, since the addition of each variable improved the predictive ability of the model in the NHS as measured by the IDI. This combination of variables showed an AUC of 0.716 (95% CI 0.681–0.755) in the NHS, 0.716 (95% CI 0.693–0.749) in women in the EIRA, and 0.756 (95% CI 0.725–0.808) in men in the EIRA (Figure 1 and Table 3).
|Seropositive RA cases (n = 317)||Controls (n = 551)||ACPA-positive RA cases (n = 987)||Controls (n = 958)|
|Age, mean ± SD yearsa||55.1 ± 8.1||55.5 ± 7.9||51.2 ± 12.0||52.5 ± 11.7|
|Women, no. (%)||317 (100)||551 (100)||702 (71)||715 (75)|
|Current or past smoker, no. (%)||195 (63)||309 (56)||644 (73)||518 (59)|
|Pack-years among smokers, mean ± SD||25.0 ± 18.0||22.7 ± 20.9||19.2 ± 14.8||16.5 ± 14.6|
|GRS39 (with HLA SE), mean ± SD||5.14 ± 0.86||4.70 ± 0.79||5.39 ± 0.95||4.72 ± 0.81|
|GRS31 (without HLA SE), mean ± SD||4.27 ± 0.60||4.10 ± 0.86||4.51 ± 0.59||4.26 ± 0.58|
|Age at symptom onset, mean ± SD years||55.5 ± 10.4||–||50.3 ± 12.1||–|
|Age at diagnosis, mean ± SD years||56.1 ± 9.8||–||51.1 ± 12.1||–|
|RF positive, no. (%)||297 (94)||–||835 (88)||–|
|ACPA positive, no. (%)b||112 (55)||–||987 (100)||–|
|Seropositive, no. (%)c||317 (100)||–||987 (100)||–|
|Statistical parameters||Primary data set, NHS womena||Validation data set, EIRA women||Validation data set, EIRA men|
|Nagelkerke's R2 (95% CI)||0.185 (0.129–0.246)||0.185 (0.147–0.241)||0.275 (0.189–0.360)|
|AUC (95% CI)||0.716 (0.681–0.755)||0.716 (0.693–0.749)||0.756 (0.725–0.808)|
|Hosmer-Lemeshow χ2 (P)||5.48 (0.71)||8.71 (0.367)||4.40 (0.810)|
Secondary analyses were performed specific to the expanded list of variables available in each data set. All models that included G performed significantly better than the E models, based on Nagelkerke's R2, AUC, IDI, and cNRI (Table 4). The expanded model in women in the NHS that included additional reproductive variables and additional G and GEI terms produced the highest AUC (in the NHS) of 0.738 (95% CI 0.721–0.790) (Figure 1A). An optimal expanded model in women in the EIRA produced an AUC of 0.724 (95% CI 0.705–0.761) and included the GRS and HLA SE, but not GEI with smoking, since these did not increase prediction (e.g., addition of the HLA SE × smoking interaction term did not show improvement with the IDI) (Figure 1B). The optimal expanded model in men in the EIRA that included occupational exposures produced the highest AUC of 0.769 (95% CI 0.747–0.830) and included the GRS and HLA × smoking interaction, but not other interactions (Figure 1C).
|Statistical parameters||E||E + Ga||E + G + GEIb|
|NHS expanded modelsc|
|Nagelkerke's R2 (95% CI)||0.094 (0.076–0.172)||0.186 (0.162–0.279)||0.209 (0.188–0.323)|
|AUC/C statistic (95% CI)||0.655 (0.635–0.709)||0.718 (0.702–0.774)||0.738 (0.723–0.795)|
|IDI (P) compared to:|
|E model||Ref.||0.069 (6.0 × 10−14)||0.088 (7.4 × 10−17)|
|E + G model||−0.069 (6.0 × 10−14)||Ref.||0.019 (0.0001)|
|E + G + GEI model||−0.088 (7.4 × 10−17)||−0.019 (0.0001)||Ref.|
|EIRA women expanded modelsc|
|Nagelkerke's R2 (95% CI)||0.069 (0.055–0.117)||0.199 (0.169–0.265)||0.200 (0.171–0.267)|
|AUC/C statistic (95% CI)||0.632 (0.614–0.671)||0.724 (0.705–0.760)||0.724 (0.706–0.762)|
|IDI (P) compared to:|
|E model||Ref.||0.101 (1.0 × 10−35)||0.102 (8.4 × 10−36)|
|E + G model||−0.101 (1.0 × 10−35)||Ref.||0.0002 (0.499)|
|E + G + GEI model||−0.102 (8.4 × 10−36)||−0.0002 (0.499)||Ref.|
|EIRA men expanded modelsc|
|Nagelkerke's R2 (95% CI)||0.125 (0.098–0.237)||0.273 (0.231–0.401)||0.282 (0.240–0.414)|
|AUC/C statistic (95% CI)||0.685 (0.657–0.752)||0.767 (0.744–0.828)||0.769 (0.747–0.830)|
|IDI (P) compared to:|
|E model||Ref.||0.116 (1.1 × 10−14)||0.123 (2.2 × 10−15)|
|E + G model||−0.116 (1.1 × 10−14)||Ref.||0.006 (0.12)|
|E + G + GEI model||−0.123 (2.2 × 10−15)||−0.006 (0.12)||Ref.|
In the NHS, the primary model with age, smoking, alcohol, education, parity, GRS31, HLA SE, and HLA × smoking explained 19% of the variance, whereas in the NHS expanded models, the variance explained from just the E (E model) was 9%. As expected, however, since more variables were added to the expanded model, the variance explained increased, first to 19% with inclusion of G, and finally to 21% with the addition of the G × E interaction (E + G + GEI model). In women in the EIRA, the primary model explained 19% of the variance in women and 28% of the variance in men. In the expanded analysis among women in the EIRA, the variance explained from just the E (E model) was 7%, and as more variables were added, the variance explained increased to 20% with inclusion of G and was 20% with inclusion of GEI factors. In men in the EIRA, the primary model explained 27.5% of the variance. In the expanded model for men in the EIRA, the variance explained by the E (E model) was 12.5%, and as more variables were added, the variance explained increased to 27.3% with inclusion of G and to 28.2% with the inclusion of GEI variables.
In the NHS expanded models, the addition of G showed a significant improvement in prediction over the E model as measured with the IDI (P = 6 × 10−14); addition of GEI further improved the IDI (P = 0.0001). For women in the EIRA, addition of G in the expanded models showed a significant improvement in prediction over the E model as measured by the IDI (P < 1.0 × 10−35); however, no improvement in the IDI was seen with the addition of GEI terms (P > 0.05). In men in the EIRA expanded models, the addition of G and G + GEI each showed a significant improvement in prediction over the E model (P = 1.1 × 10−14 and P = 2.2 × 10−15, respectively). However, there was no significant improvement in the IDI adding GEI to the E + G model.
The stratified cNRI results shown in Table 5 demonstrate the change in sensitivity (reclassification of cases) and change in specificity (reclassification of controls) of the primary model compared to the expanded models. In the NHS overall, the primary model performed more poorly than the E + G + GEI with a total cNRI of −0.26 (P = 0.0004), indicating lower specificity. However, from the stratified results we can see that the primary model performed similarly in reclassifying cases (same sensitivity; cNRI = −0.03, P = 0.64), but worse in reclassifying controls (lower specificity; cNRI = −0.23, P = 1.3 × 10−7). This is interpreted as an excess of 23% of controls reclassified with a higher predicted probability of RA in the primary model versus the E + G + GEI model. In women in the EIRA, the primary model performed similarly to the E + G model (P = 0.08). However, in the stratified analysis, the primary model correctly reclassified controls with lower predicted probabilities (higher specificity; cNRI = 0.33, P = 5.9 × 10−18), and also reclassified cases with lower probabilities of developing RA (lower sensitivity; cNRI = −0.42, P = 5.0 × 10−28). In men in the EIRA, the primary model performed worse than the E + G + GEI model according to the cNRI of −0.27 (lower specificity; P = 0.004). However, again in the stratified analysis, the primary model performed similarly to the E + G + GEI model in reclassifying controls (same specificity; cNRI = 0.08, P = 0.24), but did more poorly in reclassifying cases (lower sensitivity; cNRI = −0.36, P = 2.8 × 10−8).
|Cohort||cNRI for cases (P)||cNRI for controls (P)||Total cNRI (P)|
|NHS||−0.027 (0.64)||−0.233 (1.3 × 10−7)||−0.260 (0.0004)|
|EIRA women||−0.424 (< 1.0 × 10−16)||0.328 (< 1.0 × 10−16)||−0.096 (0.08)|
|EIRA men||−0.355 (2.8 × 10−8)||0.081 (0.24)||−0.274 (0.004)|
Leveraging an extensive body of epidemiologic and genetic research on the risk of developing RA among asymptomatic cohorts along with modern statistical techniques, we developed and validated comprehensive risk models for RA. We demonstrate that inclusion of information on genetic variants and GEI models significantly improves the predictive power of epidemiologic models. Using the AUC to assess discriminative ability, the optimal primary model among a US female cohort included GEI terms for HLA–DRB1, the strongest genetic risk factor for RA and smoking and the strongest environmental risk factor for RA, with an AUC of 0.716. The optimal primary model among a Swedish cohort was seen for women with an AUC of 0.716 and for men with an AUC of 0.756. Expanded models demonstrated improved AUCs to 0.738 among US women, 0.728 among Swedish women, and 0.769 among Swedish men. The variance explained by the expanded E, G, and GEI models ranging from 21% for women to 28% for men suggested that there are more risk factors yet to be discovered.
We demonstrate that optimal models for prediction of RA should include E, G and, in some cases, their interaction terms. In other diseases and complex traits, for example, type 2 diabetes mellitus ([56-60]) and cardiovascular disease ([61, 62]) variants have at most provided only a modest increase in the predictive ability of clinical models. Willems and colleagues reviewed 20 studies assessing risk prediction for type 2 diabetes mellitus and found that adding G (up to 40 polymorphisms) did not significantly add to the discriminative ability of any of the clinical models (). They showed that where the E models are strong (AUCs ranging from 0.68–0.92), adding genetic variants with weak effects (even as a cumulative GRS) does not add much to prediction. In a simulation study to explore the potential improvement in discrimination with models that include G × G and G × E interactions, Aschard et al demonstrated that inclusion of interaction effects in models for 3 diseases (breast cancer, type 2 diabetes mellitus, and RA) was unlikely to dramatically improve the discrimination ability of these models (). This study of RA risk is one of the few examples where the inclusion of G and GEI factors in a model resulted in a significant increase in discrimination and reclassification.
Using a primary model consisting of 5 clinical variables that could be collected at a routine clinic visit could be advantageous when screening large numbers of individuals for RA risk. These results are timely, since selection of high-risk individuals in enrollment in primary prevention trials in RA research is exciting ([65, 66]). Therefore, we strived to construct a model that performed well statistically, but also included clinical variables that could be collected on a larger scale. The variables for our model were chosen such that all of the E could be easily attained by survey (e.g., year of birth, smoking, alcohol, education, and parity). We studied whether the predictive ability of the model improved with an expanded set of predictors that could be assessed with longer surveys along with genetic risk alleles. We showed that although there were slightly lower AUC statistics in the primary models, the predictive accuracies were significantly lower than the expanded models, as demonstrated by the cNRI results stratified into case and control subgroups. This demonstrates the importance of considering statistical evaluation beyond analysis of AUC statistics ([54, 55, 67-69]).
If these results were applied to selecting high-risk individuals for a prevention trial, the expanded set of variables should be collected by survey and genetic risk factors should be assessed. Alternatively, family history may be a good proxy for genetic risk alleles (although family history information on all subjects was not available in our study). Among men in the EIRA, the primary model classified cases as having a lower predicted probability than the expanded model, or lower sensitivity, but had specificity similar to the full model. Among women in the EIRA, the primary model resulted in a significant decrease in sensitivity, with 42% of cases being classified with lower predicted probabilities; however, there was also an increase in specificity, with 33% of controls being classified with lower predicted probabilities. This would result in fewer potential cases qualifying to be randomized in a trial among men and women, but also reduced false-positive fraction among women, lowering the chance of unnecessary treatment. However, the performance of the primary model compared to the expanded model in the NHS data set suggests lower specificity, with 22% of the controls being incorrectly reclassified with a higher predicted probability and no change in sensitivity. This would lead to an increase in false-positive fraction, which would result in more women being enrolled and treated unnecessarily. When considering a prognostic model for enrollment in a prevention trial targeted to high-risk groups, maintaining high specificity (and therefore a low false-positive rate) is a higher priority than increasing sensitivity in the setting of treatment with a potentially toxic medication.
One limitation of this study is that models were developed in primarily white populations. There may be E, G, or GEI factors that might lead to differences in risk models in nonwhite populations. Therefore, the predictive models developed in this data set need to be validated in other populations. Another limitation of this study is that some factors in the models were originally discovered in the NHS or EIRA. These include region of residence and GSTT1 × smoking interaction (NHS), and HLA × smoking, silica, and solvents (EIRA) ([10, 14, 15, 42]). Therefore, the results of the expanded models could be inflated due to overfitting. We also recognize that assessing the models on the data set in which they were developed leads to optimistic measures of variance explained and discrimination. The primary limitation of the study design is that in a matched case–control study, we cannot estimate weights for each factor that could be used in other studies or in clinical prediction. However, the wealth of data allowed us to gain insights into the optimal collection of variables that should be included in prospective studies for development of prediction rules. Further, although the NHS cohorts involved >230,000 women, blood was collected on <25% of women, therefore limiting the sample size for analyses that include G. Finally, blood samples were collected after RA diagnosis in most subjects in the NHS and in all subjects in the EIRA; therefore, we do not have biomarker data among preclinical RA collected prior to the onset of RA symptoms, such as autoantibodies or cytokines, that have been shown to be strongly associated with risk of development of RA ([48, 70-74]).
The strength of this study is the use of statistical metrics to parse the effect of the addition of each factor to a model. Over the last decade, limitations of using change in AUC as a primary outcome when comparing both diagnostic and prognostic models has been widely discussed ([54, 55, 68, 75]). Cook () and Pepe et al () separately showed that any new factor would require an exceptionally large odds ratio to show an impact on the AUC. Both Pencina et al ([54, 55]) and Cook () pointed out that for assessing the utility of a prognostic model, we are interested in reclassification to a more appropriate risk category (higher for cases and lower for controls), rather than discrimination. By using the IDI, which measures reclassification, we can perform model comparison because the same scale is used in all situations. Even with our relatively modest sample size in the NHS, we had >90% power to find an IDI as small as 0.02 and >80% power to find a cNRI as small as 0.15. We were able to show the benefit of adding G and GEI terms to an E model in the expanded models.
These results demonstrate the challenges of creating a simple risk prediction model for primary prevention trials of RA. Therefore, further work on development of highly specific prediction models using prospective cohorts to assess weights is still needed for primary prevention trials. Our data suggest that addition of cumulative RA genetic variants as a GRS and, in some cases, GEI to an RA prediction model with E, significantly improves the predictive ability of the model for this complex human autoimmune disease. However, the inclusion of highly specific biomarkers such as ACPA in risk models is likely to improve risk stratification of asymptomatic individuals. Further, we show that identifying risk factors separately in men and women is important, particularly if occupational exposures differ. Ultimately, this collection of variables should be used to estimate weights in other cohorts, and then the models validated and their performance assessed before using them for risk stratification.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Karlson had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Karlson, Costenbader, Klareskog, Alfredsson, Chibnik.
Acquisition of data. Karlson, Costenbader, Klareskog, Alfredsson.
Analysis and interpretation of data. Karlson, Ding, Keenan, Liao, Costenbader, Alfredsson, Chibnik.
The authors would like to thank all of the participants and staff of the NHS in the US and the EIRA in Sweden for their contributions.