To improve prognostic ability in ankylosing spondylitis (AS), we sought to identify demographic, clinical, and immunogenetic characteristics associated with radiographic severity in a large cohort of patients.
To improve prognostic ability in ankylosing spondylitis (AS), we sought to identify demographic, clinical, and immunogenetic characteristics associated with radiographic severity in a large cohort of patients.
Patients with AS for ≥20 years were enrolled in a cross-sectional study (n = 398). Pelvic and spinal radiographs were scored using the Bath Ankylosing Spondylitis Radiology Index for the spine (BASRI-s), and radiographic severity was measured as the BASRI-s/duration of AS. Clinical factors and HLA–B, DR, DQ, and DP alleles associated with the highest quartile of the distribution of radiographic severity were identified by first using random forests and then using multivariable logistic regression modeling. Similar procedures were used to identify factors associated with the lowest quartile of radiographic severity.
Radiographic severity (being in the top quartile of BASRI-s/duration of AS) was associated with older age at onset of AS (odds ratio [OR] 1.10 per year), male sex (OR 1.90), current smoker (OR 4.72), and the presence of HLA–B*4100 (OR 11.73), DRB1*0804 (OR 12.32), DQA1*0401 (OR 5.24), DQB1*0603 (OR 3.42), and DPB1*0202 (OR 23.36), whereas the presence of DRB1*0801 was strongly negatively associated (OR 0.03). Being in the lowest quartile of BASRI-s/duration of AS was also less likely among those with an older age at onset of AS (OR 0.94 per year), men (OR 0.28), and current smokers (OR 0.29).
The accuracy of the prognosis of radiographic severity in AS is improved by knowing the age at disease onset, sex, smoking history, and the presence of HLA–B*4100, DRB1*0804, DQA1*0401, DQB1*0603, DRB1*0801, and DPB1*0202 alleles.
Patients newly diagnosed with ankylosing spondylitis (AS) are often concerned about the future course of their illness, and in particular, about the possibility of spinal fusion. These concerns are well founded, because the severity of spinal arthritis is an important determinant of health outcomes in patients with AS (1–5). At present, our ability to provide accurate prognostic information for individual patients is poor. Because spinal arthritis is generally progressive and ankylosis is irreversible, radiographic severity increases with the duration of AS (5–9). However, there is substantial heterogeneity in radiographic severity among patients with similar durations of AS, suggesting a strong influence of other factors (6, 9, 10).
Little is known about factors that affect radiographic severity. Men have more severe radiographic changes than women, as do patients with hip arthritis (5, 9, 10–15). Patients with a history of iritis have been reported to have more severe radiographic changes, particularly in the cervical spine (3, 10). Cigarette smoking has been inconsistently associated with radiographic severity (5, 16). The fact that few clinical predictors have been identified suggests that radiographic severity may be largely genetically determined. This idea is supported by family studies that demonstrated a high heritability of radiographic severity and concordance among siblings (17, 18). Although the presence of HLA–B27 has not been associated with more severe radiographic changes, it is not known if other HLA alleles influence radiographic severity (19–21). Previous studies of immunogenetic associations have not examined radiographic changes, yet radiographic severity is thought to be the feature of AS severity that is most highly genetically influenced (17, 18, 22–24).
To develop a system that would provide prognostic information in AS, we tested the association of clinical characteristics and immunogenetic markers with radiographic severity in a large group of patients. Because radiographic changes occur slowly, patients with early AS will not have accrued enough time for the severity of radiographic changes to become evident. To reduce misclassification of radiographic severity, we limited the analysis to patients with AS for 20 years or more.
Patients were participants in the Prospective Study of Outcomes in Ankylosing Spondylitis, and were recruited from clinics of the investigators, local rheumatologists, and from the community (25–27). All patients who met the modified New York criteria for AS (28), were ≥18 years old, and were interested in participating in a study of genetics of AS were enrolled. For this analysis, we included patients with AS for ≥20 years, timed from the onset of persistent musculoskeletal symptoms. Patients completed questionnaires on personal and medical history and underwent a clinical evaluation by a study rheumatologist (MMW, JCD, JDR, MHW). All patients underwent phlebotomy for genotyping and had pelvic and spinal radiographs performed.
Clinical and demographic data included as potential prognostic factors included age at onset of AS, sex, ethnicity, education level, marital status, smoking history (current, former, or nonsmoker, as well as pack-years of smoking), and number of comorbid medical conditions. Weighted occupational physical activity was calculated based on how much physical activity was performed (little, moderate, or much) in each past job and how many years were spent in each job (25). Also included were a family history of AS in a first-degree relative, history of inflammatory bowel disease, history of iritis, and self-reported recreational activity in their teens and twenties relative to peers (less = 1, same = 2, or more = 3). Recreational activity was limited to these age groups because previous analyses suggested that the level of recreational activity in later years was a consequence of, rather than a predictor of, AS severity.
Genomic DNA was extracted from peripheral blood leukocytes by standard techniques. HLA–B alleles were examined by sequence-specific primer typing using commercially available kits (Dynal/Invitrogen, Carlsbad, CA). HLA–DRB1, DQA1, DQB1, and DPB1 typing was done by standard oligotyping techniques. High-resolution DRB1 typing was further confirmed by a nucleotide sequence analysis of exon 2.
We used the Bath Ankylosing Spondylitis Radiology Index for the spine (BASRI-s) to assess radiographic severity. The BASRI-s is the sum of the average scores of the sacroiliac joints, the lumbar spine, and the cervical spine (possible range 0–12) (29), and is a reliable and valid measure of radiographic damage in AS (9, 10, 30–33). Although other scoring methods may be more sensitive to change, in this study we were concerned with the status of severity and not with radiographic progression. All radiographs were scored by a single musculoskeletal radiologist (TJL). The intrareader reliability of the BASRI-s, based on 2 readings of the films of 50 patients performed at least 3 months apart, was 0.987 (95% confidence interval [95% CI] 0.981–0.991).
We used a 2-stage approach to identify prognostic factors: first, using random forests to identify and validate the predictors from among a large group of candidate variables, and second, using logistic regression to determine the strength of association of these predictors with radiographic severity and to build prognostic models.
A random forest is an ensemble classifier that uses multiple classification trees for prediction (34). A single classification tree is a hierarchical procedure that uses recursive partitioning to identify subgroups of patients that are increasingly homogenous with respect to the outcome. For example, the tree program first splits the patient group into 2 subgroups based on the characteristic that best segregates the patients in the severe radiographic damage group from those in the lesser damage group. The program then repeats this process for each subgroup until subgroups of sufficient homogeneity are found. The procedure is nonparametric, not model based, and identifies the independent variables that best segregate subgroups of patients.
Random forests work by generating a large number of classification trees. Each tree is developed on a random sample of patients (bootstrap sample with replacement) and uses a randomly sampled subset of independent variables (sampled without replacement) as candidate predictors at each node in the tree. Accuracy of prediction is determined by how well each tree classifies each patient who was omitted from the development of the tree. The proportion of times that a test patient was misclassified by each tree, averaged over all patients, is a relatively unbiased estimate of classification error. This process intensively cross-validates the prediction and obviates the need for separate training and test samples for validation.
The relative importance of independent variables is determined by first counting the number of test patients correctly classified by each tree, then randomly altering the value of one independent variable of the test patients (e.g., changing the sex from male to female randomly), and determining if the tree correctly classifies these patients. A large difference in the number of patients correctly classified when the independent variable was altered indicates that the variable is important. This difference is averaged over all trees and repeated for each independent variable. We used the mean decrease in accuracy measure to rank the relative importance of predictors. Analyses were performed using the Random Forest in the R library (available at http://cran.us.r-project.org).
Random forests are useful when the number of independent variables exceeds the number of subjects, a situation that limits the application of many conventional statistical methods, but which is common in studies of genetic associations (35). Random forests can also identify rare characteristics as important predictors, whereas these are often excluded from analyses using conventional statistical methods. In addition, random forests are useful for identifying associations with independent variables when none is considered a priori to be more important, as was the case with the immunogenetic markers in this study.
Even with limiting the analysis to patients with AS for ≥20 years, there was a wide range of durations of AS present in the sample. To standardize the measure of radiographic severity further, we defined radiographic damage as the BASRI-s divided by the duration of AS, and classified those in the top quartile as having severe damage (BASRI-s/duration ≥0.3639). The duration of AS was dated from the onset of persistent musculoskeletal symptoms, and not the date of diagnosis. The candidate predictors included 14 clinical variables and dichotomous variables for each HLA–B, DRB1, DPB1, DQA1, and DQB1 allele (155 variables in total). Alleles were coded as present if one copy was present. Single random forests were constructed of 500 classification trees, using a randomly selected 75% subset of patients, 12 randomly selected independent variables tested at each node, and a weighted voting across trees such that the false-positive rate is then approximately equal to the false-negative rate. This process was repeated for 1,000 different random forests.
To refine the prediction, the 30 top-ranked predictor variables from the initial set of random forests were then tested in a second set of 1,000 random forests of 500 trees each. Although the rankings are useful to identify variables of prognostic interest, the rank of any variable is a function of the other variables in the list, and the aggregation of rankings from separate lists may not respect the original rankings. Importantly, the relative prognostic importance of 2 variables is not proportional to their rank order.
Therefore, in the second stage, we developed multivariate logistic regression models to determine the relative strength of association between predictors and radiographic damage. We used the category of radiographic damage as the dependent variable and the top-ranked predictors from the random forests as the independent variables, including at a minimum the 8 top-ranked variables. These analyses provided adjusted odds ratios (ORs) that estimated the likelihood that a patient with the clinical characteristic or allele would be in the more severe group. Although the random forest is very efficient at detecting important predictors, it is complex. We used logistic regression as an interpretable reference model. These analyses were performed using SAS programs (SAS Institute, Cary, NC).
The random forests were developed using 385 patients (of 402 patients in the cohort) who had complete data (excluding 4 patients with missing data on pack-years of smoking and 13 patients who lacked genotype information for any locus [missing, untypable, or new allele]). The logistic regression models were developed using 398 patients, excluding 4 patients with missing data on pack-years of smoking.
Because predictors of mild radiographic damage may be different from predictors of severe damage, we repeated the random forest analysis using patients in the lowest quartile of BASRI-s/duration of AS as the group of interest (BASRI-s/duration <0.1780). We also repeated the analysis comparing patients in the top tertile with those in the bottom tertile, but this analysis did not provide additional insights.
The features of the patients (n = 398) are shown in Table 1. The mean duration of AS was 31.8 years. The cohort was comprised mostly of men, and 87% were HLA–B27 positive. Immunogenetic markers were highly polymorphic, with 45 different HLA–B alleles, 38 different DRB1 alleles, 16 different DQA1 alleles, 15 different DQB1 alleles, and 21 different DPB1 alleles among our patients. Three HLA–B alleles were present in >10% of patients (B*2705, B*4400, and B*0800). Similarly, 1 DRB1 allele (*ast;0101), 5 DQA1 alleles (*03, *0101, *0102, *0201, and *0501), 5 DQB1 alleles (*0301, *0302, *0501, *0602 and *0201), and 4 DPB1 alleles (*0401, *0402, *0301, and *0201) were present in >10% of patients.
|Age, mean ± SD years||55.0 ± 10.8|
|Age at onset of AS, mean ± SD years||23.1 ± 7.8|
|Duration of AS, mean ± SD years||31.8 ± 10.0|
|Education level, mean ± SD years||16.0 ± 3.0|
|Weighted job activity (range 0–3), mean ± SD||1.8 ± 0.7|
|Recreational exercise in teens and twenties (range 0–3), mean ± SD||2.1 ± 0.6|
|Smoking status, %|
|Pack-years of smoking, mean ± SD|
|All patients||11.3 ± 18.7|
|Ever smokers only||20.4 ± 21.1|
|Number of comorbid conditions, %|
|Family history of AS, %||28|
|History of inflammatory bowel disease, %||4.5|
|History of iritis, %||42|
|HLA*B27 positive, %||87|
|BASFI (range 0–100), mean ± SD||40.9 ± 26.5|
|BASRI-s (range 0–12), mean ± SD||8.5 ± 3.2|
The median BASRI-s score was 9 and the mean ± SD BASRI-s/duration was 0.283 ± 0.13. The median (interquartile range) BASRI-s score was 11.5 (10–12) among those in the highest quartile of BASRI-s/duration, and 3.5 (3–6) among those in the lowest quartile.
We evaluated the association of 14 clinical variables and all HLA–B, DRB1, DQA1, DQB1, and DPB1 alleles with radiographic severity, using the 75th percentile of BASRI-s/duration as the threshold for severe damage. Among the 1,000 random forests, age at onset of AS was consistently ranked as the most important variable (Figure 1). The alleles DRB1*0801, DRB1*0804, DQA1*0401, and HLA–B*4100 were the next most highly ranked, followed by DQB1*0603, current smoker, male sex, and DPB1*0202, although these were less consistently associated with severe radiographic damage among different random forests. The mean ± SEM overall misclassification error was 30.34% ± 2.29%.
We next developed a multivariate logistic regression model to determine the strength of association between severe radiographic damage and variables identified as important in the random forests (Table 2). The likelihood of being in the group with severe damage increased with age at onset of AS, and men were almost twice as likely as women to be in this group. Current smokers were >4 times as likely as former smokers or nonsmokers to be in the severe damage group. Among the immunogenetic variables, HLA–B*4100, DRB1*0804, DQA1*0401, DQB1*0603, and DPB1*0202 were strongly associated with more severe radiographic damage. In contrast, adjusting for the other variables in the model, patients with DRB1*0801 were much less likely to be in the group with severe radiographic damage (adjusted OR 0.03). The model fit the data well (c statistic = 0.80, Hosmer-Lemeshow test P = 0.38). Variables ranked tenth and eleventh in the random forests were not significantly associated with severe radiographic damage, and were not included in the final model.
|Adjusted OR (95% CI)||P|
|Age of onset, per year||1.10 (1.07–1.15)||< 0.0001|
|Current smoker||4.72 (2.16–10.30)||< 0.0001|
Significant associations were present even though some immunogenetic alleles were uncommon. HLA–B*4100 was present in 6 patients (1.5%), DQA1*0401 was present in 38 patients (9.5%), DQB1*0603 was present in 33 patients (8.3%), DRB1*0804 was present in 6 patients (1.5%), DRB1*0801 was present in 24 patients (6.0%), and DPB1*0202 was present in 4 patients (1.0%). DQA1*0401 and both DRB1*0801 and DRB1*0804 are known to be in linkage disequilibrium, but in this analysis DQA1*0401 was associated with more severe radiographic damage, whereas DRB1*0801 was protective. Of the 38 patients who had DQA1*0401, 21 also had DRB1*0801 and 17 did not. None of the 21 patients with DRB1*0801 were in the severe radiographic group, whereas 12 (70%) of the 17 of those who were DQA1*0401 positive but DRB1*0801 negative were in the severe group, accounting for the divergent associations found for these linked alleles and indicating that the protective effect of DRB1*0801 dominated the effect of DQA1*0401.
The probability that a patient would be in the severe radiographic group was computed from the logistic model for men and women at 3 different ages at disease onset (23.1 [the group mean], 18, or 30 years), among current smokers and nonsmokers, and for selected combinations of HLA alleles (Table 3 and Supplementary Appendix A, available in the online version of this article at http://www3.interscience.wiley.com/journal/77005015/home). For the base case of a male nonsmoker with an onset of AS at age 23.1 years, the probability of being in the severe radiographic group was 0.181 in the absence of any of the risk alleles. This probability increased to >0.70 if either HLA–B*4100 or DRB1*0804 was present. This probability was 0.935 if both DRB1*0804 and DQA1*0401 were present, but decreased dramatically if DRB1*0801 was present. The probability of being in the severe radiographic group was lower among women than men for any combination of prognostic factors.
|Age at AS onset, years||Current smoker||HLA–B*4100||DRB1*0804||DRB1*0801||DQA1*0401||DQB1*0603||DPB1*0202||Probability, men||Probability, women|
Because the identification of prognostic factors for mild AS may also be useful, we used the same approach to examine variables associated with less severe radiographic changes, using the 25th percentile of BASRI-s/duration as the threshold for distinguishing milder from more severe radiographic damage. In this analysis, sex was consistently ranked the highest (Figure 2). Age at onset of AS was the next most highly ranked variable, followed by 2 variables characterizing smoking status, DRB1*0901 and DQA1*0102, ethnicity, and recreational activity in their teens and twenties. The mean ± SEM overall misclassification error among random forests was 36.334% ± 2.74%.
We used the variables ranked the highest as input for the multivariate logistic model to assess the strength of their associations with less severe radiographic damage (Table 4). The likelihood of being in the less severe group was lower for those with an older age at onset of AS, men, and current smokers. Patients with DQA1*0102, who comprised 30% of the sample, were somewhat less likely to be in the less severe group, whereas those with DRB1*0901 were somewhat more likely to be in this group. The model fit the data well (c statistic = 0.75, Hosmer-Lemeshow test P = 0.65). No other highly ranked variables were significantly associated with membership in the less severe group. In a separate model that included pack-years of smoking instead of smoking status, those with a history of >20 pack-years (adjusted OR 0.53, 95% CI 0.25–1.10; P = 0.09) were somewhat less likely to be in the less severe group compared with nonsmokers.
|Adjusted OR (95% CI)||P|
|Age of onset, per year||0.94 (0.89–0.97)||0.0005|
|Current smoker||0.29 (0.09–0.85)||0.03|
|Recreational exercise during teens and twenties, per 1 unit||1.20 (0.79–1.81)||0.39|
Accurate prognostic information can be used to counsel patients about their illness and its future course. It also allows risk stratification, which can be used in clinical trials to select the subgroups of patients most at risk for the outcome of interest, thereby improving the efficiency of the trial. Radiographic severity is an important outcome for which to identify prognostic markers; because the degree of spinal damage is highly variable among patients, it affects health outcomes of concern to patients, and because of testing of treatments for their ability to slow radiographic damage is of great interest.
Our prognostic model differentiated patients at risk for severe radiographic damage based on a small set of clinical variables and HLA markers. The key prognostic variables were identified and validated using random forests, a statistical learning machine scheme that allows testing for the associations of a large number of candidate predictors. With this approach, we were able to identify several alleles of prognostic importance among the highly polymorphic HLA markers. Nineteen percent of patients had at least one of the 5 HLA alleles prognostic for severe radiographic damage, and 7.8% had 2 or more of these alleles. Among these patients, the presence of these risk alleles greatly changed the probability that severe radiographic damage was present, demonstrating that even if alleles are uncommon, they can be jointly predictive with good accuracy. In the remaining patients, prognostic information was contributed by age at disease onset, sex, and smoking status. These 3 variables can be used to estimate the likelihood of severe radiographic damage after 20 years of AS, with the probability being higher for men, smokers, and those with onset of AS at an older age.
Men with AS had more severe radiographic damage than women, confirming previous studies (5, 9, 10–15). We also found that the likelihood of more severe damage was greater among those with an older age at symptom onset. This association may result if these patients have more extensive spinal inflammation or are more prone to bone formation. Some (27, 36, 37), but not all (5, 10), studies support this possibility. This association may also result if patients with an older age at disease onset are more likely to be asymptomatic early in their illness or to have stuttering or milder symptoms, which might cause them to underestimate the duration of their AS. HLA–B27 alleles were not associated with radiographic severity, supporting previous studies (19–21).
Of the HLA alleles found to be associated with radiographic severity, HLA–B*4100, DRB1*0804, DQA1*0401, DQB1*0603, and DPB1*0202 were associated with an increased risk of radiographic severity. Although the presence of DRB1*0801 was associated with a decreased likelihood of being in the highest 25% of the distribution of BASRI-s/duration of AS, this does not necessarily mean that patients with this allele had mild radiographic changes; they could have had average radiographic severity. We do not know if these alleles are mechanistically related to the development of radiographic damage in AS, or if they are linked to other genes that are directly related. HLA–DRB1*0801 differs from DRB1*0804 at 2 amino acid positions that are crucial in antigen binding and T cell receptor interaction, although how this might influence radiographic severity is unknown. However, the absence of information about the pathogenetic importance of these alleles does not diminish their prognostic value, in the same way that absence of information about the ways in which male sex influences radiographic severity does not diminish its prognostic value.
Men, patients with an older age at onset of AS, and current smokers were not only more likely to be in the group with more severe radiographic damage, but were also less likely to be in the group with relatively little radiographic damage. This finding emphasizes the importance of these factors in stratifying risk. Two previous studies of the association between smoking and radiographic severity in AS reported conflicting results (5, 16). However, smoking has been consistently associated with more severe limitations in physical function in patients with AS, suggesting indirectly that smoking may increase skeletal damage (5, 16, 25, 38, 39). Although patients who had DRB1*0901 were more than 3 times more likely than those without this allele to be in the less severe group, this association was marginally statistically significant. Those with DQA1*0102 were somewhat less likely to be in the lowest quartile of radiographic severity, indicating an increased risk for average or severe radiographic damage among these patients.
The strengths of this study include the large well-characterized sample, examination of a number of clinical characteristics and HLA loci, and testing of prognostic factors for both severe and less severe radiographic damage. The use of random forests allowed us to test and validate associations of the highly polymorphic HLA alleles, even in this circumstance when the ratio of the number of alleles to the number of subjects was relatively high. However, the misclassification error rates suggest that prediction of the random forests could be improved, and indicate that additional prognostic factors are still to be discovered. These additional factors are likely to be other genetic markers, which can easily be incorporated in extensions of this model. The relative importance of HLA alleles will best be determined after other prognostic markers have been identified and the strengths of their association are estimated. Other prognostic markers may be more common but have weaker associations with radiographic severity. This study is limited in that we did not examine AS activity or certain clinical features, such as nonsteroidal antiinflammatory drug use, as potential prognostic factors due to the absence of complete historical data. Prognostic variables were either time invariant or able to be assessed at the time of diagnosis of AS. We likely underestimated radiographic severity in some patients who had reached the maximum BASRI-s score. We limited the study to patients with AS for ≥20 years to allow time for patients to develop radiographic changes and minimize false-negative associations of the prognostic variables. Timing the onset of AS can be imprecise. Inaccurate recall may have introduced variation, but recall would not be expected to vary by HLA status, and therefore would not confound associations with HLA alleles. BASRI scores in this cohort were similar to those of patients with ≥20 years of AS in other reports, suggesting that the sample was not atypical with respect to radiographic severity (9).
Our models demonstrate that age at onset of AS, sex, smoking history, and selected HLA alleles predict radiographic severity in AS. The models can easily incorporate new genetic markers as they are discovered to provide updated prognostic estimates that are able to stratify risks. In addition to their prognostic importance, the HLA associations can be investigated for potential clues to the pathogenesis of radiographic damage in AS.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Ward had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Ward, Malley, Davis, Reveille, Weisman.
Acquisition of data. Ward, Learch, Davis, Reveille, Weisman.
Analysis and interpretation of data. Ward, Hendrey, Malley, Davis, Reveille, Weisman.
We thank Stephanie Morgan, Lori Guthrie, Felice Lin, and Laura Diekman for their assistance.