Rheumatoid arthritis (RA) is a common autoimmune disease characterized by chronic inflammation of the synovial membrane, which can lead to joint damage and a variety of other clinical manifestations (). The number of genes reported to be associated with susceptibility to RA continues to grow, now totaling ∼60 (). Although RA affects individuals from diverse ethnic backgrounds, genome-wide association studies (GWAS) of RA have been performed only in individuals of European or east Asian (particularly Japanese) ancestry (). In addition, as is the case with most complex diseases, a substantial proportion of “missing heritability” remains to be identified in RA (). The difference between RA heritability estimates from family-based studies and the variance explained by single-nucleotide polymorphisms (SNPs) from GWAS has not been investigated in non-European populations.
In this issue of Arthritis & Rheumatism, Negi et al () report a novel association of the gene ARL15 (ADP-ribosylation factor–like 15) from the first GWAS of RA in North Indians. Although India's population of 1.2 billion comprises thousands of ethnic groups characterized by differences in language, customs, and religion, 2 ancient populations are ancestral to most present-day Indians: ancestral North Indians and ancestral South Indians (). Of the 2 ancestral groups, ancestral North Indians are genetically closer to Middle Easterners, Central Asians, and Europeans. Ancestral North Indian ancestry ranges from 39% to 71% in most Indian groups, and the degree of ancestral North Indian ancestry is reportedly higher in traditionally upper-caste groups and Indo-European speakers (). In the study by Negi et al, there is a clear substructure in the ancestral North Indian population, with 3 clusters identified using multidimensional scaling (see Figure 1b in the article by Negi et al), but single-marker tests were adjusted for the resulting genomic inflation (see Figure 2 in the article by Negi et al).
The association of ARL15 with RA reported by Negi et al underscores the importance of the search for genetic underpinnings of RA and other immune-mediated diseases among different ethnicities. There are several reasons to study RA genetics among different ethnicities—to look for different, ethnic-specific risk factors (as was the focus of Negi et al), to look for risk factors shared across multiple ethnicities, and to use divergent linkage disequilibrium patterns to refine the distance between the tag polymorphism and the causal variant.
The presence of a significant number of ethnic-specific risk alleles for RA might lead to the development of ethnic-specific diagnostic and therapeutic tools. However, associations that are limited to one or more but not all ethnicities, such as that of PTPN22 with RA in Europeans () but not in African Americans (), appear to be the exception rather than the rule. The paucity of ethnic-specific risk alleles points to the potential value of large-scale genetic association studies across RA patients of different ethnicities, namely, enabling the conduct of trans-ethnic analyses. By analyzing large groups of RA patients of different ethnicities together, there is the capability of performing fine-mapping of causal variants along with increased statistical power to identify new genetic associations (). Such an approach has been used to identify new loci associated with RA () as well as broader phenotypes such as serum protein levels ().
The findings of Negi et al underscore the importance of adipocytokine pathways in RA. As reviewed by Müller-Ladner and Neumann (), adipocytokines such as adiponectin are produced by synovial fibroblasts, are present in substantial amounts in the serum and joints of patients with inflammatory joint diseases, and can up-regulate proinflammatory pathways and RANKL-dependent osteoclast activation. In addition, serum adiponectin levels are associated with radiographic damage in RA (). While the association between RA and cardiovascular disease (CVD) is known, there are conflicting data on whether circulating adiponectin levels are associated with CVD. Negi et al found that the risk allele (C) of the SNP rs255758 influences adiponectin levels in RA patients. There is a link between RA and ARL15 and between ARL15 and adiponectin; it is interesting to speculate that ARL15 variants may contribute to the CVD phenotype in RA through the adiponectin pathway.
In addition to conventional association analysis, the investigators used a machine learning approach, namely, support vector machine (SVM) analysis, to identify novel genetic susceptibility loci for RA. The SVM approach is a well-developed machine learning technique used in computer science for pattern recognition; it uses a set of training data to learn how to classify objects (). As applied to the problem of disease risk prediction, the SVM approach attempts to identify an optimal set of genetic variants that can distinguish cases from controls. This question is different from asking whether individual markers explain variation in case–control status. Therefore, it is not surprising that 6 additional loci, not including ARL15, were found to be most informative for RA risk prediction in ancestral North Indians. SVM analysis has been used to predict genetic susceptibility based on genotype data in complex diseases such as type 2 diabetes mellitus (). Another interesting application of the SVM methodology would be to preselect known RA risk variants from other or all ethnicities and test their predictive ability in the ancestral North Indian sample set. This analysis would then help to answer the question of the extent to which ancestral North Indians and other ethnicities share RA risk factors.
Use of different statistical analytic approaches may lead to the discovery of additional novel genetic associations with RA in ancestral North Indians. For example, in the replication stage of the study by Negi et al, at least 20 SNPs tagging the SNP with lowest P values could have been used, rather than 1 SNP per region, to provide more detailed data. In addition, the threshold used for significance for multiple testing (5 × 10−8) could have been lowered if the spectral decomposition method had been used to find the effective number of SNPs for association. Finally, during the quality control analysis, using a P value of 1 × 10−7 as a threshold to exclude SNPs not in Hardy-Weinberg equilibrium may have been too stringent; all SNPs with deviation from Hardy-Weinberg equilibrium could have been included in the analysis with subsequent use of zoom plots to investigate clustering of genotypes and ensure correct calling by the algorithm.
In summary, the finding of an association of ARL15 with RA in ancestral North Indians illustrates the utility of such genetic studies in populations of different ethnicity. By performing larger analyses, including trans-ethnic meta-analyses, the totality of the genetic contributions to this disease may finally start to become apparent. These studies have important implications for the full delineation of the multiple pathways involved in RA. Genetic profiles may allow identification of subsets of patients with different dominant pathologic pathways. This stratification may in turn lead to tailored approaches to early diagnosis or targeted therapy. For example, persons with variants in CTLA4, which is associated with RA, may be more (or less) likely to respond to abatacept (CTLA-4Ig). Through incremental gains, the big picture of RA becomes clearer and hopefully will lead to improved outcomes for patients with this chronic disease.