Functional genomics in osteoarthritis: Past, present, and future

ABSTRACT Osteoarthritis (OA) is a common complex disease of high public health burden. OA is characterized by the degeneration of affected joints leading to pain and reduced mobility. Over the last few years, several studies have focused on the genomic changes underpinning OA. Here, we provide a comprehensive overview of genome‐wide, non‐hypothesis‐driven functional genomics (methylation, gene, and protein expression) studies of knee and hip OA in humans. Individual studies have generally been limited in sample size and hence power, and have differed in their approaches; nonetheless, some common themes have started to emerge, notably the role played by biological processes related to the extracellular matrix, immune response, the WNT pathway, angiogenesis, and skeletal development. Larger‐scale studies and streamlined, robust methodologies will be needed to further elucidate the biological etiology of OA going forward. © 2016 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 34:1105–1110, 2016.

Osteoarthritis (OA) is a degenerative joint disease characterized by joint pain and stiffness. OA mostly affects the knee, hip, and hand joints, and leads to changes in multiple joint tissues. Radiographic evidence of osteoarthritis is present in over 40% of individuals over the age of 70, and the incidence of OA increases with age. 1 Treatment of OA is often unsuccessful in alleviating joint pain, leading to total joint replacement (TJR). Known epidemiological risk factors for OA include obesity (especially for knee OA), joint injuries, and bone morphology. 2 Twin and family studies have documented that genetic factors also play an important role in the etiology of OA. The heritability of knee OA is estimated to be about 40%, while the heritability of hip and hand OA is estimated to be about 60%. 3 Genetic association studies have to date identified 17 OA risk loci, 4-6 many of which are specific to the joint, sex, or ethnic group (European or Asian). All of these variants are common in frequency, and have only a small to moderate effect on risk of OA. The genetic studies have shown that OA is a complex disease, so that many different genetic risk variants will contribute to the susceptibility. This is in line with many other human traits, such as height 7 and BMI, 8 and many diseases, such as type 2 diabetes and schizophrenia.
For OA, the known associations have not yet proven sufficient to pinpoint biological pathways involved in disease etiology. However, the disease tissue is accessible through TJR surgery. This offers an opportunity to carry out functional genomics studies, examining the methylation, RNA, and protein level signatures of OA. Notably, genetics studies of OA aim to identify heritable risk factors. By contrast, functional genomics studies can establish how molecular processes change in disease; such changes can reflect causes or consequences of the disease and are of a dynamic nature.
The techniques to interrogate methylation, RNA, or protein levels genome-wide have recently become more affordable and scalable, leading to an increase in the number of corresponding studies (Fig. 1). [9][10][11][12][13][14] To pave the way for future progress, we review here the existing results, as well as the challenges that need to be addressed.
We consider systematic, non-hypothesis-driven functional genomics studies of OA, focusing on knee and hip OA as the two major joints. Many studies of OA have been carried out in animal models (e.g., mice, reviewed by Fang and Beie, 15 or zebrafish 16 ). The studies reviewed were identified from PubMed searches (carried out in February 2015) for hip or knee OA, together with the terms "gene expression array" or "RNA sequencing" or "methylation" or "proteomics," and focus exclusively on human data. We only consider studies with at least five OA and five control samples (the latter can be unaffected tissue from individuals with OA) (Supplementary Tables S1-3). Given epidemiological and genetic differences between affected joint sites, we first review knee OA and hip OA studies separately, and then discuss cross-joint studies. Notably, studies use different methodological approaches and different thresholds for declaring novel findings (e.g., different significance threshold, multiple-testing correction, minimum fold-change between healthy and disease tissue), making direct comparison of results difficult. We therefore, discuss general insights gleaned from within and across studies, and conclude with opportunities and challenges for the future.

KNEE OA Methylation
All genome-wide methylation studies in knee OA have focused on cartilage (Supplementary Table S1) and have used arrays ranging from 27,000 10 to 450,000 probes. 12,13 The gene sets found to be enriched among differentially methylated loci (DMLs) or genes include development and differentiation pathways 12 and genes relating to the inflammatory response. 10 Moreover, two studies observed subclustering of OA samples 10,13 : in one, 10 the between-cluster DMLs were enriched in immune system or extracellular matrix (ECM) genes; and in the other, the between-cluster DMLs were enriched in immune pathways. 13 In a different non-array-based approach, levels of a particular methylation mark (5-Hydroxymethylcytosine) were found to be increased in OA chondrocytes. 14

Gene Transcription
Numerous studies have used microarrays to investigate gene expression changes in knee OA. The majority considered cartilage [17][18][19][20] (Supplementary Table S1) and two studies each highlighted ECM-related genes 19,20 and angiogenesis. 17,19 A further study identified two subclusters of samples 10 pointing to immune response pathways. Compared to methylation-based clustering in an independent set of patients, the overlap and direction of change of cluster-distinguishing genes was only moderate. Two further studies 21,22 specifically considered the expression of long non-coding RNAs (lncRNAs) in cartilage (Supplementary Table S1).
Three microarray studies considered synovial tissue, subchondral bone, or peripheral blood (Supplementary Table S1). Gene sets associated with OA based on the synovial membrane include inflammation, cartilage metabolism, WNT signalling, and angiogenesis. 23 By contrast, from a study of subchondral bone, the enriched gene sets included lipid metabolism, mineral metabolism, connective tissue disorders, cellular growth and proliferation, connective tissue development and function. 24 Finally, in an analysis of blood leukocytes, OA samples were subdivided into two clusters based on cytokine expression. 25 The "increased inflammation" cluster characterized by elevated expression of IL-1b, TNF-a, IL-6, IL-8, and COX-2 was associated with both increased pain and increased risk of progression.

Protein Expression
Thus far, studies have mostly used gels to identify protein spots that differ between OA and control samples, or between intact and damaged OA samples. The number of differentially expressed proteins has been consistently low (Supplementary Table S1). The gene sets implicated from three cartilage-based studies show little overlap: cellular metabolism, structure, or protein targeting gene sets in one study, 26 oxidative stress response, metabolic, and apoptotic pathways in a second study, 27 inflammatory response and response to wounding in the third study. 28 Two further studies were based on synovial fluid. One of these found an association with the acute-phase response signalling pathway, the complement pathway, and the coagulation pathway. 29 The second study did not find differences between what they classified as "early" and "late" OA samples, but found two different subgroups of OA samples based on protein expression profiles. 30 Common Pathways Despite the differences in the tissues, molecular phenotypes, and analytical methods across studies, some biological gene sets or pathways have been implicated multiple times. Gene sets involved in immune response were implicated across all molecular phenotypes and in multiple tissues (cartilage and synovium). 10,13,23,28,29 Several studies identified genes related to the extracellular matrix 19,20 or angiogenesis. 17,19 Moreover, several studies have found that knee OA samples form two different clusters 10,13,30 ; two of these studies related differences between the clusters to the immune response. 10,13 HIP OA Methylation Three studies have investigated methylation patterns in hip OA to date; two considered cartilage (Supplementary Table S2) and found that DMLs included 24 genes previously associated with OA, 11 and that they were enriched in genes annotated to the extracellular matrix, collagen, angiogenesis, and the TGF-b pathway. 13 Hip OA samples were also found to form two clusters, with genes in the between-cluster DMLs enriched among inflammation and immunity pathways. 13 A study of trabecular bone 31 identified enrichment of genes associated with bone phenotypes in the GWAS catalogue, 32 genes annotated to the skeleton, glycoprotein, neuronal differentiation, adherence, homeobox, and cell proliferation pathways.

Gene Transcription
Each of the three gene expression studies of hip OA have focused on different joint tissues (Supplementary  Table S2). In a study of cartilage, 33 the enriched pathways included stress response, cell death, regulation of development, skeletal development, WNT/bcatenin signalling, HIF1a signalling, and p38 MAPK signalling. A study of trabecular bone 34 suggested a role for genes involved in angiogenesis, as well as osteoblast, osteocyte, and osteoclast function. Some evidence was also found for association of the WNT pathway and TGF-b/BMP signalling. Finally, a study of bone marrow mesenchymal stem cells 35 found differential expression of genes annotated to signal transduction, cell development and differentiation, the WNT/catenin pathway, and collagen.

Protein Expression
Protein expression studies in hip OA have so far used gels followed up by mass spectrometry (MS), as described for knee OA above. A study of adult mesenchymal stem cells 36 identified a small number of differentially expressed proteins, which included metabolic enzymes and proteins involved in the cytoskeleton or motility.

Common Pathways
The common themes emerging in functional genomics studies of hip OA have primarily included gene sets relevant to collagens, 13

OVERLAP AND DIFFERENCES BETWEEN JOINTS
While most studies of knee or hip OA have focused exclusively on one joint, a few recent studies have considered the commonalities and differences between knee and hip OA (Supplementary Table S3).

Methylation
One study of cartilage looked for regions differentially methylated in a comparison between hip OA and controls, as well as in comparison between knee OA and the same control samples, 13 and found that shared DMLs were enriched in genes involved in immune response, but constituted a limited proportion of DMLs individually found in hip and knee OA. In a comparison between hip and knee samples, DMLs were enriched in genes involved in development. In a different study, a clustering analysis of intact and damaged cartilage from knee and hip OA samples showed that knee and hip samples clustered separately. 9 Here, the regions with differential methylation between joints were also enriched in genes involved in development, especially in homeodomain genes.

Gene Transcription
One study compared the differentially expressed genes identified in a cartilage case/control analysis for hip OA 33 to the genes from an analogous study of knee OA. 19 There was moderate overlap between the differentially expressed genes. However, 34 canonical pathways were significantly enriched in both studies, suggesting that different genes but similar pathways might play a role across joints in OA. In a different case/control study in blood, ascertained on the basis of OA in multiple joints (knee, hip, hand, or spine) or multiple joint sites in the hand, 37 the differentially expressed genes were enriched among biological processes relating to protein transport and localisation, macromolecular complex subunit organisation, cell cycle, RNA processing, and apoptosis.

CONCLUSIONS
The last few years have witnessed a flurry of studies focussing on the functional genomics of OA. While there is limited concordance of study results even within the same joint, some gene sets have been implicated across multiple studies, notably those related to immune response, the extracellular matrix, angiogenesis, and WNT signalling. These associations may reflect causal processes as well as consequences of OA, especially when relatively intact tissue from individuals with OA serves as the control. Across the hip and knee OA joints, both similarities and differences have been observed, with higher overlap at the Figure 2. (a) At a sample size of 10 cases and 10 controls, an estimated less than 10% of the "true" differentially expressed genes are significant at 5% FDR. Sample sizes of about 300 cases and 300 controls are needed to detect over 90% of differentially expressed genes in OA (expected discovery rate, EDR). Higher sample sizes also increase true positive (TN) and true negative (TN) rates. The results are estimates obtained from PowerAtlas by extrapolation from the analysis of cartilage from 12 individuals, with intact and degraded cartilage from each individual, and significance defined at a p-value threshold of p 0.001 which corresponds to 5% false-discovery rate in the analysis of the given samples. (b) Defining significance at a nominal p-value of p < 0.05 leads to lower rates of true positive (TP) results. Estimates obtained as in (a). level of biological pathways than that of individual genes. Notably, the etiology of OA may be heterogeneous even within the same joint. This is supported by the clustering of OA samples into two distinct groups as observed in several studies. Further research will be required to explore this hypothesis.
Biological changes may be mediated by epidemiological OA risk factors, such as body mass index (BMI), joint morphology, and pain perception that impact the symptomatic aspects of OA. Detailed phenotypes at large scale will be required to address their interplay with molecular phenotypes. In addition, the etiology of OA may differ across ethnicities, as suggested by the different prevalence of lateral compared to medial OA between Chinese and European-descent individuals. 38 Consequently, studies of different ethnicities have the potential to provide insights into different genetic and environmental risk factors and their effect on molecular phenotypes.
Functional genomics studies carried out to date in OA have been small or modest in scale, and although necessary to demonstrate experimental feasibility and provide first insights into disease biology, they are restricted in terms of power. Using data from RNA sequencing in intact and affected cartilage from individuals with knee OA (m/s in review), we estimate that the genes identified as differentially expressed at 5% falsediscovery rate from 10 to 12 individuals represent less than 10% of the actually changed genes (Fig. 2). Using the same p-value threshold and extrapolation from the pilot data, over 300 cases and 300 controls would be needed to identify at least 90% of the "true" differentially expressed genes (Fig. 2). Notably, none of the studies conducted to date approach these numbers. While these estimates are not exact and may depend on joint and cell type, they indicate the required level of magnitude. Importantly, replication is the gold standard for declaring robust findings, and to date, few of the functional genomics studies in OA have explicitly included a replication component. Further development of analytical methodology and standardized approaches to declaring significance in the presence of multiple-omics testing will additionally help utilize existing and future samples to their full potential.
There are additional genomics approaches for which application to OA has been limited to date, including metabolomics (e.g., Zhang et al. 39 ) and studies of noncoding RNA (e.g., microRNAs 40 ).
Genomics approaches can also help to clarify the mechanisms through which genetic risk variants exert their effect. For example, some OA risk variants were recently found to be correlated with DNA methylation levels of nearby probes. 41 However, the mechanism of action is unknown for most variants. Some could regulate genes located far way-such long-range interactions could be identified through chromosome conformation capture. 42 For example, a range of genetic and genomic approaches was recently applied to elucidate that the obesity risk variant located in FTO 43 acts through affecting regulation of genes IRX3 and IRX5 which are over 500 kb away. 44 To determine the causal variant among several in LD and verify their action, genome editing techniques such as CRISPR-Cas9 45,46 will be crucial.
The combination of different genomics data will also provide further insights. For example, a recent study has combined methylation and gene expression data from OA patients to identify genes whose expression was correlated with methylation of nearby probes, with differential expression of the genes and differential methylation of the probes in intact versus degraded cartilage. 47 Some of these gene expression or probe methylation values were also correlated with nearby genetic variants. 47 Finally, OA is a disease affecting multiple tissues, and interactions between tissues may be important for disease etiopathogenesis. 48 Studies so far have only considered one tissue and have largely been focused on cartilage (Supplementary Tables S1-3), as it is both central to the disease and consists of only one cell type (chondrocytes). Building on this work, analyses taking into account several relevant tissues will be necessary to elucidate the corresponding biological mechanisms and to identify biomarkers. In particular, cross-tissue analyses may help to resolve the role of inflammation in OA, which is currently still unclear and may depend on the tissue examined. Ultimately, integration across multiple tissues and different omics levels at large scale holds the promise of characterising OA processes in depth, thus leading to the development of new therapeutic interventions.