Integrative analysis of Dupuytren's disease identifies novel risk locus and reveals a shared genetic etiology with BMI

Abstract Dupuytren's disease is a common inherited tissue‐specific fibrotic disorder, characterized by progressive and irreversible fibroblastic proliferation affecting the palmar fascia of the hand. Although genome‐wide association study (GWAS) have identified 24 genomic regions associated with Dupuytrens risk, the biological mechanisms driving signal at these regions remain elusive. We identify potential biological mechanisms for Dupuytren's disease by integrating the most recent, largest GWAS (3,871 cases and 4,686 controls) with eQTLs (47 tissue panels from five consortia, total n = 3,975) to perform a transcriptome‐wide association study. We identify 43 tissue‐specific gene associations with Dupuytren's risk, including one in a novel risk region. We also estimate the genome‐wide genetic correlation between Dupuytren's disease and 45 complex traits and find significant genetic correlations between Dupuytren's disease and body mass index (BMI), type II diabetes, triglycerides, and high‐density lipoprotein (HDL), suggesting a shared genetic etiology between these traits. We further examine local genetic correlation to identify 8 and 3 novel regions significantly correlated with BMI and HDL respectively. Our results are consistent with previous epidemiological findings showing that lower BMI increases risk for Dupuytren's disease. These 12 novel risk regions provide new insight into the biological mechanisms of Dupuytren's disease and serve as a starting point for functional validation.


| INTRODUCTION
) is a common and disabling connective tissue disorder affecting 5-25% of individuals of European ancestry, characterized by progressive and irreversible fibroblastic proliferation affecting the palmar fascia of the hand (Dupuytren, 1834;Gudmundsson, Arngrıḿsson, Sigfússon, Björnsson, & Jónsson, 2000). DD initially manifests as nodules in the palm of the hand, resulting in contraction and ultimately flexion contractures of the digits in a proportion of individuals affected with DD. Recent twin studies estimate the heritability (i.e., the proportion of phenotypic variation explained by genetics) of DD to be~80% (Larsen et al., 2015). The largest previous genome-wide association study (GWAS) of DD in individuals of European ancestry identified 26 genome-wide significant single-nucleotide polymorphism (SNP) associations in 24 independent risk regions (Ng et al., 2017), and estimated the proportion of phenotypic variance attributable to additive effects of common variants (i.e., SNP-heritability) to be 0.53 (Ng et al., 2017;Yang, Lee, Goddard, & Visscher, 2011). The vast majority (23 of 24) of DD associations lie in noncoding genomic regions with only one located in an intron (Ng et al., 2017), thus the biological implications of these associations are not immediately clear. Investigations into the mechanisms behind the strongest GWAS association, rs16879765 (P =7.2×10 GWAS −41 ), located in the intron of EPDR1, revealed an effect on expression and protein secretion of the nearby gene SFRP4 (Ng et al., 2017) but implicated EPDR1 functionally (Staats, Wu, Gan, O'Gorman, & Ophoff, 2016). Overall, the regulatory mechanisms driving signal at the GWAS associations on DD remains unknown.
In this study, we aimed to explore genetic mechanisms at known risk regions for DD, identify complex traits with possible shared genetic etiologies, and find novel risk regions for DD. Recently, transcriptome-wide association studies (TWAS; Gamazon et al., 2015;Gusev et al., 2016) have emerged as a way to identify associations between gene expression and a trait. We performed a multi-tissue TWAS by combining a recent DD GWAS (Ng et al., 2017) with expression quantitative trait loci (eQTL; Fromer et al., 2016;GTEx Consortium, 2013;Laakso et al., 2017;Nuotio et al., 2014;Raitakari et al., 2008;Wright et al., 2014), integrating gene expression from five consortia in 43 unique tissues, to test for association between expression and DD in 15,198 genes. We identified 43 associations between tissue-specific gene expression and DD, including one novel risk region on chromosome 17. Next, we aimed to understand the genetic relationship between DD and 45 other complex phenotypes by identifying traits that have genetic correlation (i.e., the similarity in genetic effects across two traits) providing etiological insights and plausible causal relationships to investigate Johnson, Shi, Pasaniuc, & Sankararaman, 2018;Pickrell et al., 2016;Shi, Mancuso, Spendlove, & Pasaniuc, 2017). We performed genetic correlation analyses through cross-trait linkage disequilibrium (LD) score regression (LDSC; B. , and find that body mass index (BMI), type II diabetes (T2D), triglycerides (TG), and high-density lipoprotein (HDL) levels are significantly genetically correlated with DD. Additionally, we sought to further refine and understand these relationships with DD and BMI, T2D, TG, and HDL by exploring local regions with enrichment of genetic correlation using ρ-HESS , and found 8 risk regions significantly correlated with BMI and three risk regions significantly correlated with HDL. Finally, we aimed to identify a tissue or cell type to prioritize when studying DD.

| DD GWAS summary statistics
Results from a GWAS of DD in UK Europeans (3,871 cases and 4,686 controls) were previously reported (Ng et al., 2017). This GWAS summary data contained association statistics for 7,218,238 SNPs, with 6,991,033 SNPs that were imputed from individuals of European ancestry in the Haplotype Reference Consortium (McCarthy et al., 2016). We excluded multi-allelic SNPs, SNPs with ambiguous alleles (e.g., A to T or C to G), and SNPs without an rsID defined by dbSNP144, resulting in 6,126,071 SNPs for downstream analyses.
We used PLINK (Purcell et al., 2007) to compute independent risk regions (at least one SNP with ≤ P 5 × 10 GWAS −8 ) in the DD GWAS data by clumping SNPs into regions based on LD and distance, using R 2 thresholds of 0.3 and 0.25 for between-block LD and within-block LD, respectively. This resulted in 24 independent risk regions.

| TWAS reference panels and details
To find novel risk genes and biologically meaningful associations, we performed a TWAS to test genes expression levels for association with DD. We used FUSION (Gusev et al., 2016) software (see Web Resources) along with prepackaged gene expression weights. Briefly, TWAS identifies candidate risk genes for DD by integrating results from GWAS and reference panels of gene expression measurements from eQTL studies to associate cis-regulated expression with DD, while accounting for LD. Weights for gene expression were from the Genotype-Tissue Expression Project v6 (GTEx; 43 tissues, n = 449), the Metabolic Syndrome in Men study (METSIM; adipose, n = 563), the Young Finns Study (YFS; blood, n = 1,264), the CommonMind Consortium (CMC; dorsolateral prefrontal cortex, n = 452), and the Netherlands Twin Registry (NTR; blood, n = 1,247) reference panels (Fromer et al., 2016;GTEx Consortium, 2013;Laakso et al., 2017;Nuotio et al., 2014;Raitakari et al., 2008;Wright et al., 2014). This totaled to 47 different reference tissue panels that represent 43 unique tissues (see Table S1). Description of quality control procedures for these expression data has been previously described (Gusev et al., 2016;Mancuso et al., 2017). FUSION estimates the strength of association between predicted expression of a gene and DD (Z TWAS ) as a function of the vector of GWAS summary Z-scores at a given cis-region (Z GWAS ) and the weights vector (w GE ) learned from one of the 47 gene expression panels aforementioned. Specifically, the strength of association between predicted expression of a gene in tissue and DD is defined as where w′ GE is the transpose weights vector and V is the reference panel LD (European ancestry from the 1000 Genomes [1000G] Project (Genomes Project Consortium et al., 1000). A p-value (P TWAS ) is obtained using a two-tailed test under N (0,1). This process was repeated for each reference tissue panel and gene, resulting in 98,147 tissue-specific gene models involving 15,189 genes (Table S1). We assessed significance with the Bonferronicorrected threshold at ≤ P TWAS 0.05 98,147 .

| Genome-wide genetic correlation with cross-trait LDSC
We estimate the genome-wide genetic correlation between DD and 45 complex traits to identify shared genetic risk for DD with other complex traits. To this end, we used cross-trait LDSC , a method for estimating genome-wide genetic correlation between two traits that requires only GWAS summary statistics and reference panel LD (European ancestry from the 1000G Project [Genomes Project Consortium et al., 1000). We defined the genetic correlation (r g ) between DD and another trait as significant if the associated p-value from cross-trait LDSC passed the Bonferroni-corrected threshold of ≤ P T T 1, 2 0.05 45 .
2.4 | Local genetic correlation and putative causality using ρ-HESS To understand the genome-wide correlation from crosstrait LDSC at a local level we use ρ-HESS , a method to identify genomic regions that have significant enrichment of genetic correlation between two traits. Finding regions with enriched geneticcorrelation can lead to finding more possible risk regions for one trait by leveraging power from the other trait. Exploration of such regions may also lead to more insights into biological mechanisms that are affecting both traits and their shared etiology. For each of the traits with significant genome-wide genetic correlation with DD, we run ρ-HESS to estimate the local genetic correlation between each trait and DD within 1702 approximately independent regions genome-wide. We restricted our analyses to 1702 approximatelyindependent LD regions in Europeans . We filtered out one region because there were no SNPs genotyped in the DD GWAS within that region. For reference LD, we used European ancestry from the 1000G Project (Genomes Project Consortium et al., 1000. The local genetic correlation (r g local , ) between two traits at a given region was defined as significant if the associated p-value from ρ-HESS passed the Bonferroni-corrected threshold of ≤ P region 0.05 1,702 . We also aimed to find evidence for putative causal relationships between DD and other genetically correlated traits. We used the implementation in ρ-HESS based on a previously described method  to prioritize putative causal models between pairs of complex traits. Essentially, for two complex traits, the local genetic correlation is evaluated at regions harboring genome-wide significant GWAS SNPs from either trait, rather than across 1702 independent regions. Trait 1 specific regions are regions harboring significant GWAS SNPs for trait 1 but not trait 2; trait 2 specific regions are regions harboring significant GWAS SNPs for trait 2 but not trait 1. The local genetic correlation for all trait 1 specific regions are summed (r T1,regions ) and the local genetic correlation for all trait 2 specific regions are summed (r T2,regions ); these summed values are a good representation of true genetic correlation at trait 1 specific regions or trait 2 specific regions. Confidence intervals are determined by 1.96 times jackknife standard error on each side; significance is determined if the confidence intervals do not overlap. If trait 1 correlation is significantly nonzero while trait 2 correlation is near zero, then this is consistent with a model that trait 1 causally influences trait 2. The intuition behind this test is that if trait 1 causally influences trait 2 then trait 1 specific regions would have a strong genetic correlation MAJOR ET AL.

| 631
with trait 2 but trait 2 specific regions would not have a strong genetic correlation with trait 1. To avoid spurious claims, we only do this test if there are more than 10 regions harboring GWAS significant SNPs for each trait. Thus, we can leverage the difference in correlations for a trait-specific signal at these regions to see if the correlations are consistent with a suggestive causal model Shi et al., 2017).

| Tissue and cell type prioritization
To identify tissues and/or cell types that are biologically relevant to DD, we used stratified LD score regression to estimate the enrichment of DD SNP-heritability in 205 publicly available specifically expressed gene (SEG) annotations, each of which represents a set of genes that are specifically expressed in a single tissue or cell type (LDSC-SEG; Finucane et al., 2018). Briefly, the 205 annotations were originally created from two datasets: RNA-seq gene expression measurements in 53 human tissues from GTEx v6p (GTEx Consortium, 2013; average of 161 samples per tissue), and a microarray gene expression data set comprised of 152 tissues and cell types from either human, mouse, or rat (the "Franke Lab" data set; Fehrmann et al., 2015;Pers et al., 2015). For each set of specifically expressed genes, an annotation was created by adding 100-kb windows upstream and downstream from the transcribed region of each gene. In addition, we tested for enrichment of DD SNPheritability in a set of 489 publicly available tissue-or cell type-specific chromatin annotations (Finucane et al., 2018). 396 of these annotations were originally created from five activating histone marks (H3K27ac, H3K4me3, H3K4me1, H3K9ac, and H3K36me3) and DNase I hypersensitivity (DHS) regions that were present in a subset of 88 tissues and cell types in the Roadmap Epigenomics Consortium (Roadmap Epigenomics . An additional 93 annotations were created from a set of four activating histone marks (H3K27ac, H3K4me3, H3K4me1, and H3K36me3) in 27 tissues from EN-TEx (ENCODE Project Consortium, 2012) that were also present in GTEx. Details on the construction of both the SEG annotations and chromatinbased annotations can be found in the original study (Finucane et al., 2018). Each annotation was tested individually for the enrichment of DD SNP-heritability on top of the baseline-LD model (Gazal et al., 2017) by assessing whether the expected additional per-SNP heritability contribution due to the annotation is significantly nonzero (FDR < 0.1).
We also used the web application FUMA (Watanabe, Taskesen, van Bochoven, & Posthuma, 2017) in the aim of finding tissues or cell types with differentially expressed genes relevant to DD. FUMA maps GWAS results to create a gene set in three ways: (a) physical proximity on the genome, (b) eQTL associations, and (c) chromatin interaction. We used the gene property analyses (implemented from MAGMA (de Leeuw, Mooij, Heskes, & Posthuma, 2015)) and differentially expressed gene (DEG) analysis to prioritize different tissues or cell types. For the gene property analysis, FUMA tests if the expression of the GWAS gene set in a single tissue or cell type is statistically different than the average expression of the GWAS gene set across all tissues or cell types. We perform this gene property analysis in 53 GTEx (GTEx Consortium, 2013) 0.05 53 ) as well as in 5115 study-defined cell types ( ≤ P GP CT , 0.05 5,115 ) using single cell RNA-seq data from 28 studies (Alles et al., 2017;Breton et al., 2016;Campbell et al., 2017;Chen, Wu, Jiang, & Zhang, 2017;Darmanis et al., 2015;Enge et al., 2017;Furlan et al., 2016;Gokce et al., 2016;Haber et al., 2017;Habib et al., 2017;Han et al., 2018;Häring et al., 2018;Hochgerner et al., 2017;Hu et al., 2017;Joost et al., 2016;La Manno et al., 2016;Mohammed et al., 2017;Romanov et al., 2017;Saunders et al., 2018;Tasic et al., 2016;Usoskin et al., 2015;Vanlandewijck et al., 2018;Zeisel et al., 2018Zeisel et al., , 2015Zhong et al., 2018;Zhou et al., 2017) as described on the FUMA website (see Web Resources). For the DEG analysis, FUMA defines differentially expressed genes in each tissue by performing a two-sided t test for that one tissue against all other tissues. Each of the 53 GTEx (GTEx Consortium, 2013) tissues are tested for upregulation, downregulation, and both-sided DEG sets. We removed tissues where DEG sets had less than 30 genes to avoid underpowered correlations; significance was defined by Bonferroni correction for the number of tests ( ≤ P DEG 0.05 153 ).

| TWAS identifies 18 risk genes for DD
To explore putative biological mechanisms at known DD risk regions, we performed a multi-tissue TWAS to identify genes (specifically, cis-regulated gene expression), associated with DD (see section 2). Briefly, TWAS identifies candidate risk genes for DD by integrating results from GWAS and reference panels of gene expression measurements from eQTL studies to associate cis-regulated expression with DD, while accounting for LD. We used tissue reference panels from GTEx (GTEx Consortium, 2013), METSIM (Laakso et al., 2017), YFS (Nuotio et al., 2014;Raitakari et al., 2008), CMC (Fromer et al., 2016), and NTR (Wright et al., 2014) resulting in 47 different reference tissue panels with a combined sample size of 3,975 (see section 2; Table S1). Using these reference panels, we tested 98,147 tissuespecific gene models and found 43 significant tissue-specific gene-trait associations at a Bonferroni-corrected threshold of ≤ P TWAS 0.05 98,147 (Table 1; Table S2). GWAS SNP association strength and TWAS tissue-specific gene model association strength can be seen in Figure 1. These 43 significant models were composed of 18 genes among 23 tissue panels-7 genes were significant in multiple tissues (Table 1). A total of 36 of the 43 significant tissue-specific gene models were within 0.5 Mb of any of the previously identified 24 risk regions.
One region of interest is on chromosome 7, where EPDR1 was found to be significant in 10 different tissue panels (most significant in lung tissue, P = 6.4 × 10 TWAS −31 ). This region has been previously investigated because of its strong association signal (Odds Ratio 1.93 and P = 7.2 × 10 GWAS −41 ) with DD (Ng et al., 2017). The variant with the strongest association in this region, rs16879765, is in an intron of EPDR1. Although the decreased secretion of the nearby WNT-agonist SFRP4 was correlated with the high-risk genotype (Ng et al., 2017), genetic and functional evidence point toward EPDR1 being the disease-relevant gene for this region, which has been functionally validated as contributing to myofibroblast contractility (Staats et al., 2016). All three transcripts of EPDR1 are found in affected DD tissue and knockdown of EPDR1 attenuates contractility in fibroblast-populated collagen lattice assays (Staats et al., 2016).

| TWAS identifies novel risk region on chromosome 17
To identify possible novel risk regions from TWAS associations, we aimed to see if any tissue-specific gene models were independent of established GWAS associations. After grouping association signal into 1 Mb regions, we found 13 regions with only significant GWAS SNP(s), 1 region with only significant TWAS model(s), and 11 regions with both significant GWAS SNP(s) and TWAS model(s). Here we define a region identified through TWAS to be novel if (1) the strongest DD associated SNP in the gene's region is not genome-wide significant (i.e., ≥ P 5 × 10 GWAS −8 ) and (2) that the TSS of the TWAS-gene is not within 0.5 Mb of the previously known 24 risk regions. With these constraints, we identified one novel risk region for DD (Figure 2). To ensure our result was robust to long-range LD, we expanded our window criteria to include 1 Mb and 2 Mb and found no change. We found a single tissue-specific gene model, TMEM106A (P = 1.2 × 10 TWAS −7 ; GTEx breast mammary tissue), was significantly associated with DD risk at this region ( Figure 2). To determine that the TMEM106A association was robust to possible LD confounding, we performed a permutation test using GWAS summary statistics and found similar results (P = 8.91 × 10 perm −3 ). There were 12 tissue panels that expression for TME-M106A was modeled from (Table S3).
3.3 | Estimates of SNP-heritability in DD are higher than previously proposed We obtained an SNP-heritability estimate of 0.67 (SE = 0.08), using LDSC (B. K. . We also used Heritability Estimator from Summary Statistics (HESS), a previously described method using similar framework as ρ-HESS that estimates local SNP-heritability, and found the total SNP-heritability to be 0.532 (SE = 0.282), similar to the previous estimate of 0.533 using GCTA (Ng et al., 2017;Yang et al., 2011). Because HESS is optimized for GWAS with sample sizes greater than 50,000 (contributing the large standard error), we included the LDSC regression estimate of SNP-heritability when running HESS to obtain more stable estimates of local SNP-heritability (B. K. Shi, Kichaev, & Pasaniuc, 2016).

| Genetic correlation suggests shared genetic etiology with DD
To identify traits that have a shared genetic etiology with DD, we used cross-trait LDSC (B. Bulik-Sullivan et al., 2015) which estimates the genetic correlation between two traits using GWAS summary statistics (see section 2). Results for the genetic correlation test between DD and 45 other traits (average sample size 132,115) can be found in Table 2. These 45 traits include a variety of anthropometric, immune, hematological, neurological, and cardiovascular-related traits and disorders. Four traits were found to have a significant genetic correlation with DD ( ≤ P T T  Table 2). These results are compatible with previous observational studies (Chammas et al., 1995;Gudmundsson et al., 2000;Sanderson, Morris, Stanley, & Fahmy, 1992). Notably, the negative genetic correlation between BMI and DD is consistent with a previous epidemiological investigation showing that the risk of DD was inversely proportional to BMI, after correcting for age, race, and sex in 14,844 patients diagnosed with DD (Hacquebord, Chiu, & Harness, 2017).  Abbreviations: GWAS: genome-wide association study; SNP: single nucleotide polymorphism; TWAS: transcriptome-wide association studies.

Note.
These are the 43 significant ( Having found evidence for BMI, HDL, TG, and T2D sharing genetic factors with DD at a genome-wide scale, we next aimed to locate possible shared genomic regions. We did this by running ρ-HESS  to estimate the local genetic correlation between DD and each of the four traits in 1702 approximately independent LD blocks ) (see Materials and Methods). The genome-wide genetic correlation results from cross-trait LDSC and ρ-HESS are fairly consistent (Pearson's r = 0.94; Table S4); differences may have resulted from using metabochip array (Voight et al., 2012) GWAS statistics with LDSC (HDL and TG), which is discouraged for cross-trait LDSC (B. , as well as smaller sample size in the DD GWAS adding noise to estimates from ρ-HESS . We found eight regions significantly genetically correlated (P T1,T2 ≤ 0.05/45) between DD and BMI, three regions between DD and HDL, and no regions between DD and TG or between DD and T2D (Table 3). Of these 11 regions, three contained a genome-wide significant association in the DD GWAS (Ng et al., 2017). Only one of the 11 regions contained significant tissue-specific gene models from TWAS; the 10 models for EPDR1 were within the DD and HDL genetically correlated 7:37555184-38966703 region (Table S2, Table 3).

| Genetic correlation patterns of BMI/TG and DD consistent with putative causality
To further elucidate the relationships of these traits with DD, we used ρ-HESS  to test for evidence GWAS: genome-wide association study; SNP: single nucleotide polymorphism; TWAS: transcriptome-wide association studies of putative causality through GWAS estimated genetic effects for BMI, HDL, TG, and T2D acting on DD or vice versa (see Materials and Methods). Both BMI and TG showed suggestive patterns that would be consistent with a putative causal relationship with DD ( Figure 3). HDL and T2D did not show suggestive patterns with DD since there is no clear direction of the trait-specific genetic correlation that would be consistent with a putative causal relationship ( Figure S1). We would expect that trait 1 specific genetic correlation would be much more extreme than trait 2 specific genetic correlation and trait 2 specific genetic correlation close to zero, or vice versa if there were a causal relationship between trait 1 and trait 2 . For example, when considering BMI and DD, the correlation at 399 BMI-specific regions (−0.27; SE = 0.044) is seemingly stronger than the correlation at 19 DD-specific regions (− 0.03; SE = 0.15), indicating that regions that increase BMI tend to decrease the risk of DD; this is consistent with a model where BMI genetic effects decrease the risk of DD (Figure 3). The same is true for TG and DD; the correlation at 65 TGspecific regions (−0.3; SE = 0.08) is seemingly stronger than the correlation at 22 DD-specific regions (0.018; SE = 0.2; Figure 3). Both of these results are not significant (assessed by the overlap of confidence intervals, r ± 1.96 × SE TX regions , SE); this is most likely because of the relatively reduced sample size in the DD GWAS study. Nonetheless, there is evidence of a putative causal relationship with BMI affecting TG Shi et al., 2017) and results from TG and DD may be from a mediated causal relationship of BMI affecting TG, which in turn affects DD (Figure 3).

| DD most relevant tissue or cell type unidentifiable with current data
Finally, we aimed to identify relevant tissues or cell types for DD. First, we used S-LDSC  to estimate the enrichment of SNP-heritability of DD (controlling for the baseline-LD model [Gazal et al., 2017]) in two sets of publicly available annotations (Finucane et al., 2018): annotations representing specifically expressed genes (SEG) in 205 tissues or cell types (Fehrmann et al., 2015;GTEx Consortium, 2013;Pers et al., 2015) and 489 annotations representing 6 chromatin features (DHS and five histone marks) in 91 tissues or cell types (ENCODE Project Consortium, 2012;Roadmap Epigenomics Consortium et al., 2015). Among the 205 SEG annotations, synovial membrane tissue was the most enriched for DD SNP-heritability on top of the baseline-LD model, but none of the 205 annotations were statistically significant (FDR < 0.1; Table S5). Among the 489 chromatin annotations, we found that esophagealmucosa tissue was the most enriched for DD SNPheritability, however, none of 489 annotations were statistically significant (FDR < 0.1; Table S6). Next, we prioritized tissues and cell types using FUMA (Watanabe et al., 2017), a platform to visualize and interpret GWAS summary statistics (see Materials and Methods). After using FUMA to create a gene set from the GWAS statistics, we first performed a gene property analysis (which tests if gene expression in a single tissue or cell type is statistically different than the average gene expression across all tissues or cell types) in 53 tissue types (GTEx Consortium, 2013). Although none of the 53 tissues showed a significant effect (P GP,T ≤ 0.05/53), the effect was strongest in cell transformed fibroblasts (Table  S7). We then assessed whether the GWAS gene set was enriched in any of the differentially expressed gene (DEG) sets for tissues. The upregulated DEG sets for tibial artery and aorta tissues both demonstrated significant (P DEG = 5.5 x 10 -5 and 7.8 x 10 -5 , respectively) overlap with the GWAS gene set (Table S8). We also performed a gene property analysis using cell type specific expression data for 5115 study-defined cell types from 28 scRNA-seq studies (Alles et al., 2017;Breton  Note. This table lists the eight regions demonstrating a significant genetic correlation between DD and BMI, and the three regions demonstrating a significant correlation between DD and HDL; significance was assessed at a Bonferroni-corrected threshold of ≤ P 0.05/1, 702 region for each trait. Also included is the number of SNPs within each region ("# SNPs") as well as the minimum GWAS association p-value for either BMI or HDL ("Min. Trait P GWAS ") and DD ("Min. DD P GWAS "). All other regions demonstrated no significant genetic correlation between DD and any trait tested.
F I G U R E 3 Tentative evidence for putative causality with DD. Here we show the genetic correlation for three different pairs of traits (DD/BMI, left; DD/TG, center; and BMI/TG, right) between four groupings of SNPs: (1) GWAS-significant SNPs specific to trait 1, (2) GWAS-significant SNPs specific to trait 2, (3) GWAS-significant SNPs for both trait 1 and trait 2, and (4) all nonsignificant SNPs shared between studies. In the left plot, since GWAS-significant SNPs specific to BMI have more enrichment of genetic correlation compared to those specific to DD, we can putatively interpret that BMI SNPs are driving the shared genetic etiology. The same can be said for the middle plot with TG. On the right, for completeness, we show the same correlation for BMI and TG, which was significant. Error bars are defined by the genetic correlation ± 1.96 times the s.e. for each grouping of SNPs. BMI: body mass index; DD: Dupuytren's disease; GWAS: genomewide association study; SNP: single nucleotide polymorphism; TG: triglycerides; TWAS: transcriptome-wide association studies Campbell et al., 2017;Chen et al., 2017;Darmanis et al., 2015;Enge et al., 2017;Furlan et al., 2016;Gokce et al., 2016;Haber et al., 2017;Habib et al., 2017;Han et al., 2018;Häring et al., 2018;Hochgerner et al., 2017Hochgerner et al., , 2018Hu et al., 2017;Joost et al., 2016;La Manno et al., 2016;Mohammed et al., 2017;Romanov et al., 2017;Saunders et al., 2018;Tasic et al., 2016;Usoskin et al., 2015;Vanlandewijck et al., 2018;Zeisel et al., 2018Zeisel et al., , 2015Zhong et al., 2018;Zhou et al., 2017). While none of the single cell types were significant (P GP,CT ≤ 0.05/5115), stromal cells and muscle cells were among the top five results (Table S9). As a final analysis, we averaged the χ 2 -statistic (Z TWAS 2 ) for the 43 significant TWAS models within each tissue to determine which tissue had the most enrichment of TWAS signal. We found adipose subcutaneous tissue was most enriched among the 23 tissues with significant TWAS models ( Figure S2). Because of the lack of consistency between methods and lack of statistical significance in many methods, we are lead to believe that likely the relevant tissue or cell type is not represented in current datasets.

| DISCUSSION
In this work, we aimed to better understand the genetic architecture of DD, find plausible biological mechanisms at known risk regions for DD, understand the relationship between DD and a variety of other traits, and identify possible novel risk regions through local genetic correlation with other traits or geneticmediated gene expression effects. We highlight that the estimated SNP-heritability of DD (0.53-0.67) is relatively close to estimates of heritability from twin studies (0.8). We also note that the strong concentration of DD GWAS signal in a handful of genomic regions is more consistent with an oligogenic architecture than a polygenic one, suggesting that further functional studies could be particularly fruitful as compared to more polygenic traits and diseases. We also identify a negative genetic correlation between DD and BMI, supporting a previous epidemiological study that observationally showed a negative correlation between the traits (Hacquebord et al., 2017); understanding the relationship between DD and BMI as well as that between DD and TG could shed light on shared biologically important pathways. Finally, we identify one novel risk region from TWAS and identify 11 regions with the significant local genetic correlation between DD and BMI or HDL. Overall, our findings highlight the need for more investigation into these regions as a first step.
Additionally, we note a few caveats in our results. First, though the sample size of 8,557 for the DD GWAS is the largest yet, it is possible that additional GWAS regions remain undiscovered due to the limit in power and this also would further reduce power to fully detect associations and relationships with other traits. Second, while the patterns of genetic correlation between BMI and DD as well as TG and DD are somewhat consistent with causal relationships, true causality between these traits cannot be determined without functional experimentation. Third, we emphasize that TWAS may not detect the true mechanism of disease if the gene expression is not mediated through genetics or if disease-relevant tissue is not well-represented in available gene expression reference panels. This may be further illustrated by the fact we were unable to identify a specific tissue or cell type to prioritize for further study in DD. This could also be due to the small sample size of the DD GWAS, the cell-type specificity of enhancer elements, or again the publication bias away from musculoskeletal connective tissues, leading to a gap in the available datasets.
Future work should be taken in multiple directions. First, we provide additional evidence that EPDR1 may contribute to the pathogenesis of DD; further work should be dedicated to functionally validate and understand this gene in connection with DD, as it may represent an attractive therapeutic target. Second, there is strong evidence for a relationship between BMI/TG and DD-elucidating the mechanism may lead to interesting observations with implications for the treatment of both traits. Third, additional GWAS, with larger sample sizes and in additional populations, will uncover more of the contribution of genetic variation to DD. And fourth, given the putative oligogenic architecture of DD, and that our tissue and cell type analyses lacked consistent results, it might also be rewarding to generate more functionalomics data, such as reference gene expression panels or chromatin accessibility data in the palmar fascia tissue. These resources would offer valuable insight into the underlying mechanisms of DD and opportunity to explore therapeutic avenues.