The association of genetic variability in patatin-like phospholipase domain-containing protein 3 (PNPLA3) with histological severity of nonalcoholic fatty liver disease


  • Yaron Rotman,

    1. Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Christopher Koh,

    1. Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Joseph M. Zmuda,

    1. Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA
    Search for more papers by this author
  • David E. Kleiner,

    1. Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD
    Search for more papers by this author
  • T. Jake Liang,

    Corresponding author
    1. Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD
    • NIH, NIDDK, Liver Diseases Branch, 10 Center Drive, Building 10, Room 9B16, Bethesda, MD, 20892-1800
    Search for more papers by this author
    • fax: 301-402-0491

  • the NASH CRN

    1. The Nonalcoholic Steatohepatitis Clinical Research Network, Bethesda, MD
    Search for more papers by this author

  • Potential conflict of interest: Nothing to report.


Genome-wide association studies identified single-nucleotide polymorphisms (SNPs) that are associated with increased hepatic fat or elevated liver enzymes, presumably reflecting nonalcoholic fatty liver disease (NAFLD). To investigate whether these SNPs are associated with histological severity of NAFLD, 1117 (894 adults/223 children) individuals enrolled in the Nonalcoholic Steatohepatitis (NASH) Clinical Research Network and National Institutes of Health Clinical Center studies with histologically confirmed NAFLD were genotyped for six SNPs that are associated with hepatic fat or liver enzymes in genome-wide association studies. In adults, three SNPs on chromosome 22 showed associations with histological parameters of NASH. After adjustment for age, sex, diabetes, and alcohol consumption, the minor allele of rs738409 C/G, a nonsynonymous coding SNP in the patatin-like phospholipase domain-containing protein 3 (PNPLA3) (adiponutrin) gene encoding an Ile148Met change, was associated with steatosis (P = 0.03), portal inflammation (P = 2.5 × 10−4), lobular inflammation (P = 0.005), Mallory-Denk bodies (P = 0.015), NAFLD activity score (NAS, P = 0.004), and fibrosis (P = 7.7 × 10−6). Two other SNPs in the same region demonstrated similar associations. Three SNPs on chromosome 10 near the CHUK (conserved helix-loop-helix ubiquitous kinase) gene were independently associated with fibrosis (P = 0.010). In children, no SNP was associated with histological severity. However, the rs738409 G allele was associated with younger age at the time of biopsy in multivariate analysis (P = 0.045). Conclusion: In this large cohort of histologically proven NAFLD, we confirm the association of the rs738409 G allele with steatosis and describe its association with histological severity. In pediatric patients, the high-risk rs738409 G allele is associated with an earlier presentation of disease. We also describe a hitherto unknown association between SNPs at a chromosome 10 locus and the severity of NASH fibrosis. (Hepatology 2010)

Nonalcoholic fatty liver disease (NAFLD) has become the most common cause of chronic liver disease and has an estimated prevalence of 20%-30% in the general population and 67%-75% in the obese population.1, 2 This disease ranges from simple steatosis to nonalcoholic steatohepatitis (NASH) and has been closely associated with obesity, insulin resistance, diabetes, metabolic syndrome, and hyperlipidemia.3, 4 The precise mechanism responsible for the development of the NASH phenotype is yet to be elucidated.5 Metabolic derangements associated with NAFLD such as obesity and diabetes are known to have a strong underlying genetic component. Moreover, familial clustering of steatosis, NASH, and cryptogenic cirrhosis has been reported, suggesting that genetic factors may contribute to the development of NAFLD.6-8 An important initial step in the identification of these genetic factors is through the use of genome-wide association (GWA) studies. These studies evaluate the genotypic-phenotypic association in large population-based cohorts and have identified susceptibility loci in numerous diseases.9

Recently, GWA studies have identified several single-nucleotide polymorphisms (SNPs) that are associated with increased hepatic fat content10 or with elevated liver enzymes.11 Of these, the patatin-like phospholipase domain-containing protein 3 (PNPLA3) locus on chromosome 22, and especially the nonsynonymous coding SNP rs738409 C/G (Ile148Met), emerged as an important genetic variant associated with the presence of NAFLD. These findings were confirmed by candidate-gene studies that assessed the associations with this SNP in several cohorts of different ethnicity.12-15

These previous studies have focused on the genetic variability associated with the presence of NAFLD, mainly diagnosed using noninvasive markers. However, whether these genetic variants also determine disease severity is currently unknown. In this study, we investigated the association of well-replicated SNPs identified from published literature with histological parameters of disease severity in a large cohort of patients with NAFLD.


ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; CHUK, conserved helix-loop-helix ubiquitinous kinase; CI, confidence interval; CPN1, carboxypeptidase N polypeptide 1; FDR, false detection rate; GWA, genome-wide association; NAFLD, nonalcoholic fatty liver disease; NAS, NAFLD activity score; NASH, nonalcoholic steatohepatitis; NASH CRN, Nonalcoholic Steatohepatitis Clinical Research Network; NIH, National Institutes of Health; OR, odds ratio; PNPLA3, patatin-like phospholipase domain-containing protein 3; SNP, single-nucleotide polymorphism; WHO, World Health Organization.

Patients and Methods

Study Population.

The patient population consisted of participants in one of three National Institutes of Health (NIH)-sponsored NASH Clinical Research Network (NASH CRN) multicenter studies and a cohort of individuals with NASH at the NIH Clinical Center. The NASH CRN studies include: (1) the NAFLD natural history database; (2) adult individuals enrolled in the Pioglitazone versus Vitamin E versus Placebo for the Treatment of Nondiabetic Patients with Nonalcoholic Steatohepatitis (PIVENS) trial,16 and (3) the Treatment of NAFLD in Children (TONIC).17 The NIH cohort consists of patients enrolled in two pilot treatment studies of NASH: (1) A pilot study of pioglitazone treatment for nonalcoholic steatohepatitis18 and (2) a pilot study of metformin for the treatment of NASH.19 The design and enrollment criteria for these studies have been previously reported.

Individuals included in the analysis had histological evidence of NAFLD or NASH on a liver biopsy obtained prior to enrollment. All liver biopsy specimens from the NASH CRN studies were scored by the pathology committee of the network in a blinded manner; biopsies from the NIH pilot studies were evaluated by a single pathologist (D.E.K.). Biopsies were scored according to the NASH CRN scoring system20, 21 (Supporting Information Table 1). All studies were approved by the institutional review boards and all patients gave written informed consent for participation and for analysis of genetic material.

Population control samples are from a cohort of 336 community-dwelling Caucasian men aged ≥50 years who were enrolled in a study of bone mineral density in Pittsburgh, PA.22 Participants were recruited by population-based mailings to age-eligible men. Samples were selected regardless of the bone mineral density or health status of participants. Written informed consent was obtained from all participants, and the protocol was approved by the University of Pittsburgh Institutional Review Board.

SNP Selection and Genotyping.

Six SNPs that were previously shown to be associated with liver fat (on noninvasive tests) or elevated liver enzymes10, 11 (Table 2) were chosen for genotyping. DNA was obtained from peripheral mononuclear cells or Epstein-Barr virus–immortalized B cell lines.

The SNPs were genotyped using the Sequenom MassARRAY iPLEX Gold platform (Sequenom, Inc., San Diego, CA), with polymerase chain reaction primers purchased from Invitrogen (Carlsbad, CA). Genotyping concordance was 99.9% as determined by inclusion of 301 replicate NAFLD patient samples and 100% using 96 replicate control population samples. The average genotyping call rate across all SNP assays was 97.9% and all call rates were >94%.

Measurements and Data.

Body mass index (BMI) was calculated as weight (kg)/height (m)2. In pediatric patients, BMI was adjusted to age, according to the World Health Organization (WHO) growth charts for the appropriate age and sex23 and the Z-score was used for analysis. The frequency of alcoholic beverage consumption was assessed using the AUDIT (alcohol use disorders identification test) questionnaire. Race and ethnicity were self-reported.

Statistical Analysis.

The univariate association of SNP genotypes with histological ordinal variables was assessed using the Jonckheere-Terpstra test for ordered alternatives. To determine the appropriate genetic model to be used for multivariable analysis, a modification of the method suggested by Minelli et al.24 was used. The outcome was converted to a binary variable and the univariate odds ratios (ORs) were calculated for comparison between homozygotes of the minor allele and homozygotes of the major allele (denoted ORGG) and heterozygotes compared to homozygotes of the major allele (denoted ORGg). The ratio between log ORGg and log ORGG, λ, captures the genetic mode of inheritance, where λ values of 0, 0.5, and 1 are consistent with a recessive, additive (codominant), and dominant models, respectively. We arbitrarily defined λ < 0.25 as recessive, 0.25 ≤ λ < 0.75 as additive, and 0.75 ≤ λ as dominant. Bootstrap analysis (500 samples with no replacement, 1000 repetitions) was used to assess the dispersion of λ values, using the median λbootstrap and the percentage of λbootstrap values that were classified similarly to the actual λ.

Ordinal logistic regression was used for multivariable analyses. Logit link function was selected for uniformly distributed dependent variables, whereas for normally distributed variables, a Probit link function was used. Binary variables were assessed using binary logistic regression, reporting OR and 95% confidence interval (CI). Because ordinal regression does not provide meaningful ORs, a secondary analysis was performed, in which dependent variables were converted to a binary variable and binary logistic regression was applied to obtain ORs. The genetic heredity model suggested by the λ value was used in the ordinal and binary logistic regressions.

Liver enzymes were log-transformed to achieve a normal distribution, and their association with SNPs was analyzed using one-way analysis of variance for univariate analysis and linear regression for multivariate one. Age, sex, BMI, alcohol consumption, and the presence of diabetes mellitus type 2 were selected a priori as covariates in multivariable analyses. A two-sided significance level of α < 0.05 was used for all significant tests.

When more than one SNP was tested for association with an attribute, the Bonferroni correction for multiple comparisons was used. To correct for testing for multiple histological dependent parameters, the false discovery rate (FDR) was calculated using the Benjamini-Hochberg procedure.25 The FDR significance level was set at q < 0.05. Correction was not used in univariate analyses, because these were only used to select SNPs for multivariable analyses. Statistical analyses were performed on SPSS version 13.0 and Microsoft Excel.

In order to observe the association of genetic variability with the full spectrum of histological disease, we limited the primary analysis to adult patients (age 18 years or older at the time of biopsy). Pediatric cases were used in a secondary analysis, to extend the results obtained from the adult population.


SNP genotypes were available from 894 adult patients with NAFLD (Table 1), the majority of whom (766 patients, 85.7%) were Caucasians. A definite histological diagnosis of NASH was obtained in 59.4% of patients. SNPs that are associated with NAFLD are expected to be enriched in this cohort relative to the general population. We compared the allele frequencies in the adult Caucasian patients to that of the Caucasian control cohort and to the reported frequency in 120 healthy persons of northern European descent in the HapMap project.19 Highly significant enrichment of the minor allele was found for three SNPs in the PNPLA3 gene or its vicinity on chromosome 22: rs738409, rs2281135, and rs2143571 (Table 2). The minor allele was less frequent in three SNPs on chromosome 10, but only one of those — rs11597086 located near the conserved helix-loop-helix ubiquitous kinase (CHUK) gene — achieved borderline significance. The allelic frequencies in the Caucasian control population were similar to those reported in HapMap Caucasians.

Table 1. Patient Characteristics
CharacteristicAdult CohortPediatric CohortP Value (for Comparison with Adults)
  • *

    Two-sided χ2 test.

  • BMI for children reported in standard deviations from age- and sex-fitted norms.

  • Student t test.

  • §

    Mann-Whitney U test.

  • Fisher's exact test

Male sex337 (37.7%)162 (72.6%)P < 0.001*
Data Source   
 NASH CRN database642 (72%)91 (41%) 
 PIVENS213 (24%)0 (0%) 
 NIH trials39 (4%)0 (0%) 
 TONIC0 (0%)132 (59%) 
Race (%)  <0.001*
 African American2.30.9 
 American Indian3.218.8 
 >1 race3.90.9 
Hispanic ethnicity97 (10.9%)144 (64%)<0.001*
Age47.9 ± 11.6 (18-76)12.4 (6-17) 
BMI34.4 ± 6.4 (20-59.5)+3.3 SD (1.5-7.5) 
ALT (median, range)58 (10-538)82 (16-328)<0.001
AST42 (10-480)49 (14-317)<0.001
Gamma glutamyl transferase45 (5-793)33 (5-208)0.003
Alcoholic drinks ≤ 2-4/month96.1%98.6%0.089
Diabetes mellitus type 14 (0.5%)00.586
Diabetes mellitus type 2244 (28.5%)9 (4%)<0.001*
NAS (median, range)4 (0-8)4 (1-8)0.08§
Steatosis grade  <0.001§
 0 (<5%)6.2%2.2% 
 1 (5-33%)39.2%30% 
 2 (33-66%)21.8%28.7% 
 3 (>66%)22.9%39% 
Fibrosis stage  <0.001*
 0 (no fibrosis)22.4%28.3% 
 1 (zone 3 or periportal only)26.5%40.8% 
 2 (zone 3 and periportal)18.1%15.2% 
 3 (bridging)20.8%13.5% 
 4 (cirrhosis)10.1%1.8% 
NASH diagnosis  <0.001*
 No NASH22%24.7% 
 Definite NASH59%31.4% 
Table 2. Allele Frequency of SNPs in Adult Caucasians with NAFLD and Control Populations
SNPChromosome and LocationNearest GeneAllelesHapMap-Caucasian (CEU) FrequencyControl Caucasian Cohort FrequencyNAFLD Adult Caucasian Cohort FrequencyP Value*
  • *

    Two-sided χ2 test for comparison between NAFLD adult Caucasians and control Caucasians.

  • rs11597390 could not be genotyped in controls. Two-sided χ2 test is reported for comparison between NAFLD adult Caucasians and Caucasians from HapMap.

  • Statistically significant (corrected α = 0.0083, Bonferroni procedure).

rs1159739010:101861435CPN1G/A0.675/0.325 0.666/0.3340.48
rs73840922:44324727PNPLA3C/G0.767/0.2330.772/0.2280.495/0.5059 × 10−143
rs228113522:44332570PNPLA3C/T0.792/0.2080.835/0.1650.632/0.3686 × 10−98
rs214357122:44391686SAMM50G/A0.808/0.1920.831/0.1690.680/0.3202 × 10−54

The three SNPs on the chromosome 22 locus were highly correlated with each other (Spearman's rho 0.599-0.775, P < 0.001 for all pairwise comparisons). Because rs738409 has the highest level of significance, is the only nonsynonymous coding SNP among the three, and is potentially a functionally relevant mutation,26 it was chosen as the representative SNP for that locus for further analyses.

Association with Histological Parameters of Disease Activity.

Histological parameters of NAFLD are assessed along several axes: steatosis, inflammation, cellular injury, and fibrosis. On univariate analysis, rs738409 was significantly associated with increased steatosis, portal inflammation, and lobular inflammation and showed a near-significant trend for association with the presence of Mallory-Denk bodies (Table 3). For all of these parameters, genotypes containing the minor allele rs738409 G allele were associated with more severe disease, in a dominant pattern (Fig. 1A–C), whereas with portal inflammation, the association is better fitted by an additive model. After adjustment for age, sex, BMI, diabetes type 2, and alcohol consumption, all associations remained significant. No association was found with hepatocellular ballooning. Because the results of ordinal regression cannot be easily converted to ORs, we used multivariate binary logistic regression to estimate the magnitude of effect. Patients with at least one minor allele of rs738409 were more likely to have a steatosis score ≥ 2 (OR = 1.46, 95% CI = 1.07-2.01). Similarly, the OR for lobular inflammation was 1.84 (95% CI = 1.33-2.55) and for Mallory-Denk bodies, OR = 1.55 (95% CI = 1.07-2.26). For portal inflammation, the OR for the presence of each minor allele (additive model) was 1.57 (95% CI = 1.24-1.99). These ORs are of the same order of magnitude as those of other binary variables, such as sex or the presence of diabetes (Table 3).

Table 3. Univariate and Multivariate Regression and Odds Ratio for Association of rs738409 Genotype with Histological Parameters of Disease Severity in Adults with NAFLD
PredictorSteatosisPortal InflammationLobular InflammationMallory-Denk Bodies
Univariate P ValueMultivariate P Value, (FDR q Value**)OR (95% CI)§Univariate P ValueMultivariate P Value (FDR q Value)OR (95% CI)Univariate P ValueMultivariate P Value (FDR q Value)OR (95% CI)Univariate P Value#Multivariate P Value# (FDR q Value)OR (95% CI)#
  • *

    Jonkheere-Terpestra test.

  • Ordinal regression.

  • Mann-Whitney U.

  • §

    Multivariable binary logistic regression for steatosis ≥ 33% vs. < 33%. rs738409 entered using dominant model (i.e., genotype CG or GG vs. CC).

  • Multivartiable binary logistic regression for more than mild portal inflammation (score of 2) vs. none (0) or mild (1). rs738409 entered using additive model.

  • Multivariable binary logistic regression for lobular inflammation ≥ 2 foci vs. < 2. rs738409 entered using dominant model.

  • #

    Multivariable binary logistic regression. rs738409 entered using dominant model.

  • **

    False discovery rate evaluated for association with SNP only. q < 0.05 is significant.

rs7384090.005*0.012 (0.033)1.46 (1.07-2.01)0.009*2.5 × 10−4 (0.017)1.57 (1.24-1.99)4 × 10−4*0.002 (0.025)1.84 (1.33-2.55)0.0580.015 (0.042)1.6 (1.46-3.07)
Age1.1 × 10−55.5 × 10−50.98 (0.97-0.99)4.5 × 10−78.1 × 10−61.03 (1.02-1.05)0.9980.0980.99 (0.98-1.01)1.3 × 10−91.4 × 10−41.03 (1.01-1.05)
Female sex0.5640.0751.20 (0.89-1.62)0.0080.7541.01 (0.69-1.48)0.0014.8 × 10−51.60 (1.18-2.17)1.7 × 10−87.3 × 10−52.12 (1.46-3.07)
BMI0.7560.9370.99 (0.97-1.01)5 × 10−54 × 10−51.04 (1.01-1.06)0.1030.0200.98 (0.96-1.00)0.3820.9521.00 (0.98-1.03)
Diabetes type 20.0750.2660.83 (0.60-1.13)9.7 × 10−50.0101.72 (1.19-2.48)0.6490.7950.97 (0.70-1.33)7.2 × 10−71.5 × 10−41.95 (1.38-2.75)
Alcohol consumption0.171*0.0430.91 (0.44-1.87)6.9 × 10−5*0.0880.53 (0.18-1.55)0.694*0.3361.52 (0.73-3.18)1.8 × 10−40.061.38 (0.606-3.143)
Figure 1.

PNPLA3 rs738409 genotype association with histological attributes: (A) Association with steatosis in a dominant pattern (λ = 0.79, median λbootstrap = 0.80, 55% of λbootstrap values classified as dominant). (B) Lobular inflammation, dominant pattern (λ = 0.99, median λbootstrap = 0.99, 90% classified as dominant). (C) Presence of Mallory-Denk bodies, dominant pattern (λ = 0.93, median λbootstrap = 0.91, 71% classified as dominant). (D) Fibrosis scores, additive hereditary pattern (λ = 0.43, median λbootstrap = 0.42, 57% classified as additive). Error bars denote standard error of the mean.

The association of rs738409 with hepatic steatosis could theoretically explain the association with inflammation and Mallory-Denk bodies. However, even when the degree of steatosis was included in the multivariable model, the association with inflammation and Mallory-Denk bodies remained significant (Table 4).

Table 4. The Association of rs738409 Genotype with Histological Parameters of NAFLD Before and After Inclusion of Steatosis in the Model
Dependent ParameterMultivariate*Multivariate with Steatosis
  • *

    Ordinal regression adjusted for age, sex, BMI, alcohol consumption, and diabetes type 2.

  • Ordinal regression adjusted for age, sex, BMI, alcohol consumption, diabetes type 2, and steatosis score.

  • Binary logistic regression (univariate and adjusted multivariate).

Portal inflammation2.5 × 10−41.2 × 10−4
Lobular inflammation0.0050.008
Mallory-Denk bodies0.0150.017
Fibrosis7.7 × 10−62.7 × 10−6

The NAFLD activity score (NAS) is defined as the sum of the scores for lobular inflammation, ballooning, and steatosis and is proposed as a measure of overall activity of the disease.21 As expected from the analyses of individual components, carriers of the rs738409 G allele had higher NAS on univariate (P = 0.009) and multivariate (P = 0.004) analyses in a dominant pattern (λ = 0.991). Homozygotes or heterozygotes for rs738409 G had a mean NAS score of 4.52 ± 1.69, compared to 4.11 ± 1.72 in homozygotes for rs738409 C, and were more likely to have NAS > 3 with an OR of 1.56 (95% CI = 1.12-2.17).

The two other SNPs on chromosome 22 in the sorting and assembly machinery component 50 homolog (SAMM50)-PNPLA3 gene region (rs2281135 and rs2143571) demonstrated similar associations as those seen with rs738409. No other SNP was associated with any of the histological parameters of activity.

Association with Fibrosis.

The rs738409 SNP was associated with higher fibrosis scores (Table 5) in an additive pattern (λ = 0.43), with an average fibrosis score that was 0.47 stage higher for patients with genotype GG compared to CC (mean score 2.50 versus 2.03; Fig. 1D). This association was highly significant even after controlling for steatosis, inflammation, and the presence of Mallory-Denk bodies, suggesting that the association of rs738409 with fibrosis may be independent of its association with other parameters of disease severity. The adjusted OR for having at least bridging fibrosis was 1.50 (95% CI = 1.19-1.89) for each G allele present.

Table 5. Univariate or Multivariable Regression for the Association of SNPs with NASH Fibrosis Score in Adults
CharacteristicUnivariate P ValueMultivariate P Value (FDR q-Value§)
  • Both SNPs are applied as an additive inheritance model

  • *

    Jonkheere-Terpestra test.

  • Ordinal regression.

  • Mann-Whitney U.

  • §

    False discovery rate evaluated for association with rs738409 only. q < 0.05 is significant.

rs7384090.005*1.9 × 10−5 (0.008)
Age1.8 × 10−139.2 × 10−10
Sex2.3 × 10−40.998
Diabetes type 21.1 × 10−151.2 × 10−10
Alcohol consumption1.9 × 10−13*0.003

Three SNPs (rs11591741, rs11597086, and rs11597390), all on chromosome 10 in the carboxypeptidase N polypeptide 1 (CPN1)–endoplasmic reticulum lipid raft associated 1 (ERLIN1)–CHUK gene region and known to be in linkage disequilibrium with each other, were also found to be associated with fibrosis. Of these three SNPs, rs11591741 had the most significant association (minor allele C associated with increased fibrosis) and was selected as the representative SNP for further analysis. Age, sex, BMI, diabetes, and alcohol consumption were also associated with fibrosis. When both rs738409 and rs11591741 were included in the multivariate analysis, both retained significance (Table 5).

Effect of Race on Genetic Association.

Because only a small number of patients were non-Caucasians (∼18%), we were unable to control for race in multivariate analyses. When limiting the analyses to Caucasian adults only, all of the associations remained statistically significant. The number of non-Caucasian patients was too small to allow for statistical significance analysis. However, the direction and magnitude of the effect of rs738409 genotype was similar to that seen in Caucasians for steatosis, inflammation, and NAS (Supporting Information Table 2). There was a less clear relationship between rs738409 and fibrosis in non-Caucasians, probably because of the small number.

Association with Spectrum of NAFLD.

For every patient, the pathologists assigned a diagnosis of definite steatohepatitis, borderline steatohepatitis, or not steatohepatitis, on the basis of accepted criteria.27 We compared the frequency of the rs738409 G allele in Caucasian patients from three populations: patients with a definite diagnosis of steatohepatitis (n = 438), patients with steatosis only (taken from the group classified as not steatohepatitis and defined as liver fat, minimal inflammation, no cellular injury, and no fibrosis, n = 82; Supporting Information Table 3), and 336 population controls. The allelic frequency of rs738409 G did not differ between the steatohepatitis (49.2%) and simple steatosis (51.8%, P = 0.54) groups, but was markedly higher in both groups than normal controls (22.8%, P = 2 × 10−13 and 4 × 10−26 for the comparison with steatosis and steatohepatitis, respectively). On univariate and multivariate genotypic analysis, rs738409 did not differentiate steatohepatitis from steatosis alone, even after controlling for potential confounders.

Association with Liver Enzymes.

On univariate analysis, the minor allele of rs738409 was associated with increased serum alanine aminotransferase (ALT) and aspartate aminotransferase (AST; P = 0.027, P = 0.023, respectively). On multivariate linear regression, rs738409 genotype maintained significance with both enzymes (ALT, regression coefficient 0.027, P = 0.03; AST, regression coefficient 0.024, P = 0.033; additive model). Similar associations were seen with the other two SNPs in the PNPLA3 vicinity.

The minor allele of rs11591741 in the CHUK gene was associated with decreased ALT on univariate analysis (P = 0.025). This association remained significant on multivariate linear regression, adjusted for age, BMI, diabetes, sex, and alcohol consumption (regression coefficient −0.028, P = 0.038, additive model). No significant association was observed with AST on univariate or multivariate analysis. Similar findings were seen for rs11597086, a SNP closely linked to rs11591741. There was no association between the third chromosome 10 SNP of rs11597390 and aminotransferases.

Results in Pediatric Patients.

Liver histology was available for 223 pediatric patients (Table 1). The patients in this cohort were more likely to be males and Hispanic in comparison to the adult cohort, were less likely to be diabetic, and had higher aminotransferases but lower gamma-glutamyl transferase levels. Histologically, the pediatric cohort had more steatosis, a nonsignificant trend toward a decreased NAS, and milder fibrosis, with portal fibrosis (fibrosis stage 1c) predominating, as previously described.28 In addition, these patients were less likely than the adult patients to have a confirmed diagnosis of NASH and were more likely to have a borderline diagnosis.

No association was found for any SNP with histological parameters in the pediatric patients. However, patients carrying the rs738409 G allele tended to be younger by 11 months at the time of biopsy, in a dominant pattern (Fig. 2A), and the association with the minor allele was monotonous with age (Fig. 2B). This association reached borderline significance in univariate analysis (P = 0.054, Student t test) and was significant on multivariate analysis (P = 0.045, linear regression adjusted for sex and BMI Z-score). Hispanic ethnicity was also associated with younger age at time of biopsy in univariate and multivariate analyses. Hispanics were more likely to carry the rs738409 G allele (91% versus 70% of non-Hispanics, P = 9 × 10−5, Fisher's exact test), and thus, when both Hispanic ethnicity and rs738409 genotype were entered as parameters in the regression, both became nonsignificant predictors. However, even when analyzing Hispanics and non-Hispanics separately, the G allele was associated with a younger age at biopsy in both groups, although the difference was not statistically significant, probably due to a smaller number of patients.

Figure 2.

Association of rs738409 genotype with age at time of the biopsy in pediatric patients. (A) Average age at biopsy according to rs738409 genotype. (B) Frequency of rs738409 G allele according to age at time of biopsy. Error bars denote standard error of the mean.


In this study, using a large cohort of patients with NAFLD and well-defined histological information, we confirm the association of the minor alleles of rs738409 and other SNPs in the PNPLA3 gene region with liver steatosis and liver enzymes. More importantly, we show a strong association of PNPLA3 with histological parameters of disease activity (inflammation and Mallory-Denk bodies) and with fibrosis severity, associations that are independent of the effect on steatosis. Thus, the presence of the minor allele of rs738409 is a strong predictor of NAFLD histological severity.

Interestingly, the rs738409 G allele was present in the same frequency in patients with NASH or patients with steatosis alone, despite its marked enrichment in both groups compared to population controls, and despite its association with disease severity. This is probably related to the lack of association between rs738409 and hepatocyte ballooning, a histological feature that heavily favors a diagnosis of NASH by the pathologist.27 Another explanation may result from a potential selection bias. In order to be included, patients had to have been referred to a hepatologist and have an indication for a liver biopsy. Thus, it is possible that the patients with only steatosis are not fully representative of simple steatosis and have very mild features of NASH (perhaps obscured by sampling error).

One study to date has investigated the association of rs738409 with histological parameters of NAFLD29 by comparing 172 patients with NAFLD (103 of whom had liver biopsies) to 94 normal controls. In that study, the minor allele was associated with histological steatosis and overall severity, similar to our findings. Our larger sample size enabled us to study in greater detail the associations with specific components of histological severity and to demonstrate their independence of steatosis grade. In contrast to our findings, Sookoian et al.29 observed a significant difference in rs738409 genotypes between patients with NASH and steatosis. The two studies differ in the population prevalence of the minor allele, in the definition of NASH, and in the recruitment strategy of patients. Further studies are necessary to resolve this discordant result.

NAFLD is becoming increasingly prevalent in pediatric patients.30 We did not identify an association between the PNPLA3 SNPs and severity in the pediatric cohort, probably because of a milder disease at an early stage of the NASH natural course, as evident by the smaller proportion of pediatric patients with a definite NASH diagnosis. However, the presence of the rs738409 G allele was associated with a younger age at the time of liver biopsy, suggesting a younger age of disease presentation.

There are important racial and ethnic differences in the prevalence of fatty liver disease and the predisposition to NASH; patients of Hispanic ethnicity are more likely to have NASH and NAFLD, whereas African Americans seem to be relatively protected.31 The racial distribution in our adult cohort did not allow us to determine whether the effects of genetic variation explain the ethnic-racial differences or are independent of them.

PNPLA3 encodes for adiponutrin, a membrane-bound protein expressed primarily in adipose tissue.32 The expression level of adiponutrin messenger RNA is tightly regulated by food intake33, 34 through changes in insulin levels.35, 36 The function of adiponutrin is not fully understood. It has close homology with adipose triglyceride lipase37 and seems to have both triglyceride lipase and acyl-coenzyme A–independent transacylase activity.38 The rs738409 C→G SNP is a nonsynonymous coding SNP, creating an Ile148Met change in the adiponutrin protein. Recently, this mutation was suggested to impair triglyceride hydrolysis in hepatocytes, a potential explanation for increased triglyceride accumulation.26 Interestingly, despite the association with liver steatosis, this SNP does not seem to be associated with insulin resistance10, 12, 13 or with obesity.39, 40 It has been reported to be associated with lower cholesterol and low-density lipoprotein cholesterol levels,14 but whether this is related to the liver disease is unclear. It is possible that another genetic variant, which is strongly linked to rs738409, may be responsible for the difference in phenotypes.

We have also found an association of NASH fibrosis with several SNPs on chromosome 10, in the vicinity of the CHUK1 gene, which encodes IκB kinase-α that is involved in the nuclear factor-κB signaling, and CPN1, which encodes carboxypeptidase N, a metalloproteinase that regulates the activity of kinins and anaphylotoxins. Despite the association with increased fibrosis, the minor alleles at this locus are associated with lower ALT levels in our cohort as well as in other studies.11 This apparently paradoxical association is explained by the nonmonotonous relationship between ALT and fibrosis (increasing fibrosis stage associated initially with higher ALT and then with lower ALT) in patients with NAFLD. These SNPs are not associated with any other histological aspect of NAFLD, and thus, it is unclear whether they are associated with NASH fibrosis specifically or with fibrosis in other liver diseases as well. Further studies are needed to confirm their role in NAFLD and other liver diseases.

Our findings of association with histological parameters reach conventional significance, but not significance on the level of GWA studies. Although this is often advocated in genetic studies, we believe it may not be applicable in this case for several reasons. First, we performed candidate-gene analysis with only six SNPs, as compared to hundreds of thousands in GWA studies. Second, because we tested associations with several phenotypes, we corrected for multiple comparisons using the FDR procedure.25 Because the histological attributes we tested are not independent of each other, the use of the Bonferroni correction would have been too conservative and would greatly reduce the power to detect an association. The FDR procedure is much more suitable for this situation.41 Third, the separate and independent association of rs738409 with different histological aspects of NAFLD, all in the same direction of effect, strongly supports a true association. Finally, the histological scoring scheme for NAFLD is semiquantitative and has a limited dynamic range. This factor markedly reduces the ability to achieve high significance values that can be observed with discrete traits such as liver enzymes or liver fat quantification by magnetic resonance spectroscopy.

In summary, we confirm the association of the rs738409 G allele in the PNPLA3 (adiponutrin) gene with steatosis in the largest cohort of patients with histologically proven NAFLD reported to date. Furthermore, this is the first study to describe an association of this SNP with histological parameters of disease severity, including inflammation, Mallory-Denk bodies, and fibrosis. In pediatric patients, the high-risk allele seems to be associated with earlier presentation of disease. Our findings suggest that the rs738409 G allele may predispose patients to fat accumulation in the liver, but that other factors, whether environmental or hereditary, may be required for the development of inflammation, cellular injury, and fibrosis. However, once patients develop NASH, the rs738409 G allele predisposes them to worse injury. In addition, we report an association of SNPs in the PNPLA3 locus and in the CPN1-CHUK locus on chromosome 10 with severity of NASH fibrosis. The mechanism underlying these associations requires further investigation.


We are grateful for the technical assistance of Ronda Sapp and Yoon Park, and the statistical advice of Dr. Xiongce Zhao and professor Yoav Benjamini.