Department of Clinical and Molecular Hepatology, Institute of Medical Research A Lanari-IDIM, University of Buenos Aires-National-Council of Scientific and Technological Research (CONICET), Ciudad Autónoma de Buenos Aires, Argentina
Instituto de Investigaciones, Médicas A. Lanari-CONICET, Combatiente de Malvinas 3150, Buenos Aires (1427), Argentina
Department of Molecular Genetics and Biology of Complex Diseases, Institute of Medical Research A Lanari-IDIM, University of Buenos Aires-National-Council of Scientific and Technological Research (CONICET), Ciudad Autónoma de Buenos Aires, Argentina
Instituto de Investigaciones, Médicas A. Lanari-CONICET, Combatiente de Malvinas 3150, Buenos Aires (1427), Argentina
Potential conflict of interest: Nothing to report.
Partially supported by grants PICT 2006-124 and PICT 2008-1521 (Agencia Nacional de Promoción Científica y Tecnológica), and UBACYT M55 (Universidad de Buenos Aires). The authors belong to Consejo Nacional de Investigaciones Científicas.
Our objective was to estimate the strength of the effect of the I148M (rs738409 C/G) patatin-like phospholipase domain containing 3 (PNPLA3) variant on nonalcoholic fatty liver (NAFLD) and disease severity across different populations. We performed a systematic review by a meta-analysis; literature searches identified 16 studies. Our results showed that rs738409 exerted a strong influence not only on liver fat accumulation (GG homozygous showed 73% higher lipid fat content when compared with CC ones, data from 2,937 subjects; P < 1 × 10−9), but also on the susceptibility of a more aggressive disease (GG homozygous had 3.24-fold greater risk of higher necroinflammatory scores and 3.2-fold greater risk of developing fibrosis when compared with CC homozygous; P < 1 × 10−9; data from 1,739 and 2,251 individuals, respectively). Nonalcoholic steatohepatitis (NASH) was more frequently observed in GG than CC homozygous (odds ratio [OR] 3.488, 95% confidence interval [CI] 1.859-6.545, random model; P < 2 × 10−4; data from 2,124 patients). Evaluation of the risk associated with heterozygosity for the variant suggests that the additive genetic model best explains the effect of rs738409 on the susceptibility to develop NAFLD. Nevertheless, carrying two G alleles does not seem to increase the risk of severe histological features. Meta-regression showed a negative correlation between male sex and the effect of rs738409 on liver fat content (slope: −2.45 ± 1.04; P < 0.02). The rs738409 GG genotype versus the CC genotype was associated with a 28% increase in serum alanine aminotransferase levels. Conclusion: By summarizing the amount of evidence, this study provided unequivocal evidence of rs738409 as a strong modifier of the natural history of NAFLD in different populations around the world. (HEPATOLOGY 2011;)
Advances in genome analysis, including high-throughput genotyping technology, have strongly contributed to the understanding of the genetic component of nonalcoholic fatty liver disease (NAFLD). In fact, the nonsynonymous variant I148M (rs738409 C/G, chr22:42656060-42656060) located in human patatin-like phospholipase domain containing 3 gene (PNPLA3, also known as adiponutrin) was initially associated with fatty liver in the first genome-wide association study (GWAS) on NAFLD,1 and further robustly replicated in independent candidate gene studies. The study of Romeo et al.,1 a large multiethnic population-based epidemiological study of fatty liver, not only demonstrated that the rs738409 variant is significantly associated with increased liver fat content, as evaluated by proton magnetic resonance spectroscopy, but also opened subsequent important questions about the pathogenesis and biology of NAFLD. The first question was about the potential association between the I148M variant and histological disease severity. Therefore, we first demonstrated,2 a finding replicated by others,3-6 that the G allele in the forward strand was significantly associated with disease severity, as evaluated by histological assessment of liver biopsies. Nevertheless, despite the fact that nonalcoholic steatohepatitis (NASH) was more frequently observed among G allele carriers in some studies,2, 4-6 others were unable to replicate this finding even after including a larger sample of cases.3 In addition, the magnitude and strength of the effect of the variant on disease severity widely varied among populations of similar ethnic background, ranging from odds ratios (ORs) of 1.56 to 3.264 for NASH, and ORs of 1.56 to 3.374 for liver fibrosis.
The concomitant question that came out after the findings of the GWAS was whether the rs738409 variant also influenced the prevalence and the behavior of NAFLD in early life. Therefore, large and well-characterized studies including pediatric population were carried out showing that some5 but not all of them3 confirmed an association between rs738409 and the severity of NAFLD in pediatric cohorts.
Almost simultaneously, the research community sought to know whether the metabolic comorbidities that are frequently associated with the pathogenesis of NAFLD, such as insulin resistance and obesity, are also influenced by rs738409. Interestingly, although most of the NAFLD studies have simultaneously confirmed the independence of the effect of rs738409 on fatty liver from the mentioned related phenotypes, a population-based study that surveyed a large sample of subjects without fatty liver (n = 1,811) has shown that PNPLA3 variants, including rs738409, are associated with obesity and insulin sensitivity and secretion.7 In addition, the G allele of rs738409 is associated with a favorable metabolic profile, including decreased body mass index (BMI) and decreased risk of type 2 diabetes in one of the large NAFLD studies.4
The association of the I148M variant with increased liver enzymes, in particular alanine aminotransferase (ALT) levels, was first discovered by a GWAS of plasma liver-enzyme levels in three different populations,8 and thereafter replicated by several independent studies. In contrast to that observed in adults, the rs738409 variant was not associated with ALT levels in a series of pediatric patients with NAFLD, proven by liver biopsy.5
Finally, Romeo et al.1 showed that the effect of the rs738409 variant was more pronounced among individuals of Hispanic ancestry, in whom the risk allele was also more frequent when compared with that of European-Americans and African-Americans. Hence, this observation had opened new investigations about the magnitude of the effect of the variant in different populations.
In view of the evidence mentioned above, our primary purpose was to estimate from the available literature the strength of the effect of the rs738409 variant on NAFLD and the histological disease severity across different populations, and the potential influence of the intermediate associated phenotypes. In addition, we systematically evaluated the study characteristics that could be responsible for the association. Furthermore, in order to provide novel information compared to what is already established in the literature, as the issue is still unresolved, we addressed in this meta-analysis which genetic model best explains the effect of the rs738409 single nucleotide polymorphism (SNP) on the susceptibility to develop NAFLD or NASH.
For the electronic searches, published studies were found through PubMed at the National Library of Medicine (http://ncbi.nlm.nih.gov/entrez/query) and in Medline databases for the query “(PNPLA3, adiponutrin) and (rs738409, gene or variants or polymorphism or alleles) and (fatty liver).” Reference lists in relevant publications were also examined. The literature search was done on studies up to January 2011 and availability of an English-language abstract or article for review; this yielded 32 hits. One study that performed a GWAS on NAFLD was excluded because rs738409 was not captured by the chip.9
There were no country restrictions. The authors reviewed all abstracts independently either to determine the eligibility criteria or for examining the appropriateness of the research issue and, when so, the article was retrieved; there were no discrepancies.
Details about inclusion and exclusion criteria and data collection can be seen in the Supporting Material online.
The evaluation of histological disease severity was based on data about liver biopsy of NAFLD patients, including the presence of NASH as defined by Kleiner et al.,10 presence of lobular necroinflammation (grade >1), and presence of fibrosis (stage >1).
Because the variation seemed to follow an undefined model of inheritance in some of the outcomes, to avoid choosing any a priori model, we decided to compare the extreme genotypes, namely, homozygous CC (148 I/I) versus homozygous GG (148 M/M), as reported.11 In addition, in order to address which genetic model best explains the effect of the rs738409 SNP on the susceptibility to develop NAFLD and NASH, we also included an evaluation of the risk associated with heterozygosity for the variant (heterozygous CG versus homozygous CC, the reference group). For each phenotype we evaluated the association results stratified by age and ethnicity.
An evaluation of study quality of the reviewed articles using the median impact factor of the journals in which they had been published was included.12
For quantitative variables, effect stands for standardized difference (D), defined as the mean difference (between GG and CC groups, and also between CG and CC groups) divided by the common within-group standard deviation, and for dichotomic variables, effect stands for OR with respect to the homozygous CC as a reference group unless indicated. Summary ORs and corresponding 95% CIs were estimated by fixed and random effects meta-analysis, respectively. Fixed and random effect models using the Mantel-Haenszel method were used to summarize results, obtaining the corresponding pooled OR.
For D, Cohen test (which is used for expressing the magnitude of differences between groups) was used to summarize the results, and heterogeneity was evaluated with Q statistic and the I2 statistic, a transformation of Q that estimates the percentage of the variation in effect sizes that is due to heterogeneity. An I2 value of 0% indicated no observed heterogeneity, and larger values showed an increasing heterogeneity. In the case of heterogeneity, we proceed as explained before13 (details can be seen in Supporting Material online). To check for publication bias, we used the Begg and Mazumdar's rank correlation test (this test reports the rank correlation, also known as rank correlation coefficient or simply Kendall's tau, between the standardized effect size and the variances, or standard errors, of these effects).14 A P value equal to or less than 0.05 was considered statistically significant.
All calculations were performed using the Comprehensive Meta-Analysis computer program (Biostat, Englewood, NJ).
We evaluated 16 studies that met the selection criteria and that were identified using the search strategy described in Supporting Fig. 1. Studies characteristics are shown in Table 1. Data from one study that fulfilled the eligibility criteria was included after personal contact with the investigators15; data on one further study was unavailable because in the article the authors did not disclose the raw data and our attempts to contact the authors were unsuccessful.16 All the studies scored well in terms of adequate descriptions of selection criteria and reference test, blind assessment of the reference test, and the availability of clinical data. A general critique concerns the observation that information about genotype counts per evaluated phenotype was scarcely found across the studies.
Table 1. Characteristics of the Studies on the Association Between the Nonsynonymous rs738409 Variant of PNPLA3 and Fatty Liver Disease
First Author, Year
Population Ethnicity. Country
Study Design and Sample Size (N)
Features and Patients Characteristics
Liver Biopsy (N)
Age of the Subjects
Female Sex, n (%)
H-MRS: hydrogen magnetic resonance (H-MR) spectroscopy, US: liver ultrasonographic examination, CT: computed tomography, LB: liver biopsy, DEXA: dual energy x-ray absorptiometry, T2D: type 2 diabetes, NA: not available.
Hepatic steatosis measured by abdominal CT scanning
Hispanic American 524 (62.6)African American 229 (61.7)
Eleven studies were hospital-based case-control studies,2-6, 15, 17-21 and the other five were population-based case-control studies,1, 22-24 or family-based studies.25
Information about liver biopsy was available in six studies,2-6, 17 and data about disease severity was analyzed in 2,651 patients with NAFLD; ALT levels according to the rs738409 genotypes were available in 11 studies.1, 2, 5, 6, 15, 17-21, 24
Genotyping for rs738409 was carried out across studies using Taqman assay in 111, 5, 6, 17-24 studies, by allele-specific oligonucleotides in two studies,2, 15 and by Sequenom MassARRAY iPLEX Gold platform (Sequenom, San Diego, CA) in the remaining three studies.3, 4, 25
Fatty Liver Disease and Liver Fat Content.
Data regarding fatty liver disease as a disease trait extracted from 11 studies included 5,100 individuals,2-4, 6, 15, 17, 18, 20, 23-25 and, as expected, the analysis showed a significant association between fatty liver and the rs738409 variant either in the fixed or the random model (P < 1 × 10−9) (Fig. 1a); details of the association stratified by age are shown in Supporting Fig. 2. At any rate, we did not observe heterogeneity among studies as assessed by the Q statistic (P = 0.33), I2: 11.97. From the Begg and Mazumdar's rank correlation test (two-tailed P = 0.15), it seems that there was no publication bias. The evaluation of the risk associated with heterozygosity for the variant and fatty liver as a dichotomic variable also showed a significant association with the G allele. Interestingly, this analysis suggests that rs738409 exerts an additive effect on the susceptibility to develop NAFLD (Fig. 7); the details of the association analysis results for NAFLD and the CG versus CC genotypes are given in Supporting Table 1.
In addition, we found five homogeneous reports (P = 0.22, I2: 27.5) that reported retrieval data about the measurement of liver fat content (determined using hydrogen magnetic resonance spectroscopy [H-MRS]) according to the rs738409 genotypes.1, 18, 19, 22, 23 The comparison between cases and controls, including 2,937 subjects, showed that liver fat content was significantly associated with the GG genotype, either in random effect or fixed models (P < 1 × 10−9) (Fig. 1b), without evidence of publication bias (two-tailed P = 0.37) (details of the association stratified by ethnicity are shown in Supporting Fig. 3).
The evaluation of the risk associated with heterozygosity for the variant and liver fat content showed that, even significant, the effect seems to be much lower when carrying only one G allele (Fig. 7) (details in Supporting Table 1).
By meta-regression analysis, we observed a negative correlation between the male proportion in the studied populations and the effect of rs738409 on liver fat content (slope: −2.45 ± 1.04, P < 0.02; Fig. 2), suggesting that a sexual dimorphism might be involved in the effect of the SNP on NAFLD development. Conversely, a significant correlation between the effect of the SNP on either NAFLD risk or liver fat content and BMI, and fasting glucose or fasting insulin could not be demonstrated (data not shown).
Histological Severity of NAFLD.
We found six heterogeneous reports (P < 0.001, I2: 83.7) that disclosed extractable data about the presence of NASH and either ORs per risk allele or the prevalence of NASH according to the rs738409 genotypes.2-6, 17 The comparison among NAFLD patients, including 2,124 subjects with confirmed diagnosis by liver biopsy, showed that NASH was more frequently observed in GG than in CC carriers by fixed (3.125, 95% CI 2.690-3.630; P < 1 × 10−9) or random effect (3.488, 95% CI 1.859-6.545; P < 2 × 10−4) models, without evidence of publication bias (two-tailed P = 0.45); details of the association stratified by ethnicity are shown in Supporting Fig. 4. To investigate the source of heterogeneity, we analyzed the data by grouping the reports by age, and after separating one study that included a pediatric population and showed a disparate high OR of 88.65 (Fig. 3), the heterogeneity still persisted between the remaining four studies that included an adult population. The heterogeneity disappeared after excluding one outlier study,3 and the effect was still significant (OR 3.223, 95% CI 2.849-3.875, fixed and random model; P < 1 × 10−9).
Data about lobular necroinflammation according to either genotypes or ORs per risk allele was available in four heterogeneous studies (P < 0.002, I2: 79),2-5 including 1,739 patients. The analysis showed that the GG genotype was significantly associated with higher inflammation scores (fixed P < 1 × 10−9 and random P < P < 1 × 10−7), without evidence of publication bias (two-tailed P = 0.31; Fig. 4). By separating one report5 that included pediatric patients (and again showed a disparate high OR of 72) the heterogeneity was removed, and the effect was still significant (OR 3.18, 95% CI 2.77-3.64, fixed and random model; P < 1 × 10−9).
Finally, data about fibrosis score was extractable from five homogeneous studies,2-6 including 2,251 patients. We observed a significant association with the variant either in fixed or random model (P < 1 × 10−9); there was no evidence of publication bias (two-tailed P = 0.81; Fig. 5). Owing to the limited number of studies that disclosed data about the NAFLD traits associated with disease severity and metabolic syndrome intermediate phenotypes, meta-regression analysis of these variables with BMI, homeostasis model assessment of insulin resistance (HOMA-IR), fasting glucose, or fasting insulin was not possible.
The evaluation of the risk associated with heterozygosity for the variant and liver disease severity showed that the effect when carrying only one G allele does not differ from the GG genotype (Fig. 7), suggesting that carrying two G alleles does not lead to a large change on the risk of severe histological features (details in Supporting Table 1).
ALT was significantly associated with the rs738409 variant in 11 heterogeneous studies (P < 0.001, I2: 86.5),1, 2, 5, 6, 15, 17-21, 24 including 5,366 individuals; fixed effect P < 1 × 10−9, and random effect P < 0.0009, without evidence of publication bias (two-tailed P = 0.30); details of the association stratified by ethnicity are shown in Supporting Fig. 5. Subjects were stratified by age (Fig. 6), ethnicity, study design, and associated disease condition, but the heterogeneity remained significant. The heterogeneity did not disappear even after removing the outlier studies.5, 18, 19, 21 Nevertheless, the effect estimate seems to be robust because similar and significant results (standard deviation between 0.32 and 0.45, P < 1 × 10−8) still remained after excluding one study at a time.
The analysis of the heterozygosity for the variant showed that ALT levels were significantly associated with the rs738409 G allele when the reference genotype (CC) was compared with the CG genotype, suggesting again an additive genetic effect (details in Supporting Table 1).
Associated Metabolic Phenotypes: Insulin Resistance and BMI.
Additional information about the NAFLD-associated insulin resistance phenotype was available in six studies that reported data on HOMA-IR,2, 5, 6, 20, 21, 24 including 1,404 subjects; in seven studies that reported data on fasting insulin levels,2, 6, 18-21, 24 including 1,721 subjects; and in nine studies that reported data on fasting glucose levels,1, 2, 5, 6, 18-21, 24 including 4160 subjects (Supporting Table 2). Interestingly, no significant association with the variant was found for HOMA-IR and fasting glucose or insulin levels. Neither was the trait obesity, as measured by BMI, associated with the variant (Supporting Table 2) in 4,141 subjects included in 10 heterogeneous studies.1, 2, 6, 15, 18, 19, 21, 22
Overall Study Quality.
The median impact factor for all the included studies was 7.81 (range, 2.84-34.28). In conclusion, we observed that the data we have included in the analysis were published in leading journals with a high impact factor.
Whereas the physiological role and biological function of PNPLA3 in the liver is still unclear, the evidence on the impact of the rs738409 nonsynonymous I148M polymorphism not only on the hepatic triglyceride content but on a more severe course of NAFLD seems to be irrevocable. The results of this well-powered meta-analysis, by summarizing the amount of evidence, degree of replication, and absence of publication bias, show that rs738409 exerts a strong influence not only on liver fat accumulation (GG carriers have a 73% higher lipid fat content when compared with CC carriers), but also on the susceptibility of a more aggressive disease with higher liver injury (GG homozygous subjects have 3.24-fold more risk of higher necroinflammatory scores when compared with those homozygous for the C allele; data from 1,739 individuals) and fibrosis scores (GG homozygous subjects have 3.2-fold more risk of develop liver fibrosis when compared with those homozygous for the C allele; data from 2,251 patients). These results coupled data from studies obtaining the largest available series of patients with NAFLD proven by liver biopsy.
In addition, novel information is added as we observed that the effect of rs738409 on fatty liver disease seems to follow an additive genetic effect for all phenotypes except for liver disease severity. This observation suggests that the influence of the variant on the susceptibility to develop a more progressive disease follows a dominant model.
Notably, meta-regression showed a negative association between the effect of rs738409 on liver fat content and male sex, novel data never analyzed before even in the large studies. This effect may explain the previously reported sex differences in the prevalence of NAFLD26 and, even more interesting, may suggest a potential sexual dimorphism in the effect of the gene variant on NAFLD susceptibility. At any rate, the different proportion of males in different studies explained, at least in part, the variation in the estimated effect of the gene variant on liver fat content. Despite the explanation on the gene by sex interaction cannot be given by the present study, some evidence exists that the PNLPA3 gene is under the control of several factors, for instance, nutritional control in close relationship with SREBP-1c and liver X receptor.27 Interestingly, sexual hormones, like estrogen, modulate lipogenic genes, including SREBP-1c, and participate in adiposity and fuel partitioning.28 Hence, a possible mechanism could be the sex hormones modulation.
A note of caution should be added, as the presence of heterogeneity may potentially restrict the interpretation of the pooled risk estimates, in particular concerning the association of the variant with closely related features such as the presence of NASH, scores of liver necroinflammation, and serum ALT levels. Heterogeneity in a meta-analysis is mostly produced by differences in study design and background characteristics of the subjects, and the extent of heterogeneity might influence the conclusions. However, the random effect model, where heterogeneity is no longer a main issue, provided a significant result about these features closely related with the disease severity and the magnitude of the liver injury. Although heterogeneity was addressed statistically by applying a random effect model, we aimed to further investigate its potential sources where possible. Thus, the full dataset was utilized for investigation of heterogeneity by sensitivity analysis. In some studies, NAFLD patients were mostly recruited because of elevated ALT levels; thus, there was a marked enrichment of patients with NASH and few of them showed simple steatosis. Moreover, potential selection bias when selecting patients for liver biopsy may also explain the heterogeneity.
The current meta-analysis is useful to clearly understand the magnitude of the effect of rs738409 on the histological severity of NAFLD, which is far beyond the small magnitude observed for common variants on complex traits,29 and may be explained by the nonsynonymous nature of the polymorphism that induces an amino acid change of I for M (missense Ileu (ATC) → Met (ATG)) with possible functional consequences.30
A potential limitation of this study is its reliance on two studies,3, 4 which included cases from the Nonalcoholic Steatohepatitis Clinical Research Network (NASH CRN). This fact may limit the research results to liver disease severity. Hence, to circumvent this potential caveat, we performed the whole analysis about all the NAFLD-related phenotypes excluding the smaller study in terms of sample size3 (details in Supporting Table 3). We observed a very similar magnitude in the variant effect on the analyzed phenotypes.
Data about a GWAS on NAFLD performed in female adults also from the NASH CRN was not included in this meta-analysis because rs738409 was not captured by the chip.9 Although 15 SNPs in the PNPLA3 were captured by the GWAS and none of them showed significant association with NAFLD, it is important to note that among five variants, only rs2076211 was in moderate linkage disequilibrium (r2: 0.65) with rs738409 precluding any imputation even assuming access to the complete dataset.
An interesting observation about the analysis of the significant association between rs738409 and liver enzymes is the 28% increase in serum levels observed in GG homozygous individuals, a relevant aspect that might be regarded when selecting patients for clinical trials, decision-making on indication of liver biopsy, and evaluation of the magnitude of any treatment response based on serum ALT levels.
The question of whether some NAFLD-associated metabolic comorbidities such as insulin resistance and obesity are also influenced by the rs738409 variant can also be answered by this meta-analysis, because all the included studies in which these phenotypes were measured showed a lack of significant difference among genotypes for HOMA-IR (data from 1,404 individuals), glucose and insulin levels (data from 4,160 and 1,721 subjects, respectively), and BMI (data from 4,141 individuals). These results, which are hardly due to lack of statistical power, are in agreement with data from large population-based association studies from several consortia that evaluated subjects without the associated NAFLD phenotype.4
As a final observation, we strongly encourage the inclusion in genetic association studies of all the data in standard format (complete record of all of the phenotypes available by genotypes and genotype counts for cases and controls), either in the print or supplementary electronic material, to facilitate further meta-analysis with more robust estimates of the genetic effect.
The Human Genome Epidemiology Network recommends that a meta-analysis is necessary before the evidence for a particular association can be regarded as strong.31 Therefore, we provide strong statistical evidence for the impact of the rs738409 variant on the clinical course of NAFLD, including susceptibility to liver injury and fibrosis progression. A conservative population attributable risk (%) estimated for the GG genotype may be as high as 33 in Hispanics with a minimum allele frequency (MAF) of around 50% to as low as 9 in African-Americans who have an MAF below 25%. Ethnic differences in susceptibility to NAFLD are evident, as shown in previous studies that described a high prevalence in Hispanics, and significantly lower in African-Americans,32 but probably reflect differences in the variant MAF and not a different risk associated with the variant. Considering that NAFLD has also reached epidemic proportions in China and Japan,33 the strength of this study is the evaluation of the effect of the variant on NAFLD in Asians, which compared with Caucasians is almost identical (3.26 versus 3.11) (Fig. 1a).
The future challenge of the medical community will be to manage the expectations about rapidly translating this knowledge into more individualized decision-making and personalized medicine. Nevertheless, the observed effect of the rs738409 variant on the behavior of NAFLD is perhaps one of the strongest ever reported for a common variant modifying the genetic susceptibility for complex diseases.