Objective: The excessive consumption of confectionery might have adverse effects on human health. To screen genetic factors associated with confectionery-intake frequency, a genome-wide association study (GWAS) in Japan was conducted.
Design and Methods: For the discovery phase (stage 1), we conducted a GWAS of 939 noncancer patients in a cancer hospital. Additive models were used to test associations between genotypes of approximately 500,000 single-nucleotide polymorphisms (SNPs) and the confectionery-intake score (based on intake frequency). We followed-up association signals with P < 1 × 10−5 and minor allele frequency >0.01 in stage 1 by genotyping the SNPs of 4,491 participants in a cross-sectional study within a cohort (replication phase [stage 2]).
Results: We identified 12 SNPs in stage 1 that were potentially related to confectionery intake. In stage 2, this association was replicated for one SNP (rs822396; P = 0.049 for stage 2 and 4.2 × 10−5 for stage 1+2) in intron 1 of the ADIPOQ gene, which encodes the adipokine adiponectin.
Conclusions: Given the biological plausibility and previous relevant findings, the association of an SNP in the ADIPOQ gene with a preference for confectionery is worthy of follow-up and provides a good working hypothesis for experimental testing.
Confectionery or snacks are often sweet, fatty, and energy dense, so their excessive consumption might have adverse effects on human health in terms of obesity and the metabolic syndrome [1, 2]. Genetic as well as environmental factors have recently been implicated as correlates in the consumption of sweet foods. For example, a genetic variation in TAS1R2, a sweet taste-receptor subunit, was reported to affect habitual consumption of sugars in overweight and obese individuals , whereas a functional polymorphism of the dopamine transporter SLC6A3 was related to the intake of high-calorie sweet foods among women with high depressive symptoms . Furthermore, polymorphisms of the leptin gene (LEP) and the leptin receptor gene (LEPR) were associated with sweet preference . Identification of genetic polymorphisms associated with a preference for sweet foods might therefore help us to understand the physiology and pathophysiology of eating behaviors and addiction to sweet foods [6, 7].
However, most genetic findings to date derive from candidate-gene approaches in which biologically possible associations were tested between genetic factors and use of sweet foods. In addition to candidate-gene approaches, a few genome-wide linkage studies have addressed this topic [8, 9]; however, these studies used only several hundreds of microsatellite markers, so the regions of the genome identified were too broad to locate relevant genes.
Recent genome-wide association studies (GWASs) have utilized several hundreds of thousands of single-nucleotide polymorphisms (SNPs) as a means of finding genes that are potentially related to various phenotypes without prior hypotheses. GWASs have been used to identify novel candidate or target genes regulating obesity  and diabetes ; however, a preference for the consumption of sweet foods has not been examined as a phenotype in recent GWASs. This study therefore used a GWAS to screen genetic factors associated with the intake frequency of confectionery throughout the human genome, followed by a replication study in another independent population.
In the discovery phase (stage 1), we conducted a GWAS of 977 participants of the Hospital-based Epidemiological Research Program II at Aichi Cancer Center Hospital (HERPACC-II) between January 2001 and September 2005. All participants were enrolled during their first visit to the Aichi Cancer Center Hospital (ACCH; Nagoya, Japan). The framework of the HERPACC-II has been described elsewhere [12, 13]. Briefly, all first-visit outpatients to the ACCH aged 20-79 years were asked to fill in a self-administered questionnaire about their lifestyle and medical factors, and trained interviewers checked their responses. The outpatients were also asked to provide a blood sample. In total, 96.7% of contacted patients completed the questionnaire and about 50% of respondents provided a blood sample.
The current analyses were limited to noncancer participants; approximately 35% of the subjects were diagnosed with cancer within 1 year of their first visit. Our previous study showed that the lifestyle patterns of first-visit outpatients without cancer corresponded well with those of individuals who were randomly selected from the general population of Nagoya city .
Association signals selected in the stage 1 GWAS were followed-up by genotyping the SNPs in 4,491 participants aged 35-69 years in a cross-sectional study within the Japan Multi-Institutional Collaborative Cohort (J-MICC) Study (replication phase [stage 2]). We previously reported the detailed design of this cross-sectional study  and the J-MICC Study as a whole . In brief, participants in the current study completed a questionnaire about lifestyle and medical factors, and donated a blood sample at the time of the J-MICC Study baseline survey. J-MICC Study participants were recruited from 10 areas throughout Japan between 2004 and 2008, and included community citizens, first-visit patients to a cancer hospital, and health check-up examinees. The response rates for the baseline survey by study area varied according to the source population, and were recorded as 7.0-24.0% in the community (recruitment by mailing invitation letters or distributing leaflets), 58.4% in first-visit patients to a cancer hospital, and 14.0-65.5% in health check-up examinees. The respondents for the cross-sectional study comprised 400-600 participants who were enrolled consecutively from each area of the J-MICC Study, with the exception of two areas (Kyoto and Tokushima) where fewer participants were recruited.
All participants in this study gave their written informed consent prior to inclusion. The ethics committees of Kyoto University Graduate School of Medicine (Kyoto, Japan) and Aichi Cancer Center approved the protocol for the stage 1 GWAS. The committee of Aichi Cancer Center also approved the protocols of the HERPACC-II, and the committees of Nagoya University School of Medicine (Nagoya, Japan), Aichi Cancer Center, and all participating research institutions approved the protocols of the J-MICC Study, including the current cross-sectional study. The present study was conducted in accordance with the World Medical Association Declaration of Helsinki and its later amendments.
We defined the intake score for confectionery as described below, and considered it as a trait in the current GWAS to seek relevant quantitative trait loci. Participants in both stages 1 and 2 studies were asked to report their usual frequency of consumption of 43 food items in a self-administered questionnaire with the following eight possible responses: 1 = almost never; 2 = 1-3 times per month; 3 = 1-2 times per week; 4 = 3-4 times per week; 5 = 5-6 times per week; 6 = once per day; 7 = twice per day; and 8 = ≥ 3 times per day . The respondents were requested to circle one of the numbers to provide an answer. Western-style and Japanese-style confectionery were included as two separate food items in the questionnaire. The responses were then converted into intake scores of 0, 0.1, 0.2, 0.5, 0.8, 1, 2, and 3, respectively, and the sum of the two intake scores was used for association analysis. Generic or grouped questions about the consumption of confectionery (not those on individual items such as cookies and sponge cakes) are generally used in studies in Japan and have been validated through comparisons with diet records [21, 22].
For stage 1, the DNA of each participant was extracted from the buffy-coat fraction using a DNA Blood Mini kit (Qiagen Group, Tokyo, Japan). All 977 samples were genotyped on an Illumina Human610- Quad BeadChip (Illumina, San Diego, CA, USA) with 576,736 SNP markers at the Center for Genomic Medicine of Kyoto University Graduate School of Medicine.
We excluded two participants whose recorded gender was inconsistent with genotyping data. A further sample was excluded because the call rate was below the threshold (0.95), another was excluded because of an extremely high proportion of heterozygotes among the genotyped SNPs, and two were excluded because they were from closely related participants with a pi-hat >0.4 (estimated using the PLINK whole-genome association-analysis toolset ). For each closely related pair, we excluded the member with the lower call rate. Based on principal component analysis, no outlier was identified in terms of ancestry from East-Asian populations. In addition, 32 participants were excluded because of missing data on the intake frequency of confectionery, leaving a total of 939 for the present analysis. After removing SNPs that failed the quality control criteria (Hardy–Weinberg equilibrium P-value ≥ 1 × 10−6 [excluded SNPs: n = 277]; SNP call rate >0.95 [n = 2,921]; and minor allele frequency [MAF] ≥ 0.01 [n = 82,414]), 491,738 markers were used for the analysis (some SNPs were excluded based on two or more criteria). The procedures used to select candidate SNPs for stage 2 are described below.
DNA samples for stage 2 were prepared from the buffy coat or whole blood using a BioRobot M48 Workstation (Qiagen Group) or an automatic nucleic-acid isolation system (NA-3000; Kurabo, Osaka, Japan). We then genotyped the 12 SNPs identified in stage 1 using the multiplex polymerase chain reaction-based Invader assay  (Third Wave Technologies, Madison, WI, USA) at the Laboratory for Genotyping Development of the Center for Genomic Medicine at RIKEN (Yokohama, Japan). The call rates for all 12 SNPs were 99.6% or higher at stage 2.
In stage 1, PLINK software version 1.07 (http://pngu.mgh.harvard.edu/purcell/plink/)  was used to test the association between SNP genotypes and the confectionery-intake score. We utilized standard additive models for assessing associations, and adjusted for gender and age using general linear models. The intake score was regressed on the number of minor alleles for each SNP, and the regression coefficient (β) was estimated, with gender and age included as covariates in the linear model. The SNPs were chosen if the additive model P was <1 × 10−5. For SNPs within one linkage disequilibrium (LD) block (defined by pairwise r2 > 0.8), we selected the one with the lowest P value. SNPs were excluded from stage 2 analysis when both of the following two conditions were satisfied: the MAF was <0.05; and they were not within or near to (<50 kb) a gene. The genome-wide –log10P value plot (Manhattan plot) from stage 1 was depicted using Haploview version 4.2 (http://www.broadinstitute.org/scientific-community/science/programs/medical-and-population-genetics/haploview/haploview) . The Q–Q plot was drawn from the PLINK output using Stata version 11.1 (Stata Corporation, College Station, TX, USA).
Using stage 2 genotyping data, we repeated the association analyses carried out in stage 1. The combined datasets of stages 1 and 2 studies were analyzed in the same manner to yield pooled P values. For the replicated SNPs (P < 0.05 in stage 2), the mean confectionery intake scores were computed according to the genotype of the whole, or subgroups of, the pooled population. In this analysis, the heterozygotes were combined with the homozygotes of minor alleles because of the small number of minor homozygotes. The distribution of the intake score was skewed toward lower values compared with the normal distribution. However, the mean scores by genotype would approximate a normal distribution in accordance with the central limit theorem because of the relatively large sample size.
Statistical analyses for stage 2 and the pooled dataset were performed using Statistical Analysis System (SAS) version 9.1 (SAS Institute Inc., Cary, NC, USA). We repeated the analysis for stage 1 using SAS, and reproduced the results obtained using the PLINK software. Accordance with the Hardy–Weinberg equilibrium was assessed by the exact test  using Stata software. The background characteristics of participants were compared by the t-test or the χ2 test. BMI [kg/m2] was calculated on the basis of self-reported height and body weight, because measured values were not available in two study areas. In the remaining eight areas, however, the BMI based on self-reported height and weight was similar to that derived from measured height and weight; the intraclass correlation coefficient between the two indices was 0.98 in both men and women. To consider the potential effect of BMI, we repeated the association analysis of data from stages 1 and 2, or 1+2 with further adjustment for BMI (as a continuous variable).
We initially decided to examine the association between SNPs and the sum of the two intake scores for Western- and Japanese-style confectionery, and therefore did not run the analyses separately. However, for SNPs with P < 0.05 in the stage 2 study, we did conduct separate analyses for Western- and Japanese-style confectionery because the correlation was moderate between the intakes of the two types (correlation coefficient, 0.33 for stage 1+2 data).
The participants of the stage 2 study were older and slightly more likely to be female than those of the stage 1 study, whereas former drinkers and current smokers were more prevalent in stage 1 (Table 1). BMI was slightly higher in the stage 2 study, and the confectionery-intake score was comparable between the two studies. The mean intake score ± SD was 0.30 ± 0.32 for stage 1 and 0.28 ± 0.27 for stage 2. The distribution of the score was skewed toward lower values compared with the normal distribution: the median was 0.2 (interquartile range, 0.1-0.4) for stage 1 and 0.2 (0.1-0.3) for stage 2.
Table 1. Background characteristics of participants
P (stage 1 vs. 2)
Values are means ± SD.
47.9 ± 16.3
55.8 ± 8.9
54.4 ± 10.9
Current drinkers (%)
Current smokers (%)
22.5 ± 3.1
23.2 ± 3.2
23.0 ± 3.2
Confectionery intake score
0.30 ± 0.32
0.28 ± 0.27
0.28 ± 0.28
In stage 1, we found 22 SNPs with P values for the additive model <1 × 10−5. The selected SNPs are shown as spots above the blue line at P = 1 × 10−5 in the genome-wide –log10P value plot (Manhattan plot; Figure 1). The Q–Q plot indicates that the observed P values that were less than the cutoff (i.e., 1 × 10−5) all deviated from the expected P value under the null hypothesis (Figure 2). The genomic inflation factor (λ) was 1.003.
Of the 22 SNPs, five were excluded because they were located within the same LD block of the selected SNPs with the lowest P value in each block. An additional five SNPs were omitted as they were not within or near to a gene (<50 kb) and their MAF was <0.05. Eventually, we selected 12 SNPs (Table 2) from HERPACC-II participants for the stage 2 follow-up based on predefined criteria. The genotype distributions of these 12 SNPs were in Hardy–Weinberg equilibrium (P > 0.05) in both studies, with the exceptions of rs12351510 in stage 1 (P = 0.014) and rs2839519 in stage 2 (P = 0.020). The deviation from the expected genotype distribution, however, was relatively small, at less than 1% in both cases.
Table 2. SNPs identified in GWAS analysis for confectionery intake score (n = 939 for stage 1 and 4,491 for stage 2)
aAlleles are indexed to the forward strand of Center for Biotechnology Information (NCBI) Build 36.3.
Among the 12 selected SNPs, the association of rs822396 with the confectionery-intake score was replicated in stage 2 of the J-MICC population (P = 0.049; Table 2). In the pooled analysis of stages 1 and 2 data, the smallest P was found for rs822396 (P = 4.2 × 10−5), followed by rs17042603, rs13356198, and rs1147522. The polymorphism rs822396 was shown to be an SNP in intron 1 of the ADIPOQ gene (IVS1-3971A>G). Further adjustment for BMI did not substantially alter the results. The P values for rs822396 in stages 1, 2, and 1+2 were 4.2 × 10−7, 0.049, and 4.3 × 10−5, respectively. The P values for other selected SNPs were also similar to those without BMI adjustment (data not shown), although those in stage 1 for rs10810211 and rs6039211 were 1.1 × 10−5.
The association of the rs822396 polymorphism with the confectionery-intake score was more dominant for Japanese-style than Western-style confectionery: the respective P values in stages 1, 2, and 1+2 were 1.9 × 10−8, 0.013, and 4.8 × 10−6 for Japanese-style confectionery, and 5.1 × 10−3, 0.73, and 0.083 for Western-style confectionary.
We compared the mean confectionery-intake score by the rs822396 genotype (major homozygotes versus heterozygotes + minor homozygotes) and the background characteristics of participants in the pooled dataset (Table 3). The score was higher among participants with at least one minor allele than among those without. Moreover, this difference was consistently observed across the strata of gender, age, smoking and drinking habits, and BMI. It was particularly large (0.083, P = 5.1 × 10−6) in younger participants aged <55 years.
Table 3. Mean confectionery intake score by rs822396 genotype and background characteristics of participants in the pooled dataset (stages 1 and 2 studies)
In this study, we identified 12 candidate SNPs that were potentially related to confectionery intake in a GWAS. Among them, the association was replicated in an independent population for one SNP (rs822396 or IVS1-3971A>G) in intron 1 of the ADIPOQ gene.
The ADIPOQ gene on chromosome 3q27 encodes adiponectin, which is an adipokine that is extensively expressed in adipose tissue, and is a highly abundant plasma protein with circulating levels that are, in part, genetically controlled [27, 28]. Adiponectin has excited intense interest because of the robust negative correlations of its circulating levels with indices of insulin resistance and the risk of type 2 diabetes, as well as their consistent inverse associations with fat mass . Because central nervous insulin action might be related to inhibition of eating behavior, and might be negatively correlated with peripheral insulin resistance and obesity , insulin resistance associated with hypoadiponectinemia could be involved in increases in food intake.
In addition to its potential role as an insulin-sensitizing adipokine, it is hypothesized that adiponectin plays an important role in the regulation of energy homeostasis, including appetite stimulation [27, 30], although its effects on food intake show considerable diversity across studies . Collectively, however, previous studies show that adiponectin directly and/or indirectly affects eating activities, which might partly explain our current findings.
Physiologically, adiponectin has been shown to stimulate food intake by activating AMP-activated protein kinase (AMPK) in the arcuate hypothalamus via its receptor AdipoR1 [27, 31]. The putative downstream pathways for food-intake regulation in response to hypothalamic AMPK are acetyl-coenzyme A carboxylase/malonyl-coenzyme A/carnitine palmitoyltransferase-1/fatty-acid oxidation and mammalian target of rapamycin signaling . However, it remains to be investigated whether adiponectin specifically affects consumption of energy-dense foods such as confectionery. Our relatively broad screening approach with a cutoff of P < 1 × 10−5 in stage 1 might have resulted in the observed low replication rate of only one of the 12 candidate SNPs identified in stage 1. Nevertheless, this replicated SNP appears to be biologically plausible as an indicator of a gene involved in eating behaviors.
The ADIPOQ gene spans 17 kb, contains three exons, and its translation starts at exon 2 and ends at exon 3 [32, 33]. SNPs throughout the gene or nearby have recently been related to the circulating levels of adiponectin in GWASs  or in studies genotyping tag SNPs [33, 37]. SNPs representing the most significant associations, however, vary considerably among studies. They are distributed throughout or nearby the ADIPOQ gene from upstream (e.g., rs864265), through the promoter region (e.g., rs822387, rs17300539), intron 1 (e.g., rs16861210, rs17366568), exon 2 (e.g., rs2241766), and intron 2 (e.g., rs3774261), to the 3′ untranslated region (UTR; e.g., rs6773957, rs2082940). These SNPs are frequently in LD with one another, so that researchers cannot easily focus on the genetic polymorphisms that are responsible for the adiponectin levels.
The rs822396 polymorphism associated with the confectionery-intake score in the present study was also in LD, albeit weak-to-moderate, with some of the previously mentioned polymorphisms, including rs864265, rs822387, rs3774261, rs6773957, and rs2082940 in a Japanese population within the International Haplotype Map (HapMap) project (http://hapmap.ncbi.nlm.nih.gov/). Therefore, even if the rs822396 polymorphism is not directly linked with circulating adiponectin levels, genetic polymorphisms of ADIPOQ around the rs822396 SNP might control blood adiponectin levels, and could be associated with a propensity to favor foods of high-energy density such as confectionery. Additionally, alternative splicing sites of the ADIPOQ gene have been found near this SNP (within 4 kb upstream and downstream of the SNP; http://www.ensembl.org/Homo_sapiens/Gene/Splice?db=core;g=ENSG00000181092;r=3:186560479-186576252;t=ENST00000444204). Thus, the rs822396 SNP might affect the expression of ADIPOQ through alternative splicing.
Interestingly, the Québec Family Study by Choquette et al.  involving genome-wide linkage analysis found linkage on chromosome 3q27.3 with intakes of energy, lipid, and carbohydrate. As the 3q27 region harbors the ADIPOQ gene, this study might corroborate these earlier findings, suggesting that a variation of this gene is associated with the intake of high-calorie foods.
The association with the confectionery-intake score showed genome-wide significance for SNP rs2839525 in the stage 1 study (Table 2, P = 5.5 × 10−9). This was, however, not replicated in stage 2. Although we could not identify the precise reason for this discrepancy, highly significant associations found in GWASs have often failed to be replicated .
Although we identified and replicated an association of the rs822396 polymorphism with the confectionery-intake score, it did not reach genome-wide significance (P < 10−8) either in stage 1 alone or in the pooled analysis of stages 1 and 2, and the association was comparatively weaker in stage 2. This might have been partly due to the relatively small number of participants (n = 939) in stage 1, or the simplistic self-reporting method used to assess the intake frequencies of Japanese-style and Western-style confectionery. Moreover, the difference in background characteristics of participants between the studies might have partly accounted for the weaker association in stage 2 than stage 1. The two populations differed notably in age: the average age for the stage 2 group was 8 years higher than that for stage 1 (Table 1). The association of the rs822396 polymorphism with the confectionery-intake score was much stronger in the younger group (<55 years; Table 3). When we analyzed data only from the stage 2 study by age stratum (<55 and ≥55 years), as in Table 2, the association of SNP rs822396 was more dominant in the younger age strata (n = 1,772; β for additive model = 0.0487; P = 0.008) than in the older one (n = 2,717; β = 0.0090; P = 0.56). The older age distribution of the population in stage 2 might therefore have attenuated the SNP association compared with that in stage 1; the association might have been more replicable if the stage 2 population had been more similar in age distribution to that in stage 1.
Although a more detailed questionnaire including questions on portion sizes might have provided more conclusive findings, informative data were obtained in a previous familial study based on simple questions about the intake frequencies of sweet foods . Furthermore, the association of an SNP in the ADIPOQ gene with the intake of high-energy foods such as confectionery is biologically plausible and supports the findings of a previous analysis .
In summary, we found that an SNP in the ADIPOQ gene was correlated with a preference for confectionery through a two-stage GWAS with discovery and replication phases. Given the biological plausibility and relevant previous findings, this association warrants further follow-up and provides a good working hypothesis for experimental testing.
We thank Miki Kokubo at the Center for Genomic Medicine of Kyoto University Graduate School of Medicine for technical assistance with the stage 1 GWAS, Kyota Ashikawa, Tomomi Aoi, and other members of the Laboratory for Genotyping Development at the Center for Genomic Medicine of RIKEN for support with genotyping in the stage 2 study, Yoko Mitsuda, Keiko Shibata, and Etsuko Kimura at the Department of Preventive Medicine of Nagoya University Graduate School of Medicine, Miki Watanabe and Isao Oze at the Division of Epidemiology and Prevention of the Aichi Cancer Center Research Institute, Fusako Katsurada at the Department of Health Science of Shiga University of Medical Science, and Mitsuhiko Matsushita and Yasunobu Sagara at the Tokushima Prefecture Health Examination Center for their cooperation, technical assistance, and valuable comments. We also thank Shinkan Tokudome at the National Institute of Health and Nutrition (formerly Nagoya City University), Chiho Goto at Nagoya Bunri University, Nahomi Imaeda at Nagoya Women's University, Yuko Tokudome at Nagoya University of Arts and Sciences, Masato Ikeda at the University of Occupational and Environmental Health, and Shinzo Maki at the Aichi Prefectural Dietetic Association for providing the food frequency questionnaire.
The discovery phase (stage 1) of this study was supported by a Grant for the CREST program of the Japan Science and Technology Agency, and a Grant for the Third Term Comprehensive Control Research for Cancer from the Ministry of Health, Labor and Welfare of Japan (H21-3rdCancer-G003). The replication phase (stage 2) was supported by Grants-in-Aid for Scientific Research on Priority Areas (No. 17015018) and Innovative Areas (No. 221S0001) from the Japanese Ministry of Education, Culture, Sports, Science, and Technology.