• Open Access

Genetic factors related to gastric cancer susceptibility identified using a genome-wide association study


To whom correspondence should be addressed.

E-mail: nsaeki@ncc.go.jp


Gastric cancer (GC) is one of the major malignant diseases worldwide, especially in Asia, where Japan and Korea have the highest incidence in the world. Gastric cancer is classified into intestinal and diffuse types. While the former is almost absolutely caused by Helicobacter pylori infection as the initial insult, the latter seems to include cases in which the role of infection is limited, if any, and a contribution of genetic factors is anticipated. Previously, we performed a genome-wide association study (GWAS) on diffuse-type GC by using single nucleotide polymorphisms (SNP) catalogued for Japanese population (JSNP), and identified a prostate stem cell antigen (PSCA) gene encoding a glycosylphosphatidylinositol-anchored cell surface antigen as a GC susceptibility gene. From the second candidate locus identified using the GWAS, 1q22, we found the Mucin 1 (MUC1) gene encoding a cell membrane-bound mucin protein as another gene related to diffuse-type GC. A two-allele analysis based on risk genotypes of the two genes revealed approximately 95% of Japanese population have at least one of the two risk genotypes, and approximately 56% of the population have both risk genotypes. The two-SNP genotype might offer ample room to further stratify a high GC risk subpopulation in Japan and Asia by adding another genetic and/or non-genetic factor. Recently, a GWAS on the Chinese population disclosed an additional three GC susceptibility loci: 3q13.31, 5p13.1 and 10q23. (Cancer Sci 2013; 104: 1–8)

Gastric cancer (GC) is one of the major malignant diseases and the second causal of cancer death worldwide.[1] It is usually classified into two types, intestinal and diffuse, a classification which was originally based on histological observation but is recently thought to reflect its pathogenesis.[2] The majority of intestinal-type GC (IGC) arises in the sequence of inflammatory change of the gastric epithelium resulting from bacterial infection; Helicobacter pylori (HP) infection – chronic inflammation – intestinal metaplasia – dysplasia – adenocarcinoma. In contrast, de novo diffuse-type GC (DGC) is thought to emerge in a histologically almost normal epithelium as a consequence of some genetic change that occurred in gastric stem cells and/or epithelial precursor cells, although some cases with DGC might represent dedifferentiated stages of IGC, and a contribution of HP is also suggested.[3] In other words, it is apparent that the pathogenesis of IGC is initiated by HP infection, a class I carcinogen acknowledged by WHO, and therefore IGC is essentially a preventable disease by eradicating HP infection. However, DGC might develop earlier in life than IGC,[4] and no definite DGC-specific environmental risk factor has been established. Therefore, so far we have neither solid strategy nor promising theory to envision a consistent diminution of the incidence of DGC.

The incidence of GC has strong geographical and ethnical characteristics. For example, it is one of the rare cancers in North America and Europe, while its incidence is significantly high in Japan and Korea. This can be explained roughly by the difference in regional prevalence of HP infection.[4] However, Japan has a high incidence of GC (age-standardized incidence rate 62.7/100 000) but lower HP seroprevalence (39.3%) than, for example, Bangladesh (92%) and India (79%), which have a much lower GC incidence, 1.6/100 000 and 5.7/100 000, respectively, suggesting the contribution of some other factor in the carcinogenesis of gastric epithelial cells.[5] Moreover, Helicobacter and Cancer Collaborative Group reported that HP infection was not associated with the overall risk of GC developing in the cardia of the stomach.[6]

Genome-wide association study (GWAS) of genetic factors for GC development

Genome-wide association study has been successful in exploring genetic susceptibility factors of a number of polygenic or so-called lifestyle-related diseases based on a common disease – common variant hypothesis.[7, 8] The current choice of polymorphic markers in GWAS is single nucleotide polymorphisms (SNP), and the spectrum and frequency of SNP depend on each ethnic population. Japan preceded other countries in the preparation for conducting GWAS, because SNP in the Japanese population (JSNP) were already catalogued in the early 2000s by Dr Yusuke Nakamura at the Institute of Medical Science, The University of Tokyo. The JSNP database led to a number of fruitful harvests in the area of GWAS on genetic factors for common diseases in the late 2000s.[9] As a part of the so-called Millennium Project in Japan, GWAS on GC was performed with two steps of the association study.[10, 11] The first step was performed on 85 576 SNP using 188 DGC cases and 752 references, and the second step was performed on 2753 selected SNP with 749 DGC cases and 750 controls. Finally, it listed the top 10 SNP related to DGC with statistical significance, which included four SNP located in chromosome 8q24.3 and two SNP in 1q22 (Table 1).[12] The subsequent linkage–disequilibrium (LD) analyses revealed two genes in the LD block at 8q24.3 and five genes at 1q22.[12, 13]

Table 1. Gastric cancer susceptibility loci identified using a genome-wide association study
LocusRepresentative SNP (major/minor allele)Odds ratio (95% CI)P-valueEthnicityCancer typeNearest gene Primary report
  1. CI, confidence interval; SNP, single nucleotide polymorphism.

  2. a

    Allelic model.

  3. b

    Additive model.

1q22rs2070803 (G/A)1.63 (1.33‒1.98)a1.2 × 10−6aJapaneseDiffuse MUC1 Sakamoto et al.[12]
3q13.31rs9841504 (C/G)0.76 (0.69‒0.83)b1.7 × 10−9bChineseNon-cardia ZBTB20 Shi et al.[23]
5p13.1rs13361707 (T/C)1.41 (1.32‒1.49)b7.6 × 10−29bChineseNon-cardia PRKAA1 Shi et al.[23]
8q24.3rs2976392 (A/G)1.62 (1.38‒1.89)a1.1 × 10−9aJapaneseDiffuse PSCA Sakamoto et al.[12]
10q23rs2274223 (A/G)1.31 (1.19‒1.43)a8.40 × 10−9aChineseCardia, non-cardia PLCE1 Abnet et al.[24]

In the 8q24.3 locus, prostate stem cell antigen (PSCA) gene was identified as a DGC susceptibility gene, with a significant association between DGC and two SNP, rs2976392 and rs2294008, in the gene (rs2976392: 926 cases, 1397 controls, allele-specific odds ratio = 1.71, 95% confidence interval = 1.50–1.94, = 1.5 × 10−16; rs2294008: 925 cases, 1396 controls, allele-specific odds ratio = 1.67, 95% confidence interval = 1.47–1.90, = 2.2 × 10−15).[12] The association was replicated in the Korean population (rs2976392: 449 cases, 390 controls, allele-specific odds ratio = 1.90, 95% confidence interval = 1.56–2.33, = 8.0 × 10−11; rs2294008: 454 cases, 390 controls, allele-specific odds ratio = 1.91, 95% confidence interval = 1.57–2.33, = 6.3 × 10−11) and it also showed a relatively weak correlation to IGC in both populations from Japan (rs2976392: 599 cases, 1397 controls, allele-specific odds ratio = 1.29, 95% confidence interval = 1.12–1.49, = 5.0 × 10−4) and Korea (rs2976392: 416 cases, 390 controls, allele-specific odds ratio = 1.37, 95% confidence interval = 1.12–1.68, = 0.0017). Later, the association of rs2976392 or rs2294008 with GC was validated in other Japanese and Korean panels and in Chinese and Caucasian studies (Table 2).[14-21] Intriguingly, PSCA was also identified as a gene related to bladder-cancer susceptibility for Caucasians by GWAS.[22]

Table 2. Association studies of prostate stem cell antigen (PSCA) and gastric cancer
ReportSNP (major>minor)aEthnicityCasesControlsModelbReferencecOdds ratio95% confidence intervalP valueAnalyzed histology or subclasses
  1. a

    Major, major allele; minor, minor allele.

  2. b

    Genetic model for the biological effect of risk alleles (rs2294008:T, rs2976392:A).

  3. c

    Genetic model as reference (odds ratio = 1).

Sakamoto et al.[12]rs2294008 T>CJapanese749750Per-alleleC1.581.35–1.856.3 × 10−9Diffuse

925 (diffuse)

599 (intestinal)

1396Per-alleleC1.67/1.29/1.301.47–1.90/1.11–1.49/1.10-1.522.2 × 10−15/5.1 × 10−4/0.0015Diffuse/intestinal/intestinal versus diffuse
DominantCC4.18/1.59/2.702.88–6.21/1.15–2.21/1.64–4.501.5 × 10−17/0.0041/4.7 × 10−5
RecessiveCC + CT1.62/1.24/1.351.35–1.93/1.01–1.52/1.06–1.719.4 × 10−8/0.040/0.013

454 (diffuse)

417 (intestinal)

390Per-alleleC1.91/1.37/1.391.57–2.33/1.12–1.68/1.14–1.696.3 × 10−11/0.0017/7.9 × 10−14
DominantCC3.61/1.85/1.812.41–5.51/1.27–2.71/1.17–2.833.2 × 10−11/0.0011/0.0066
RecessiveCC + CT1.61/1.22/1.391.15–2.26/0.84–1.77/1.00–1.940.0051/0.31/0.050
749750Per-alleleG1.621.38–1.891.1 × 10−9Diffuse
rs2976392 A>GJapanese

926 (diffuse)

599 (intestinal)

1397Per-alleleG1.71/1.29/1.321.50–1.94/1.12–1.49/1.13–1.561.5 × 10−16/5.0 × 10−4/6.0 × 10−4Diffuse/intestinal/intestinal versus diffuse
DominantGG4.24/1.55/2.732.92–6.29/1.13–2.16/1.67–4.566.4 × 10−18/0.0059/3.3×10−5
RecessiveGG + GA1.66/1.24/1.351.39–1.99/1.02–1.52/1.07–1.711.5 × 10−8/0.035/0.012

449 (diffuse)

416 (intestinal)

390Per-alleleG1.90/1.37/1.391.56–2.33/1.12–1.68/1.14–1.698.0 × 10−11/0.0017/9.0 × 10−4
DominantGG3.47/1.86/1.752.32–5.27/1.27–2.72/1.13–2.741.1 × 10−10/0.0010/0.010
RecessiveGG + GA1.64/1.24/1.411.17–2.30/0.86–1.80/1.01–1.970.0036/0.26/0.041
Wu et al.[14]rs2294008 C>TChinese

1020 (non-cardia)

716 (cardia)

rs2976392 G>AGAGG1.21/1.091.01–1.45/0.89–1.340.041/0.402
Matsuo et al.[15]rs2294008 T>CJapanese708708Per-alleleC1.41.19–1.653.7 × 10−5Gastric cancer
DominantCC2.071.45–2.956.4 × 10−5
RecessiveCC + CT1.311.11–1.650.003
rs2976392 A>GPer-alleleG1.41.19–1.654.1 × 10−5
DominantGG2.091.46–2.995.7 × 10−5
RecessiveGG + GA1.361.10–1.670.004
Lu et al.[16]rs2294008 C>TChinese10531100CTCC1.16/1.24/1.170.97–1.39/0.89–1.73/0.95–1.42 Gastric cancer/ diffuse/intestinal
rs2976392 G>AGAGG1.40/1.34/1.371.17–1.67/0.96–1.87/1.12–1.66 
Ou et al.[17]rs2294008 C>TTibetan196246Per-alleleC1.341.00–1.790.049Gastric cancer
rs2976392 G>A   Per-alleleG1.070.80–1.450.645
Lochhead et al.[18]rs2294008 C>TCaucasian312383CTCC1.9/2.9/1.61.2–2.9/1.0–10.1/1.0–2.60.003/0.028/0.040Gastric cancer/diffuse/intestinal
RecessiveCC + CT1.2/1.7/1.20.9–1.7/0.9–3.2/0.8–1.70.184/0.089/0.431

Non-cardia/cardia/non-cardia diffuse/

non-cardia intestinal

RecessiveCC + CT1.9/0.9/1.9/1.51.2–3.0/0.5–1.6/1.1–3.5/0.7–2.90.005/0.766/0.018/0.246
Zeng et al.[19]rs2294008 C>TChinese460549CTCC1.381.06–1.780.018Gastric cancer
Song et al.[20]rs2294008 C>TKorean32451700Per-alleleC1.291.18–1.41<0.01Gastric cancer
Sala et al.[21]rs2294008 C>TCaucasian4111530Log-additive 1.42/1.47/1.54/1.521.23–1.66/1.19–1.81/1.20–1.96/1.20–1.936.5 × 10−6/0.0003/0.0005/0.0005Gastric cancer/non-cardia/diffuse/intestinal
CTCC1.46/1.43/1.32/1.681.23–1.66/0.98–2.10/0.84–2.07/1.08–2.623.7 × 10−5/0.0015/0.0018/0.0022

Our subsequent analyses revealed that the 1q22 region contains 13 SNP with strong LD over five genes, but we have concluded that the Mucin 1 (MUC1) gene is responsible for the observed association as the second DGC susceptibility gene; rs2070803 with = 2.20 × 10−6, adjusted per allele odds ratio = 1.63 (606 cases and 1264 controls), which was replicated in additional Japanese (= 3.93 × 10−5, odds ratio = 1.81, 304 cases and 1465 controls) and Korean (= 2.19 × 10−4, odds ratio = 1.82, 452 cases and 372 controls) case-control panels.[13] While rs2070803 was one of the original LD mapping markers, which we found to have an association with DGC, we later identified rs4072037 in the MUC1 gene as a functional SNP. [13]

In addition, the combined genotype association data of rs2294008 in PSCA and rs4072037 in MUC1, both of which were shown to have biological functions (discussed in sections PSCA gene and MUC1 gene), revealed that 66.5% of the Japanese control subjects had the risk genotype of rs4072037 (risk allele = A, in a recessive model of the allele effect), 84.6% had the risk genotype of rs2294008 (risk allele = T, in a dominant model) and 55.8% had both, showing an odds ratio = 8.4 (Fig. 1).[13] This suggests approximately 95% of Japanese population possess at least one of the two risk genotypes. The risk allele of rs2294008 is a major allele in the Japanese population, but the allele is minor in some other ethnic populations including Caucasians (Supporting Information Table S1).[12, 13] Moreover, Korea has the highest GC incidence almost equivalent to that of Japan,[5] where both the risk alleles are similarly major, and it was estimated that more than 90% of the Korean population has at least one risk genotype of the two SNP.[13] In Japanese population and other ethnic groups, the association was demonstrated between one or two of the risk alleles, and it seems possible that the ethnic prevalence of GC development is influenced by the risk allele frequency of the SNP with proven biological functions.[12, 13]

Figure 1.

Prostate stem cell antigen (PSCA) and Mucin 1 (MUC1) genotypes are associated with risk for diffuse-type gastric cancer (DGC). Association studies were performed with a distinct model for each risk allele's effect, dominant for rs2294008 (risk genotype: TT and TC; protective genotype: CC) and recessive for rs4072037 (risk genotype: AA; protective genotype: GG and GA), using genotype data of rs2294008 in PSCA and rs4072037 in MUC1 (Japanese 605 DGC cases and 1264 controls). Bar, upper bound of 95% confidence interval.

The majority of IGC arises in the chronic inflammatory lesion of the gastric epithelium resulting from HP infection, but some genetic contribution has also been suggested, and, indeed, the association of PSCA with IGC was demonstrated.[12] Independent of the DGC GWAS described above, an IGC GWAS was also initiated in Japan. The first screening (1600 cases, 3400 controls, 501 909 SNP) has already been performed by the National Cancer Center in collaboration with RIKEN, in which, intriguingly, rs2294008 and three other SNP in PSCA were included in the six SNP showing the most statistically significant association (< 1 × 10−6, Hiromi Sakamoto, Teruhiko Yoshida and Yusuke Nakamura, unpublished data). In the first screening, even PSCA, which showed the strongest association, showed a relatively low odds ratio, for example, rs2294008 with odds ratio = 1.27 for IGC.

Besides the Japanese study, two GC GWAS were recently conducted on the Chinese population; one unveiled 3q13.31 and 5p13.1 as a GC-related chromosomal region, in addition to 1q22 and 8q24, and the other 10q23 (Table 1).[23, 24] The 5p13.1 includes eight genes in the vicinity of rs13361707, and the susceptibility gene is yet to be identified in this region. Recombination hotspot analyses suggested PLCE1, a member of the phospholipase C family, in the 10q23, and ZBTB20, encoding zinc finger and BTB domain-containing protein 20, in the 3q13.31, are likely to be the causal for the association in the GWAS.[23, 24]

PSCA gene

As mentioned above, PSCA was identified as a DGC susceptibility gene by the Japanese GWAS, although it was originally reported as the gene upregulated in prostate cancer.[25] It is also upregulated in many types of other cancers including urinary bladder cancer, renal cell carcinoma, hydatidiform mole, ovarian mucinous tumor, pancreatic cancer, non-small-cell lung cancer and glioma (Table 3).[26-32] In those cancers, PSCA can act to promote tumor progression and it was actually demonstrated that suppression of the gene with siRNA resulted in growth inhibition of prostate cancer cells.[33] In contrast, downregulation of the gene was reported only in esophageal and gastric cancers.[12, 34] Recently, we have reported that the gene is also downregulated in gallbladder cancer.[35] Intriguingly, both gallbladder cancer and IGC develop in a similar sequence of chronic inflammation, intestinal metaplasia, dysplasia and cancer. Moreover, PSCA is downregulated in intestinal metaplasia in both gallbladder and gastric epithelia.[12, 35] There could be other cancers in which PSCA is silenced during carcinogenesis.

Table 3. Cancer-type dependent expression status of the prostate stem cell antigen (PSCA) gene
  1. a

    Upregulation in cancer but no expression in normal tissue.

Prostate cancer[25]Esophageal cancer[34]
Urinary bladder cancer[26]Gastric cancer[12, 34]
Renal cell carcinoma[27]Gallbladder cancer[35]
Hydatidiform mole[28] 
Ovarian mucinous tumor[29] 
(Aberrant expression)a 
Pancreatic cancer[30] 
Non-small-cell lung cancer[31] 

It was demonstrated that PSCA has growth inhibition activity on GC cells, which is concordant with the finding of frequent downregulation in the cancer.[12] In the stomach, PSCA is expressed in the isthmus/neck region, a middle portion of the gastric epithelium, in which rapidly amplifying pre-pit cells are present to support the rapid turnover of mucus-secreting pit cells (Fig. 2). It is speculated that PSCA has a role in regulating cell growth of the pre-pit cells and that reduction of its function predisposes the pre-pit cells to abnormal cell division and carcinogenesis (Fig. 2). It is thought that the initial lesion of DGC arises in the isthmus/neck region, based on a detailed histopathological investigation in Japan, which revealed that the smallest lesions of dysplasia and carcinoma seem to be confined to the region.[36]

Figure 2.

Prostate stem cell antigen (PSCA) regulates proliferation of pre-pit cells in gastric epithelium? (a) PSCA is mainly expressed in the epithelium in the middle portion, the isthmus and neck regions, of the gastric gland (immunohistochemical double stain; blue for PSCA and brown for proliferating cell nuclear antigen). Together the two regions are called the isthmus/neck region as the boundary between the two regions is often ambiguous. Weak PSCA expression is also observed in the epithelium of the pit region. (b) The isthmus/neck region harbors pre-pit cells, a precursor of pit cells, which are rapidly proliferating to compensate for rapid turnover of pit cells. It is hypothesized that PSCA regulates proliferation of pre-pit cells, which also contributes to prevention of carcinogenesis in the epithelium.

It was demonstrated that the rs2294008 in the gene is the functional SNP affecting transcriptional activity of the PSCA promoter. However, the biological effect of the T allele still seems to be considerably controversial. The rs2294008 determines the position of the translation initiation codon; the T allele makes itself part of the codon encoding as the first methionin (ATG); in contrast, the C allele replaces the encoded amino acid from methionine to threonin (ACG), resulting in a change of the first methionine position. The T allele associated with the risk for GC has a negative effect on the promoter activity in gastric, urinary bladder and gallbladder cancer cell lines.[12, 22, 35] Therefore, it is anticipated that people possessing the T allele have a lower amount of PSCA protein in the organs than those possessing the C allele. However, recent microarray transcriptome analyses showed that normal and malignant urinary bladder tissues from people with the T allele contained more PSCA transcripts than those from people with the C allele.[37] The discrepancy between the in vitro reporter assay and the in vivo expression data further suggests a complexity of the PSCA regulation, which might be influenced by tissue-specific transcriptional factors and DNA methylation, as shown in gastric and gallbladder cancer cell lines.[35] In contrast, a recent study reported that the C allele of rs2294008 in PSCA was associated with an increased risk of duodenal ulcer (DU; odds ratio = 1.84; = 3.92 × 10−33 in a recessive model).[38] Moreover, the results of functional analyses showed that the C allele changes the subcellular localization of PSCA protein from the cell surface to the cytoplasm and also reduced the protein's stability.[38] As PSCA might have several functions, some of which could be contradictory, in the context of tissues and pathological states, it is likely that the functional effect of rs2294008 might also differ. In particular, the reciprocal association between the rs2294008 alleles and two major HP-related gastrointestinal diseases is notable; the T allele predisposes to GC, while the C allele confers an increased risk for DU. It is known that patients with DU have a decreased risk for GC, but it depends on the location of gastritis; DU patients with chronic corpus gastritis have an increased risk of GC.[39] This could be explained by the relation between the location of gastritis and the amount of acid secretion; corpus-predominant gastritis is accompanied by hypochlorhydria and results in the highest risk for GC, whereas antrum-predominant gastritis is associated with hyperchlorhydria and predisposes to DU disease.[40, 41] As PSCA is not expressed in the duodenum, its function in the stomach might affect DU development, possibly through such effects as those related to acid secretion, location and extent of HP infection or gastritis. It was also reported that the T allele was especially associated with non-cardia GC in Chinese and Caucasian patients but with cardia GC in Korean patients. [14, 18-21]

The cell growth inhibition activity of PSCA was also demonstrated on gallbladder cancer (GBC) cells.[35] The GBC cells introduced with PSCA cDNA showed lower in vitro and in vivo growth than controls, and their invasion ability assayed with a Matrigel chamber was also attenuated.[35] PSCA is expressed homogenously in normal gallbladder epithelium, which is characterized by a mono-layer of columner cells and functions by absorbing water and electrolytes. It is possible that PSCA has a role in cell-division control and/or other activities such as active transport of molecules in gallbladder epithelium.

The product of the PSCA gene is glycosylphosphatidylinositol (GPI)-anchored membrane protein with unknown biological function.[42] It is believed that, as with other GPI-anchored proteins, PSCA might be located in a special microdomain called lipid raft, enriched in glycosphingolipids, cholesterol and other lapidated proteins, on the outer surface of the cell membrane. The lipid raft is known as the domain where molecular interaction for subcellular signaling is processed.[43] However, there have been no reports on the elucidation of the PSCA ligand or the molecule on which PSCA makes some modification. Our attempt to co-immunoprecipitate PSCA-associating molecules has not been successful, even with cross-linking of proteins using DSP (dithiobis[succinimidylproprionate]) or DTSSP (3,3′-dithiobis[sulfosuccinimidylproprionate]). The function of PSCA might be restricted to a preparation of the microenvironment on the cell membrane by changing the local composition of lipid and other molecules, which supports some molecular interaction required for subcellular signal transduction. If that is the case, its apparently organ-dependent opposing function, either tumor promotion or suppression, could be highly comprehensible.

MUC1 gene

The Mucin family (MUC1 to MUC21) consists of secretary and membrane-bound types, and MUC1 belongs to the latter.[44] After being translated, a single MUC1 peptide was cleaved to N-terminal and C-terminal subunits, designated as MUC1-N and MUC1-C, respectively, by autoproteolysis, but both the subunits remain associated by non-covalent binding and are localized to the cell membrane in the apical side of the epithelial cells. MUC1-N present on the cell surface has multiple glycosylation sites and is thought to be the second line of protection for cells against many types of insults, after the front layer of defense by the secretary mucins in mucus.[45] However, MUC1-C has a transmembrane domain and a cytoplasmic tail (CT), which is involved in subcellular signal transduction. The CT contains several phosphorylation sites and a β-catenin binding site. Phosphorylation of Thr in the TDRSPYEKV sequence within the CT stimulates interactions between the CT and β-catenin, which leads to nuclear localization of the complex for regulating genes including p53.[46, 47] Previously, and maybe at present, MUC1 has been considered as an oncoprotein, because there are several studies demonstrating the correlation of MUC1 expression and the poor prognosis of cancer patients. It was also reported that MUC1 acts as a growth factor receptor on undifferentiated human embryonic stem cells.[48-50]

In addition to the GWAS conducted in Japan, the recently conducted GWAS on the Chinese population also listed 1q22 as a candidate for GC-related locus (rs4072037; odds ratio = 0.75, = 4.22 × 10−7).[12, 24] The association between non-cardia GC and the GC-related locus was also demonstrated in imputation analyses on large-scale Chinese case-control samples (rs4072037; odds ratio = 0.73, = 1.0 × 10−4).[23] Moreover, an association between MUC1 gene polymorphisms and GC has also been reported by other groups previously.[51, 52] The MUC1 gene contains a polymorphic number of tandem repeats, a variable number of tandem repeats (VNTR), which are shown by an electrophoresis pattern after restriction enzyme digestion. When the polymorphic allele is divided into large (L) and small (S) alleles, the latter was shown to associate with GC in Caucasians.[51, 52] The association between the A allele of rs4072037 and DGC identified in Japanese patients was also found in Chinese and Caucasian patients.[53, 54] The A allele of rs4072037 identified using the GWAS is in a linkage disequilibrium with the S allele in Japanese and Caucasian patients.[13, 55] The results of these association studies on different ethnic populations strongly support the result of the Japanese GWAS; MUC1 is a GC susceptibility gene.

It was demonstrated that the rs4072037 in MUC1 has a biological function. In the gastric epithelium, variants 2 and 3 are the major MUC1 transcript.[13] The rs4072037 is located in the 5′ side of the second exon of MUC1 and determines the splicing acceptor site in the second exon, which in turn defines the type of variants; the G and A alleles result in the expression of variants 2 and 3, respectively (Fig. 3).[13, 55] Consequently, the nine amino acid deletion in the second exon changes the supposed cleavage site of the N-terminal signal peptide, which might lead to a difference in the function of the encoded protein between the two splicing variants. It is understood that, in GC and other cancer cells that have lost cell polarity, the MUC1 protein interacts freely with other molecules including membrane receptors involved in cell growth and, consequently, it acts as an oncoprotein; in contrast, in normal epithelial cells, MUC1 is restricted to the apical surface of the cells where the interaction with other molecules is limited and it acts as a barrier against exogenous insults to the cells.[56] It is speculated that the rs4072037 affects the barrier function in the stomach of individuals through the determination of a major variant expressed in the stomach, which results in the difference in GC susceptibility. From a different viewpoint, MUC1 was identified as associating with the serum magnesium level using GWAS.[57] Because a correlation between a low serum magnesium level and GC was suggested, it is possible that MUC1 affects GC susceptibility by playing a role in magnesium homeostasis.[58] However, it was demonstrated using the GWAS that hypomagnesaemia was correlated with the G allele of rs4072037, the protective allele for GC development.[57] As no correlation was observed between MUC1 and IGC in the present study ([13] and Hiromi Sakamoto, Teruhiko Yoshida and Yusuke Nakamura, unpublished data), MUC1 probably has a role specifically in DGC, contrary to PSCA whose association was revealed in both DGC and IGC (Table 2). This difference might derive from the difference in their pathogenesis.

Figure 3.

Single nucleotide polymorphism (SNP) rs4072037 determines the major splicing variants expressed in the gastric mucosa. In the gastric mucosa, major splicing forms were variant 2 (NM_001018016) and variant 3 (NM_001018017). The allele of SNP rs4072037 is related to the splicing acceptor site selection in the second exon (upper panel). Nucleotide sequences of the first/second exon boundary of Mucin 1 (MUC1) variants 2 and 3 revealed using RNA ligase-mediated rapid amplification of the 5′ cDNA end on RNA samples from normal stomach and gastric cancer cell lines (lower panel).[13] In the present study, all variant 2 transcripts containing the first 27 bp of the second exon (double-headed red arrow) had a G allele at rs4072037, while all variant 3 lacking the 27 bp had A allele. This result is concordant with a previous report.[55] It is anticipated that deletion of the 27 bp, corresponding to nine amino acids, changes cleavage sites (green arrows) of the signal peptide among the variants. One-letter amino acid abbreviation is shown just below or above the second nucleotide of each codon.


It is expected that identification of GC susceptibility genes will contribute to the development of a new approach in GC prevention in the future. The risk genotypes of the two genes identified by the previous Japanese GWAS classified the majority of Japanese people into a high-risk group (Fig. 1). This finding is supported by HapMap Project data on 11 ethnic populations, which show, for example, that the Japanese population has the lowest frequency of the protective genotype (C/C) of rs2294008 in the 11 populations (Supporting Information Table S1). This offers a good starting point for the development of a personalized DGC prevention, because it means that we could add other layers of risk factors to capture even higher DGC risk subgroups without too much size diminution and restriction of the target population. Adding non-genetic risk factors for a further stratification is of particular interest, because they could be modifiable, while the genetic predisposition presents a fixed and basic risk probability of each individual. It should also be noted that the Japanese GWAS on GC was initiated almost 10 years ago. Since then, more powerful platforms for efficient SNP typing including next-generation sequencers have been developed, and more numerous DNA samples have been accumulated for a GWAS by several groups in Japan, including those involved in prospective cohort studies. Other novel GC susceptibility genes and their interactions with non-genetic risk factors can be identified by conducting a GWAS with a large number of the samples and the latest typing platforms, especially in a nested case-control design based on a molecular epidemiological or “genome” cohort. Such a systematic approach will contribute not only to GC prevention but also to the development of new GC therapeutics by unveiling novel molecular pathways involved in GC carcinogenesis and should be one of the urgent agenda items in medical research in light of the overwhelming social burden of cancer death in Japan and the world.


This review is based on research grants from the Ministry of Education, Culture, Sports, Science and Technology, Japan (JST grant), Grants-in-Aid for Scientific Research (KAKENHI) by the Japan Society for the Promotion of Science (No. 23501327), and the Program for Promotion of Fundamental Studies in Health Sciences of the National Institute of Biomedical Innovation (NiBio, Grant No. 10–41).

Disclosure Statement

The authors have no conflict of interest.