Fruit and vegetable intake and the risk of gastric adenocarcinoma: A reanalysis of the european prospective investigation into cancer and nutrition (EPIC-EURGAST) study after a longer follow-up


  • This article was published online on 2 April 2012. An error was subsequently identified. This notice is included in the online and print versions to indicate that both have been corrected 22 October 2012.


In a previous European prospective investigation into cancer and nutrition (EPIC) analysis, we found an inverse association between total intake of vegetables, onion and garlic, and risk of intestinal gastric cancer (GC) and between citrus fruit and risk of cardia GC. The aim of this study is to reanalyze the effect of fruit and vegetables (F&V), based on a longer follow-up and twice the number of GC cases. Subjects are 477,312 men and women mostly aged 35 to 70 years participating in the EPIC cohort, including 683 gastric adenocarcinomas with 11 years of follow-up. Information on diet and lifestyle was collected at baseline. A calibration study in a subsample was used to correct for dietary measurement errors. When comparing the highest vs. lowest quintile of intake, we found an inverse association between total intake of V&F and GC risk [hazard ratio (HR) 0.77; 95% confidence interval (CI) 0.57–1.04; p for trend 0.02], between fresh fruit and risk of the diffuse type (HR 0.59; 95% CI 0.36–0.97; p for trend 0.03) and an inverse association between citrus fruit and risk of cardia cancer (HR 0.61; 95% CI 0.38–1.00, p for trend 0.01). Although calibration revealed somewhat stronger inverse associations, none of the risks reached statistical significance. There was no association between total or specific vegetables intake and GC risk. The inverse association between fresh fruit and citrus fruits and risk of GC seems to be restricted to smokers and the Northern European countries. Fresh fruit and citrus fruit consumption may protect against diffuse and cardia GC, respectively.

The incidence of gastric cancer (GC) has declined in most countries over the last decades. However, GC remains the second most common cause of cancer death and the fourth most common cancer in the world.1 Helicobacter pylori infection2 and smoking3 are well-recognized causal risk factors of GC. It is thought that diet also plays an important role in the etiology of stomach cancer, although there is not yet convincing evidence. Although results from case–control studies are consistent, evidence from cohort studies is inconclusive. Consumption of nonstarchy vegetables and fruits (F&V) probably protects against stomach cancer.4 A meta-analysis of cohort data as described by the World Cancer Research Fund showed a significant decreased risk for green-yellow vegetables and allium vegetables4 but not for total vegetables or total fruit. Another meta-analysis5 of cohort studies assessing the association between F&V intake and GC risk found a significant inverse association with fruit intake when the outcome was cancer incidence, while the effect of vegetables was weaker and nonstatistically significant.

In a previous analysis of the European Prospective Investigation into Cancer and Nutrition (EPIC-EURGAST)6 which included participants with a wide range of fruit F&V intake7 and that was based on 330 gastric adenocarcinomas, we found an inverse but nonstatistically significant association between the intestinal type and total vegetable, onion and garlic intakes, as well as an inverse nonstatistically significant association between citrus fruit intake and cardia GC. The aim of this study is to describe the effect of intake of F&V on the risk of gastric adenocarcinoma, based on a longer follow-up and about twice the number of GC cases.

Material and Methods

Study subjects

EPIC is a prospective study designed to investigate the relation between diet, lifestyle, genetic and environmental factors and the incidence of cancer and other chronic diseases, carried-out in 23 centers from 10 European countries: Denmark, France, Germany, Greece, Italy, the Netherlands, Norway, Spain, Sweden and the United Kingdom. The study has been described in detail elsewhere.8 The EPIC cohort consist of 521,448 subjects (70% women), mostly aged 35–70 years, recruited mostly between 1992 and 1998, usually from the general population. Exceptions were the French cohort based on members of the health insurance of school employees, the Utrecht cohort and the Florence cohort based on women attending breast cancer screening, part of the Italian and Spanish cohorts based on blood donors and the Oxford cohort based mostly on vegetarians. Eligible participants gave written informed consent and completed questionnaires on their diet, lifestyle and medical history. Approval for this study was obtained from the ethical review boards of the International Agency for Research on Cancer and from all local participating centers.

Diet and lifestyle questionnaires

The usual diet over the previous 12 months was measured at recruitment by country-specific validated questionnaires.8, 9 Most centers adopted a self-administered questionnaire of 88 to 266 food items. In Greece, Spain and Ragusa, the questionnaire was administered at a personal interview. Questionnaires in France, Italy, Spain, The Netherlands, Germany and Greece were quantitative, estimating individual average portion size systematically. Those in Denmark, Naples and Umea were semi-quantitative, with the same standard portion assigned to all subjects, whereas in Norway standard portion size was used for some foods and individual portion size for others. In Malmö, Sweden and United Kingdom, a questionnaire method combined with a food record was used. In Spain, a computerized version of a diet history questionnaire was used. Lifestyle questionnaires included questions on education, physical activity, lifetime history of smoking and alcohol intake, occupation, reproductive history and use of hormones, history of previous illnesses and some surgical operations.

Follow-up and identification of cancer cases

The follow-up was based on population cancer registries, except in France, Germany and Greece, where a combination of methods, including health insurance records, cancer and pathology hospital registries and active follow-up were used. Mortality data were collected from registries at the regional or national level. Follow-up began at the date of recruitment and ended at either the date of diagnosis of GC, death or date of the last complete follow-up. A total of 892 incident GC cases had been reported to the central database at IARC up to September 2010. Cancer of the stomach included cancers coded as C16 (C16.0 for cardia and C16.1–16.6 for noncardia) according to the 10th Revision of the International Statistical Classification of Diseases, Injuries and Causes of Death (ICD). Validation and confirmation of the diagnosis, classification of tumor site and morphology of tumor (according to ICDO2 Classification and Lauren classification for histology) for 81% of cases was carried-out by a panel of pathologists,10 using original histological slides and/or slices obtained from paraffin blocks, as well as original histopathological reports. Among the incident cases, 41 gastric lymphoma and 91 other nonadenocarcinoma GC were excluded, so 760 case subjects with gastric adenocarcinoma were available for the analysis. Fifty-three gastro esophageal junction tumors were combined with cardia tumors.

Statistical methods

The proportional hazard model (Cox regression) was used for the analyses of the cohort data. The analysis was stratified by centre, to control for potential confounding due to differences in follow-up procedures and questionnaire design. Age was used as the time scale variable in all models. Entry time was defined as age at recruitment and final time as age of diagnosis (cases) or age at censoring (at risk subjects). All models were adjusted for sex, body mass index (BMI) (<25; 25–30; >30), educational level (none, primary, technical, secondary, university and missing), alcohol intake (g/day) at baseline, smoking (never, current 1–15 cigarette/day, 16–25 cigarette/day, 26+ cigarette/day, former quit <10 years, 11–20 years, >20 years, current pipe or cigar, current or former missing and unknown), physical activity (inactive, moderately inactive, moderately active, active and missing),11 energy intake (kcal/day) and consumption of red and processed meat (g/day). The analysis of fruit was adjusted for vegetable consumption and vice versa. Vegetable and fresh fruit intakes from the dietary questionnaires were estimated in grams per day. The list of specific vegetables included in each subgroup of vegetables is shown at the bottom of Table 5. Dried fruit and fruit and vegetable juices were excluded. Juices were quantified in liquid form and it is insensible to pool liquid and solid quantities; moreover, their intake in the EPIC cohort is very low (less than 10% of total fruit and vegetables consumption).7 Intakes were analyzed as both continuous (increments of 100 g/day for groups and of 50 g/day for subgroups) and categorical variables using EPIC-wide sex-specific quintiles. Categorical variables (scored from 1 to 5) were used to calculate trend tests. No interaction by sex was observed and therefore results for both sexes combined are presented (p for interaction 0.43, 0.39 and 0.26 for vegetables, fresh fruit and citrus fruits, respectively). The Wald statistic12 was used to test for homogeneity of risk for cardia and noncardia and intestinal and diffuse subtype of GC. We assessed possible interaction between fruit and vegetables intake and smoking, alcohol intake and EPICs countries from the North and South in a stratified analysis, by including a product term between exposure and potential effect modifier. To exclude reverse causation, subsequent sensitivity analysis was run after the exclusion of cases diagnosed during the first 2 years of follow-up.

Table 1. Description of the EPIC-EUROGAST cohort
inline image

Spline regression

Restricted cubic splines (using from 3 to 4 knots) were used to evaluate whether the associations between F&V intakes and GC were linear. Knot positions were determined using the Harrell criteria,13 and to select the best model Akaike's Information Criterion (AIC) was used.

Calibration of the dietary data

A second dietary measurement was taken from an 8% random sample of the cohort (36,994 participants) using a detailed computerized 24-hr dietary recall (24HR) method14 to calibrate dietary measurements of total fruits, total vegetables and total fruit and vegetables intake across countries and to correct for systematic overestimation or underestimation of dietary intakes.15, 16 The 24HR values of these 36,994 cohort participants were regressed on the main dietary questionnaire values for vegetables and fruits. Weight, height, age at recruitment and center were included as covariates, and data were weighted by day of the week and season of the year during which the 24HR was collected. Zero consumption values in the main dietary questionnaires were excluded in the regression calibration models (0–8% of the participants depending on the food variable) and a zero was directly imputed as a corrected value. Country and sex-specific calibration models were used to obtain individual predicted values of dietary exposure for all participants. Cox regression models were then run using the predicted (calibrated) values for each individual on a continuous scale. The standard error of the deattenuated coefficient was calculated with bootstrap sampling in the calibration and disease models consecutively.16 For all analyses, p-value < 0.05 were considered statistically significant.


From the original cohort of 521,448 individuals, 28,268 (50 GC) were excluded because they had a prevalent cancer or were lost to follow-up. 15,868 (27 GC) individuals from which no dietary information was available or were in the top or bottom 1% of the ratio of energy intake to estimated energy requirement were also excluded from the analysis.17 The final sample consisted of 477,312 individuals, including 683 (57.8% men) stomach adenocarcinomas (hereafter denoted as GC), with 5,262,994 person-years and an average of 11.02 (SD 2.8) years of follow-up.

Table 1 shows the distribution of cases by country, as well as means of intakes of fresh fruits and vegetables based on 24HR data. According to site, 201 (29.4%) cancers were located in the cardia, 323 (47.3%) in the distal part of the stomach and for 159 (23.3%) cases, localization was unknown and therefore excluded in the subsequent analysis by sublocalization. According to the Lauren classification, 203 (29.7%) were classified as intestinal, 217 (31.8%) as diffuse and 263 (38.5%) as mixed, unclassified or undetermined.

Table 2. Range and mean intake1 (g/day) of fruits and vegetables by quintiles and sex
inline image

Table 2 shows the range and mean intakes (based on dietary questionnaires) and mean intakes (based on the 24HR from the calibration study) of total F&V, total fresh fruit, total vegetables and subgroups, in men and women, within each sex-specific quintile. For both sexes, mean intake of total vegetable in the upper quintile was more than twofold higher than in the lowest, whereas for fruit intake it was sixfold in men and more than threefold in women.

Table 3. Baseline distribution of co-factors according to quintiles of intake of vegetables and fresh fruit
inline image

Baseline characteristics of the participants according to intake of F&V are reported in Table 3. Those with highest intake of fruit and vegetables were more often never smokers, reported a higher educational level, more physical activity, higher energy intake and higher consumption of red meat, but less consumption of processed meat.

Table 4. Multivariate hazard ratio (HR) of gastric cancer (95% confidence interval) for observed and calibrated intakes of total fruits and vegetables, total vegetables, total fresh fruit and citrus according to subsite and histological type in the EPIC-EUROGAST cohort
inline image

Table 4 shows the observed and calibrated hazard ratios (HR) of GC according to total F&V, total vegetables, total fresh fruit and citrus fruit intake for all stomach cancer, as well as cardia, noncardia and intestinal and diffuse subtypes. In the categorical analysis, an inverse association was seen with total F&V [HR 0.77; 95% confidence interval (CI) 0.57–1.04; p for trend 0.02] for the highest vs. lowest quintile intake (corresponding to 611 vs. 187 g/day in men and 547 vs. 233 g/day in women). A nonsignificant inverse association was observed between total fresh fruits (p for trend 0.05) and citrus fruit (p for trend 0.07) and GC risk. No association was observed for total vegetable intake. Controlling for measurement error, the calibrated HRs were slightly lower than the continuous noncalibrated values, but none of the calibrated estimates were significant (HR 0.93; 95% CI 0.87–1.00 for an increase of 100 g/day of F&V and HR 0.91; 95% CI 0.82–1.01 for an increase of 50 g/day for citrus fruit). In the categorical analysis by site, we found a borderline significant inverse association between citrus fruit intake and cardia cancer (HR 0.61; 95% CI 0.38–1.00, p for trend 0.01) for the highest vs. lowest quintile intake (corresponding to 104 vs. 11 g/day in men and 84 vs. 23 g/day in women) that was not observed in the noncardia GC (p for heterogeneity 0.04). When comparing the results of linear regression analysis with spline regression analysis for citrus fruit consumption and cardia GC risk using four knots (data not shown), we found that the AICspline was 2508.57, and the AIClinear was 2505.5, indicating that the linear model gave a slightly better fit than a spline model. When analyzing by histological type, we observed a significant inverse association between the diffuse type and total fresh fruit intake (HR 0.59; 95% CI 0.36–0.97, p for trend 0.03) for the highest vs. lowest quintile of intake (corresponding to 386 vs. 64 g/day in men and 346 vs. 107 g/day in women), whereas no association was observed for the intestinal type (p for heterogeneity 0.28). This association was, however, not statistically significant in the continuous or calibration models.

Table 5. Intake of specific subgroups of vegetables and the risk of gastric cancer1 by subsite and histological type
inline image

In relation to specific types of vegetables (Table 5), no evidence of association was found for any of the subgroups of vegetables and GC risk. A positive association was observed for cardia GC and fruiting vegetable consumption, but the HRs showed an inverted U-shaped pattern.

We explored the effect of F&V intake after excluding cases diagnosed in the first 2 years of follow-up, and overall, we did not observe substantial changes in any of the studied associations. The inverse association between citrus fruit and risk of cardia GC remained statistical significant (HR for the highest vs. lowest intake 0.54; 95% CI 0.32–0.90; p for trend 0.02; Supporting Information Table 6). Stratifying by alcohol, no effect modification was observed (data not shown). Results of analysis stratified by smoking are presented in Supporting Information Table 7. The inverse association between total F&V (p for interaction 0.23), total fresh fruit (p for interaction 0.09), citrus fruit (p for interaction 0.07) and GC risk seems to be restricted to current smokers. Stratifying by smoking in non cardia GC, using a continuous variable of dietary intake, we observed (data not shown) a significant interaction for fresh fruit (p for interaction 0.042) and for citrus fruit (p for interaction 0.035). Stratifying by EPIC countries from the South and North of Europe (Supporting Information Table 8), we found a significant interaction between intake of total fruit and vegetables (p for interaction 0.024) and total vegetables (p for interaction 0.049) and GC risk. A borderline inverse association was observed for the Northern countries, whereas no association was found for the Southern countries.


In a previous analysis of this cohort study and based on 330 cases of GC and 6.5 years of follow-up, we found a weak inverse association between citrus fruit intake and risk of cardia GC, as well as weak inverse association between risk of intestinal GC and leafy vegetables as well as onion and garlic intake.6 In this reanalysis based on 683 GC and 11 years of follow-up, we confirmed the inverse association between citrus fruit and cardia GC risk, we also found an inverse association, with a significant dose-response, between risk of diffuse GC and total fresh fruit intake, as well as between F&V consumption and overall GC risk. On the contrary, the previous suggestion of an inverse association between leafy vegetables, onion and garlic intake and risk of intestinal GC was no longer present.

The statistically significant inverse association between plasma levels of carotenoids and retinol and CG risk observed in a nested study of the EPIC cohort18 may suggest, however, that systematic and random errors in the measurement of vegetables intake and to a lesser extent of fruits intake, may have attenuated questionnaire-based associations with GC risk. It is also possible that most participants are already above the biological level needed to have a beneficial effect of bioactive chemical compounds in vegetables and fruits. This might explain the lack of effect observed in our study in the Southern countries, where total F&V and citrus fruits intake is almost twice the consumption in the Northern countries. The mean levels of F&V intake in our cohort, even in the lowest quintile were relatively high (187 g/day for men and 233 g/day for women). In the largest case–control study carried-out in Western-Europe, almost 20 years ago, the cut-off for the lowest category of vegetable intake was 2.1 times a month in Sweden,19 2.9 times a week in Italy20 and 47 g/day in Spain.21

The significant decreased risk for green-yellow vegetables and allium vegetables found in the meta-analysis of cohort data of GC included in the last WRCF and AICR4 report could not be replicated in the current updated analysis. Our results are more consistent with a previous meta-analyses5 of 13 cohort studies assessing the association between fruit intake and GC and 8 cohort studies assessing the association between vegetable intake and GC risk. When the analysis was restricted to incidence studies, this meta-analysis suggested that the inverse association for GC incidence was stronger for fruit (RR = 0.82; 95% CI: 0.73–0.93) than for vegetable (RR = 0.88; 95% CI: 0.69–1.13). The magnitude of the effect was stronger in studies with longer follow-up and was related to the validity of the dietary assessment method and the adjustment for covariates. Because the last WRCF and AICR4 report, apart from our previous study,6 seven other cohort studies have been published.22–28 A significant inverse association with total vegetables was found in only one25 out of the seven cohorts, however, including a small number of GC cases, and inverse association for some specific vegetables (brassica or root) was found in only two cohort studies.25, 28 On the contrary, a significant inverse association with fruit or citrus fruits was observed in four22, 24, 27, 28 out of these seven cohorts. In two of them,22, 28 there was a significant inverse association between citrus fruit or any fruit intake with risk of cancer of the cardia but not of noncardia.

The inverse association between cardia GC and citrus fruit is in agreement with the role of vitamin C in gastric carcinogenesis, which may act through inhibition endogenous formation of nitrosamines and scavenging of potentially mutagenic oxidative free radicals.4 It is also consistent with our finding of a significant inverse association between plasma vitamin C and GC risk (stronger also for cardia), observed in a nested case–control of the EPIC-EUROGAST study.29

We did not observe differences between intestinal and diffuse type regarding the effect of total F&V, total and specific vegetable intake, but the significant inverse association with total fresh fruit and GC was restricted to the diffuse type. The relative small sample of each histological type stratified by subsite does not allow us to assess the effect by histology and site simultaneously. The largest European case–control studies19–21 have shown similar patterns for both histological types, but evidence from cohort studies is lacking and the pathway and features of these histological types is still unknown. None of the seven cohort studies published in recent years had information on histology. The results presented in this article are based mostly on histologically confirmed adenocarcinoma cases that have been validated by a panel of pathologists.10

In a case–control study nested within the EPIC cohort, we examined the relationship between H. pylori infection status, measured in plasma30 and noncardia GC risk (data not shown). However, in another EPIC-EUROGAST study,31 we have shown that most GC cases classified as negative for H. pylori status by ELISA are actually false negative when they are analyzed by Western blot, supporting the hypothesis that H. pylori may be a necessary condition for noncardia GC. Given that Western blot was used in this study in only half of the included cases, some of the uninfected cases could be false negatives. Therefore, the analysis of interaction stratifying by sero-positivity and sero-negativity for H. pylori infection would be uninformative.

When we stratified by smoking status, the inverse association with GC risk tended to be restricted to smokers, although the interaction term was significant only for noncardia GC. The same results were observed in another cohort study in relation to fruit intake.27 In relation to the effect of total antioxidant capacity and GC risk32 in another EPIC-EUROGAT study, we also found a significant inverse association in ever and current smokers, whereas no association was observed in never smokers. It is known that active smokers have lower blood concentrations of ascorbic acid, alpha carotene, beta-carotene and cryptoxanthin,33 and these differences are in part due to differences in dietary habits between smokers and nonsmokers. It has been suggested27 that smokers benefit more from fruit intake because they have lower blood levels of vitamin C and carotenoids. Another alternative explanation is that this effect could in part be the result of residual confounding by smoking.

Epidemiological studies, including cohort studies, do have limitations.34 Measurement error of dietary exposure being the most important one, which forces us to be cautious in making definitive conclusions, also because several comparisons were made in the analyses, and some results may have been due to chance. It has been shown that the magnitude of the distortion in the estimated relative risk depends on the ratio between the interindividual variation of intake to the intraindividual measurement error.34 This means that the relatively broad range of vegetables and fruit intake in the EPIC cohort may reduce the potential impact of measurement error. Statistical calibration allowed us to correct part of this measurement error. The calibrated HRs were slightly lower than the noncalibrated values, but none were significant, indicating either a lack of a true effect or that the calibration, at least for some food groups, does not completely correct the measurement error. It is known that the measurement error of the dietary questionnaire and the 24HR are correlated.15 Anther limitation is that the EPIC cohort does not have estimates of salt intake. The accurate measurement of total salt intake from the diet is very difficult because the proportion added during food preparation or at the table is highly variable and difficult to quantify. We were only able to adjust for processed meat, which is one of the major sources of salt intake in Western Europe.

In conclusion, even though in EPIC the mean of consumption of F&V in the lowest quintile is relatively high, we did observe an inverse association with a significant trend test, between total F&V and GC risk, between total fresh fruit and the diffuse type and between citrus fruit and cardia GC. The effect seems to be restricted to smokers and the Northern European countries. No associations were observed with total vegetables or different subtypes of vegetables. The 5-year survival rate of GC is very low, and the identification and control of risk factors represent the most effective way of reducing the burden of these tumors.


We are grateful to the members of the pathologist panel for their valuable work: Dr. Fatima Carneiro (coordinator, Porto, Portugal), Dr. Roger Stenling (Umea, Sweden), Dr. Johan Offerhaus (Amsterdam The Netherland), Dr. Laszlo Igali (United Kingdom), Dr. Julio Torrado (San Sebastian, Spain), Dr. Gabriella Nesi (Firenze, Italy), Dr. U. Mahlke (Postdam, Germany), Dr. Hendrik Bläker (Heildelberg, Germany), Dr. Claus Fenger (Denmark) and Dr. Dimitrious Roukos (Ioannina, Greece). We also thank Catia Moutinho (Porto, Portugal) for her technical work in the preparation of pathological material.


Calibration equation

The multivariate calibration model used is defined as:

equation image

where Rij = (Rij,1,…RijK1) is the 24-hr recall questionnaire for each individual i (i = 1, …, nj), and for each country j = 1, …, J, Qij is the questionnaire measured and Zij = (Zij,1, …, ZijK2) is the error-free covariates vector. εij i.i.d. N(0,∑εlgroup=j) and capture the residual random variability within each group.

The regression calibration allows the imputation of predicted values, which are used in the disease model. Bootstrap sampling is used to estimate the variance of the corrected parameter.