Lung cancer is mainly caused by smoking, but the quantitative relations between smoking and histologic subtypes of lung cancer remain inconclusive. By using one of the largest lung cancer datasets ever assembled, we explored the impact of smoking on risks of the major cell types of lung cancer. This pooled analysis included 13,169 cases and 16,010 controls from Europe and Canada. Studies with population controls comprised 66.5% of the subjects. Adenocarcinoma (AdCa) was the most prevalent subtype in never smokers and in women. Squamous cell carcinoma (SqCC) predominated in male smokers. Age-adjusted odds ratios (ORs) were estimated with logistic regression. ORs were elevated for all metrics of exposure to cigarette smoke and were higher for SqCC and small cell lung cancer (SCLC) than for AdCa. Current male smokers with an average daily dose of >30 cigarettes had ORs of 103.5 (95% confidence interval (CI): 74.8–143.2) for SqCC, 111.3 (95% CI: 69.8–177.5) for SCLC and 21.9 (95% CI: 16.6–29.0) for AdCa. In women, the corresponding ORs were 62.7 (95% CI: 31.5–124.6), 108.6 (95% CI: 50.7–232.8) and 16.8 (95% CI: 9.2–30.6), respectively. Although ORs started to decline soon after quitting, they did not fully return to the baseline risk of never smokers even 35 years after cessation. The major result that smoking exerted a steeper risk gradient on SqCC and SCLC than on AdCa is in line with previous population data and biological understanding of lung cancer development.
Lung cancer is a leading cause of death, where the proportion of cases attributable to smoking has reached up to 90% in countries with a history of tobacco consumption.1 Although the general relative risk pattern of smoking and lung cancer is well known, a few questions remain open, for example, the different contribution that smoking makes to the etiology of different histologic types of lung cancer.2
Maintaining gas exchange in the lung requires tight coordination of functional components, including neural regulation of breathing, plasticity and permeability of the lung surface and protection from inhaled toxicants. This is reflected in a number of different cell populations that, in case of malignant transformation, can result in a variety of tumors as described in the WHO's histological classification of lung cancer.3 Smoking is a strong risk factor for all forms of lung cancer, and among male smokers, squamous cell carcinoma (SqCC) is the predominant subtype. Smoking is also closely associated with small cell lung carcinoma (SCLC).4 Adenocarcinoma (AdCa) is the most common subtype in never smokers and women, with increasing incidence rates over time.5–7
Current knowledge of the biological pathways in lung development, tissue repair and cancer can improve the understanding of the distribution of histological subtypes by extent of smoking, and vice versa the distribution of the major subtypes in humans can serve as proof of the experimental results. A comprehensive statistical analysis of lung cancer by subtype is, therefore, an important step toward translational research.
The SYNERGY project was designed as a pooled analysis of case–control studies on the interaction of occupational carcinogens in the development of lung cancer.8 Our work uses information gathered by SYNERGY to explore the risks of smoking on the major histological subtypes using the largest observational database of lung cancers and controls of mainly European descent.
Material and Methods
Information on smoking, subtype of lung cancer and other data were extracted from the SYNERGY database for eight European case–control studies in 11 countries and for one Canadian study (Supporting Information Table S1). The pooled dataset comprised 13,169 cases (10,653 males and 2,516 females) and 16,010 controls (12,758 males and 3,252 females) enrolled from 15 study centers between 1985 and 2005. Controls were individually or frequency matched to cases by gender and age and recruited from the general population or hospitals. Population controls comprised 68.8% of the subjects from the pooled database. Smoking information was predominantly collected by interviews with the subjects themselves, though proxy next-of-kin respondents were accepted in the Swedish and the Canadian studies if subjects were unavailable (12.3% of all subjects). The subtype of lung cancer was classified according to WHO guidelines3 by the pathologists associated with the participating hospitals. Reference pathology was performed for the German cases.9 The ethics committees of the individual studies approved the conduct of the study, as did the Institutional Review Board at IARC.
Assessment of tobacco smoking
Information on cigarette smoking included age at initiation, age at quitting for former smokers and cigarettes per day in calendar-year periods. A regular smoker was defined as someone who had smoked more than one pack-year (py), and a current smoker was defined as a regular smoker who still smoked in the year of interview or in the year before. We excluded the year of the interview and the previous year for the calculation of py, average intensity and duration of smoking. Subjects who had ever smoked but had not accumulated one py were considered occasional smokers; these were treated as nonsmokers in this analysis. Smokers of types of tobacco other than cigarettes were classified as “other smokers” (yes or no).
Smoking-related relative risks were estimated for all lung cancers combined (N = 13,163) and for each of the three main subtypes AdCa (N = 3,397), SqCC (N = 5,310) and SCLC (N = 2,201). All calculations were performed with SAS/STAT software, version 9.2 (SAS Institute, Cary, NC). Odds ratios (ORs) and 95% confidence intervals (CIs) were estimated with conditional logistic regression models. The relative risks were estimated separately in men and women, conditional on study center (15 centers) and adjusted for age (<45, 45–49, 50–54, 55–59, 60–64, 65–79, 70–74, 75–79 and ≥80 years). The reference group comprised never smokers of any type of tobacco. Ever smoking of other types of tobacco was implemented as a binary variable when estimating the relative risks of cigarette smoking. Exposure to tobacco smoke was analyzed with commonly used cut-point categories. Trend tests were performed with continuous smoking variables, including never smokers. We calculated the adjusted population-attributable fraction for smoking cigarettes in pys according to the method of Bruzzi et al.10 Meta-regression models were applied to estimate ORs for the subtypes of lung cancer in the individual studies and to estimate heterogeneity using I2 statistics (Comprehensive Meta-Analysis Version 2.2.027, Biostat, Englewood, NJ). Forest plots were presented as Supporting Information to visualize the risk estimates for the major subtypes in current smokers.
Lung cancer risk by smoking status
SqCC was the leading subtype in men (4,747 of 8,891, 53.4%) and AdCa in women (1,013 of 2,017, 50.2%; Table 1). The proportion of SCLC among the major subtypes showed no obvious gender preference (19.8% in men and 21.9% in women). Never having smoked was reported by 220 (2.1%) male and 609 (24.2%) female cases. Their leading subtype was AdCa (57.6% of all three subtypes combined in men and 70.1% in women). SqCC was the predominant type in male ever smokers (53.9%), whereas AdCa predominated in women (43.8%).
Table 1. Relative risk of lung cancer and major subtypes by smoking status
Most cases had been regular smokers (97.9% in men and 75.8% in women). A higher proportion of smoking women consumed tobacco until diagnosis of lung cancer (66.0% in men and 74.6% in women). Current smoking of cigarettes was associated with an OR for lung cancer of 23.6 (95% CI: 20.4–27.2) in men and 7.8 (95% CI: 6.8–9.0) in women. Because the ORs did not obviously vary by type of control, we further refer to the estimates of the combined control groups if not otherwise stated.
The ORs differed greatly by histological type in current smokers. The ORs from the pooled analyses were 45.6 (95% CI: 34.3–60.6) for SqCC, 45.7 (95% CI: 29.9–70.0) for SCLC and 10.8 (95% CI: 8.7–13.3) for AdCa in men. The overall ORs from the meta-analysis are shown as Figure S1 in the Supporting Information. Very few male never smokers were cases with SqCC or SCLC. This resulted in a high heterogeneity of the subtype-specific ORs (SqCC: I2 48.1%, p = 0.03; SCLC: I2 62.8%, p = 0.005 and AdCa: I2 28.7%, p = 0.16). The risk estimates were lower than the pooled ORs because of the exclusion of studies with lacking cases in the reference group [SqCC: 37.9 (95% CI: 28.3–50.5), SCLC: 28.9 (95% CI: 18.6–44.9) and AdCa: 9.7 (95% CI: 7.8–12.0)]. The corresponding ORs from the pooled and meta-analysis in women were more similar [SqCC: 13.6 (95% CI: 10.5–17.7) vs. 13.2 (95% CI: 9.8–17.6), SCLC: 21.7 (95% CI: 15.5–30.1) vs. 19.5 (95% CI: 13.5–28.3) and AdCa: 4.2 (95% CI: 3.5–5.0) vs. 4.1 (95% CI: 3.3–4.9)]. No female never smoker with SCLC was observed in one study, which was excluded from the calculation of the overall OR in the meta-analysis. The heterogeneity between the studies was highest for AdCa (SqCC: I2 54.3%, p = 0.02; SCLC: I2 0%, p = 0.66 and AdCa: I2 74.0%, p < 0.0001).
Fewer women than men had quit smoking (59.6 vs. 48.7% in controls and 34.0 vs. 25.4% in cases). Furthermore, 37.8% of male hospital controls had ceased smoking compared to 48.9% in population controls. Former smokers had an OR for lung cancer of 7.5 (95% CI: 6.5–8.7) in men and 2.8 (95% CI: 2.4–3.3) in women. When additionally stratifying the risk estimates by type of controls, the risk estimates were again based on small numbers of never smoking cases. The relative risk estimates for SqCC and SCLC were higher than for AdCa. Among cases, 153 (1.4%) men exclusively smoked other forms of tobacco than cigarettes and had an OR of 5.9 (95% CI: 4.6–7.4) with a corresponding variation by subtype.
Relative risk of lung cancer by exposure to cigarette smoke
Table 2 presents the relative risk estimates for various measures of exposure to cigarette smoking in current smokers. More men than women were heavy smokers. We found a significant trend with increasing dose for all subtypes and metrics of exposure (p < 0.0001). This trend was accompanied by a pronounced shift toward a higher proportion of SqCC and SCLC relative to AdCa. The proportion of AdCa among the major subtypes decreased, for example, from 57.6% in male never smokers to 26.5% in men smoking ≥60 py. Smoking of ≥60 py was associated with an OR of 47.7 (95% CI: 38.5–59.0) in men and an OR of 25.7 (95% CI: 14.5–45.5) in women. The corresponding relative risk estimates for the subtypes were 93.2 (95% CI: 65.8–132.0) for SqCC, 100.2 (95% CI: 60.3–166.5) for SCLC and 18.0 (95% CI: 13.1–24.8) for AdCa in men. In women, the relative risk estimates were 74.9 (95% CI: 35.2–159.3) for SqCC, 49.4 (95% CI: 21.5–113.8) for SCLC and 8.0 (95% CI: 4.0–16.0) for AdCa. Very high ORs were found in those smoking >30 cigarettes per day, whereas the relative risk estimates for duration leveled out beyond about 50 years.
Table 2. Relative risk of lung cancer in current smokers by duration, average intensity and pack-years of smoking cigarettes
In contrast to this observed leveling-off of relative risks in all men, there was a monotone increase of ORs in studies with population controls (smoking duration ≥60 years: OR 110.9, 95% CI: 52.2–235.6 for SqCC and OR 22.3, 95% CI: 10.2–48.6 for AdCa, Supporting Information Table S2). Figure 1 visualizes the trend in current male smokers by duration of smoking for studies with population controls vs. those with hospital controls. The OR among those with ≥60 years of smoking was 12.0 (95% CI: 4.3–33.4) in studies with hospital controls in comparison to 42.0 (95% CI: 23.7–74.4) in studies with population controls.
Supporting Information Table S3 presents the ORs by pys in all participants, and the synergistic effect of duration and intensity is shown in Supporting Information Table S4. There appears to be a steep rise of the relative risk up to a certain amount of cigarettes per day with no further increase by intensity, whereas a longer duration was associated with higher relative risks within all intensity strata. Again, stronger effects were observed for SqCC and SCLC than for AdCa.
Lung cancer relative risk by age at initiation
To further explore the underlying association of smoking habits with lung cancer relative risk, we present relative risk estimates by age at initiation in Table 3 among current smokers, additionally adjusted by average intensity. In both men and women, we observed decreasing relative risks with increasing age at initiation for all subtypes. Males who began smoking at age <10 years experienced higher ORs for SCLC (23.3; 95% CI: 14.3–37.9) and SqCC (22.5; 95% CI: 16.1–31.5) than for AdCa (6.3; 95% CI: 4.7–8.5). This contrast between subtypes was found in all strata for age at initiation. In current smokers of the control populations, early beginners of both genders (<15 years) smoked approximately twice as many cigarettes per day than late beginners (>25 years) (22 vs. <15 cigarettes per day). Supporting Information Tables S5–S7 depict the relative risk estimates for lung cancer by age at initiation in current smokers stratified by intensity, duration and pys. The relative risk estimates showed a decline when we account for differences in the smoking habits.
Table 3. Risk of lung cancer in current smokers by age at initiation of smoking cigarettes
Relative risk reduction by cessation of cigarette smoking
The risk reduction following smoking cessation was demonstrated in relation to never smokers and was observed for all subtypes of lung cancer (Table 4 for all studies, Supporting Information Table S8 for studies with population controls). Men who quit more than 35 years ago did not return to baseline level (OR 2.2, 95% CI: 1.8–2.8), whereas women reached the baseline relative risk after 25 years, with the exception of heavy female smokers (Supporting Information Table S9). A decline in OR following cessation was evident as soon as 2–5 years after cessation, especially in younger cases (Supporting Information Table S10). Quitting at a younger age (<60 years) showed a greater benefit.
Table 4. Relative risk of lung cancer by time since cessation of smoking cigarettes
Smoking is the leading cause of lung cancer with different quantitative effects on risks of different histological subtypes. Our analysis of lung cancer relative risks by smoking and histological subtype confirmed the known epidemiological risk pattern. Overall, AdCa was the most prevalent subtype among never smokers and among women, whereas SqCC and SCLC were more frequently observed with increasing exposure to tobacco smoke. There were very high relative risks of all subtypes in current heavy smokers, with the highest risk estimates for SqCC and SCLC. These results are consistent with many epidemiological studies and comprehensive reviews, which have been evaluated by a working group of the International Agency for Research on Cancer.1 The observation that smoking exerted a steeper relative risk gradient on SqCC and SCLC than on AdCa may reflect different response pathways of the lung depending on the exposure to pulmonary carcinogens and the extent of tissue damage. Our findings of very high relative risks for SqCC and SCLC in smokers are in line with the higher potency of their possible cellular precursors to restore damage of larger extent.11
Our work is generated from one of the largest collections of lung cancer cases from a series of case–control studies. The individual studies were conducted in various countries in Eastern and Western Europe and Canada over more than 2 decades and entailed interstudy variability in both smoking habits and relative occurrence of different lung cancer subtypes. A previous pooled analysis explored the geographic variation of the lung cancer risk in European centers.12 We included those and other centers and focused on the associations between smoking and each of the main histological subtypes of lung cancer. They are different in their molecular signatures and origins. Results from case–control studies complement findings from prospective cohort studies because they have different strengths and weaknesses. Although differential recall of smoking is of concern for the case–control design, prospective studies typically lack smoking information for the time period between recruitment and diagnosis. This impairs the assessment of quitting, duration and pys of smoking. Our pooled analysis included case–control studies with somewhat different designs and field-work procedures. We explored the choice of control populations, the variation in response rates and other factors in sensitivity analyses but found no obvious influences on the general relative risk pattern that smoking is more strongly associated with SqCC and SCLC than with AdCa. A similar but weaker association was observed in uranium miners with increasing occupational exposure to pulmonary carcinogens.13
Two potential misclassifications, involving smoking and histological subtype, have to be considered. Misclassification of smoking could have a strong impact on relative risk estimates,12, 14 especially in regard to the identification of nonsmokers, the reference category for OR estimates. In the original case–controls studies that comprise SYNERGY, different criteria had been used for defining nonsmokers. For our analysis, the most feasible criterion to apply to all available studies was the py variable, and we set the cut-point at one py. Despite the large study size and admission of subjects with up to one py as nonsmokers, there were only 220 “nonsmoking” cases as the reference group for men, with the consequence that confidence limits of OR estimates were quite wide. Further, recall or reporting error of smoking status can distort OR estimates.15 For instance, if there had been a false report rate as low as 1% among 10,000 male smokers with lung cancer, this would lead to about 100 false negatives among the purported 220 never smokers. With such a “contaminated” unexposed group, the OR estimates could quite seriously be underestimated. This “contamination” of the reference group could operate differentially among different histological subtypes.
In these populations of mainly European descent, AdCa was the leading subtype in never smokers. This is consistent with many other investigations. In a prospective population-based US cohort,16 the proportion of nonsmoking lung cancer cases that were AdCa was even higher than in the present pooling of case–control studies (82 vs. 57% in men and 89 vs. 70% in women). In addition, misclassification of the subtype has to be taken into account in this pooled analysis that included a diversity of countries in which different diagnostic methodologies and standards apply. Various studies revealed a higher inter-reader variability for AdCa than for other subtypes (for review, see Ref.17). Reference pathology was performed for the German cases. The agreement between pathologists was 94% (κ = 0.82) for SCLC and 81% for non-small cell carcinoma (κ = 0.58 for SqCC and κ = 0.55 for AdCa) and lower in never smokers.9 Biopsy specimens frequently consist of small amounts of diagnostic tissue, often limited to a single fine needle aspirate. Many cases will not have a tumor resection, and a postoperative confirmation of the tumor classification is frequently not possible.
All histological types were strongly associated with smoking, though the relative risks were considerably lower for AdCa than for SqCC and SCLC. This was seen in the pooled as well as in the meta-analysis and confirms results from many earlier studies (e.g., Refs.12 and16). We further found an increasing proportion of AdCa over calendar time that was particularly evident in women and not modified by smoking (data not shown). Many reasons have been offered to explain the observed global rise of AdCa.5 Chest imaging and detection of peripheral pulmonary nodules have improved.18 Changes in the WHO classification together with an improved staining of mucin-producing cells resulted in a shift of cases from large cell carcinoma to AdCa (for review, see Ref.17). Also, modifications in the make-up of cigarettes and changes in the tar and nicotine content have to be considered.19 Furthermore, the introduction of cellulose acetate filters may have led to the inhalation of smaller particles that penetrate more deeply into distal airways.20
In our pooled studies of mainly European populations from countries with relatively high lung cancer mortality,21 SqCC was the predominant subtype of lung cancer among male smokers recruited between 1985 and 2005. By contrast, AdCa has become the leading subtype in the United States and Japan.22 Very few cases with SCLC and SqCC were observed in never smokers resulting in a large variation of the subtype-specific ORs of the individual studies. The overall ORs from the meta-analysis may underestimate the relative risks compared to the pooled risk estimates for these subtypes when excluding studies with lacking cases in the reference group. The highest relative risk estimates from the pooled analysis were about 100 for SCLC in both genders for nonquitting heavy smokers. Even higher relative risks were observed in a large US cohort study for SCLC in current female smokers and for SqCC in both men and women.16 The somewhat lower estimates of relative risks observed in the retrospective design may point to some misclassification of the smoking status or subtype due to relatively more cases with SqCC and SCLC among never smokers than in the prospective design.
Although many questions on cancer development are yet unanswered, progress has been made explaining the diverse ontogeny of lung cancer (for review, see Ref.23). Tobacco smoke can exert a wide range of irreversible changes in lung structure and function (for review, see Ref.24). The adult lung hosts a hierarchy of progenitor and stem cells (for review, see Ref.25). Greater damage requires the activation of quiescent subpopulations of cells with greater “stemness,” i.e., lower differentiation and higher potency for regeneration.26 These cells are more centrally located in anatomically distinct niches of the lung.27 The molecular profiles of SqCC and SCLC indicate a greater “stemness,” whereas expression signatures of AdCa have been associated with more differentiated stages.28, 29 AdCa are peripheral tumors that express proteins typical for the interaction of the lung with the environment. Mucins keep the peripheral epithelial layer hydrated and act as a filtration barrier. NKX2-1 is essential for the formation of alveolar cells of type 2 (AT2)30 and is frequently overexpressed in AdCa.31 AT2 pneumocytes maintain surface tension, among other functions, by surfactant production.32 These relatively differentiated cells were suggested to be facultative progenitor cells in tissue repair.11, 25 Precursor cells of AdCa may be related to aberrant cells from this lineage. The more distant localization of these putative progenitors may also explain the distant localization of AdCa. Our findings of very high relative risks for SqCC and SCLC in smokers are in line with the higher potency of their possible cellular precursors to restore damage of larger extent. Major damage needs remodeling of fundamental structures of the tissue architecture.33 There is growing evidence that signaling programs, which direct branching morphogenesis and axis determination in lung development, are recapitulated both in lung injury and cancer.34 Although these pathways are deactivated approaching later stages in epithelial differentiation during normal tissue repair, they stay persistent in cancer cells with a subtype-specific signature. For example, SCLC shows neuroendocrine features.3
Higher lung cancer relative risks were commonly found with earlier ages at initiation. This raises the question whether this is based on a higher susceptibility at young age or on the longer duration of smoking in the group of early beginners. In current smokers from the control populations, we observed that early beginners were stronger smokers than late beginners in terms of intensity. When adjusting for intensity the reduction in relative risk by a later age at initiation still persists. Together with the vanishing trend across different ages at initiation observed within groups of a similar dose, our data suggest that smoking habits could likely explain this effect.
Although all metrics of exposure to tobacco smoke are associated with an increase in lung cancer relative risk, the summary evaluation of IARC revealed duration as the strongest determinant.1 Our analysis demonstrated high relative risks also for the average intensity in current smokers with a monotone increase that is in line with findings from recent cohort studies.16, 35 The daily dose acted synergistically with duration up to a certain number of cigarettes per day. We observed a plateauing of the relative risks by duration of smoking. A leveling-off of the relative risk in heavy smokers has been also demonstrated in a previous analysis.36 A possible reason can be competing mortality from other tobacco-related diseases. Besides potential saturation effects, dose-dependent inhalation habits, or other explanations, underreporting of the daily dose among cases cannot be ruled out.
When restricted to studies with population controls, we observed a monotone increase in the duration-response trend among current male smokers, whereas the dose-response trend even decreased in studies with hospital controls. Another possible explanation for the different trends could be that hospitalization of elderly smokers may be related to chronic diseases that are caused at least in part by an unhealthy lifestyle. This may not fully be captured in the exclusion criteria for hospital controls. It is important to note that current male smokers were more prevalent in hospital than in population controls.
Besides the higher relative risks observed for longer smoking duration in studies with population controls, we found only minor relative risk differences for smoking status or cessation between studies with population and hospital controls. Choice of source of controls remains a controversial issue in cancer case–control studies.37 Although population controls are often favored over hospital controls, their typically lower response rate is a methodological disadvantage.
Quitting reduces the relative risk of lung cancer. This has been demonstrated in many studies, although the time slope of the reduction remains an outstanding question.38 A drop in relative risk was evident already after 2–5 years of smoking cessation pointing to a potential promoting effect of tobacco smoke. This highlights the importance of smoking cessation in all age groups, including the elderly. The benefits of cessation were both short term and long term.1, 39 Smokers showed a large reduction in relative risk after quitting, but the risk in men and heavy female smokers did not fully return to baseline in accordance with other observations.40 Tobacco control policies should better address women who quit less frequently than men in populations from Europe and North America.16
A large fraction of lung cancer cases are attributable to tobacco smoking.1 In our study populations, and in the areas of data collection, the estimated pooled population-attributable fraction was 93.1% (95% CI: 90.1–96.2%) in men and 82.6% (95% CI: 79.6–86.0%) in women. The lower fraction in women is likely due to historic and national gender differences in smoking patterns. Lung cancer mortality shows an increasing trend for women in various European countries.21 Especially in young Europeans, lung cancer mortality is converging between men and women due to an increase in women's tobacco consumption.41 The issue of a differential susceptibility of males and females to tobacco carcinogenesis is controversial (e.g., Ref.42), and indeed there is no consensus about the meaning and methods of detection of interaction. We are undertaking a formal assessment of this issue in a separate report.
SCLC comprised a similar fraction of about 20% of the major subtypes in both men and women, which is in line with incidence data from a US cohort.16 However, SqCC was more common in men and AdCa in women. It is important to note that ERBB2 (HER-2), which can play a considerable role in breast and ovarian cancer, is more frequently overexpressed or mutated in AdCa.43 Besides gender-specific differences in smoking habits,44 differences in lung architecture and physiology have been described.45 Women exhibit a relatively higher surface area due to smaller alveoli that has been considered an evolutionary advantage in meeting the oxygen demand during pregnancy. Although various sex-related influences on lung cancer have been demonstrated,46 smoking clearly outweighs these factors.
Here, we present a consolidated assessment of the impact of smoking on relative risks of different subtypes of lung cancer. AdCa was the most common subtype among never smokers and among women. The greater the amount smoked, the greater the proportion of SqCC and SCLC relative to AdCa. Smoking cessation reduced the relative risk in the short term and long term, but risks among heavy former smokers never fully returned to baseline risks of nonsmokers. The risk pattern from this large collection of retrospective studies is in line with prospective studies and hypotheses from experimental studies on aberrant repair of lung tissue by level of exposure and extent of damage.