Background: Global gene expression analysis is proving to be an important means of assessing human tumors and may identify key components of carcinogenesis or clinical prognosis. This technique has been successfully applied to head and neck squamous cell carcinoma (HNSCC) and thyroid carcinomas; however, little has been done to evaluate premalignant states.
Methods: Human buccal mucosal cells were sampled from smokers and nonsmokers using a noninvasive brush technique. The method was validated by assessing the quantity and quality of RNA obtained. The purified RNA was then assayed using cDNA microarrays containing 27,323 cDNA clones to examine the buccal mucosa in these patients for differences in gene expression patterns. Using unsupervised and supervised hierarchical clustering methods, we developed a gene profile signature for an initial training set of smokers and nonsmokers and then used this to predict smoking status in a subsequent test set of subjects. Selected genes were then cross-referenced with previously published gene sets found in HNSCC identified by our group.
Results: Nineteen subjects were used in this pilot analysis, 9 smokers and 10 nonsmokers. Smoking among the study group ranged from 1 to 60 pack years. RNA purified from buccal mucosal brushing demonstrated a high degree of similarity in gene expression profiles among independent samples. Through the application of supervised clustering techniques, we were able to identify 113 genes whose expression differed significantly between samples from smokers and nonsmokers (t test, P < .001). This expression signature was able to accurately predict who within the second set of subjects were smokers, with the exception of one person who had a minimal tobacco history and clustered with the nonsmokers. Cross-referencing data with that found in HNSCC, we were able to identify a tumor suppressor gene involved in the c-myc pathway (Mxi1) that was similarly under-expressed in smokers and cancer patients with progressive disease.
Conclusions: Although the sample size was small in this preliminary dataset, our analysis revealed several groups of genes that were either over- or under-expressed in the smokers and which could be used to predict smoking exposure. Many of these represent genes of possible interest as early molecular markers for head and neck carcinogenesis.