Transglutaminase 3 as a prognostic biomarker in esophageal cancer revealed by proteomics

To develop a prognostic biomarker for esophageal squamous cell carcinoma (ESCC), we examined the proteomic profile of ESCC using two‐dimensional difference gel electrophoresis (2D‐DIGE), and identified proteins associated with prognosis by mass spectrometry. The prognostic performance of the identified proteins was examined by immunohistochemistry in additional cases. We identified 22 protein spots whose intensity was statistically different between ESCC cases with good (N = 9; survived more than 5 years without evidence of recurrence) and poor (N = 24; died within 2 years postsurgery) prognosis, within the patient group that had two or more lymph node metastases. Mass spectrometric protein identification resulted in 18 distinct gene products from the 22 protein spots. Transglutaminase 3 (TGM3) was inversely correlated with shorter patient survival. The prognostic performance of TGM3 was further examined by immunohistochemistry in 76 ESCC cases. The 5‐year disease‐specific survival rate was 64.5% and 32.1% for patients with TGM3‐positive and TGM3‐negative tumors, respectively (p = 0.0033). Univariate and multivariate analyses revealed that TGM3 expression was an independent prognostic factor among the clinicopathologic variables examined. It is noteworthy that the prognostic value of TGM3 was shown to be higher than those of the lymph node metastasis, intramural metastasis and vascular invasion status. These results establish TGM3 as a novel prognostic biomarker for ESCC for the first time. Examination of TGM3 expression may provide novel therapeutic strategies to prevent recurrence of ESCC. © 2008 Wiley‐Liss, Inc.

Esophageal cancer is the 8th most common cancer 1 and the 6th leading cause of cancer death worldwide. 2 Despite the use of modern surgical techniques in combination with radio-and chemotherapy, early recurrence is common and the overall 5-year survival rate remains below 40%. [3][4][5] Although the use of adjuvant and neoadjuvant chemotherapies has improved the survival times of esophageal cancer patients, 6 these treatment modalities cause serious side effects in a large number of patients and only benefit a limited number of patients in terms of overall survival times. On the other hand, 45-52% of patients with resectable esophageal cancer treated with surgery alone survive for more than 5 years. 7,8 The patients who can be completely cured by surgery alone receive unnecessary and harmful combination therapy. The response to treatment such as surgery or chemo-radiotherapy is variable, even when the patients are at the same clinical stage, and is not predicted by the existing diagnostic modalities. Accurate risk stratification is therefore of paramount importance to either avoid potential morbidity due to over-treatment or prevent further progression of disease.
Global mRNA expression studies have identified the gene clusters associated with the progression of esophageal cancer, [9][10][11] suggesting that multiple gene and protein alterations are implicated. 12 These alterations can be considered as potential biomarkers for detecting cancer, determining prognosis, and monitoring disease progression or therapeutic response. However, none of them has been proven to be clinically useful, and the response to treatment such as surgery or chemo-radiotherapy is not predicted by the existing diagnostic modalities. Practical biomarkers to predict response to treatment have long been desired to optimize therapeutic strategies and improve clinical outcomes.
The proteome is a functional translation of the genome. The genomic aberrations of cancer cells are transcribed to the transcriptome, translated to the proteome, then determining cancer phenotypes. In this sense, the proteome is a functional translation of the genome, directly regulating tumor behavior. It is obvious that proteomic features more directly reflect the tumor characters than genomic contents do. Proteomic studies can generate unique data about the final products of genome information. Many lines of evidence demonstrated discordance between mRNA and protein expression. [13][14][15] In addition, examining DNA sequences and measuring mRNA expression do not accurately predict the status of post-translational modifications such as phosphorylation and glycosylation, which play a key role in regulating the malignant behavior of cancer cells. Taken together, the proteome can be a rich source for biomarker identification.
In this study, we performed a proteomic study to identify biomarkers to predict the clinical outcome of esophageal squamous cell carcinoma (ESCC) patients. We used laser microdissection to recover tumor cells and neighboring normal epithelial cells from surgical specimens of esophageal cancer cases, and subjected the recovered cells to proteomic analysis using two-dimensional difference gel electrophoresis (2D-DIGE). We took particular note of postoperative prognosis in advanced esophageal cancer treated with surgery alone, and discovered prognostic biomarker candidates to optimize the existing surgical treatment strategy. As a result, transglutaminase 3 (TGM3) was identified as a prognostic biomarker candidate. The prognostic performance of TGM3 was successfully validated by immunohistochemistry in 76 additional ESCC cases. This is the first report concerning the prognostic value of TGM3 expression in ESCC. By measuring TGM3 expression in primary tumors, we will be able to refine the prognostic protocol and optimize current therapeutic strategies.

Patients and clinical information
We examined primary tumor tissues from 82 ESCC patients who underwent surgery at the National Cancer Center Hospital consecutively from 1998 to 2002. All patients underwent curative resection, and were not treated with chemo-or radiotherapy. The patients were newly diagnosed with thoracic ESCC and were followed up for at least 5 years after surgery. The overall clinicopathological data of the cases are summarized in Table I, while information on the individual cases is available in Supplemental Table S1. Two or three tissue fragments, less than 10 mm 3 in volume, were grossly obtained from primary tumors. Matched normal mucosal tissues located at least 5 cm away from the tumor margins were also included in this study. The resected tissues were snap-frozen in liquid nitrogen and stored at 280°C until use. The recovered specimens were histologically examined and the clinicopathological stage was determined according to the International Union against Cancer tumor-node-metastasis (TNM) classification. 16 All cases were classified as T3N0-1M0. This study was approved by the ethics committee of the National Cancer Center and written informed consent was obtained from the patients.
The patients that survived more than 5 years without evidence of recurrence were categorized in the good prognosis group (N 5 39) while the patients that died within 2 years post surgery were categorized in the poor prognosis group (N 5 28). The proteomic profiles of these two sample groups were compared.
We performed immunohistochemistry on 76 cases, which included 14 cases that were not categorized in either group. The clinicopathological data of the 76 esophageal cancer cases are demonstrated in Table II.

Laser microdissection
Specific cell populations were recovered by laser microdissection according to our previous reports 17,18 (Fig. 1a). A 1 mm 2 of microdissected area, recorded during microdissection, was recov-ered from hematoxylin-stained tissues for each 2D-DIGE gel. As tumor tissues could not be recovered in 9 cases (Supplemental Table S1) because of poor preservation, we finally examined 58 tumor tissues and 53 normal epithelium tissues.

2D-DIGE and image analysis
2D-DIGE was performed as previously described. 17,18 In brief, a common internal control sample was created by mixing a small portion of all protein samples used in this study, which was labeled with Cy3 fluorescent dye (CyDye DIGE Fluor saturation dye, GE Healthcare Biosciences, Uppsala, Sweden). Individual samples were labeled with Cy5 fluorescent dye (CyDye DIGE Fluor saturation dye, GE Healthcare Biosciences). These differently labeled protein samples were mixed together and separated by two-dimensional gel electrophoresis (2D-PAGE) according to their isoelectric point and molecular weight. The first dimension separation was achieved using a 24 cm-length immobiline gel (IPG, pI 4-7, GE Healthcare Biosciences) and Multiphor II (GE Healthcare Biosciences), while the second dimension separation using a home-made gradient gel with GiantGelRunner (Biocraft, Tokyo, Japan), with a separation distance of 36 cm. The gels were scanned using a laser scanner (Typhoon Trio, GE Healthcare Biosciences) at the appropriate wavelength for Cy3 or Cy5. For all protein spots, the Cy5 intensity was normalized with the Cy3 intensity in the same gel using the Progenesis SameSpots software version 3 (Nonlinear Dynamics, Newcastle, UK), so that gel-togel variations were canceled out (Fig. 1b). We monitored the system reproducibility by running the same sample twice (case 15; Supplemental Table S1). The scatter plot showed that the intensity Good prognosis group 2 34 Poor prognosis group 3 24 1 Considered to be significant (p < 0.05).-2 Survived more than 5 years without evidence of recurrence.-3 Died within 2 years postsurgery.
value of 95% of protein spots was scattered within a 2-fold value difference, and that the correlation coefficient was 0.8352, demonstrating the high reproducibility of our profiling method (Fig. 1c). The spot intensity data were exported from the Progenesis Same-Spots software as Excel files, amenable to data analysis.

Data analysis
As a preprocess of data analysis, raw intensity data for each experiment were log2 transformed and then Z score transformation was applied to standardize the distribution of the intensity data. 19 Hierarchical clustering was performed with the Euclidean distance and unweighted pair group methods, using the arithmetic average (UPGMA) method on the standardized data to reveal the global features of the proteomic profiles acquired. To identify the spots that had different intensity between the 2 groups, the z-test was used for each spot. As the obtained p-value list possibly included false positive results due to multiple tests, we estimated the false discovery rate (FDR) following the Benjamini-Hochberg procedure according to the previous report. 20 We subsequently selected the spots so that the FDR is less than 0.05. For the comparison of normal tissues with tumors, we chose the spots the intensity ratio of group means was at least 4 times above the aforementioned FDR criteria.

Mass spectrometric protein identification
The proteins corresponding to the protein spots detected were identified by mass spectrometry according to our previous report. 21 Cy5-labeled proteins separated by 2D-PAGE were recovered in gel plugs and digested with modified trypsin (Promega, Madison, WI). The trypsin digests were subjected to liquid chromatography coupled with tandem mass spectrometry, a Finnigan LTQ linear ion trap mass spectrometer (Thermo Electron, San Jose, CA) equipped with a nano-electrospray ion source (AMR, Tokyo, Japan). The Mascot software (version 2.1, Matrix science, London, UK) was used to search for the mass of the peptide ion peaks against the SWISS-PROT database (Homo sapiens, 16,529 sequence in Sprot_52.5 fasta file). Proteins with a Mascot score of 34 or more were subjected to protein identification. When multiple proteins were identified in a single spot, the proteins with the highest number of peptides were considered as those corresponding to the spot.

Pathway analysis of expression data
Pathway analysis of the protein expression pattern was performed using the MetaCore software (GeneGo, St. Joseph, MI). MetaCore identifies networks based on a manually curated database containing known molecular interactions, functions, and disease interrelationships using proteome data sets. The pathways are identified by the probability that a random set of proteins the same size as the input list would give rise to a particular mapping by chance.
Immunohistochemistry and tissue microarray Immunohistochemical staining for TGM3 was performed on methanol-fixed, paraffin-embedded tissue sections from 76 cases (Supplemental Table S1) using the Dako REAL EnVision Detection System (DAKO, Glostrup, Denmark) following the manufacturer's instructions. The sections were deparaffinized, dehydrated and blocked by 3 mL/L H 2 O 2 in methanol for 30 min to remove endogenous peroxidase activity. The sections were autoclaved in 10 mM citrate buffer (pH 6.0) at 121°C for 10 min. The primary antibody used was a rabbit polyclonal, mono-specific antibody against TGM3 (HPA004728; Atlas antibodies, Stockholm, Sweden) at a dilution of 1:100. One pathologist (Y. N.) and one medi-cal doctor (N. U.) reviewed the sections stained with anti-TGM3 antibody in a blinded fashion regarding clinical data. The normal esophageal epithelium served as an internal positive control. Cases in which more than 10% tumor cells were positively stained with anti-TGM3 antibody were considered as TGM3 positive, while cases with less than 10% TGM3 positive tumor cells were considered as a TGM3 negative. Staining was evaluated at the dominant differentiation area of the tumor if considerable tumor heterogeneity was present.
We examined TGM3 expression using our home-made tissue microarray containing 59 normal tissues and 323 tumor tissues (Supplemental Table S2).

Statistical analysis
The correlation between TGM3 expression and clinicopathological features was evaluated using the Fisher's exact test for categorical variables and the Mann-Whitney U test for continuous variables. The disease-specific survival time was calculated from the first resection of the primary tumor to death of disease-specific causes. All time-to-event end points were computed by the Kaplan-Meier method. 22 Potential prognostic factors were identified by univariate analysis using the log-rank test. Independent prognostic factors were evaluated using the Cox's proportional hazards regression model. p value differences of <0.05 were considered to be significant. Statistical analyses were performed using the SPSS 11.0 statistical package (SPSS, Chicago, IL).

Results
2D-DIGE generated quantitative expression profiles that included 3,623 protein spots per sample. Based on the overall similarity of the acquired protein expression profiles, the samples were divided into 2 groups: tumor tissues and normal epithelial tissues (Supplemental Fig. S1); that is, the proteomic profiles reflected the tissue origin of the sample. Considerable differences were observed between the proteomic profile of tumors and normal tissues; we found 200 protein spots that matched the criteria of an FDR < 0.001 and a fold difference >4 between the tissue groups. The intensity of 33 of these spots indicated increased protein expression levels while the remaining 167 spots indicated decreased expression in tumor tissues. All proteins corresponding to these 200 protein spots were identified (Supplemental Table  S3).
The samples were not grouped according to the prognosis group to which they belonged based on their overall protein expression features. Similarly, no protein spots with significantly different intensity between these two groups were observed. However, the gender, the number of lymph node metastases, the lymphatic and vascular invasion status, and the intramural metastasis status were significantly different between the patients groups with different prognosis (Table I).
The number of lymph node metastases is one of the major prognostic factors in esophageal cancer. 23 We classified the patients based on their lymph node metastasis status into the good and bad prognosis group, and found 22 protein spots with significantly different intensity between the two groups (FDR < 0.05). The localization of the 22 spots on the two-dimensional image is shown in Figure 2a (enlarged image in Supplemental Fig. S2). Mass spectrometric protein identification revealed that the 22 protein spots corresponded to 18 distinct gene products (Table III, Fig. 2b and  Supplemental Table S4). Pathway analysis using a MetaCore software analysis tool showed that 17 of the 18 identified proteins were part of a network (Fig. 2c) in which STAT1, p53 and HNF4 seemed to be key proteins. TGM3 was connected to STAT1 through Sp1, which directly regulates TGM3 expression 24 and is an intermediary of p53, which is known to be a prognostic factor of several malignancies including esophageal cancer. 25 TGM3 spots seemed 3 times in the list of the 22 protein spots with consistently lower intensity in the poor prognosis group.
To further validate the prognostic value of TGM3 expression in ESCC, we examined the expression of TGM3 in 76 ESCC cases using immunohistochemistry. Both cytoplasmic and nuclear TGM3 staining were observed, depending on the case (Fig. 3a, enlarged image in Supplemental Fig. S3), although only the former has been reported previously 26,27 and was considered as indicating positive staining in this study.
The 5-year disease-specific survival rate was significantly higher in the 48 TGM3-positive compared with the 28 TGM3-negative cases (64.5 versus 32.1%; p 5 0.0033; Fig. 3b, Table II). Multivariate analysis revealed that TGM3 expression was an independent predictor of disease-specific survival (Table II). The immunohistochemical expression of TGM3 did not correlate with any other clinicopathological variables (Table II).
In tissue microarray analysis, TGM3 was shown to be expressed in all normal squamous epithelia and squamous cell carcinomas examined, including those arising in the skin, lung, oral cavity and uterus. In addition, TGM3 was expressed in a few cases of non-squamous epithelia and nonsquamous cell carcinomas, including those arising in the breast, prostate and thyroid gland (Fig. 3c).

Discussion
A variety of treatments is currently available for esophageal cancer. 6 The choice of treatment is crucial, as the response is diverse between the patients, even when they are diagnosed at the same clinical stage. 6 Treatment-related complications may easily lead to serious and occasionally fatal adverse reactions, such as myocardial infarction, heart failure, and pneumonia. 28,29 Therefore, by predicting response to treatment and optimizing individualized therapy, we will be able to improve the clinical outcome of esophageal cancer patients.
Novel prognostic modalities have long been desired to improve the management of ESCC. Global genomic and transcriptomic expression studies have been conducted to detect prognostic molecular biomarkers for ESCC. 30,31 However, these studies did not result in the identification of novel practical biomarkers, because they used too many molecules to predict clinical outcome, and did not perform sufficient verification experiments in clinical-scale sample sets using practical methods such as immunohistochemistry. Although proteomics has much potential to reveal the molecular background of esophageal cancer, this is the first report to employ proteomics to identify prognostic biomarkers in esophageal cancer, and to successfully establish TGM3 as a single prognostic biomarker.
The prognosis of esophageal cancer patients may be affected by various factors, including those used in TNM classification and the presence of intramural metastasis and vascular invasion. 29 Therefore, the molecular background of tumors from patients with similar prognosis could vary even when they have the same T and M stage, as in this study. Indeed, we did not identify any protein spots with different intensity between the good and poor prognosis groups. We assumed that this was probably due to the high heterogeneity of the molecular background of the tumors. With this notion, we subsequently focused our analysis on comparing the proteomic profiles of patients with different survival periods within the patient group that had two or more lymph node metastases at the time of pathological diagnosis. This analysis is clinically significant because this patient group generally has poor prognosis and is in more need of the development of suitable prognostic biomarkers. As a consequence, we successfully identified 22 protein spots that had different intensity between the aforementioned patient subgroups. These observations suggest that following a strategy that is based on the use of such clinically relevant parameters is an effective way to identify the proteins that correlate with the malignant potential of tumors.
We identified 18 proteins of prognostic value in the primary tumors. Network analysis revealed that 17 of them were linked through the STAT1, p53 and HNF4 transcription factors, all of which are aberrantly regulated in esophageal cancer. Suppression of the EGF-STAT1 pathway leads to progression of esophageal cancer. 32 p53 immunoreactivity has been detected in 34-67% of ESCC cases, [33][34][35][36] and is significantly correlated with cancer-specific death. 25 HNF4alpha expression significantly correlates with MUC4 expression, 37 which is a mediator of tumor growth and metastasis by acting as a ligand for the ErbB2 tyrosine kinase receptor. [38][39][40] These observations suggest that a limited number of transcription factors may affect a large number of genes, resulting in poor prognosis for esophageal cancer.
We considered that TGM3 was a strong prognostic biomarker candidate, because its immunohistochemical expression clearly correlated with clinical outcome in our study and it has also been shown to be potentially relevant to ESCC. 27    Spot numbers refer to those in Figure 1A and correlated with lymph node metastasis of oral squamous cell carcinoma. 41 Although Liu et al. reported that TGM3 expression correlated with histological grade in ESCC, 27 we did not observe any correlation between TGM3 expression and the clinico-pathologi-cal parameters examined, including the tumor stage. This discordance is probably due to the different antibody used, the different surgical procedure followed, 42 and the different clinical background of the patients included in the studies. A multi-institutional  Table 2), and some adenocarcinomas and their normal counterparts.
validation study that will take into account such differences will be required to firmly establish TGM3 as a practical prognostic biomarker. In oligomicroarray analysis in ESCC, the TGM3 gene was reported to be suppressed and to correlate with lymph node metastasis. 31 Although these reports suggested that TGM3 may be involved in cancer progression, they have not shown the prognostic or other practical value of TGM3 expression examination in ESCC. Genome and transcriptome studies have listed many biomarker candidates included TGM3, 31 without, however, detecting or proposing single biomarkers of potential practical use. In this study, we detected the proteins associated with the survival of ESCC patients using a proteomic approach, and subsequently selected TGM3 as an individual prognostic biomarker candidate. We then showed that the positive or negative immunohistochemical expression of TGM3 in our study corresponds to TGM3 expression as assessed by the proteomics tools we employed, and thus can be used to assess the expression level of TGM3 (Fig. 3a) in practice.
TGM3 plays a key role in epidermal terminal differentiation through cross-linking structural proteins such as involucrin, loricrin and small proline-rich proteins. 43 Although the role of TGM3 has been well established in the differentiation of skin keratinocytes, 44 little information is available concerning its involvement in esophageal epithelium. TGM3 stabilizes the cornified envelope of cells, a process that precedes the transition of keratinocytes to corneocytes by apoptosis. Therefore, down-regulation of TGM3 in ESCC may interrupt the potentially critical initiation of apoptosis, thereby favoring tumor cell survival.
TGM3 expression was decreased in ESCC tissues compared with normal tissues; all normal tissues strongly expressed TGM3 compared to only 63% of the ESCC tissues. This observation is consistent with previous microarray studies 45,46 that indicated that TGM3 is down-regulated in many types of malignancies compared with the corresponding normal tissues, suggesting that the reduced expression of TGM3 may play a common role in the carcinogenesis of not only ESCC but also other carcinomas.
In conclusion, we performed the first esophageal cancer proteomics study that uses a large-scale clinical sample set that includes prognostic information, and identified TGM3 expression as a novel prognostic indicator in ESCC. The use of a common internal control sample in 2D-DIGE, and the use of laser microdissection contributed to accurate protein expression profiling. The immunohistochemical examination of TGM3 expression may help identify patients with high risk for recurrence, and may improve the clinical outcome of these patients through closer postoperative follow-up and additional treatment. Our results therefore provide the possibility for the development of novel strategies for ESCC management.