Fax: (011) 0041 44 255 45 52.
N-glycoprotein profiling of lung adenocarcinoma pleural effusions by shotgun proteomics†
Article first published online: 7 MAR 2008
Copyright © 2008 American Cancer Society
Volume 114, Issue 2, pages 124–133, 25 April 2008
How to Cite
Soltermann, A., Ossola, R., Kilgus-Hawelski, S., von Eckardstein, A., Suter, T., Aebersold, R. and Moch, H. (2008), N-glycoprotein profiling of lung adenocarcinoma pleural effusions by shotgun proteomics. Cancer, 114: 124–133. doi: 10.1002/cncr.23349
The authors declare to have no competing financial interests.
- Issue published online: 11 APR 2008
- Article first published online: 7 MAR 2008
- Manuscript Accepted: 13 DEC 2007
- Manuscript Revised: 21 NOV 2007
- Manuscript Received: 24 SEP 2007
- Helmut Horton Foundation
- Ludwig Institute for Cancer Research
- Swiss National Science Foundation and the National Heart, Lung, and Blood Institute
- National Institutes of Health
- mass spectrometry;
- pleural effusion;
Malignant pleural effusion of advanced lung adenocarcinoma may be a valid source for detection of biomarkers, such as N-glycosylated proteins (N-GP), because tumor cells grow during weeks in this liquid. The authors aimed for creation of N-GP effusion profiles from routine cytology specimens to detect relevant biomarkers.
Hundred microliters of malignant pleural effusions of 5 patients with lung adenocarcinoma and 5 nonmalignant controls were used for triplicate N-GP capture by solid-phase extraction. After trypsin digest and PNGase F release, a liquid chromatography separation connected online to a tandem mass spectrometer was performed by liquid chromatography/tandem mass spectrometry (LC/MS/MS).
In the total of 10 samples, 170 and 278 nonredundant proteins were detected with probabilities of ≥.9 and ≥.5, respectively. The specificity for the N-glycomotif was 88% at P ≥ .9. Penetration into the moderate to low protein concentration range (μg-ng/mL) occurred, and several proteins associated with tumor progression or metastasis were identified, including CA-125, CD44, CD166, lysosome-associated membrane glycoprotein 2 (LAMP-2), multimerin 2, and periostin. MS identifications were correlated with the corresponding immunoreactivity in either effusion fluid or tumor tissue.
In conclusion, reduction of sample complexity by N-GP capturing allows detection of proteins in the μg to ng/mL range. Pleural effusion is a useful source for biomarker research in lung cancer. Cancer (Cancer Cytopathol) 2008. © 2008 American Cancer Society.
Proteomics promises significant advancement for disease diagnosis and therapeutic monitoring through identification of novel biomarkers by using high-throughput technologies, such as high-pressure liquid chromatography combined with mass spectrometry.1 The most widely studied body fluid is human blood plasma, and it is assumed that changes in plasma will provide direct information on physiological and metabolic states of disease and drug response. However, the characterization of the blood plasma proteome is analytically challenging because of the top-down problem, meaning that 99% of the plasma protein mass is distributed across only 22 proteins.2 In addition, this fluid is constantly regulated and metabolized through kidney clearance and liver passage, mechanisms that may affect protein marker concentration. Furthermore, the cellular leakage proteins may originate from virtually any cell or tissue type in the body.
To overcome the top-down problem, several methods for the creation of subproteomes have been described.3 For clinical purposes, the recently developed N-GP capturing protocol is particularly interesting, because N-GP containing the N-X–S/T motif are secreted or detached from the plasma membrane into body fluids, and well-known biomarkers belong to this family, eg, the prostate-specific antigen.4–6 This protocol removes the major plasma protein albumin, which is not N-glycosylated, and increases complexity of mass spectrometric analyses. Other high abundance N-GPs in the g/L concentration range are purified.
We considered malignant pleural effusions to be valid sources for protein biomarker identification for lung adenocarcinoma, with the rationale that prolonged cancer cell growth in the closed pleural cavity will increase concentration of secreted proteins. This protein enrichment over weeks, because of only partial interchange with the plasma, may allow mass spectrometric detection despite the top-down problem.
Therefore, the first aim of the study was to test if known and/or novel potential N-glycoprotein biomarkers could be detected in routinely processed pleural effusions of advanced lung adenocarcinoma by shotgun mass spectrometry. Second, we aimed for validating protein identifications by clinical chemistry and immunocytochemistry.
MATERIALS AND METHODS
Five patients with malignant pleural effusions from advanced lung adenocarcinoma were analyzed. Tumor cells may not appear until the second or third effusion drainage, the first liquid may be negative. We, therefore, choose 5 patients as controls that were free of tumor in the clinical course of 1 year. Hemolytic effusions or empyema were ruled out. On the basis of different patients' histories, drug regimens, and total protein concentrations, an equal volume approach as used in routine clinical chemistry was considered most adequate. The study was reviewed and accepted by the ethical committee of the Zurich University Hospital.
Effusion liquids were centrifuged at 2000× gravity for 10 minutes at room temperature. The cell-free supernatant was aliquoted and frozen at −80°C. Time delay from thoracic puncture to freezing was in the range of 1 to 4 hours, allowing complete clotting. No protease inhibitors were added because pleural effusion is a plasma equivalent, containing all its constitutive protease inhibitors. The upper white phase of the sediment was used for manufacture of 3 Papanicolau-stained smears as well as a formalin-fixed paraffin-embedded cell block by adding plasma (4 droplets) and thrombin (1 droplet). Tumor cells, mesothelial cells, lymphocytes, and granulocytes were counted by a semiquantitative score of 0 to 3 on 10 high-power fields.
Clinical Chemistry and Immunohistochemistry
One milliliter of the supernatant was subjected to clinical chemistry analyses. The following parameters were assessed according to Light's criteria7, 8: Total protein (g/L, biuret method), LDH (U/L, UV test), glucose (mmol/L, glucose oxidase test), and cholesterol (mmol/L) were measured by using tests from Roche diagnostics on either a Cobas Integra 800 or a Cobas Modular analyser (Roche Diagnostics, Basel, Switzerland). CA-125 (kU/L) and CEA (μg/L) were measured by using the Kryptor analyser and time-resolved amplified cryptate emission immunoassays (BRAHMS, Hennigsdorf, Germany). Human interferon gamma (IFN-g) was measured by a FACS-based bead monoplex assay (Luminex, Austin, Tex). Immunohistochemistry with antibodies against TTF1 (1:50, Dako, Glostrup, Denmark), CK7 (1:100, Dako), periostin (1:200, Biovendor, Modrice, Czech Republic), CD44 (1:100, BD Biosciences Pharmingen, Franklin Lakes, NJ), and CD166 (1:50, Novocastra Laboratories, Newcastle upon Tyne, UK) was performed on the cell blocks and tumor tissues by means of a Ventana automat (Ventana Medical Systems, Tucson, Ariz).
The lung adenocarcinoma cell line A549 was cultured in RPMI 1640 Medium (Sigma Aldrich, St. Louis, Mo) supplemented with 10% heat-inactivated fetal bovine serum. At 70% confluence, cells were washed 3 times with 1× cold phosphate-buffered saline (PBS), and serum was starved for 1 week. The 10 mL supernatant was removed and concentrated by ultrafiltration (Vivaspin 15R Hydrosart with 5 kDa cutoff; Vivascience, Aubagne, France) to a final protein concentration of 6 mg/mL. 100 μL were used for capturing the N-GP.
The enrichment of formerly N-linked glycopeptides was performed in triplicate as described.5, 6 In brief, 100 μL of cell-free effusion liquid, serum or cell culture supernatant, were desalted by using a 96-well G10 gel-filtration plate (Harvard Apparatus, Holliston, Massachusetts) and coupling buffer (100 mM sodium acetate, 150 mM NaCl; pH 5.5). Glycoproteins were oxidized by adding 15 mM sodium periodate (Affi-Gel oxidizer, Bio-Rad Laboratories, Hercules, Calif) for 1 hour in the dark and conjugated to hydrazide resin (Affi-Gel Hz Hydrazide support, Bio-Rad) for 16 hours. The covalent binding was stabilized by reduction of N = N to N-N bonds (330 mM sodium cyanoborohydride, 500 mM Tris-HCl, pH 8.3) for 1 hour. Nonglycoproteins were removed by washing the resin with denaturing urea buffer (8 M urea, 0.05% SDS, 5 mM EDTA, and 200 mM Tris-HCl; pH 8.3). The peptides were reduced by adding 8 mM TCEP (Pierce, Thermo Fisher Scientific, Waltham, Mass) for 30 minutes and alkylated by adding 10 mM iodoacetamide for 30 minutes. Digest was performed at 37°C overnight by using sequencing grade-modified trypsin (Promega, Madison, Wis) at a concentration of 20 μg of trypsin per sample in 1:4 diluted urea buffers. The trypsin-released peptides were removed by washing the resin 5 times with 1.5 M NaCl, 80% (volume-to-volume ratio) acetonitrile, 100% (volume-to-volume ratio) methanol, and 0.1 M NH4HCO3 each. N-linked glycopeptides were released from the resin by addition of PNGase F (New England Biolabs, Ipswich, Mass) at a concentration of 0.3 μL PNGase F per 100 μL sample for 16 hours at 37°C. The supernatant was collected and combined with 2 resin washes with 80% (volume-to-volume ratio) acetonitrile. The peptides were finally lyophilized under vacuum and stored at −30°C.
Reversed-phase Capillary Liquid Chromatography/Tandem Mass Spectrometry (LC/MS/MS) Analyses
One μL of sample was analyzed using an HPLC system coupled online to a linear ion trap mass spectrometer (LTQ, Thermo Fisher Scientific, Waltham, Mass) by use of an electrospray ionization (ESI) interface. The reversed-phase capillary column was prepared by slurry packing 3 μm C18-bonded particles into 10-cm–long 75 μm tubing inner diameter fused silica capillary. The mobile phase consisted of solvent A (0.1% formic acid in water) and solvent B (2% water, 0.1% formic acid in acetonitrile). Elution was performed by increasing the mobile-phase composition from 5% to 45% B for more than 75 minutes followed by a washing step at 100% B for 15 minutes. The mass spectrometer was operated in a data-dependent tandem mass spectrometry (MS/MS) mode (mass-to-charge ratio [m/z] 400–1600), in which a full scan was followed by MS/MS scans of the 3 most intensive precursor ions. A dynamic exclusion of 1 minute for already sequenced ions was used.
MS/MS spectra were searched independently against the human International Protein Index (IPI) database, versions 3.01 and 3.19, as well as the National Cancer Institute (USA) database, release 2006 of 11/29, using the search algorithms COMET and SEQUEST (PMID 1002776). The search window of potential precursor masses in the database was set to 4 Da. We included a static modification of 57 Da on cysteine and a variable modification of 16 Da on methionine. The identified peptide sequences were linked to protein identifications through the transproteomic pipeline by Protein-Prophet software (http://proteinprophet.sourceforge.net/index.html).9 This algorithm was performed with every single data file, as well as with merges of all samples (3 × 10), only tumor (3 × 5). and only control (3 × 5). Biological annotations were observed by Sisyphus software (developed by Dr. B. Wollscheid, IMSB). We set probability cutoffs at .2, .5, and .9, respectively. A probability of .5 was regarded as the cutoff between confident and doubtful identification, as there is double selection in the protocol by both identification and N-glycomotif.
Clinical data from the 10 patients are summarized in Table 1. A significant difference was found for blood C-reactive protein (CRP), but not for age, body mass index, hemoglobin or leukocyte count. All 5 tumor patients had nodal positive lung adenocarcinoma with metastases, as controlled by immunohistochemistry with antibodies TTF-1 and CK7, and had moderate to high amounts of tumor cells in their effusion smears. In addition, mesothelial cells, lymphocytes, and granulocytes were identified in both populations to similar degrees. Clinical chemistry analyses demonstrated significant differences for total protein, lactate dehydrogenase (LDH), glucose, and cholesterol as well as the tumor markers CA-125 and CEA. Further diagnoses and drug regimens are listed on Table 2.
|Tumor, n=5||Control, n=5||P-value|
|Sex||3 women/2 men||3 male/2 fem.|
|N-classification||N1 to N3|
|Side||3 left/2 right||4 right/1 left|
|Total protein, g/L, (range)||38 (30–43)||20 (8–40)||<.001|
|Patient||Major diagnosis||1. Minor diagnosis||2. Minor diagnosis||1. Drug||2. Drug|
|1||Lung adenocarcinoma||Colon cancer||Hypertension||Prednisone||LMW-Heparine|
|2||Lung adenocarcinoma||Lung embolism||Atrial fibrillation||Gemcitabine||Paracetamol|
|3||Lung adenocarcinoma||COPD||Atrial fibrillation||Fluticasone||Paracetamol|
|4||Lung adenocarcinoma||Heart insufficiency||—||Acetylsalicylic acid||—|
|5||Lung adenocarcinoma||Malleolar fracture||—||Biphosphonate||—|
|7||Periph art vas disease||Myocard infarction||Diabetes||Phenprocoumon||Insulin|
|8||Cholecystitis||Renal transplantation||Malignant melanoma||Cyclosporine||Mycophenolate|
|10||Knee abscess||Coronary artery disease||iv drug abuse||Amoxicillin||Methadone|
Protein Probability and Specificity
For every tumor and control sample, triplicate N-GP capturing and LC-MS/MS was performed. The second and third captures were on the same 96-well plate, 9 months later than the first isolation. Thirty individual data files were generated in the data-dependent MS/MS mode with dynamic size exclusion of 1 minute, yielding, in total, 131 nonredundant proteins with probability ≥.9, based on IPI v3.19. The number mounted to 170 when the individual files were merged, as triplicate probabilities are additive. The respective numbers for P ≥ .5 or P ≥ 0.2 and for the merges of only tumor or only control are seen in Table 3. At protein probability ≥.9, 88% of proteins were identified by at least 1 glycopeptide containing the N-X–S/T motif (Table 4). Importantly, 97% of all peptide identifications were in the P ≥ .9 range. Identifications with P ≤ .2 were not listed. Furthermore, we observed a remarkable reduction in albumin content as demonstrated by the presence of only 300 albumin peptides in the total of 63,697 peptide identifications at P ≥ .2. Complete absence of albumin peptide identifications was observed in 3 of 30 individual analyses (10%). Immunoglobulins and alpha-1–antitrypsin were the most frequently identified proteins. Generally, the 20 most abundant so-called classical plasma proteins were represented by 75% of the peptide identifications, leaving 25% for lower-abundance proteins. We concluded that the full complexity of the pleural effusion N-glycoproteome is only partly covered by shotgun proteomics.
|Protein probability||No. of proteins|
|Total single files|
|P ≥ .9||131|
|Total merged files|
|P ≥ .9||170 corr|
|P ≥ .5||278 corr|
|P ≥ .2||425|
|P ≥ .9||140|
|P ≥ .9||143|
|A. Merged total||No. of proteins||No. of glycopeptides||Specificity|
|P ≥ .9||170||20 (12%)||88%|
|P ≥ .5||278||39 (14%)||86%|
|P ≥ .2||425||81 (19%)||81%|
|B. Merged total||No. of peptides||Albumin||Immunoglobulins||alpha-1-AT|
|P ≥ .9||62027||300 (0.5%)||18182 (29%)||8038 (13%)|
|P ≥ .2||63697||300 (0.5%)||18182 (29%)||8038 (13%)|
Protein Spectrum and Concentration
The Entrez gene numbers (National Library of Medicine [USA] online database) were submitted to PANTHER (www.pantherdb.org) for protein-family analysis. The spectrum of the all sample merge at P ≥ .9 is seen in Figure 1. The 3 most prominent protein families were 1) select regulatory molecules, 2) defense/immunity, and 3) extracellular matrix. Similar results were found for the merged files of only tumor and only control. By performing data mining in literature,10–14 several tumor-progression or metastasis-associated (CA-125, CD44, CD166, lysosome-associated membrane glycoprotein 2, multimerin 2, and periostin) as well as lung-specific proteins (tracheobronchial mucin 5B, thyroid transcription factor 1, and pulmonary surfactant protein A) were identified. CA-125 and CD44 are already validated markers, whereas CD166, LAMP 2, multimerin 2, and periostin are currently under evaluation. Because the promising tumor marker, periostin, is in the ng/mL concentration range in plasma and was well identified in 2 malignant effusions (Fig. 2), we concluded that the protocol allows penetration into the moderate to low protein concentration range (μg to ng/mL). To address the problem of several cell populations in pleural effusion, we used as a control the concentrated supernatant of the A549 lung adenocarcinoma cell line. After 1-week serum starvation (data not shown), 32 proteins were identified in the supernatant at P ≥ .9, among them several of the pleural effusion proteins, in particular CD166 and tracheobronchial mucin 5B.
Validation of MS Identification
Shotgun MS is based on a continuous peptide stream from the high-pressure liquid chromatography (HPLC) column into the ion trap. The 3 most prominent ions were fragmented in the MS2 mode after the MS1 full scan. A peptide identification in only tumor samples does not mean absence of the protein in control samples, but rather, an identification depends on the individual peptide ion abundance at a given time. We, therefore, were interested how Protein-Prophet probabilities from either COMET/IPI 3.19 or SEQUEST/IPI 3.01 distribute among the triplicate profiles and how they correlate with validating data. The tumor marker, CA-125, was identified 4 times with a probability of .21 to .74 (Table 5). No correlation was observed with the respective clinical chemical values. Potent molecules are often in the very low concentration range of pg/mL. Besides the identification of several low-abundance kinases and transcription factors, the antitumor cytokine, interferon-gamma (IFN-γ), was found with P = .32 in 1 of 15 tumor profiles. No correlation was seen with the corresponding bead monoplex data. Therefore, penetration of the ion trap-based LC-MS/MS set-up into the pg/mL range is doubtful. The above-mentioned novel potential tumor-progression and metastasis-associated proteins were confidently identified with both search strategies, predominantly in the tumor samples (Table 6). Despite high individual probabilities, triplicate reproducibility was random. Finally, the MS identifications of periostin and CD166 were found to correlate with immunohistochemistry on corresponding tumor tissue and cell blocks, respectively (Fig. 3).
|CA-125 COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||0.74|
|CA-125 SEQUEST/IPI 3.01||–||–||–||0.51||–||–||–||–||–||–||–||–||–||–||–|
|IFN-g COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|IFN-g SEQUEST/IPI 3.01||–||–||–||–||0.32||–||–||–||–||–||–||–||–||–||–|
|CA-125 COMET/IPI 3.19||–||–||–||–||–||0.58||–||–||–||–||–||–||–||–||–|
|CA-125 SEQUEST/IPI 3.01||–||–||–||0.21||–||–||–||–||–||–||–||–||–||–||–|
|IFN-g COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|IFN-g SEQUEST/IPI 3.01||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|Total protein concentration||43||30||43||44||30|
|Periostin COMET/IPI 3.19||–||–||–||0.99||–||–||0.99||–||–||–||–||–||–||–||–|
|Periostin SEQUEST/IPI 3.01||–||–||–||0.98||–||–||0.97||–||–||–||–||–||–||–||–|
|CD166 COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||0.73||–||–||–||–|
|CD166 SEQUEST/IPI 3.01||–||–||–||–||–||–||–||–||–||–||0.98||–||–||–||–|
|LAMP-2 COMET/IPI 3.19||–||0.96||0.98||–||–||–||–||–||0.96||–||–||–||–||–||–|
|LAMP-2 SEQUEST/IPI 3.01||–||0.21||0.99||–||–||–||–||–||0.99||–||–||–||–||–||–|
|Multimerin-2 COMET/IPI 3.19||–||–||0.58||–||–||–||0.99||0.99||–||–||–||–||–||–||0.78|
|Multimerin-2 SEQUEST/IPI 3.01||–||0.65||0.95||–||–||–||0.96||0.88||–||–||–||–||–||–|
|Total protein concentration||12||14||40||8||25|
|Periostin COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|Periostin SEQUEST/IPI 3.01||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|CD166 COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|CD166 SEQUEST/IPI 3.01||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|LAMP-2 COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|LAMP-2 SEQUEST/IPI 3.01||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|Multimerin-2 COMET/IPI 3.19||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
|Multimerin-2 SEQUEST/IPI 3.01||–||–||–||–||–||–||–||–||–||–||–||–||–||–||–|
In conclusion, even with dynamic size exclusion and significant reduction in albumin content, the LC-MS/MS set-up with fragmentation of the 3 most prominent ions has strong influence on the composition of an individual N-GP profile. This influence is more important than the total protein concentration of the sample. Parallel captures on the same 96-well plate with the same total protein concentration may yield divergent Protein-Prophet probabilities of 1 and 0 because of the all-or-nothing nature of shotgun mass spectrometry. The merge of individual profiles, nevertheless, produces an invaluable protein list in which relevant biomarkers are detectable.
Our study reports the first N-GP catalog for malignant pleural effusion of lung adenocarcinoma. This subproteome profile partly overlaps with the data of Tyan and colleagues who performed a more global proteomic analysis.15 By using an N-GP capture and shotgun mass spectrometry-based approach, we were able to identify known and novel potential biomarkers. We chose pleural effusions because a longer secretion period can be expected in comparison to plasma and especially focused on tumor progression or metastasis-related proteins with the rationale that such proteins may be detected later in the blood plasma of pT1 patients. At the pT1 time point, they are presumably in the pg/mL to ng/mL range; thus they are far more difficult to detect by mass spectrometry. Next to already validated markers, like CA-125, CD44, TTF-1, and pulmonary surfactant protein A (SP-A), we identified novel potential markers through data mining: First, periostin (osteoblast-specific factor 2) is highly homologous to βig-h3, a member of the fasciclin I protein family and contains 1 N-glycomotif. The protein promotes oncogenesis by means of cell adhesion and spreading and has been identified as a mesenchymal gene overexpressed in human cancers (SAGE library, Chiron Co., Emeryville, Calif). Periostin is a marker for metastatic lung and breast cancers,13, 16 and promotes invasion and metastatic growth in head and neck,17 pancreatic,18 and colon cancer,19 and acts on the crosstalk between ανβ5 integrins and the EGF-receptor.20, 21 This protein is actually 1 of the most promising biomarkers, as its upregulation depends on the formation of desmoplastic stroma. It also seems to be accessible by the blood stream.22 Interestingly, the plasma concentration of periostin in metastatic cancer may be as high as 1000 ng/mL, which is 100-fold higher than prostate-specific antigen (PSA) values, being in the 10 ng/mL range. Second, multimerin-2 (Endoglyx-1) is a multisubunit glycoprotein of the vascular endothelium and may be associated with neoangiogenesis in melanoma metastases.11 Third, CD166 (activated leukocyte-cell adhesion molecule, ALCAM) is a CD6 ligand, which is overexpressed in prostate, breast, and colon cancers10 and controls invasive tumor growth.23 Finally, lysosome-associated membrane glycoprotein-2 (LAMP-2, CD107b) is highly expressed in lung, protects the lysosomal membrane from autodigestion, and has been implicated in metastasis of pancreatic carcinoma.12
Cancer development is a multistep process; therefore, different biomarker populations for early-stage (local growth), mid-stage (vessel infiltration), and late-stage disease (metastatic growth) have to be envisioned, each of them characterized by an individual N-glycosylation profile. A malignant pleural effusion from lung adenocarcinoma is, by definition, pT4. To achieve this, either destruction of the visceral pleura or spread through lymphatic vessels is required. Both processes will give rise to stromal breakdown and remodeling as well as necrosis due to local ischemia. Furthermore, cancer cells in liquid are attacked by accompanying immune cells, and mesothelial-lining cells often demonstrate reactive proliferation. The resulting N-GP profile reflects a cellular leakage status of these combined populations and is related to metastasis and tumor progression. In this anatomic context, we expected to identify serum proteins, extracellular matrix formation or breakdown products, secretory proteins of either cancer, immune cells or mesothelium, membrane proteins from shed exosomes, and cytosolic proteins of necrotic cells.24–26 Thus, compared with lung cancer tissue, the pleural effusion model may be somewhat simpler in terms of cellular composition but, evidently, is still complex with respect to the relative contributions to a particular protein concentration. In the case of periostin, we have observed a relative protein intensity of 0 to 1+ in the alveolar septa of normal lung. In many nonsmall cell lung cancers (NSCLC), the intensity in the peritumoral desmoplastic stroma reaches 2 to 3+. Finally, the submesothelial stroma also shows faint expression of 1+. In a pT4 stage with pleural breakthrough, periostin may enter into the liquid from different sources. To simplify this model, culture of cancer cells and/or mesothelial cells have to be performed. We identified several of the effusion proteins in the supernatant of the A549 cells, in particular CD166 and tracheobronchial mucin 5B. Periostin was not detected, probably because of the reported low endogenous expression in this cell line. Notably, this protein seems to require an in vivo tumoral invasion front against desmoplastic stroma to become upregulated.27
To solve the top-down problem in clinical fluid samples with high-matrix background, N-GP capture is an attractive strategy to remove the major serum non–N-GP albumin, but high abundance N-GP are copurified. We observed a remarkable reduction of albumin content but also the purification of the high-abundance N-GP, constituting 75% of all peptide fragmentations. Nevertheless, penetration into the μg/mL to ng/mL range occurred by using shotgun proteomics at the given total protein concentrations of 10–40 mg/mL. The complexity and dynamic concentration range may be further increased by use of affinity chromatography depletion columns. These columns are able to remove up to 10 highly abundant proteins, including immunoglobulins and alpha-1–antitrypsin. Yet, many low-abundance potent proteins are bound to their high-abundance carriers such as albumin or transferrin. Concerning the 3D structure of albumin, these low-abundance proteins are buried in superficial grooves, together with associated lipids, and may not be amenable to N-GP capture. Therefore, removal of the 6 to 10 predominant plasma proteins may lead to substantial loss of low-abundance proteins. Parallel mass spectrometry of the retained fraction is required to address this issue.28 Alternatively, trypsin digest before N-GP capture is currently being investigated, but the protocol needs high enzyme amounts to overcome the efficient antiprotease cascades of plasma. Furthermore, N-GP capture of a completely digested protein solution means that one has to fish single N-glycopeptides out of a background of millions of albumin peptides.
The advantage of tandem mass spectrometry lies in the ability to obtain identification from a peptide solution in short time. LC-MS/MS, therefore, is an invaluable method for protein analysis of biological fluids because it identifies more than 100 proteins in 1 sample run of 90 minutes. Including gradient washes and blank samples, 4 hours are needed per sample. In 24 hours of mass spectrometry time, up to 6 samples can be processed. The method can be considered complementary to 2D gel electrophoresis, with the major advantage of direct protein identification. The drawback lies in the lack of full coverage and quantification. As the HPLC produces a constant peptide stream, trap loading and ion detection have to be performed in the millisecond range. After the full scan of all signals, only the 3 to 5 most prominent signals can be fragmented per time frame in the tandem mode. The weaker signals are lost. Identified proteins need to be controlled by alternative technologies such as enzyme-linked immunosorbent assay (ELISA) or quantitative mass spectrometry for their relative or absolute concentration. We envision further investigations into periostin by the quantitative multiple-reaction monitoring mass spectrometry (MRM) technology.29 MRM may be able to penetrate into the pg/mL range, but precursor ions need to be known.
In conclusion, our approach of N-GP profiling of pleural effusions by LC-MS/MS shotgun mass spectrometry allows detection of more than 100 N-GPs in 1 sample analysis. Despite persistent top-down problems with repetitive detection of high-abundance N-GP, penetration into the moderate to low protein-concentration range (μg/mL to ng/mL) occurs, although randomly. Importantly, identifications of such lower abundance proteins are confident and allow further investigation by more quantitative methods.
The authors thank Patrizia Cione for excellent technical assistance and collection of effusion samples.