Using HPLC-mass spectrometry to teach proteomics concepts with problem-based techniques



Practical instruction of proteomics concepts was provided using high-performance liquid chromatography coupled with a mass selective detection system (HPLC-MS) for the analysis of simulated protein digests. The samples were prepared from selected dipeptides in order to facilitate the mass spectral identification. As part of the prelaboratory preparation, students calculated the parent ion patterns of the dipeptides using peptide calculator websites. Following instruction on the use of the HPLC-MS instrument, students analyzed mixtures of the dipeptides and identified the individual dipeptides in the unknowns. In addition, purchased chicken egg white lysozyme alkylated with iodoacetamide and digested with trypsin was analyzed using the same approach. Key tryptic peptides were identified from the HPLC-MS chromatogram with information generated with the FindPept tool. This experiment demonstrates that complex concepts can be taught in the undergraduate biochemistry laboratory using a problem-based approach.

“The great difficulty in education is to get experience out of ideas”

George Santayana

Santayana, the Spanish-born philosopher and social commentator, recognized that communication of ideas in the classroom did not always result in understanding, which he called “experience.” The experiment described here is used in our biochemistry course to communicate complex “ideas” (i.e. concepts) in a practical, problem-based laboratory exercise. We wanted students to understand how the researcher can use high-performance liquid chromatography-mass spectrometry (HPLC-MS) to identify peptides that are present in a mixture.

Our upper-level undergraduate biochemistry laboratory is a five-hour course that meets weekly for one-half of the semester. Its purpose is to introduce students to concepts and techniques used in research labs. The course exercises include kinetic analysis of mushroom tyrosinase [1], the purification and analytical characterization of lysozyme [2], a bioinformatic exercise for lysozyme [3], and the HPLC-MS analysis of a simulated proteolytic digest using purchased dipeptides and trypsin-digested lysozyme. Hands-on use of the HPLC-MS lets students become familiar with a potent, bioanalytical technique, which can then be applied to samples that will be isolated and prepared in the research lab.

Students choose to purify lysozyme from one of several avian egg whites sources. Our laboratory course uses five species of avian eggs in the exercise: Chicken (Gallus gallus), turkey (Meleagris gallopavo), duck (Anas platyrhynchos), bobwhite quail (Colinus virginianus), and ostrich (Struthio camelus). Students learn how to purify proteins by ion-exchange chromatography. They analyze their protein containing fractions for lysozyme enzyme activity against suspensions of Micrococcus lysodeikticus cell walls. Students also determine the molecular weight of their isolated lysozyme using SDS-Polyacrylamide gel electrophoresis against proteins of known molecular weights. Finally, they review the differences in the lysozyme proteins from different species with a bioinformatics exercise.

The tertiary structure of a protein, and hence, the protein's function is determined by the amino acid sequence [4]. Sequencing protein is generally more difficult than sequencing DNA. The emergence of mass spectral techniques, however, has made the determination of protein sequences and molecular size much easier. Finehout and Lee have provided an excellent review of MS techniques in biological research [5]. Matrix-assisted laser desorption/ionization (MALDI) time-of-flight (TOF) has become the instrument of choice for protein sequence/proteomics research [6, 7]. In fact, Albright et al. have recently demonstrated that in-gel digestion of proteins followed by MALDI-TOF MS analysis can be used in the classroom or laboratory [8]. Unfortunately, MALDI-TOF and MALDI-TOF/TOF instruments are very expensive, sometimes costing in excess of $0.5 million. HPLC systems equipped with mass selective detectors (MSD) can be purchased for about a fifth of the cost of MALDI-TOF instrumentation. Although this is still expensive, the acquisition of an MSD is within the means of many undergraduate institutions. Our experiment described here was developed using an HPLC-MS instrument, which had a MSD.



An Agilent 1100 High Performance Liquid Chromatograph equipped with a binary gradient pumping system and polystyrene-divinylbenzene HPLC column, PRP-1 (Hamilton) was used for the peptide separation. The HPLC system was equipped with an Agilent 1000 Diode Array Detector (DAD) in series with an Agilent 1100 Series LC/MSD mass spectrometer detection system equipped operated in the electrospray ionization mode. The HPLC-MS system was controlled with integrated HPCORE ChemStation software (Creation Date 2004).


Dipeptides were purchased from Sigma-Aldrich (St. Louis, MO) and are shown in the Table I.

Table I. Dipeptides used in this study
Dipeptide3-Letter code (1-letter code)Property
Glycine-D,L-aspartic acidGly-Asp (GD)Polar, acidic
Tyrosine-alanineTyr-Ala (YA)Nonpolar
Lysine-valineLys-Val (KV)Polar, basic
Aspartic acid-phenylalanine -O-methyl esterAsp-Phe-O-Me (DF-O-Me)Nonpolar

Chicken egg white lysozyme, dithiothreitol, and iodoacetamide were obtained from Sigma-Aldrich (St. Louis, MO). Trypsin was purchased from Invitrogen (Carlsbad, CA). The HPLC column, a poly(styrene-divinylbenzene) PRP-1 (4.1 mm × 150 mm, 5 μ), was purchased from Hamilton (Reno, NV). Durapore HPLC syringe filters were obtained from Millipore (Billerica, MA). HPLC-grade acetonitrile and formic acid used to prepare mobile phases for HPLC-MS were obtained from Sigma-Aldrich (St. Louis, MO). Distilled, deionized water was used to prepare buffers and HPLC mobile phases. Chemicals used to prepare buffers and solutions that are not specified were of reagent grade quality or better.


Solutions of the individual dipeptides were prepared at concentrations of 5 mg mL−1 in 0.3% aqueous formic acid. Binary mixtures of the dipeptides were then prepared by mixing equal parts of the individual dipeptide solutions. The concentration of each dipeptide in the binary mixture was 2.5 mg mL−1. This high concentration of the dipeptide mixtures was chosen so that students would not struggle with the detection limits of the instrument.

Chicken egg white lysozyme (100 mg) was weighed and dissolved in 20 mL of 0.025 M ammonium bicarbonate, pH 8.5. A 10-fold molar excess of dithiothreitol solution (0.070 mL at 1 M in 0.025 M ammonium bicarbonate buffer, pH 8.5) was added to the 5 mg mL−1 solution of lysozyme. The solution was mixed well and incubated at 37°C for 30 min to reduce the disulfide bonds in lysozyme. A fivefold molar excess of 1 M iodoacetamide (0.035 mL, made up in 0.025 M ammonium bicarbonate, pH 8.50) was added to the lysozyme-DTT solution and incubated at 37°C for 1 hour. The alkylated lysozyme was dialyzed at 2–8°C against 50 mL of 0.025 M ammonium bicarbonate, pH 8.5. Buffer was changed (3×) every 45 min. The dialyzed lysozyme solution (assumed to be 20 mL) was then treated with 2.5 mL of a 1.0 mg mL−1 trypsin solution prepared in 0.025 M ammonium bicarbonate, pH 8.5. The mixture was incubated at 37°C for 4 hours. The reaction mixture was quenched by adjustment to a pH of about 3 with 6 M acetic acid [9].

HPLC mobile phase was programmed as a binary gradient: i) 0.3% (v/v) aqueous formic acid (polar mobile phase) and ii) acetonitrile containing 0.3% (v/v) formic acid (nonpolar mobile phase). The flow rate for the separation was 0.4 mL min−1. Solutions were filtered through 0.45 μm Durapore (PVDF) filters before injection on the HPLC-MSD system, with an injection volume of 5 μL. The column temperature for the separations was maintained at 25°C.

The mobile phase program used for the separation of dipeptides is shown in Table II.

Table II. The mobile phase program used for the separation of dipeptides
Time (min)% Nonpolar mobile phase vs. the total composition

The mobile phase program used for the separation of the trypsin digest of acetamidylated lysozyme is shown in Table III.

Table III. The mobile phase program used for the separation of the trypsin digest of acetamidylated lysozyme
Time (min)% Nonpolar mobile phase vs. the total composition

The DAD was set to collect and store data at two wavelengths, 222 nm and 274 nm. The MSD was used in the electrospray ionization mode with positive polarity, that is, ESI+. The MSD was set to collect masses for ions from 100 to 350 mass units and 100 to 1000 mass units for dipeptides and trypsinized lysozyme, respectively. The gas temperature for the MSD was350°C and the drying gas was delivered at a rate of 13.0L min−1.

Parent ion masses for the dipeptides were determined using the following web-based calculator [10]: Servlet.html

The interpretation of the tryptic map of acetamidylated chicken egg white lysozyme was facilitated using the ExPASy FindPept tool [11] provided at the following URL: http://www.

The protein sequence of the chicken egg white lysozyme was obtained from the National Library of Medicine public database [12] located at the following URL address:

Analysis of the peptide fragments of the chicken egg white lysozyme was performed with the FindPept tool assuming uniform acetamidylation of cysteines following reduction of cystine disulfides.


The concept to use unknown mixtures of dipeptides for this experiment derives from the model of elemental ion identification used in qualitative analysis lab experiments [13]. There is no reason that tri-, tetra-, penta- and other oligo-peptides could not be incorporated into the student's unknown mixtures. As four dipeptides were purchased (i.e. Gly-Asp, Lys-Val, Tyr-Ala, and Asp-Phe-O-Me), there were six possible unknown mixtures containing two of the dipeptides.

The dipeptides were selected because of their different polarities. Gly-Asp is acidic, and hence, is a polar dipeptide. Lys-Val is basic, and hence, is also a polar molecule. The PRP-1 column binds molecules on the basis of their nonpolarity, releasing them when the equilibrium solubility of the dipeptide in the mobile phase exceeds the equilibrium affinity of the dipeptide for the solid support. The HPLC-MS chromatogram of the mixture is shown in Fig. 1a. Lys-Val and Gly-Asp coelute with each other on the PRP-1 (Fig. 1b). Tyr-Ala is less polar than either of the previous two dipeptides and elutes second in retention order on the column (Fig. 1c). The presence of the methyl ester in Asp-Phe-O-Me causes this dipeptide to be retained the longest and the acetonitrile concentration of the mobile phase had to exceed 50% in order for this nonpolar dipeptide to leave the solid phase HPLC support and travel with the nonpolar mobile phase (Fig. 1d).

Figure 1.

HPLC-MS Chromatogram of a Mixture of Four Dipeptides with Associated Mass Spectra of Chromatographic peaks. (a) The total ion HPLC-MS chromatogram (TIC) of the mixture of four dipeptides. The mass spectra of the components within the separated peaks are the following: (b) Mass spectrum of peak containing Gly-Asp with parent ion (M+H)1+ of 191.0 amu, and Lys-Val with parent ion (M+H)1+ of 246.1 amu, (c) mass spectrum of peak containing Tyr-Ala with parent ion (M+H)1+ of 253.1 amu, and (d) mass spectrum of peak containing Asp-Phe-O-Me with parent ion (M+H)1+ of 295.1 amu.

The theory of reverse phase HPLC separations, which is based on nonpolar affinities of the dipeptides for the HPLC solid support, became real for the students as they observed the retention order of different dipeptides. Also, there was a simple elegance in this mixture of four dipeptides because the mass spectrometric detector software permits extraction of an ion chromatogram from the total ion chromatogram (TIC). The Gly-Asp parent ion had an m/z of 191.0, that is, (M+H)1+. The Lys-Val parent ion was found at m/z of 246.1, that is, (M+H)1+, and the (M+2H)2+ ion was observed at m/z of 123.6. The TIC contains the sum of the ion intensities observed in the prespecified mass range and showed the presence of only three peaks from the mixture of the four dipeptides (Fig. 1a). However, using the extracted ion chromatograms (EIC) software tool for specified ions, peaks originating from Gly-Asp and Lys-Val, were extracted from the first peak using unique ion masses of the two dipeptides shown in the mass spectrum (Fig. 2a). A slight shift in the peak retention times (2.928 and 3.001 min) and a different peak shape was observed for the two dipeptides (Figs. 2b and 2c, respectively). The unique capabilities of the MSD were used to show the students how amino acid sequence data could be derived, as in the case of Lys-Val, which had an (M+2H)2+ ion of 123.6 mass units in the ESI+ detection mode (Fig. 2b). The second and third peaks from a mixture of all four dipeptides were identified as Tyr-Ala and Asp-Phe-O-Me, respectively, using their expected parent ion m/z ratios, (M+H)1+ (Figs. 1c and 1d).

Figure 2.

Chromatograms of a Binary Co-Eluting Mixture of Gly-Asp and Lys-Val. (a) Mass spectrum of peak at 3.082 min from which extracted ion chromatograms in (b) and (c) were obtained. (b) Extracted ion chromatogram (EIC) for (M+H)1+ of 191.0 for Gly-Asp dipeptide from TIC peak at retention time 2.961 min observed in Fig. 2. Note retention time of EIC peak is recorded at 2.928 min. (c) Extracted ion chromatogram (EIC) for (M+2H)2+ of 123.6 for Lys-Val dipeptide from TIC peak at retention time 2.961 min shown in Fig. 1. Note retention time of EIC peak is recorded at 3.001 min.

Sequencing of proteins using HPLC-MS involves chromatographic separation of a proteolytically digested protein followed by identification of the resultant peaks in the protein digest. A TIC of the tryptic digest of purchased chicken egg white lysozyme (Fig. 3) is shown superimposed with unique peptides (3i, 3ii, and 3iii) that were identified in the protein digest using the EIC software tool. No attempts were made to curb autoproteolysis of the lysozyme in the presence of the trypsin. Some of the peaks contain peptides that were autoproteolytically degraded beyond the specific cleavages by trypsin at the C-terminus of lysine and arginine residues. For example, the peak at about 4.5 min (3i) shows evidence for the peptide C(Cam)NDGR with a (M+H)1+ of 621 (Note: (Cam) indicates a cysteine, which has been carboxamidomethylated with iodoacetamide) using the FindPept tool [11]. This smaller peptide derives from the peak at about 12.2 (3iii) min (M+H)1+ of 993 corresponding to (R79)/WWC(Cam)NDGR/(T87) by specific tryptic cleavage at the arginines, 79 and 86. Autoproteolysis causes removal of the N-terminal tryptophan residues and results in the smaller pentapeptide.

Figure 3.

Superimposed HPLC-MS Chromatograms (TIC and EICs) of Chicken Egg White Lysozyme Digested with Trypsin. Arrows identify the location of the trypsin digest chromatographic trace (TIC) in the regions where unique peptide peaks are identified. The chromatograms of unique peptides (EICs) are superimposed atop the peptide digest. Individual peptide peaks were obtained from the tryptic digest chromatogram using the extracted ion chromatogram software tool. The EICs of the peptides are labeled to the right of each peak as follows: i) peak containing C(Cam)NDGR with parent ion (M+H)1+ of 621 amu, ii) peak containing tryptic peptide TPGSR with parent ion (M+H)1+ of 517 amu, and iii) peak containing WWC(Cam)NDGR with parent ion (M+H)1+ of 993 amu.

At the final stage, students were asked to use bioinformatic tools to identify the peak with (M+H)1+ of 993 from lysozyme digested with trypsin. Students successfully identified the peptide with the skills they had acquired from the previous 5-hour bioinformatics lab. In a post-lab exercise, the chromatogram of the trypsinized chicken egg white lysozyme was further probed using the extracted ion chromatograph tool for (M+H)1+ and (M+2H)2+ ions that would be produced by cleavage with chymotrypsin. Only two small peaks were observed in the mass range from 500 to 1000 m/z. These were not significant because the signal-to-noise, S/N, ratios for the peaks were less than 10. By comparison, peptide peaks derived by trypsin cleavage had S/N ratios that ranged from 20 to 60. This showed that the 4-hour trypsinization of lysozyme was successful and that students could use the resulting chromatograms of the digest for doing preliminary protein sequence analysis.

In summary, we have shown that HPLC-MS instrumentation can be incorporated into the upper-level undergraduate or graduate level biochemistry laboratory to illustrate how protein sequence information is derived. Unique chromatographic peaks of unknown mixtures can be separated and identified by the students with this tool using peptides with different polarities. By analyzing unknowns in the laboratory, students learn how to obtain the needed data from the instrument to make identification of their unknown mixtures. Finally, students are exposed to proteomics tools that are often saved for the research laboratory at the graduate level. In fact, as one student wrote: “This is a very fun and interesting lab because you get to use instruments that others have no idea on how to run.”


The authors thank the biochemistry laboratory students from Winters 2008 and 2009.