E. J. F. is supported by a National Science Foundation Graduate Fellowship. We gratefully acknowledge support for this work by National Science Foundation Grant BES 0120315, U.S. Department of Agriculture (USDA) Grant SCA 58-1907-1-146, and the USDA Agricultural Research Service.
Mass spectrometry involves the measurement of the mass-to-charge ratio of ions. It has become an essential analytical tool in biological research and can be used to characterize a wide variety of biomolecules such as sugars, proteins, and oligonucleotides. In this review, a brief history of mass spectrometry is discussed, and the basic principles of the technology are introduced. A summary of some current applications is provided, as are examples of recently published research. The current methods used to identify, quantify, and characterize proteins and peptides are then reviewed. The range of applications of mass spectrometry is considerable and only promises to grow as the technology continues to improve.
The definition of a mass spectrometer may seem simple: it is an instrument that can ionize a sample and measure the mass-to-charge ratio of the resulting ions. However, the versatility of this function has allowed it to become a vital tool in a wide range of fields, including biological research. This versatility arises from the fact that mass spectrometers can give qualitative and quantitative information on the elemental, isotopic, and molecular composition of organic and inorganic samples . Furthermore, samples can be analyzed from the gas, liquid, or solid state, and the masses that can be studied range from single atoms (several Da) to proteins (over 300,000 Da) .
The application of mass spectrometry (MS)11 to biology began in the 1940s, when heavy stable isotopes were used as tracers to study processes such as CO2 production in animals . Since that time, advances in technology have increased the range of sample types that can be ionized and the range of masses that can be measured, which has diversified the applications of MS. The biological applications of MS currently encompass such diverse areas as screening newborns for metabolic disorders , comparing protein expression levels between cells grown in different media , determining the bioavailability of minerals in food , and studying how pharmaceutical drugs are metabolized in vivo . As an example of the growing importance of MS, a recent search of PubMed (an archive of citations from life science journals ) for the phrase “mass spectrometry” resulted in over 65,900 total hits, with over 6,500 of these articles published in the year 2002 alone. The current importance of MS to biological research is highlighted by the 2002 Nobel Prize in Chemistry, which was awarded to John Fenn and Koichi Tanaka “for their development of soft desorption ionization methods for mass spectrometric analysis of biological macromolecules” .
The goal of this review is to familiarize the reader with MS and some of its current applications to biological research. The total number of applications is too large to cover in a single review, so we provide a general overview with an emphasis on protein characterization, one of the newest and most widespread applications of biological MS.
A BRIEF HISTORY OF THE BIOLOGICAL APPLICATIONS OF MS
J. J. Thompson constructed the first mass spectrometer in 1912. The early mass spectrometers were primarily used by physicists to study the atomic weight of the elements and the natural relative abundance of elemental isotopes . Although these early instruments could not be used to study biomolecules, it was not long before they were used to study heavy isotopes as tracers in biological systems . Improvements in the range of masses that could be analyzed and the types of sample that could be vaporized made possible the study of organic compounds in 1956. In 1959, MS was first used to sequence peptides and oligonucleotides, and in 1962 it was used to study the structure of nucleotides .
Ionization of larger molecules such as proteins was not possible until 1981, when the fast atom bombardment ionization method was introduced . The ability to ionize larger molecules was further improved with the advent of electrospray ionization (ESI) by Fenn and coworkers in 1988 . The electrospray ion source was easily connected to on-line liquid chromatography (LC), which made possible the analysis of complex mixtures. Time-of-flight (TOF) mass analysis had been developed and commercialized by 1956, but relatively poor mass resolution was a problem until improvements were made in the early 1970s . An ionization technique for biological molecules that could be used with TOF analysis was introduced in 1991. This ion source, matrix-assisted laser desorption/ionization (MALDI), was the result of work in Germany by Hillenkamp, Karas, and coworkers  and in Japan by Tanaka and coworkers . Like ESI, the MALDI ion source was capable of ionizing and vaporizing large molecules such as proteins.
Protein structure was first studied by MS in 1990 , and the peptide mass fingerprinting technique used to identify proteins via MS was developed in 1993. Shortly thereafter, the ability to identify proteins was further enhanced by the development of software programs to search protein mass spectra data against online databases of amino acid sequences . Isotope-coded affinity tags (ICAT), developed in 1999, made possible the quantification of individual proteins in a complex mixture. This technology allowed the simultaneous comparison of the expression levels of all proteins in cells grown in different media .
All mass spectrometers contain at least three major components: an ion source, a mass analyzer, and an ion collection/detection system. The instrument must also be connected to a computer system to process and record the data and a vacuum pump to control the pressure within the mass spectrometer (see Fig. 1). One property shared by almost all internal components is that they are maintained at pressures far below atmospheric (10−6 to 10−8 torr) . Low pressure is necessary to limit the number of ion collisions, which would alter the path of the ions and possibly produce unwanted reaction products or loss of charge . The specific device used in each of the components depends on the type of information desired and the properties of the sample. In this review, we consider in detail only a few of the devices commonly used in protein MS. (For a comprehensive review of the types of instrumentation used, the reader is directed to Refs. 1 and 19.)
The purpose of the ion source is to ionize, and in some cases vaporize, the sample. Some of the most common source types used in biological research today include electron impact ionization , chemical ionization , thermal ionization , ESI , and MALDI [14, 15]. The choice of ion source depends on the sample properties and the degree of ionization and fragmentation desired. Some experiments require that the sample be ionized intact, and others require that some molecular fragmentation occur.
In an ESI source, the sample is ionized when the inlet stream (liquid phase) is emitted from a capillary that has a voltage applied to it. This process forms a spray of highly charged droplets, which are then desolvated as they pass through several stages of decreasing pressure. In some cases, the droplets also pass through a heated capillary to improve desolvation (Fig. 2A). One reason behind the widespread use of ESI is that it is compatible with on-line separations such as LC. Also, ESI ionization often results in multiply charged ions. Because mass spectrometers measure the mass-to-charge ratio (m/z), having multiple charges increases the range of masses that can be analyzed .
MALDI is the most recently developed of the aforementioned ion sources. In MALDI, the sample is co-crystallized within an organic matrix such as sinapinic acid or α-cyano-4-hydroxycinnamic acid . Co-crystallization is achieved by mixing a solution of the sample analyte with a solution of the matrix. The mixture is then applied to a metal target plate and allowed to dry. The resulting crystals are irradiated with laser pulses at a wavelength at which the matrix has high spectral absorption. This process desorbs the mixture and photoexcites the matrix. The excited matrix then ionizes the analyte via proton transfer. The result of this process is analyte ions in the gas phase, the majority of which are singly charged  (Fig. 2B).
After the sample has been ionized, the beam of ions is focused and directed into a mass analyzer, which separates the ions based on their m/z. The most common mass analyzers are the magnetic sector, the quadrupole, the ion trap, the TOF, and the Fourier transform-ion cyclotron resonance analyzer . Here, the TOF and the ion trap will be discussed in more detail.
The performance of a mass analyzer is described by three characteristics: the upper mass limit, the transmission, and the resolution. The upper mass limit is the largest m/z that can be measured. The transmission is the fraction of ions produced at the source that reach the detector. The resolution is a measure of the ability to differentiate two m/z that are close together. If δm is the smallest difference between two mass peaks (m and m+δm), the resolution is δm/m .
The m/z measurements for TOF analyzers are based on the equation:
where KE is the kinetic energy of the ion, m is the mass of the ion, and v is the velocity of the ion. The ions are initially accelerated in an electric field with potential V, which results in a final kinetic energy of zV, where z is the charge on the ion. The ions then enter a flight tube (which is a field-free region) of length L, at the end of which there is an ion detector. Because the flight tube is field-free, all the KE of the ions results from their initial acceleration. Therefore, from the previous equation:
where t is the time it takes the ion to reach the detector. Because V and L are known, the m/z of an ion can be determined from the amount of time that passes between when the laser pulses (TOF analyzers are often paired with MALDI sources) and when the ion reaches the detector. TOFs have a high resolution and an extremely large mass range compared with other mass analyzers .
An ion trap is composed of a ring electrode and two end-cap electrodes (Fig. 3). Initially, the field within the trap is such that all ions that enter the analyzer begin a stable oscillation within the analyzer and are “trapped.” The rf frequencies of the fields are then ramped up to eject ions of increasing m/z. The ions travel to the detector and, based on the frequency being used at the time of detection, the m/z of the ion can be calculated. An ion trap has high sensitivity and transmission but also a limited mass range and lower resolution compared with the TOF analyzer .
Ion Detection/Collection Systems—
The original mass spectrometers used photolithic plates as detectors. However, the design of Nier in 1947 initiated the use of electronic detectors (hence the term mass “spectrometry” instead of “spectroscopy”) . For isotope ratio measurements and some inorganic MS, Faraday cup collectors are used as the detector. Otherwise, the most commonly used detectors are the electron multiplier and the microchannel plate . In both of these detectors, ions strike a metal plate, leading to a cascade of electron emissions that result in a measurable current .
Tandem MS (also known as MS/MS or MS2) is a method that isolates ions with a selected m/z (precursor ions), fragments them, and then measures the m/z of the resulting product ions. Tandem MS is widely used in both organic MS and biomolecular MS to elucidate structural details, or, in the case of peptides, an amino acid sequence. Tandem MS can be performed using an ion trap device or by using two mass analyzers separated by a collision cell. In an ion trap, all ions are ejected from the “trap” except those with the m/z of interest. The analyzer is then pressurized with a nonreactive gas such as helium, and ions are fragmented via collision-induced dissociation (CID). The fragments are then sequentially ejected and their masses determined . Using an ion trap, multiple stages of fragmentation (MSn) are possible. The analyzer can isolate an ion of interest, perform CID fragmentation, isolate a fragment, induce further fragmentation via CID, isolate a secondary fragment, etc. These subsequent fragmentation steps can reveal additional information about the structure of the analyte.
A recent addition to tandem MS instrumentation is the MALDI TOF/TOF . In this instrument, two TOF analyzers are separated by a pressurized collision cell, which fragments the precursor ion and then reaccelerates the product ions before they enter the second TOF analyzer. This instrument benefits from both the high mass accuracy associated with TOF mass analyzers and the ability to do CID.
APPLICATIONS OF MS IN BIOLOGICAL RESEARCH
The range of MS applications to biology is extensive. Here we divide them into three basic categories: isotope-ratio MS, small organic molecule MS, and macromolecular MS. For each of these groups we discuss common applications and present summaries of recently published papers.
Isotope-ratio MS (IRMS) is a technique that measures the relative stable isotopic abundance of elements. The elemental isotope ratio can be analyzed for a complex system (bulk IRMS), for a specific compound within a mixture, or for a specific position within a compound. As an example, the 13C/12C ratio can be measured for a tissue sample, for a specific fatty acid within the tissue, or for a specific carbon position within the fatty acid. IRMS can reveal information regarding the origin or state of a complex system . For instance, the ratio of 13C/12C in apple juice can indicate if the juice has been adulterated with corn syrup . The ratios of 13C/12C, 15N/14N, and 87Sr/86Sr can pinpoint the geographic origin of ivory, which can help conservationists prevent future poaching .
In biomedical research, IRMS is frequently used with projects involving stable isotopic tracers. For these experiments, compounds containing elements with abnormal stable isotope ratios are introduced into a system of interest (e.g. by feeding or by injection). The change in the isotopic ratio of the element in the system is then recorded with IRMS. Stable isotope tracers can be used to study the absorption, retention, and utilization of a nutrient in vivo . Stable isotope tracers can also be used to study energy expenditure  or as a clinical tool for disease diagnosis . IRMS uses magnetic sectors as the mass analyzers and Faraday cups as the ion detectors. The ion source is usually an electron impact or thermal ionization source .
An article by Davidsson and coworkers  describes an experiment to test the effectiveness of two different iron absorption enhancers (Na2EDTA and ascorbic acid) to increase the iron absorbed by children in Peru's school meal program. On the first day, children were fed a breakfast “shake” containing one of the enhancers and 57FeSO4. The next day, the same children were fed a breakfast “shake” containing either the original enhancer at a different concentration or a second enhancer and 58FeSO4. Blood samples were taken before the first meal was given and 2 weeks after the second meal was consumed. The ratio of 58Fe/57Fe in the blood was measured to determine which meal allowed more iron absorption. It was found that ascorbic acid at a concentration of 70 mg/meal increased the iron absorbed to 115% of the daily requirement. This double stable isotope technique is useful because two variables can be tested in the same patient at the same time.
Small Organic Molecules—
The mass spectrometric analysis of organic compounds gives information on the molecular mass, chemical formula, chemical structure, or quantity of the analyte. Based on the measured m/z and their peak intensities, the formula and chemical structure can be determined manually  and/or by comparison with a reference database of spectra. As an example, Lavermicocca et al.  used MS to identify novel antifungal compounds produced by Lactobacillus plantarum strain 21B (a bacterium used in making sourdough bread). Components of a culture filtrate were separated and tested for antifungal activity. Those that showed the highest activity were then characterized using gas chromatography-MS. The spectra were compared with those in a MS spectra library, and the compounds present were identified.
MS is also a valuable tool in the high-throughput analysis of compounds created in combinatorial libraries, such as those created to discover new drugs. If the mass spectrometer is connected to an LC system with appropriate columns, both structural information and information regarding the binding affinity of a molecule can be obtained . MS can also be used to study drug metabolism by identifying the primary metabolites from in vitro studies and by measuring them in vivo to determine pharmacokinetic parameters . An example of this type of experiment is a study done by Oo et al. . In this study, researchers administered oseltamivir (a treatment for influenza) to healthy patients between 1 and 12 years of age. The concentrations of oseltamivir and one of its metabolites were then monitored over time in the urine and plasma using MS. The pharmacokinetic data obtained allowed appropriate dosing schedules for children to be developed.
For instrumentation, small organic molecule MS often uses an electron impact or chemical ionization ion source. A wide variety of mass analyzers are used, and tandem MS is often employed to increase the structural information obtained.
MS analysis of macromolecules includes the study of proteins, peptides, oligonucleotides, oligosaccharides, and lipids. We defer a discussion of applications to the study of proteins and peptides to “Protein Characterization.” The research on the analysis of oligonucleotides has focused on studying modified oligonucleotides that may not be compatible with the current enzymatic techniques used to sequence DNA . McLafferty and coworkers were able to sequence oligonucleotides up to 100 nucleotides in length using an ESI source coupled to a Fourier transform-ion cyclotron resonance analyzer . MS can also be used to analyze DNA modifications such as methylation .
The characterization of oligosaccharides is more difficult than that of proteins and oligonucleotides because of the isomeric nature of the subunit and its ability to form branched structures. However, the structures of a wide variety of oligosaccharides have been determined by using tandem MS .
Many classes of molecules within the “lipid” category, including fatty acids, acylglycerols, and steroids, have been characterized by MS. The characterization of fatty acylcarnitines in blood can be used to screen newborns for hereditary fatty acid oxidation disorders. MS is used to measure the chain length and abundance of fatty acylcarnitines present in the blood. By comparing the resulting spectrum to that of a normal infant, doctors can diagnose if a fatty acid oxidation disorder is present, and in some cases they can determine the exact enzyme in which the newborn is deficient .
Most macromolecular research relies on the use of ESI or MALDI as an ionization source. A variety of mass analyzers, including the ion trap and TOF, are used. Tandem MS is often employed to elucidate structural details or sequence information.
MS has become a vital tool in proteomic research. It can give information on the identity of a protein, the amount of the protein that is present, and the modifications the protein contains.
The most commonly used MS method to identify proteins is a combination of peptide mass fingerprinting and amino acid sequencing via tandem MS (see Fig. 4). Using these methods, sensitivity is routinely in the femtomole range . The amino acid sequence of each protein is unique, and therefore the set of peptide masses resulting from proteolytic cleavage provides a “fingerprint” of the protein. In the peptide mass fingerprinting technique, an unknown protein is reduced and alkylated (to break any disulfide bonds) and then digested enzymatically using a sequence-specific proteolytic enzyme such as trypsin. Next, the masses of the resulting peptides are measured using MS. For the spectra in Fig. 4, a MALDI ion source was used, which results in almost all of the ions being singly charged (see “Ion Sources”).
The list of peptide masses from the spectra is entered into a search program, such as Mascot  or ProFound , along with information about the enzyme used, accuracy of the mass measurement, and possible protein modifications that may be present. The software then searches through databases of amino acid sequence information, (e.g. Swiss-Prot  or NCBInr ). For each protein entry in the database, the software uses the amino acid sequence to predict the peptide masses that would result from digestion with the user-specified enzyme. The software then compares the list of peptide masses measured (using MS) to the list of predicted peptide masses and calculates the probability of a match. The result is a list of possible protein identifications and the probability that each identification is correct.
The probability that the match is correct can be increased by performing tandem MS on one or more of the original peptides. In tandem MS, peptides of a specific m/z, selected from the peptide mass fingerprint, are isolated from peptides of all other m/z. The isolated peptides are then fragmented by collisions with a gas such as helium that has been introduced into the mass spectrometer. The new peptide pieces formed by the CID are then measured. The weakest bond in a peptide is the one between the amino acids. Therefore, for low-energy CID, the resulting MS/MS spectrum is a series of peaks that represent peptides that differ only in the number of amino acids they contain. By measuring the mass difference between each peak, one can determine the amino acid sequence of the original peptide. High-energy CID also causes side-chain fragmentation. This information can be useful to differentiate between amino acids of the same molecular mass (e.g. leucine and isoleucine) .
In Fig. 4, the peptide with an m/z of 2574.30 was chosen for tandem MS analysis. The peptide was isolated and fragmented and the mass spectrum of the resulting fragments is shown. The partial amino acid sequence can be found by calculating the difference in mass between peaks. For example, the mass difference between the peaks with m/z = 1577.80 and m/z = 1449.76 is 128.04. This is within error of the mass of glutamine (Q), which is 128.06 Da. The probability of fragmentation, however, is not the same for all of the amide bonds . This is evident by the wide range of peak intensities in the tandem MS data and the fact that some peaks are missing. For instance, due to the structure of proline, fragmentation at the C-terminal end of proline is rare . This is why in Fig. 4 we do not see a peak at m/z = 2020.94, which is equal to m/z = 2117.99 minus the mass of proline.
Although some work has been done to identify simultaneously all the proteins in a complex mixture, protein separation prior to enzymatic digestion is often performed. Proteins are separated using LC  or gel electrophoresis (either one- or two-dimensional gel electrophoresis) . If separated by gel electrophoresis, the protein band or spot is excised, the proteins within are digested “in-gel,” and the resulting peptides are extracted for MS analysis [36, 41].
Proteomic research often requires knowing not only what proteins are being expressed by an organism, but also the level of protein expression. A common research technique is to compare the protein expression levels between multiple systems. In the past, quantitative analyses of protein levels using MS have been done by the metabolic labeling of the proteins in one system (e.g. cell lysates) with a heavy isotope such as 15N . However, this procedure is limited to cells and tissues compatible with metabolic labeling. A new method for labeling that does not have this limitation is the ICAT method.
In the ICAT approach, the ICAT reagent derivatizes the side chains of the cysteinyl residues. Proteins from one system are derivatized with a “light” form of the reagent and proteins from the other system with a “heavy” form. In the original ICAT procedure, the heavy form contained eight 2H atoms ; however, in the current method, the heavy form includes nine 13C atoms [42, 43]. The ICAT reagent has four parts: an affinity tag (biotin), a thiol-specific reactive group for covalent attachment to cysteine side chains, a linker, and an isotope tag (which contains 13C in the heavy form) (see Fig. 5). The tagged protein mixtures are combined and enzymatically digested. The digest is fractionated using cation exchange chromatography, which also removes any neutral species from the tryptic peptides. The tagged peptides (those that contain a cysteine) in each fraction are then separated from the others on an affinity column, and the biotin portion of the tag is cleaved off. The peptides are then separated using LC and analyzed by MS . The ratio of the intensities of the light- and heavy-labeled peptides gives a measurement of the ratio of protein expression in the two systems. The peptide is then analyzed with tandem MS to identify the parent protein.
Proteins with Post-translational Modifications—
The function and activity of a protein is determined in part by any post-translational modifications that may be present. Over 200 distinct types of covalent modifications have been reported; the most common of these include phosphorylation, glycosylation, and ubiquitination . Here the use of MS to study phosphorylation will be reviewed.
It is estimated that over 30% of proteins are phosphorylated . The conventional biochemical approach to studying phosphopeptides requires the use of radioactive labels and Edman sequencing , but a variety of approaches have been developed to study phosphorylation using MS. If the amino acid sequence of the protein is already known, the sample is digested and the peptides analyzed with MS. The m/z of the peptides measured with MS are compared with the predicted values. An increase in a peptide mass by 80 or 160 Da indicates covalent modification by phosphate (HPO3) (mono- and diphosphorylated species, respectively) . The phosphorylated peptide can then be analyzed using MS/MS to determine the exact position of the modification. If the protein is unknown, the sample is divided after digestion, and one portion is treated with a phosphatase. This step removes any phosphate groups, and a comparison of spectra before and after phosphatase treatment will indicate which peptides are modified. The modified peptides can be analyzed with MS/MS to identify the parent protein and the site of phosphorylation .
The presence of phosphorylation can also be determined using different scanning modes on the mass spectrometer. In “neutral loss scanning,” the mass spectrometer scans for an ion that when fragmented results in a specified neutral loss (change in mass but not in charge) . When searching for phosphorylated peptides, the tandem MS spectra are searched for a fragment that has a mass 98 Da lower than the original peptide mass (with no change in charge). This neutral loss indicates a loss of H3PO4 due to β-elimination. This process cannot detect phosphotyrosines, however, due to stability of the β-protons in the benzene ring . In “precursor ion scanning,” the mass spectrometer scans for a specific product ion after fragmentation . For phosphotyrosines, the peptides are fragmented using CID and scanned for a positive fragment ion with 216 m/z (phosphotyrosine immonium ion). For other phosphorylations, the peptides are fragmented using CID and scanned for a negative ion with 79 m/z due to the release of PO3− .
MS has been used to study a multitude of post-translational modifications, often employing the same approaches discussed here for phosphorylation: treatment with an enzyme to remove the modification (with subsequent spectra comparison), neutral loss scanning, or precursor ion scanning. It should be noted that for modifications by heterogeneous molecules (such as glycans), the structure of the modification could also be analyzed with MS .
The use of MS was originally limited to the study of elemental isotopes. However, as the instrumentation, computer technology, and associated software have improved, the range of MS applications has increased. The wide variety of sample types that can now be analyzed, and the breadth of information that can be obtained, have helped MS to permeate into an extensive range of research areas. The result is that MS has become an essential analytical tool in biological research.
The abbreviations used are: MS, mass spectrometry; CID, collision-induced dissociation; ESI, electrospray ionization; ICAT, isotope-coded affinity tag; IRMS, isotope ratio mass spectrometry; LC, liquid chromatography; MALDI, matrix-assisted laser desorption/ionization; m/z, mass-to-charge; TOF, time-of-flight mass analyzer.