Gene expression analyses of stem cells (SCs) will help to uncover or further define signaling pathways and molecular mechanisms involved in the maintenance of self-renewal, pluripotency, and/or multipotency. In recent years, proteomic approaches have produced a wealth of data identifying proteins and mechanisms involved in SC proliferation and differentiation. Although many proteomics techniques have been developed and improved in peptide and protein separation, as well as mass spectrometry, several important issues, including sample heterogeneity, post-translational modifications, protein-protein interaction, and high-throughput quantification of hydrophobic and low-abundance proteins, still remain to be addressed and require further technical optimization. This review summarizes the methodologies used and the information gathered with proteome analyses of SCs, and it discusses biological and technical challenges for proteomic study of SCs.
Disclosure of potential conflicts of interest is found at the end of this article.
Stem cells (SCs) are undifferentiated cells generally characterized by their functional capacity to both self-renew and to generate a large number of differentiated progeny cells . Conventionally, SCs are either classified as those derived from embryo or adult tissues. ESCs, embryonal carcinomal cells (ECCs), and embryonic germ cells are derived from the preimplantation embryo (e.g., inner cell mass of blastocyst, morula [reviewed in ), and single blastomeres ], teratocarcinomas, and primordial germ cells, respectively. These cells are pluripotent; that is, they have the ability to form all embryonic germ layer derivatives except extracellular tissues (e.g., placenta). SCs found in adult organisms are present in most tissues and are referred to as adult SCs, such as MSCs, hematopoietic SCs (HSCs), and NSCs . They are considered multipotent, since they can produce mature cell types of one or more lineages, but cannot reconstitute the organism as a whole. What determines SC potency largely depends on intrinsic properties of SCs, as well as extrinsic cues provided by the niche (microenvironment where SCs reside). Because of their exceptional properties, SCs have the potential to be used for developmental biology, drug screening, functional genomics applications, and regenerative medicine.
Gene expression analyses of the SCs will help uncover and further define signaling pathways and molecular mechanisms involved in the maintenance of the undifferentiated state and initial loss of pluripotency and/or multipotency. A detailed understanding of these molecular mechanisms will, thus, be essential for the aforementioned SC applications. In contrast to the transcriptome, which is studied with microarrays [5, , , , , , , , , , , , , , , , , , –24], important issues of the proteome, such as protein amount, stability, subcellular localization, post-translational modifications (PTMs), and their interactions can be elucidated at proteome level. Wilkins et al.  coined the term “proteome” (PROTEins expressed by a genOME) to refer to the total set of proteins expressed in a cell, tissue, or organism.
Currently, two-dimensional gel electrophoresis (2-DE) and non-2-DE-based approaches are broadly applied to proteomic analyses. Applying proteomics to investigate the programs that control self-renewal, differentiation, and plasticity will provide valuable insight into how the factors involved induce differentiation of SCs to specific lineages.
Recent reviews have comprehensively addressed various aspects that are relevant in the context of SC proteomics [26, , –29]. Here, we review various proteomics methodologies used to study SCs, review proteome analyses of SCs, discuss biological and technical challenges encountered with proteomic studies of SCs, and provide insight into how proteomics-based research is likely to develop.
Proteomics: An Overview of Technology
Sample Preparation and Protein Extraction
Although proteomic analysis can be used for qualitative comparisons, it is much more informative when used quantitatively. The isolation of proteins from SCs and derivatives for proteome analyses is complicated. The human genome harbors 26,000–31,000 protein-encoding genes , whereas the total number of human protein products, including splice variants and essential PTMs, has been estimated to be close to 1 million [31, 32]. Another important factor for proteomic analysis is the dynamic range of protein concentrations; one cell can contain between 1 and more than 100,000 copies of a single protein . A high dynamic range can be partially achieved by fractionation of the proteome into subproteomes, for example, by applying affinity purification . Reduction of a complex sample is also achieved by specific isolation of individual proteins or protein complexes. In general, hydrophobic membrane proteins are much more difficult to handle than hydrophilic proteins; hence, hydrophobic proteins require specific extraction procedures [35, 36].
High-resolution 2-DE of proteins is the fundamental tool of proteomics and allows thousands of proteins to be analyzed simultaneously (Fig. 1). 2-DE has been available since 1975 . In 1988, a basic protocol of electrophoresis with immobilized pH gradients (IPGs) was described . The advent of IPGs for the first dimension has produced significant improvements in 2-DE separation, with higher resolution, improved reproducibility, and higher loading capacity for preparative gels . Other important technological advances in 2-DE include the development of sensitive protein stains and the use of in-gel sample application in contrast to loading at either anodic or cathodic ends of the gel.
The most significant breakthrough in proteomics has been the application of mass spectrometry (MS). It allows the identification of proteins in the femtomole to picomole range and has superseded classic Edman N-terminal sequencing, which is less automated and less sensitive and requires an unblocked N terminus [40, 41]. The main components of a mass spectrometer are an ion source, one or several mass analyzers that measure the mass-to-charge ratio (m/z) of the ionized analytes, and a detector that registers the number of ions at each m/z value (Fig. 2).
Identification of Separated Proteins from 2-DE Gels Using MS
This approach is commonly applied for the identification of proteins isolated from 2-DE. It usually begins with peptide mapping, initially suggested by Henzel et al. . The separated proteins are digested with an enzyme (for instance, trypsin), and the masses of the proteolytic peptides are measured by MS. The mass spectra are obtained with a relatively simple MS instrument, such as matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF). The masses of the measured proteolytic peptides are compared with predicted proteolytic peptides from protein sequence databases. This step can be fully automated, but it requires the complete sequence of the protein or of the coding region of its gene to be present in the database. As more gene sequence data become available, the success rate of this method will increase. The method is now popularly called peptide mass fingerprinting. In some cases where peptide mapping does not provide sufficient information for confident identification, there is a need for more sophisticated instrumentation, such as a MALDI-TOF/TOF or quadrupole TOF instrument, which provide higher mass accuracy and sensitivity and include peptide fragmentation and partial peptide sequence determination for several tryptic fragments (Fig. 2).
MS-Based Protein Profiling
Although 2-DE has been widely used for proteome analysis, this methodological approach has several limitations. For example, it is inadequate for the analysis of more complex mixtures, and detection as well as identification is strongly biased toward the more abundant proteins [43, –45]. Moreover, hydrophobic proteins, such as membrane proteins, which are not readily soluble in aqueous media, are rarely detected with 2-DE .
In MS-based protein profiling, the proteins are enzymatically digested and subjected directly to MS. This system, also referred to as “shotgun proteomics,” features a protein separation step coupled to a mass spectrometer with superior resolving power and dynamic mass range. Most popular at present are two-dimensional (2D) (strong cation exchange/reversed phase) [47, 48] or three-dimensional (strong cation exchange/avidin/reversed phase)  chromatographic separation methods of peptide mixtures. The protein samples can also be prefractionated using SDS-polyacrylamide gel electrophoresis (SDS-PAGE) or isoelectric focusing (IEF) prior to analysis.
MS-Based Quantitative Analysis
Several MS-based strategies have been developed that allow different samples to be compared quantitatively. In extracted ion current (XIC)-based quantification, the signal intensity of peptides that elute from the chromatographic column is plotted over time, and the area under this curve is the XIC [35, 50]. In this approach, the intensity of the peptide signals between two states can be compared. Two major advantages of XIC-based quantification are that no labeling is required and that it can be used with any type of sample. A more versatile approach for precise relative quantification involves the differential labeling of two or more sets of proteins or peptides derived from different cell states with light and heavy isotopes of the same chemical reagent followed by MS analysis (Fig. 3). These techniques also allow relative quantification of basic, hydrophobic, or large proteins excluded from analysis using 2-DE or difference in-gel electrophoresis (DIGE).
Protein chips will likely be the next major manifestation of the revolution in proteomics and offer another solution to analyze low-abundance proteins and have the potential for high-throughput applications to identify biomarkers. Protein chips differ from previously described methods: whereas screening by 2-DE or liquid chromatography (LC)-MS/MS can potentially detect any protein, protein chips can only provide data on set of proteins selected by the investigator (Fig. 4).
Modern surface-enhanced laser desorption and ionization (SELDI) technology uses MS as a read-out system to analyze differential protein expression on spot arrays. Comparison of two mass spectroscopy data sets generated from two different samples immediately identifies the differentially expressed proteins. Thus, high-throughput analysis of crude samples readily and rapidly generates data that can be used for diagnosis or prognosis. The key disadvantage is that the mass spectrum obtained usually does not enable identification of the proteins analyzed, necessitating additional experimental procedures (e.g., enrichment by affinity chromatography and identification by methods such as tandem MS) .
Stem Cell Proteomics
Profiling and Differential Expression Analysis
Proteome mapping serves as a starting point for building up a comprehensive database of the SC proteome. Several groups have used proteomics to identify SC-specific proteins in mouse ESCs (mESCs) [52, –54], human ESCs (hESCs) [54, 55], human umbilical cord blood (UCB) MSCs , human bone marrow (BM) MSCs , rat NSCs , and human NSCs . A comprehensive list of proteomic investigations including different SC types, practical approaches, and major achievements is given in Table 1. Although proteins involved in energy metabolism comprise the largest group of identified proteins in adult SCs [56, 58, 59], a significant proportion of identified proteins in ESCs are involved in protein synthesis, processing, and transport [53, 55], reflecting the potential of hESCs to either maintain an undifferentiated state or quickly change phenotype, as observed in rapid differentiation processes. One of the characteristics of the protein subset identified in ESC lines is that they contain relatively abundant nuclear proteins in terms of both variety and protein content. This might be related to the high nucleus-to-cytoplasm ratio of ESCs.
Table Table 1.. A summary of published papers about stem cell proteomics
In all, ninety-two proteins were commonly identified in both mouse and human ESCs (Fig. 5). The comparison of rat NSCs  with human NSCs  and the comparison of human UCB-MSCs  with human adipose-derived (AD) MSCs  revealed 52 and 65 MSC- and NSC-specific proteins, respectively. NSCs have been shown to be more similar to ESCs than to MSCs (Fig. 5). The global overlap between genes expressed in ESCs and NSCs supports previous results at the transcriptome level  and corroborates the observed default differentiation pathway of ESCs to neural lineages. This is in line with observations in which embryonic cells of both frogs  and mice  become neural cells in the absence of cell-to-cell signaling. Possible reasons for discrepancies in proteome analyses of SC are discussed below.
Over the past few years, there has been a growing interest in applying proteomics to study differential expression of SC genes in different developmental stages, thereby specifically aiming at unraveling the regulatory networks active during differentiation of mESCs [63, , , , –68], hESCs , MSCs [57, 60, 69, 70], NSCs [27, 59], and HSCs . Interestingly, the nine SC-specific proteins distinguished in this review were among those differentially expressed; that is Peroxiredoxin 1 [27, 59, 60, 63], Heat shock cognate 71 kDa protein [27, 59], Enolase 1 [59, 66], 78-kDa glucose-regulated protein precursor [27, 66], T-complex protein 1 , Translationally controlled tumor protein , and ATP synthase β chain . A small number of other differentiation-associated proteins have also been published, including proteins involved in stress response and oxidative defense (HSP27 [60, 64, 65, 72]), 60-kDa heat shock protein [59, 64, 66], Peroxiredoxin 4 [59, 64], cell cytoskeleton (Tubulin-α) [27, 54, 59, 63, 64, 72], vimentin [63, 65, 66], receptor for intracellular transport (Syntaxin 7) [27, 60], and multifunctional protein (calreticulin) [64, 66, 72].
Although these studies have generated a wealth of data, it is rather difficult to create a definitive proteome profile of undifferentiated and differentiated SCs in the different published studies. One of the major hurdles that have to be overcome in large-scale proteomic studies in general is the reduction of sample complexity. As yet, there is no single preparation method that allows identification of all proteins present in a sample. Applying different separation methods will produce various sample compositions, thus resulting in different data sets after proteome analysis. Also, different quantitative analysis methods may show variation in accuracy and sensitivity to samples of various complexities. Wu et al.  compared three quantitative methods frequently used in proteomics, 2-DE-DIGE, isotope-coded affinity tag (ICAT), and isobaric Tag for relative and absolute quantitation. They reported that there is a limited overlap of differentially expressed proteins identified by the three methods from two closely related HCT-116 cell lines, suggesting the complementary nature of these approaches . Nevertheless, the complementary information obtained through different methods should potentially provide a better portrait of the biological system under investigation.
On the other hand, differences in culture methodologies applied in different laboratories are likely to induce variations in protein expression. SCs are notoriously difficult to culture compared with more conventional cell lines; this difficulty is mainly due to our lack of knowledge about SCs. Some culture methodologies that work in one laboratory may not work as well in another laboratory because of unknown or less well-defined factors (e.g., serum batches) that affect cell growth and behavior. In addition, the different methods used to derive SC lines will also contribute to differences in cell line characteristics (discussed further in Heterogeneity of Proteome). These factors force individual laboratories to empirically adjust and optimize the culture conditions required to grow SCs. Thus, the different protein separation techniques, proteome analysis methods, and culture conditions, all of which depend on the interest of individual research groups in a specific topic, result in the generation of various proteome data sets that are almost impossible to compare directly. However, the different proteome profiles that have become and will become available are usually supplementary and thereby complement our overall knowledge of SCs and the proteins expressed in different environments as well as under different culture conditions.
Approximately 20%–30% of all genes in an organism encode integral membrane proteins, which are involved in numerous cellular processes. The target residues for tryptic cleavage (i.e., lysine and arginine) are mainly absent in transmembrane helices and preferentially found in the hydrophilic part of these lipid bilayer-incorporated proteins. Because of the protein aggregation step of IEF, 2-DE is unsuitable for the separation of integral membrane proteins and is limited to detection of membrane-associated proteins and membrane proteins with a low hydrophobicity (e.g., those having only one or two transmembrane helices). In contrast, the combination of one-dimensional (1D) gel separation and LC-MS/MS has been applied with success . Another, more successful approach to isolating membrane proteins relies on cell surface-labeling in combination with high-resolution 2D LC-MS/MS. First, cell surface proteins of intact cells are selectively labeled with the membrane-impermeable reagent biotin, and biotinylated plasma membrane proteins are then enriched via affinity capture using immobilized avidin. The biotinylated proteins can be separated by gel electrophoresis and identified with MS . The only record of cell surface proteome characterization of ESCs was made by Nunomura et al. , who studied cell surface proteins using cell surface labeling of undifferentiated mESCs (line D3) coupled to high-resolution 2D LC-MS/MS. They identified 324 proteins, 235 of which had a putative signal sequence and/or transmembrane segments . Using 1D gel followed by LC-MS/MS , Foster et al.  applied an XIC-based quantification method and identified 104 membrane proteins from human MSCs; they found that expression levels changed during differentiation toward osteoblast cells.
In both studies, many of the identified proteins were abundant housekeeping proteins, such as ribosomal constituents, structural molecules, histones, and chaperones. Although some of these proteins might be associated with the membrane, it was difficult to distinguish them from the intracellular components released by dead cells. Combining this method with quantitative proteomic approaches, such as stable isotope labeling, will provide valuable information about stage- and lineage-specific expression of SCs.
Many regulatory steps, especially those involved in cell proliferation, migration, and differentiation, depend on protein PTMs rather than protein abundance . Several 2-DE reports have identified large numbers of isoforms or PTMs in SCs (Table 1). By comparing proteomic data with transcriptome analyses, Unwin et al.  showed that the shift in proteome from long-term reconstituting HSCs (Lin−Sca+Kit+; LSK+) to non-long-term reconstituting progenitor cells (Lin−Sca+Kit−; LSK−) was associated with post-transcriptional control of protein levels. Another study, performed by Schrattenholz et al. , enabled the enrichment of phosphoproteins of neuronal derivatives of mESCs that were exposed to chemical ischemia in a differential and quantitative proteome analysis. Moreover, in a study that was restricted to a defined set of proteins, Prudhomme et al.  used a computational systems biology approach to study phosphorylation states of 31 intracellular signaling network components across 16 different stimuli at three time points. They applied quantitative Western blotting and partial-least-squares modeling to determine which components showed the strongest correlation with cell proliferation and differentiation rates .
Kratchmarova et al.  applied a quantitative phosphoproteomics approach to study the effects of growth factors (epidermal growth factor [EGF] and platelet-derived growth factor [PDGF]) on human MSCs. Carrette et al.  metabolically labeled the proteins in cell culture using stable isotope labeling by amino acids in cell culture (SILAC) (Fig. 3), combined the cell lysates of the three states, and incubated this mixture with antibodies against phosphotyrosine. The precipitated complexes were resolved with 1D SDS-PAGE and proteolytically digested, after which the resulting peptide mixture was analyzed with LC-MS/MS. These results showed that EGF and PDGF modulate ostogenic capability of MSCs through mitogen-activated protein kinase (MAPK)/ERK, P38 kinase, and phosphatidylinositol 3-kinase signaling.
Puente et al.  sought to characterize the SC state by identifying the phosphoproteome of mESCs and their derivatives formed in embryoid bodies (EBs). Samples were loaded onto phosphoprotein-affinity columns, and eluted proteins were separated by 2-DE followed by silver staining. Proteins visualized with silver stain were identified by MALDI MS/MS or LC-MS/MS. The set of proteins that exhibited altered PTM during differentiation included several proteins previously displayed in gene expression arrays as conserved features of the SC phenotype. Proteins related to protein catabolism, protein folding, chromatin remodeling, and other functions were found to exhibit altered phosphorylation between the ESC and EB states. As such, these data suggest that kinase activity and the phosphorylation state of target substrates act as critical regulators of SC function.
Heterogeneity of Proteome
The reproducibility of proteome profiles of individual SC samples or their derivatives generated under similar conditions is a major criterion for large-scale proteomics-based studies. The proteome of a cell is highly dynamic and depends on several parameters, including genetic background, the method of derivation, growth condition, and the stage of the cell cycle during sample collection. Therefore, individual samples of cells in the same physiological state should be made for accurate and reliable quantitative proteome comparisons with respect to protein up- or downregulation.
Zenzmaier et al.  compared CD34+ preparations from five different umbilical cord samples. Out of hundreds of spots detected on 2D gels, they found only 52 common proteins, 22 of which were identified using nano-high-performance LC-MS/MS . Since the purity of the cell samples was >88%, the observed heterogeneity could not be attributed to contaminating cells. Instead, the difference in the protein patterns was interpreted as SC-intrinsic heterogeneity.
We analyzed the proteome of three hESC lines in triplicate and identified 54 and 14 proteins showing quantitative (p ≤ .01) and qualitative changes, respectively . Moreover, van Hoof et al.  reported that the expression levels of proteins such as β-actin and Oct4 were similar between the hESC lines NL-HESC-01  and HES-2 , whereas the expression ratios of several of these proteins were different in another hESC line, HUES-1 . HES-2 and NL-HESC-01 cells were both passaged mechanically by a cut-and-paste method in serum-containing medium [84, 86], whereas HUES-1 cells were passaged enzymatically by trypsinization and cultured in serum replacement with basic fibroblast growth factor . Potential sources of variation among hESC lines included the following (reviewed in [87, 88]): (a) differences due to origin of cell lines ((i) genomic diversity; (ii) stage of preimplantation embryo at derivation; (iii) conditions of early culture (feeder layer, culture conditions); (iv) differences in culture  and derivation procedures applied in laboratories, such as different feeder cell types and densities, culture substrates, culture media, growth factors/other additives, and freezing method; (v) the passage number and method of passaging [2, 90, 91]; and [vi) imprinting and X-inactivation]; (b) differences arising over time in culture ((i) genetic changes (loss or gain of specific sequences); and (ii) general and specific epigenetic changes [DNA methylation, histone acetylation, and micoRNAs); reviewed in ]; and (c) differences due to mosaicism in cultures ((i) partial or terminal differentiation of subpopulations within cultures; and [ii) variation among epigenetic and genetic changes].
In adult SCs, it was shown that the proliferation and osteogenic capacity of human MSCs decrease during serial subculturing . Moreover, passage-specific proteins were found, which were suggested to be differentially regulated and to play a role in the decrease of osteogenic differentiation potential under serial subculturing.
The purification and extraction of specific SC-derived cell types and the consistency and reproducibility of sample generation are thus considered important issues. SC differentiation usually yields mixed and heterogeneous cell populations. Therefore, optimization of protocols for enhancement of differentiation toward a specific cell linage and the following purification should be taken into account (reviewed in ). Feasible methods that may help to achieve this include the following: (a) addition of specific combinations of growth factors or chemical morphogens, (b) changing the physical and geometrical microenvironment, (c) coculture or transplantation of SCs with inducer tissues or cells, (d) implantation of SCs into specific organs or tissues, and (e) overexpression of transcription factors associated with development of specific cell lineages. However, to date, these strategies have not yielded pure populations of mature progeny and apparently require efficient protocols to purify specific cell populations. Methods such as fluorescence-activated cell sorting and magnetic-activated cell sorting allow purification as such, but they depend on the cell to express a surface marker that can be recognized by a fluorescent or magnetic microbead-tagged antibody; to be desirably effective, the marker needs to be cell-type-specific. In most cases, these cell markers are not commercially available; thus, sorting methods rely on, for example, genetic modification of SCs by tagging a lineage-specific promoter to a fluorescent marker. Alternatively, cells could be transduced with a drug-resistance gene instead of a marker, to allow preferential selection of subpopulations.
Application of Protein Array to Stem Cell Proteomics
Protein arrays offer a different solution and have the potential for high-throughput applications to identify novel protein markers and molecular pathways. Hayman and Przyborski  applied SELDI-TOF to rapidly generate protein peakmap bioprofiles. They demonstrated that this approach can be used with up to 100% accuracy to distinguish human ECCs from differentiated derivatives . It should be noted that this approach does not identify the individual molecules expressed in the cell sample. Yet if the identification of a particular protein is required, the current approach can be combined with other technology, such as SELDI-tandem MS. Using cytokine protein arrays, it has been shown that cytokine induction and signal transduction are important for the differentiation of human UCB-MSCs . Sakaguchi et al.  used a ProteinChip system to identify OP9-a BM cell line-conditioned medium molecule responsible for neurosphere formation from NSCs.
The application of reverse-phase protein arrays for the analysis of primary acute myelogenous leukemia samples, as well as leukemic and normal SCs, has been demonstrated . Using this strategy, the differences in protein expression in as few as three cell protein equivalents could be detected. Therefore, it was suggested that this approach can be applied as a highly reliable and reproducible high-throughput system for rapid, large-scale proteomic analyses of protein expression and phosphorylation state in primary acute myelogenous leukemia cells, as well as in human SCs.
The secretome is defined as a subset of the proteome that contains all proteins actively exported out of a cell from any origin. The type of proteins secreted by the cells strictly depends on the type of cell and the cellular state; therefore, the secretome reveals much about what is going on inside the cell.
The proteomics approach was used to characterize an environment that supports the growth of undifferentiated hESCs and to identify factors critical for their independent growth. Proteome analysis of conditioned medium (CM) from mouse embryonic fibroblast feeder layers (STO cell line)  and human neonatal foreskin cell line (HNF02)  resulted in the identification of several proteins involved in cell growth, differentiation, and extracellular matrix formation and remodeling; many intracellular proteins were identified.
Zvonic et al.  compared the secretomes of CMs obtained from four individual primary AD-SC cultures in uninduced or adipogenic-induced conditions and identified several proteins, such as adiponectin and plasminogen activator inhibitor 1, and multiple serine protease inhibitor proteins (serpins).
These studies indicate the complexity of the environment formed by the feeder cells and provide a useful starting point for future studies. Secretome studies show a high potential for identification of biomarkers involved in many cellular processes, including growth, division, differentiation, development, and death.
Although considerable progress in human transplantation medicine has been made, several major obstacles still restrict more widespread application of cell transplantation and in particular that of SCs. The major clinical obstacle that has to be overcome is demonstrating the safety and feasibility of cell therapy. Proteomic analyses of tissues and body fluids after cell therapy could address these concerns. For example, Kaiser et al.  analyzed urine after HSC transplantation (HSCT) and could clearly distinguish between patients with graft-versus-host disease (GVHD) and those with no problems after HSCT with a high specificity (82%) and a sensitivity of 100%.
Wang et al.  quantitatively analyzed the human plasma proteome before and after the onset of GVHD, leading to the identification of a large number of proteins that are affected by GVHD after HSCT. They identified 75 proteins that exhibited quantitative changes between the pre- and post-GVHD samples . Some of these proteins were well-known acute-phase reactants, including serum amyloid A, apolipoproteins A-I/A-IV, and complement C3.
To study salivary protein changes that occur after HSCT, Imanguli et al.  analyzed serially collected saliva samples from 41 patients undergoing allo-HSCT using SELDI-MS in conjunction with 2-DE. Significant changes in multiple salivary proteins that lasted at least 2 months post-transplant were detected, including upregulation of lactoferrin and secretary leukocyte protease inhibitor and downregulation of secretary IgA. Weissinger et al.  could correlate proteomic data with the clinical diagnosis of acute GVHD. From their proteome analysis, a tentatively acute GVHD-specific model consisting of 31 polypeptides was chosen that allowed them to distinguish between patients with GVHD and those with no problems after HSCT with high specificity (98%) and a sensitivity of 100%. The subsequent blinded evaluation of 599 samples enabled diagnosis of acute GVHD, even prior to clinical diagnosis, with a sensitivity of 83.1% and a specificity of 75.6%.
These results showed the power of proteomics as an unbiased laboratory-based screening method, enabling diagnosis and pre-emptive therapy.
Insight into Stem Cell Protein Networks and Signaling Pathways for Pluripotency
Understanding molecular mechanisms underlying SC pluripotency should illuminate fundamental properties of SCs and the process of cellular reprogramming. Proteomics proved to be a powerful approach to gain insight concerning key intracellular signals governing SC self-renewal and differentiation.
In an attempt to analyze the cue-signal-response relationship underlying SC self-renewal versus differentiation, the phosphorylation states of 31 intracellular signaling network components were quantitatively studied under fibronectin, laminin, leukemia inhibitory factor, and fibroblast growth factor-4 treatments . Using a multivariate proteomic approach, Prudhomme et al.  identified a set of signaling network components most critically associated with differentiation (Stat3, Raf1, MEK, and ERK), proliferation of undifferentiated mESCs (MEK and ERK), and proliferation of differentiated cells (PKBα, Stat3, Src, and PKCε).
A quantitative MS-based phosphoproteomics approach has been applied to elucidate critical differences in the signaling mechanisms of EGF and PDGF that led to the differential effects on osteoblast differentiation of human MSC (as described in Post-Translational Modification) . By studying tyrosine-phosphorylated proteins in response EGF and PDGF, Kratchmarova et al.  found that less than 10% of all phosphotyrosine proteins are specific to either the EGF or PDGF activation program in human MSCs, revealing a range of widely shared signaling pathways. Examples include the mitogen-activated MAPK cascades and signal attenuation through receptor ubiquitination followed by endocytic removal from the cell surface. However, based on the observation that EGF-treated human MSCs but not PDGF-treated cells undergo osteogenic differentiation, the variation contained in the 10% of differentially activated genes was clearly of crucial significance. The most striking difference was the preferential activation of phosphatidylinositol 3-kinase exclusively by PDGF, signifying a possible control point in the osteogenic differentiation process. These results demonstrated that, at least in some cases, decisions can be made by preferentially activating a small subset of the signaling network.
Using a chip-based proteomics approach, factors affecting the proliferation of NSCs have been screened. Sakaguchi et al.  used a ProteinChip system to identify molecules present in conditioned medium of OP9, a BM cell line that induces neurosphere formation from NSCs. In this screen, they identified a soluble carbohydrate-binding protein, Galectin-1, as a candidate. Galectins make up a family of carbohydrate-binding lectin proteins that are implicated in cell adhesion, growth, differentiation, neoplastic transformation, and metastasis . Galectin-1 has also been identified as one of the relatively abundant proteins in mouse embryonic fibroblast-conditioned medium  and human foreskin fibroblast-conditioned medium . Based on results from intraventricular infusion experiments and phenotypic analyses of knockout mice, Sakaguchi et al.  suggested that the carbohydrate-binding activity of Galectin-1 is required for its promotion of adult neural progenitor cell proliferation.
In a recent investigation, proteomics has been applied to gain insight into the regulatory protein networks in which Nanog operates in mESCs . A construct bearing pluripotency factor Nanog with a Flag tag as well as a peptide tag that serves as a substrate for in vivo biotinylation was expressed in ESCs. The tagged protein was recovered from cellular lysates with streptavidin beads and further purified using anti-Flag antibodies. MS was then applied to identify its interacting partners. Not surprisingly, many of the candidates were other transcription factors, some of which had already been associated with ESCs. The resulting data set was used to generate a complex network of interacting proteins that were depicted in a concise scheme . Most of proteins in this network were shown to be essential for early development and/or ESC properties. The knockout of several network proteins, including Prmt1, YY1, Rnf2, BAF155, Rybp, Oct4, Cdk1, NF45, Sall4, Elys, Tif1β, Pelo, Dax1, and REST, resulted in defects in proliferation and/or survival of the inner cell mass or other aspects of early development. The knockout of Err2, Rif1, Nac1, and Zfp281 resulted in defects in self-renewal and/or differentiation of ESCs. The coexpression of most of network genes and their roles as both targets and effectors indicate that this interactome may serve as a functional module committed to maintaining ESC pluripotency. This network provided a solid base for further exploration of the signaling pathways involved in ESC maintenance . Sall4 had been found to be involved in these signaling pathways by three other groups independently [107, –109]. This protein was also identified in the large-scale proteome study by van Hoof et al. ; however, the association with Oct4 and Nanog had not been made. This illustrates the likelihood that numerous proteins specifically identified in SCs play a significant role in SC sustaining processes. Venn diagrams such as Figure 5 will narrow down the search for novel ESC-associated proteins; however, the involvement and role of such candidates in SC maintenance needs to be confirmed by additional experiments.
Future Challenges and Outlook
Proteomic methods have produced large data sets of proteins involved in mechanisms and pathways that regulate SC proliferation and differentiation. The insights thus obtained in SC biology have also created many opportunities to improve public health. In recent years, numerous proteomics techniques have been developed and are continuously being improved, in both peptide and protein separation (e.g., LC and 2-DE), as well as MS methods and accuracy. However, several important issues that remain to be addressed rely on further technical advances in proteomics analysis. When large proteomes consisting of thousands of proteins are analyzed, the dynamic resolution is restricted and only the most abundant proteins can be detected .
Despite advances in non-2-DE based proteomics technologies, 2-DE remains the pivotal and most widespread method of currently proteomics [73, 80]. However, we believe that large-scale MS-based quantification approaches will significantly contribute to our understanding of SCs in the future and will soon become the standard to analyze the SC proteome. To enable proteome-wide quantification, further optimization of chemical-labeling reagents, including chemicals targeting specific protein classes and MS instrument performance, are necessary .
Proteome-wide quantification of membrane proteins requires methods that solve problems such as contamination of intracellular components, protein insolubility, and loss of hydrophobic peptides, which prevent protein identification. Although protein chips are still under development, they have already proven their value to study protein functions and expression patterns. Requiring only small amounts of material makes them exceptionally well-suited to study SC populations. However, further optimization of these techniques is needed before they can be widely used in proteome analyses. The application of several array-based approaches, such as phosphorylation or G protein-coupled receptor arrays, that are missing from the current SC literature will provide highly valuable contributions.
The advantage of MS-based proteomics is its ability to indiscriminately study PTMs that affect activity and binding properties of proteins, thereby altering their roles within the cell. It is likely that the phosphoproteome, protein interactions (interactome), and glycomics will soon become major areas of SC proteomics research.
One of the major problems in the SC studies is to obtain consistent results for the same type studies. Proteomics may very well contribute to gaining insight into SC functioning and behavior and thus provide clues for how to tackle these problems. Obviously, gaining more insight into how SCs respond to their environment will improve our ways to control their behavior by applying better-defined culture methods.
Despite increasing conformity in proteomics applications and data storage, it remains difficult to draw consistent conclusions from individual studies because of the use of different cell types, establishment, and maintenance; the number of passages; and the passaging methods applied. Standardization of proteomic methodologies and strategies between different groups of investigators and introduction of standard operation procedures would facilitate the comparability of proteomics results. In addition, the establishment of unique databases for the ever-increasing wealth of information generated by proteome-wide and in-depth proteomic studies of SCs will be indispensable (Fig. 6). To this end, several initiatives were set up to characterize numerous existing hESC lines using standardized assay conditions to allow unrestrained comparison of the data sets generated. Such initiatives have been instigated by the International Stem Cell Initiative , the International Stem Cell Forum (http://www.stemcellforum.org), the NIH Stem Cell Unit (http://www.stemcells.nih.gov/research/nihresearch/scunit), and the American Type Culture Collection (http://www.stemcells.atcc.org; ). Combined, the various proteomic approaches will continue to revolutionize insights into SC biology.
Disclosure of Potential Conflicts of Interest
The authors indicate no potential conflicts of interest.
We gratefully thank Dr. Peter Hains (Australia) for critical reading and helpful comments on the manuscript. This work was supported by grants from the Royan Institute.