Comparative proteomics of the recently and recurrently formed natural allopolyploid Tragopogon mirus (Asteraceae) and its parents


  • Jin Koh,

    1. Department of Biology, University of Florida, Gainesville, FL 32611, USA
    2. Interdisciplinary Center for Biotechnology Research, University of Florida, PO Box 103622, Gainesville, FL 32610, USA
    Search for more papers by this author
  • Sixue Chen,

    1. Department of Biology, University of Florida, Gainesville, FL 32611, USA
    2. Interdisciplinary Center for Biotechnology Research, University of Florida, PO Box 103622, Gainesville, FL 32610, USA
    3. Genetics Institute, University of Florida, Gainesville, FL 32610, USA
    Search for more papers by this author
  • Ning Zhu,

    1. Department of Biology, University of Florida, Gainesville, FL 32611, USA
    Search for more papers by this author
  • Fahong Yu,

    1. Interdisciplinary Center for Biotechnology Research, University of Florida, PO Box 103622, Gainesville, FL 32610, USA
    Search for more papers by this author
  • Pamela S. Soltis,

    1. Genetics Institute, University of Florida, Gainesville, FL 32610, USA
    2. Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
    Search for more papers by this author
  • Douglas E. Soltis

    1. Department of Biology, University of Florida, Gainesville, FL 32611, USA
    2. Genetics Institute, University of Florida, Gainesville, FL 32610, USA
    Search for more papers by this author

Author for correspondence:
Douglas E. Soltis
Tel: +1 352 273 1963


  • We examined the proteomes of the recently formed natural allopolyploid Tragopogon mirus and its diploid parents (T. dubius, T. porrifolius), as well as a diploid F1 hybrid and synthetic T. mirus.
  • Analyses using iTRAQ LC-MS/MS technology identified 476 proteins produced by all three species. Of these, 408 proteins showed quantitative additivity of the two parental profiles in T. mirus (both natural and synthetic); 68 proteins were quantitatively differentially expressed.
  • Comparison of F1 hybrid, and synthetic and natural polyploid T. mirus with the parental diploid species revealed 32 protein expression changes associated with hybridization, 22 with genome doubling and 14 that had occurred since the origin of T. mirus c. 80 yr ago. We found six proteins with novel expression; this phenomenon appears to start in the F1 hybrid and results from post-translational modifications.
  • Our results indicate that the impact of hybridization on the proteome is more important than is polyploidization. Furthermore, two cases of homeolog-specific expression in T. mirus suggest that silencing in T. mirus was not associated with hybridization itself, but occurred subsequent to both hybridization and polyploidization. This study has shown the utility of proteomics in the analysis of the evolutionary consequences of polyploidy.


Polyploidy (whole-genome duplication, WGD) has played a major role in speciation and genome evolution in diverse organisms, including yeast (Kellis et al., 2004), vertebrates (Dehal & Boore, 2005; Hufton et al., 2008) and plants (Soltis & Soltis, 2000; Wendel, 2000; Osborn et al., 2003; Adams & Wendel, 2005; Chen & Ni, 2006; Chen, 2007; Jiao et al., 2011). Early estimates (based on chromosome number) of the frequency of polyploidy in angiosperms ranged from 30% to 80%. However, recent studies have revealed that all angiosperms have experienced one or more rounds of polyploidy. Angiosperms for which genome sequences are available exhibit evidence of at least one round of polyploidy (Arabidopsis Genome Initiative, 2000; Simillion et al., 2002; International Rice Genome Sequencing Project, 2005; Tuskan et al., 2006; Jaillon et al., 2007; Velasco et al., 2007; Wei et al., 2007; Ming et al., 2008; Huang et al., 2009; Paterson et al., 2009; Schmutz et al., 2010); genomic and transcriptomic data (including expressed sequence tags (ESTs)) across a diversity of angiosperms reveal multiple ancient polyploidization events (Ku et al., 2000; Cui et al., 2006). Indeed, ancient polyploidy events are associated with the emergence of both seed plants and angiosperms (Jiao et al., 2011), as well as with major large clades within angiosperms (Soltis & Soltis, 2000; Soltis et al., 2009; Fawcett & Van de Peer, 2010).

The most obvious genetic consequence of polyploidy is the simultaneous duplication of all genes, which may provide increased evolutionary potential for any given polyploid. Although genes duplicated via polyploidy can be subsequently silenced or lost (nonfunctionalization), the divergence of duplicated genes can provide genetic redundancy on which mutation and selection may act. These processes may lead to the possible acquisition of new gene functions (neofunctionalization) and the acquisition of divergent gene functions, such as tissue-specific gene expression (subfunctionalization); both neo- and subfunctionalization may be important sources of novelty and adaptability (Kirschner & Gerhart, 1998; Lynch & Conery, 2000; Force et al., 2005; He & Zhang, 2005). Studies of the expression patterns of duplicated genes in polyploids have been conducted using systems such as Arabidopsis (Comai et al., 2000; Lee & Chen, 2001; Madlung et al., 2002, 2005; Lawrence et al., 2004; Wang et al., 2004, 2006), cotton (Zhao et al., 1998; Liu et al., 2001; Adams et al., 2003; Hovav et al., 2008; Chaudhary et al., 2009; Flagel & Wendel, 2009), wheat (Feldman et al., 1997; Shaked et al., 2001; Kashkush et al., 2002, 2003; He et al., 2003; Bottley et al., 2006; Bottley & Koebner, 2008; Pumphrey et al., 2009), Brassica (Song et al., 1995; Lukens et al., 2004, 2006; Gaeta et al., 2007; Marmagne et al., 2010), Spartina (Baumel et al., 2001; Ainouche et al., 2004; Salmon et al., 2005; Fortune et al., 2007), Senecio (Hegarty et al., 2005, 2006, 2008; Hegarty & Hiscock, 2008) and Tragopogon (Tate et al., 2006; Buggs et al., 2009; Koh, 2010; Koh et al., 2010). Most of these approaches have been carried out at the genomic or transcript level using cDNA-amplified fragment length polymorphism (cDNA-AFLP) display, CAPS (cleaved amplified polymorphic sequences), reverse transcriptase-PCR or microarrays. Very few studies have examined the outcome of gene expression at the protein level in a polyploid and its parents (Bahrman & Thiellement, 1987; Albertin et al., 2005, 2006; Hu et al., 2011; Ng et al., 2012).

Proteomics is an important complement to genomics and studies of gene expression because proteins are the final products of genes and are more directly related to cellular metabolism and phenotypes. Indeed, the relationship between RNA transcripts and protein abundance is not direct because of post-transcriptional regulation and post-translational modifications (PTMs) (Gygi et al., 1999), making predictions about the proteome of a polyploid relative to its diploid progenitors difficult. Therefore, the application of proteomic approaches to polyploid systems will enhance our understanding of polyploid evolution and adaptation. Only a few studies have implemented proteomic analysis in polyploids. These include studies on Triticum (wheat; Bahrman & Thiellement, 1987; Islam et al., 2003), Brassica napus (Albertin et al., 2006), Gossypium (cotton; Hu et al., 2011) and Arabidopsis suecica (Ng et al., 2012). Several studies have shown additivity of the parental profiles in the proteomes of allopolyploids (Triticum, Gossypium, Arabidopsis) (Bahrman & Thiellement, 1987; Hu et al., 2011; Ng et al., 2012), as well as increased protein production in an autopolyploid (Brassica oleracea; Albertin et al., 2005); nonadditive patterns have been observed in wheat (Bahrman & Thiellement, 1987), cotton (Islam et al., 2003; Hu et al., 2011) and synthetic B. napus (Albertin et al., 2006). However, all of these studies employed two-dimensional electrophoresis (2-DE), which has limitations, including gel-to-gel variation and problems with quantification based on spot intensity. In addition, Triticum, Gossypium and Brassica are polyploid crops and may have experienced strong artificial selection during their evolutionary history, possibly masking other effects. Therefore, the use of robust methodology and the examination of a naturally occurring allopolyploid system will provide valuable new insights into the impact of hybridization and polyploidization on the proteome.

Two natural allotetraploids of Tragopogon have become models for polyploidy research (Soltis & Soltis, 1999; Soltis et al., 2004, 2012). Tragopogon mirus and T. miscellus were formed by genomic merger of the diploid species T. dubius and T. porrifolius, and T. dubius and T. pratensis, respectively. Polyploidization occurred in the northwestern USA after the introduction of the diploid progenitors from Europe in the early 1900s. The three diploids were not all reported from this region before 1928, and so T. mirus and T. miscellus cannot be more than c. 80 yr old (Ownbey, 1950; Soltis et al., 2004). Studies have revealed rapid homeolog loss, plus differential gene expression, in both polyploids relative to their parents, as well as chromosomal rearrangement and compensating aneuploidy (Tate et al., 2006, Tate et al. 2009; Lim et al., 2008; Buggs et al., 2009a,b, 2010, Buggs et al., 2011; Buggs et al. 2012; Koh, 2010; Koh et al., 2010; Chester et al., 2012). Here, we examined the proteomes of natural allopolyploid T. mirus and its diploid parents (T. dubius and T. porrifolius) on a per-cell basis using isobaric tag for relative and absolute quantification (iTRAQ) LC-MS/MS technology. We included an artificial F1 diploid hybrid line between T. dubius and T. porrifolius and a synthetic (S1 generation) allotetraploid T. mirus line. Analysis of these materials will help to determine the relative contribution of hybridization and genome doubling to expression differences at the proteomic level in young polyploids. We sought to determine whether: (1) differences in protein expression are present among the diploid parents, artificial F1 hybrid, natural allopolyploids and synthetic polyploid, (2) hybridization or polyploidy has had a larger impact on changes in protein expression, and (3) any proteomic changes correlate with existing changes obtained from transcriptional analysis (Koh, 2010; Koh et al., 2010).

Materials and Methods

Plant materials

For all three species, seeds were collected from natural populations in Pullman, WA, USA, and grown in the glasshouse at Washington State University (Pullman, WA, USA) and allowed to self-fertilize. Seeds were collected from these glasshouse-grown plants, germinated and grown under controlled conditions in the Department of Biology glasshouse at the University of Florida (Gainesville, FL, USA). Diploid F1 hybrid and synthetic allopolyploid (S1) plants used in this study were generated by J. Tate, who crossed T. dubius L. with T. porrifolius L. (Table 1) and produced synthetic polyploids (S1) using colchicine treatment of F1 hybrids (Tate et al., 2009). Seeds from the above materials were germinated at 20°C in a Petri dish containing 0.1% bleach; the plants were then grown in a glasshouse. Leaf segments of 30 cm in length were collected directly into liquid nitrogen from young plants, 8 wk after germination. Eight lineages were examined: two samples of natural Tragopogon mirus (Ownbey) as well as two of each of its diploid parents from the same locality (Pullman, WA, USA), one F1 hybrid and one synthetic allopolyploid (S1). For each of these eight lineages, we used two biological replicates in this study (Table 1). For each replicate, leaf tissue from 10 full sibs was combined to account for variation among individuals; thus, 20 plants were used for each lineage, except the F1 hybrid (10 plants).

Table 1. Tragopogon plant lineages used in this study and their iTRAQ (isobaric tag for relative and absolute quantification) tag number
TaxonDescriptionPlant lineageTag number
T. porrifolius Maternal parent (2n)2611-1113
T. porrifolius Maternal parent (2n)2611-11114
T. dubius Paternal parent (2n)2613-5115
T. dubius Paternal parent (2n)2613-24116
T. mirus Allotetraploid (4n)2680-3117
T. mirus Allotetraploid (4n)2680-7118
F1 hybridHybrid between 2613-24 (♀) × 2611-1 (♂) (2n)299 (2611-1 × 2613-24)119
S1 generationChromosomal doubling of hybrid between 2613-24 (♀) × 2611-11 (♂) (4n)73-1 (2611-11 × 2613-24)121

Protoplast preparation

Tragopogon protoplasts from leaves were isolated and purified as described in the protocol for Brassica (Zhu et al., 2009) and Triticum (Edwards et al., 1978) with the following modifications. Plants were collected early each morning between 09:00 and 09:30 h. Leaf segments of 4.5 cm in length were prepared with mechanical blades. Approximately 0.04 g of tissue was cut and placed in 2 ml of enzyme medium (0.45 M sorbitol, 5 mM CaCl2, 10 mM 2-(N-morpholino)ethanesulfonic acid (MES), 0.25% Macerozyme (PhytoTechnology Laboratories, Shawnee Mission, KS, USA), 1% Cellulase (PhytoTechnology Laboratories) adjusted to pH 5.8) in a Petri dish (50 mm × 20 mm). The segments were agitated into the enzyme medium. The segments were washed twice with 2 ml of 0.45 M sorbitol, 5 mM CaCl2 and 10 mM MES by gentle shaking. After each washing, the released protoplasts were filtered through a coarse mesh tea strainer (1-mm apertures). We observed and recorded the cell size of protoplasts using a hemocytometer under a Leica DM4000B microscope (Leica, Buffalo Grove, IL, USA) with LAS Live Measurement DE software (Leica). To assess variation among individuals, we measured a mature basal leaf of 10 plants from the diploid F1 hybrid and 50 plants per polyploid and its parents in this assay. Based on the cell size of protoplasts, we inferred the volume of protoplasts and density of unit area from the leaf tissue.

Total protein comparison

Total proteins were isolated from 3 cm of leaf tissue from each species with 100 mM tris(hydroxymethyl)aminomethane (Tris)-HCl, 1 mM dithiothreitol (DTT), 1 mM phenylmethylsulfonylfluoride (PMSF) and 0.1% Triton X-100. To assess variation among individuals, 54 plants per species were used in this assay. Protein assays were performed using Bradford solution (Invitrogen, Carlsbad, CA, USA) with the SoftMax Pro Software v5.3 (Molecular Devices, Downingtown, PA, USA).

Protein preparation

For each sample listed in Table 1, total proteins were isolated from 2 g of leaf tissue pooled from 10 plants per lineage, and purified as described in Hajduch et al. (2005), except that proteins were washed in 80% cold acetone to remove impurities (Thongboonkerd et al., 2002). Protein assays were performed using an EZQ® Protein Quantitation Kit (Invitrogen) with the SoftMax Pro Software v5.3 (Molecular Devices).

Protein digestion, iTRAQ labeling, strong cation exchange and MS

For each sample, 100 μg of protein was dissolved in dissolution buffer with 2% sodium dodecylsulfate (SDS) in the iTRAQ Reagents 8-plex kit (AB Sciex, Inc., Foster City, CA, USA). The samples were reduced, alkylated, trypsin digested and labeled according to the manufacturer’s instructions (AB Sciex, Inc.). Two independent maternal lines of T. porrifolius were labeled with iTRAQ tags 113 and 114, and two independent paternal lines of T. dubius were labeled with tags 115 and 116. Two independent allopolyploid lines of T. mirus were labeled with tags 117 and 118, and one diploid F1 hybrid line and one synthetic S1 line were labeled with tags 119 and 121, respectively (Table 1). The combined peptide mixtures were lyophilized and dissolved in strong cation exchange (SCX) solvent A (25% (v/v) acetonitrile, 10 mM ammonium formate and 0.1% (v/v) formic acid (pH 2.8)). The peptides were fractionated using an Agilent HPLC system 1100 with a polysulfoethyl A column (2.1 mm × 100 mm, 5 μm, 300 Å; PolyLC, Columbia, MD, USA). Peptides were eluted with a linear gradient of 0–20% solvent B (25% (v/v) acetonitrile and 500 mM ammonium formate (pH 6.8)) over 50 min, followed by ramping up to 100% solvent B in 5 min. The absorbance at 280 nm was monitored, and 12 fractions were collected and lyophilized (Supporting Information Fig. S1). A quadrupole time-of-flight QSTAR MS/MS system (AB Sciex, Inc.) was used for data acquisition, as described previously (Chen & Harmon, 2006; Zhu et al., 2009). Peptides were passed through the HPLC column by a linear gradient from 3% solvent B (96.9% acetonitrile (v/v), 0.1% (v/v) acetic acid) to 40% solvent B for 2 h, followed by ramping up to 90% solvent B in 10 min. Peptides were sprayed into the orifice of the mass spectrometer, which was operated in an information-dependent data acquisition mode.

RNA extraction, purification and cDNA synthesis

Total RNA from leaf tissue was extracted and treated with DNA-free DNase I (Ambion, Austin, TX, USA) as described in Koh et al. (2010). Purified RNA samples were quantified using a Nanodrop ND-4000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA) and evaluated using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). First-strand cDNA was synthesized with 5 μg of purified RNA using reverse transcriptase (RT) Superscript III (Invitrogen, Carlsbad, CA, USA).

Quantitative real-time PCR

Quantitative real-time PCR assays for each target were performed via MyiQ ver. 2.0 with iQ ver. 5.0 (Bio-Rad, Hercules, CA, USA), using the primers (Supporting Information Table S1) and SYBER green dye method described previously (Koh, 2010).

Data analysis

Comparison of protoplast cell size and protein amount  ANOVA was performed using JMP 8.0.1 (SAS Institute, Cary, NC, USA) to determine whether there were differences in cell size and protein amount among species. A two-sample t-test was performed to assess whether the cell size and protein amount differed between the diploid parental species.

MS data  The MS/MS data were processed by a thorough search considering biological modification and amino acid substitution against the National Center for Biotechnology Information (NCBI) nonredundant fasta database (7 852 350 entries), a customized nonredundant Asteraceae fasta database ( and; 5 125 244 entries) and a nonredundant Tragopogon diploid fasta database (T. dubius, T. porrifolius and T. pratensis; 4 153 438 entries) using the Fraglet and Taglet searches under the Paragon™ algorithm (Shilov et al., 2007) of ProteinPilot v.3.0 software (AB Sciex, Inc.). The Asteraceae database was used to enhance the success of protein identification because of the limited number of Tragopogon sequences available. The identified proteins were functionally annotated according to their similarity to other proteins based on protein–protein BLAST searches, protein family database information and/or literature information (Bevan et al., 1998). To examine homeolog-specific expression, we analyzed our data against nonredundant T. dubius (163 479) and T. porrifolius (120 744) fasta databases, respectively. After searching against these databases, the results were combined per experimental group (biological replicate 1 and biological replicate 2). Plant species, fixed modification of methylmethane thiosulfate-labeled cysteine, fixed iTRAQ modification of amine groups in the N-terminus and lysine and variable iTRAQ modifications of tyrosine were considered. The raw peptide identification results from the Paragon™ algorithm were further processed by the ProGroup™ algorithm, which assembled the results into the minimal set of detected proteins. The ProteinPilot cut-off score was set to 1.3, which corresponds to a confidence level of 95%. The false discovery levels were estimated by performing the search against concatenated databases containing both forward and reverse sequences (Table S2).

For protein quantification, only MS/MS spectra that were unique to a particular protein, and where the sum of the signal-to-noise ratio for all of the peak pairs was > 9, were used (software default settings, AB Sciex, Inc.). The accuracy of each protein ratio is given by a calculated error factor from the ProGroup analysis in the software, and a P value is given to assess whether the protein is significantly differentially expressed. The error factor is calculated with 95% confidence error, where it is the weighted standard deviation of the weighted average of log ratios multiplied by Student’s t factor. The P value is determined by calculating Student’s t factor by dividing the (weighted average of log ratios – log bias) by the weighted standard deviation, allowing the determination of the P value with n−1 degrees of freedom, where n is the number of peptides contributing to the protein relative quantification (software default settings, AB Sciex, Inc.). To be identified as being significantly differentially expressed, a protein should be quantified with at least three spectra (allowing the generation of a P value), with P < 0.05 with at least six peptides in both experimental replicates. Additivity of protein expression was assessed quantitatively; we calculated additive expression based on mid-parent values (MPVs; averaged values from two biological replicates of each parent) and used a fold change > 1.5 or < 0.5 relative to MPV to identify differential expression.

Real-time RT-PCR  Relative expression of the target genes was calculated using the comparative Ct method (Applied Biosystems Inc., Carlsbad, California, USA). The differences in Ct values (ΔCt) between the target gene and endogenous control were calculated to normalize the differences in the cDNA concentrations for each reaction, as described previously (Koh, 2010).


Cell size, density and protein amount in Tragopogon

We examined the cell size, cell density and volume from diploids, an F1 hybrid, and both natural and synthetic polyploids (Fig. 1 and Tables 1, 2). The F1 hybrid has cells that are intermediate in size to its parents; the average protoplast size of the F1 hybrid is 73 709.65 μm3, whereas the average cell sizes of the paternal parent, T. dubius, and the maternal parent, T. porrifolius, are 44 203.09 and 110 458.09 μm3, respectively. Natural (155 397.55 μm3) and synthetic (153 693.04 μm3) T. mirus have comparable values that are approximately twice the cell size of the diploid F1 hybrid and are close to the sum of the cell sizes of the diploid parents (154 611.18 μm3).

Figure 1.

Representative cells of isolated protoplasts from the Tragopogon species (×400).

Table 2.   Total protein amount based on cell size among Tragopogon species
 Cell volume (μm3)SDCell number g−1SDProtein amount (mg)SD
T. porrifolius 1.10E+053.27E+044.28E+054.70E+041.94E+041.56E+03
T. dubius 4.42E+042.36E+048.87E+051.32E+052.28E+042.41E+03
F1 hybrid7.37E+042.53E+045.26E+056.84E+042.70E+041.35E+03
S1T. mirus1.54E+052.80E+043.37E+054.19E+041.58E+041.84E+03
T. mirus 1.55E+053.07E+042.75E+053.50E+041.33E+042.81E+03

We inferred the cell density of T. mirus, its parents and the F1 hybrid using the protoplast sizes reported above and equivalently sized leaf samples from all individuals. Tragopogon dubius shows the highest density, and the allopolyploids exhibit the lowest density. The F1 hybrid has more cells than T. porrifolius, but fewer than T. dubius.

We examined protein quantity in all species and determined whether cell size/density and protein amount are correlated (r2 = 0.5818201, P < 1.09e-06). The diploid F1 hybrid has the highest protein content (26.97 mg g−1), and T. mirus has the lowest (13.34 mg g−1; Table 2). We compared cell density estimates with cell volume to protein amount. Plants with many cells per unit area were found to produce a large amount of protein. There appears to be a positive correlation between the total amount of protein and cell number.

Tragopogon protein identification and quantification

Using iTRAQ labeling and the 2D LC-MS/MS method, 476 proteins were identified, 470 from the first experiment and six extra proteins from the second experiment (Table S3). Searching against three concatenated databases allowed the calculation of the false discovery rates for these experiments, which are below 4% at the protein level in all analyses (Table S2). These proteins were identical in all samples and covered a wide range of molecular functions, including photosynthesis (17.8%), metabolism (16.1%), stress and defense (15.9%), energy (respiration; 12.4%), protein synthesis (8.6%) and signal transduction (6.4%) (Fig. 2a). Of these 476 proteins, 408 showed quantitative additivity of the two parental profiles in T. mirus (both natural and synthetic). Sixty-eight proteins were quantitatively differentially expressed in natural T. mirus, synthetic T. mirus or the F1 hybrid compared with the parental species. When classified into functional categories, the differentially expressed proteins exhibited a similar functional distribution as observed in all of the identified proteins (Fig. 2b). In particular, photosynthesis-related proteins account for 24% of the differentially expressed proteins, followed by metabolism (17%), energy (14%), and stress and defense (11%) (Fig. 2b).

Figure 2.

Classification of identified proteins by molecular function. (a) Classification of the 476 proteins. (b) Classification of the 68 proteins differentially expressed in Tragopogon mirus, F1 hybrid or synthetic (S1) T. mirus.

Proteome variation in leaves of diploid parents

We first determined parental variation between two samples of each diploid parent (T. dubius and T. porrifolius). Thirteen proteins showed differential expression in either T. dubius or T. porrifolius (Table 3). In T. porrifolius, three proteins (cytosolic glutamine synthetase GS β1, predicted protein and vegetative storage protein) showed differential expression between the two samples (Table 3a). However, such differential expression patterns are expected by chance (α = 1%; five of 476 proteins). In T. dubius, seven proteins (phosphoglycerate kinase precursor-like, calcium ion binding, oxygen-evolving enhancer protein 3 precursor-like protein, ATPase β subunit, ATP synthase catalytic subunit A, ATP synthase CF1 β subunit and photosystem II 44-kDa protein) exhibited differential expression between the two samples (Table 3b), and this number is nearly twice that expected by chance (α = 1%; five of 476 proteins). Therefore, there might be minor variation in the proteome between the two T. dubius samples used in this study. We also analyzed protein expression levels between the maternal progenitor (T. porrifolius) and paternal progenitor (T. dubius) of the polyploid T. mirus. Fifteen proteins were differentially expressed between the diploid parents (Table 3c). The number of proteins identified here is much higher than that expected by chance (α = 1%; five of 476 proteins); thus, the parental leaf proteomes within T. mirus differ significantly.

Table 3.   Variations in leaf proteomes between Tragopogon porrifolius individuals, T. dubius individuals, T. porrifolius and T. dubius, and T. mirus individuals
(a) Variation between T. porrifolius individuals
AccessionProtein IDSpecies2611-11/2611-1
Ave. fold P value
gi|10946357Cytosolic glutamine synthetase GS β1 Glycine max 3.0620.0002
gi|224112154Predicted protein Populus trichocarpa 0.3740.0009
gi|15387599Vegetative storage protein, VSP Cichorium intybus 0.0790.0017
(b) Variation between T. dubius individuals
AccessionProtein IDSpecies2613-24/2613-5
Ave. fold P value
gi|82621134Phosphoglycerate kinase precursor-like Solanum tuberosum 0.3340.0002
gi|15234637Calcium ion binding Arabidopsis thaliana 5.854< 0.0001
gi|23308489Oxygen-evolving enhancer protein 3-like Arabidopsis thaliana 6.313< 0.0001
gi|15234637ATPase β subunit Nicotiana sylvestris 0.280.0016
gi|15219366ATP synthase catalytic subunit A Arabidopsis thaliana 0.205< 0.0001
gi|81176257ATP synthase CF1 β subunit Lactuca sativa 0.3195< 0.0001
gi|94502487Photosystem II 44-kDa protein Helianthus annuus 5.6130.0006
(c) Variation between T. porrifolius and T. dubius
AccessionProtein IDSpeciesT. dubius/T. porrifolius
Ave. fold P value
gi|584797ATP synthase subunit β Daucus carota 0.023< 0.0001
gi|68566313Elongation factor TuA Nicotiana sylvestris 0.0430.0001
gi|75271099Chlorophyll a/b-binding protein Oryza sativa 0.041< 0.0001
gi|4325041FtsH-like protein Pftf precursor Nicotiana tabacum 0.047< 0.0001
gi|3328122Phosphoglycerate kinase precursor Solanum tuberosum 0.0480.0003
gi|158513966Ribulose bisphosphate carboxylase large chain precursor Crucihimalaya wallichii 0.112< 0.0001
gi|195622012Membrane-associated 30-kDa protein Zea mays 0.120.0011
gi|255557841Chlorophyll a/b-binding protein, putative Ricinus communis 2.3260.0005
gi|77551383ABC1 family protein Oryza sativa 4.805< 0.0001
gi|189418957Glycolate oxidase Mikania micrantha 6.7370.0001
gi|15234637Calcium ion binding Arabidopsis thaliana 7.687< 0.0001
gi|1172664Photosystem I reaction center subunit III Flaveria trinervia 6.5220.0004
gi|15387599Vegetative storage protein, VSP Cichorium intybus 8.9590.0004
gi|75755655Photosystem II 44-kDa protein Acorus calamus 9.7380.0004
gi|2499497Phosphoglycerate kinase, chloroplastic Nicotiana tabacum 13.204< 0.0001
(d) Variation between T. mirus individuals
AccessionProtein IDSpecies2680-7/2680-3
Ave. fold P value
  1. Species, name of species on which protein ID is based.

gi|23308489Oxygen-evolving enhancer protein 3 precursor-like protein Arabidopsis thaliana 0.3800.0008
gi|81176257ATP synthase CF1 β subunit Lactuca sativa 0.4840.0005
gi|197132074Photosystem II CP47 chlorophyll apoprotein Geranium palmatum 14.0850.0007
gi|30695271Transketolase Arabidopsis thaliana 3.89150.0005
gi|94502487photosystem II 44-kDa protein Arabidopsis thaliana 6.9980.0003
gi|224065198Histone 2 Populus trichocarpa 3.63050.0008

Proteome variation in leaves of allopolyploid T. mirus

We analyzed proteome variation in natural allopolyploid T. mirus and examined parent-specific expression. Tragopogon mirus showed general proteomic additivity of the parental proteomes. Six proteins (oxygen-evolving enhancer protein 3 precursor-like protein, ATP synthase CF1 β subunit, photosystem II CP47 chlorophyll apoprotein, transketolase, photosystem II 44-kDa protein and histone 2) showed differential expression between the two samples of T. mirus (Table 3d), which is slightly higher than the number expected as a result of random variation (α = 1%; five of 476 proteins). In particular, half of these proteins are a result of variation in the proteome of the diploid parent T. dubius. Thus, the variation observed in the proteome of T. mirus may be attributed in part to variation in the parental proteomes (Table 3).

Proteome variation in natural and synthetic allopolyploids and the diploid F1 hybrid

We used three databases (based on NCBI greenplants, Asteraceae and Tragopogon) to provide putative identifications of proteins and detected six novel proteins expressed in the F1 hybrid and synthetic and natural T. mirus, four of which result from PTMs, such as deamidation, oxidation and proteolytic cleavage (Fig. S2). PTMs based on proteolytic cleavage were observed in cytochrome f and ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco) activase; deamination was found in phosphoglycerate kinase. The fourth unique protein resulted from oxidation of filamentation temperature-sensitive H 2A. These proteins were observed in the diploid F1 hybrid and synthetic and natural polyploids. In addition, we detected more PTMs for N-terminal acetylation of catalase 3 in the diploid F1 hybrid and MAP3K delta-1 protein kinase in synthetic T. mirus only.

When we compared the expression levels of the F1 hybrid and synthetic and natural T. mirus with those of the diploid parents, 68 (14.3%) proteins showed differential expression levels (Fig. 3). The other 408 proteins (85.7%) exhibited additive expression levels (Tables 4, S4). There were 44 differentially expressed proteins in the natural allopolyploid T. mirus relative to its diploid parents. Of these, 27 proteins were up-regulated relative to the diploid parents, whereas 15 proteins showed down-regulation. The remaining two proteins (ribosomal protein L12 and Rubisco large chain precursor) showed up-regulation relative to T. dubius, but down-regulation relative to T. porrifolius. Furthermore, up-regulated proteins are almost twice as frequent as down-regulated proteins (27 vs 15), and almost equal proportions of up- and down-regulated proteins were identified compared with each parental proteome (Tables 4, S4).

Figure 3.

Venn diagram showing differentially expressed proteins among F1 hybrid, synthetic (S1) and natural Tragopogon mirus.

Table 4.   Analysis of leaf proteomes of F1 hybrid, synthetic Tragopogon mirus (S1) and natural T. mirus
PatternF1 hybridSynthetic (S1)T. mirus
Number of proteins%Number of proteins%Number of proteins%
  1. *Proteins are up-regulated relative to Tdu, but down-regulated relative to Tpo. Tdu, T. dubius; Tpo, T. porrifolius.

Differential expression326.75010.5449.2
 Relative to Tdu163.4234.8163.4
 Relative to Tpo102.1234.8183.8
 Relative to both61.340.8102.1
Up-regulated proteins173.6347.1275.7
 Relative to Tdu122.5183.8102.1
 Relative to Tpo30.6132.7102.1
 Relative to both20.430.671.5
Down-regulated proteins142.9153.2153.2
 Relative to Tdu40.851.161.3
 Relative to Tpo71.5102.181.7
 Relative to both30.6  10.2
Up- and down-regulated proteins*
No differential expression44493.342689.543290.8
Total proteins identified476 476 476 

In the synthetic allopolyploid plant, 50 (10.5%) proteins were differentially expressed relative to its diploid parents (2611-11 as the maternal parent and 2613-24 as the paternal parent; Table 1). The other 426 (89.5%) proteins showed additive patterns (Tables 4, S4), similar to the results for natural T. mirus. Similar proportions of differentially expressed proteins were up- or down-regulated in the synthetic polyploid compared with natural T. mirus (Table 4). However, the number of up-regulated proteins was higher than the number of down-regulated proteins. Among the up-regulated proteins, 21 showed higher expression relative to T. dubius, whereas 16 exhibited higher expression relative to T. porrifolius (Table 4). Among the 15 down-regulated proteins, almost equal numbers were differentially expressed in the synthetic polyploid (S1) compared with each parental proteome.

To determine whether hybridization or polyploidization had the greater impact on polyploid proteomes, we compared the proteomic profiles of the diploid F1 hybrid with those of synthetic and natural T. mirus. When protein expression levels in the diploid F1 hybrid were compared with those in the diploid parents (2611-1 as the maternal parent and 2613-24 as the paternal parent; Table 1), we detected 32 (6.72%) proteins with differential expression, which is slightly lower than the number obtained from the synthetic and natural allopolyploid T. mirus, and 444 (93.3%) proteins that showed additivity. The numbers of proteins identified in the natural and synthetic polyploids were nearly identical to both of their parents. By contrast, there were more differentially expressed proteins in the F1 hybrid plants when compared with T. dubius (22 proteins) than T. porrifolius (16 proteins). This difference between F1 hybrid and parents became more obvious when we considered up-regulated proteins, that is, 15 of 20 up-regulated proteins were found by comparison with T. dubius. However, this bias was not observed in down-regulated proteins in the F1 hybrid (7 vs 11 proteins relative to T. dubius and T. porrifolius, respectively).

Homeolog-specific expression in Tragopogon allopolyploids

In the homeolog-specific analysis, we compared protein expression results with the T. dubius and T. porrifolius databases, respectively. We found two instances of homeolog-specific expression in natural polyploid T. mirus, but not in the diploid F1 hybrid or synthetic polyploid. Trimeric dihydrolipoamide succinyltransferase and an unidentified conserved protein have nine and two amino acid substitutions between T. dubius and T. porrifolius, respectively, and the expressed proteins in T. mirus come from only the T. dubius proteome (Fig. 4a,b). The absence of the T. porrifolius proteins is a result of silencing of T. porrifolius homeologs at the transcript level, so that only T. dubius homeologs were expressed at the protein level in natural allopolyploids (Fig. 4c). However, the diploid F1 hybrid and synthetic polyploid have both homeologs present in genomic DNA, and both are also present at the transcript and protein levels (Fig. 4d).

Figure 4.

Homeolog-specific expression at the protein level. In amino acid sequences, Tragopogon porrifolius has arginine (R) and T. dubius has proline (P) in the catalytic domain from trimeric dihydrolipoamide succinyltransferase. (a) T. porrifolius database search. (b) T. dubius database search. (c) Quantitative real-time PCR. (d) Genomic. (e) Comparison of amino acid sequences of trimeric dihydrolipoamide succinyltransferase in T. porrifolius and T. dubius; red letters indicate the fragment detected in the database. Tdu, T. dubius; Tpo, T. porrifolius; Tm, T. mirus; F1, diploid F1 hybrid between T. dubius and T. porrifolius; S1, first-generation synthetic T. mirus; M, marker.

Protein differences between the diploid F1 hybrid and allopolyploid T. mirus

A total of 68 proteins showed differential expression in F1 hybrid and/or synthetic and natural T. mirus plants (Fig. 3). Thirty-two of these 68 proteins showed up- or down-regulation in the F1 hybrid relative to either of the diploid parents; four of these 32 proteins showed changes only in the F1 hybrid plant, whereas 11 and 17 proteins maintained the changes observed in the F1 hybrid in the synthetic T. mirus (S1), and in the F1 hybrid, synthetic and natural T. mirus plants, respectively. Importantly, 22 of 68 proteins showed expression differences between the first-generation synthetic T. mirus and natural T. mirus plants. Nine of the 22 proteins were differentially expressed only in plants of synthetic T. mirus (S1), whereas 13 of 22 proteins maintained their expression levels in the natural T. mirus plants. Interestingly, 14 of 68 proteins showed novel differential expression changes in natural T. mirus plants compared with the diploid parental proteomes.

When compared with the parental proteome, more proteins were highly up-regulated in the F1 hybrid and polyploids (Table 4). However, the F1 hybrid and S1T. mirus show more up-regulated proteins compared with the T. dubius proteome, whereas natural T. mirus shows an equal number of up-regulated proteins in comparison with the parental proteomes (Table 4).

Comparison of proteome data with transcription data

To determine whether transcript-level changes correlate with protein accumulation, we compared the proteomic data presented here with the transcript data of Koh (2010) (Table S3). Six of 10 genes investigated by Koh (2010) (FRUCTOSE-BISPHOSPHATE ALDOLASE (FBP ), GLYCERALDEHYDE-3-PHOSPHATE DEHYDROGENASE (G3PDH), NUCLEIC ACID BINDING, NUCLEAR RIBOSOMAL DNA, POLY-UBIQUITIN and THIOREDOXIN M-TYPE 1) were included in the protein analysis. Four of the six proteins showed additive protein expression in the F1 hybrid and synthetic and natural T. mirus, and two proteins were nonadditively expressed at the protein level. The T. dubius FBP protein was up-regulated in synthetic T. mirus, 1.6-fold higher than in the diploid parent (P = 0.034), and the T. porrifolius G3PDH protein showed 0.6-fold lower expression in the F1 hybrid than in the diploid parents (P = 0.011; Fig. 5).

Figure 5.

Expression levels of FRUCTOSE BISPHOSPHATE ALDOLASE and THIOREDOXIN M-TYPE 1 at transcriptional and protein levels in various Tragopogon species. (a) and (b) show the expression of FRUCTOSE BISPHOSPHATE ALDOLASE, and (c) and (d) show the expression of THIOREDOXIN M-TYPE 1. (a) and (c) present gene expression, and (b) and (d) present protein expression. Tdu, T. dubius; Tpo, T. porrifolius; Tm, T. mirus; F1, diploid F1 hybrid between T. dubius and T. porrifolius; S1, first-generation synthetic T. mirus.


Technical challenges of proteomics in allopolyploid T. mirus and its parents

Here, we employed a novel approach to study polyploid proteomes on a per-cell basis. Our results demonstrated that the protoplasts of the diploid F1 hybrid are intermediate in size to those of the diploid parents, and that protoplasts of the polyploids are twice the size of those of the F1 hybrids. Interestingly, despite the doubled cell size, polyploids produce similar amounts of proteins to those of the diploid parents at the cellular level.

Comparison of the T. dubius and T. porrifolius proteomes revealed that 15 proteins were differentially expressed between them (3.2%; 0.01 < P < 0.05). Notably, there is a 3.5% genomic sequence divergence between the two parental species based on 30 gene sequences (Koh, 2010; Koh et al., 2010). Our results show little difference between the proteomes of the two parental species, T. dubius and T. porrifolius (15 of 476; 3.2%). However, at the genomic level, these species differ substantially, as demonstrated in our previous analyses (Koh, 2010; Koh et al., 2010), suggesting that either the nucleotide differences do not translate into protein differences or that our current analysis may underestimate the proteomic differences between the two species. In many proteomic studies, including ours, whole tissue was used, and protein differences between different cell types cannot be detected. Indeed, when the proteomes from two different cells were compared in B. napus, 217 differentially expressed proteins (217 of 1458; 15%) were discovered (Zhu et al., 2009), which is > 6 times as many as found in the Tragopogon system. In the future, proteomics using specific types of cells, organelles or other tissues may reveal more proteome differences between the species examined here.

We may also have underestimated the number of differentially expressed proteins because of the current limitations of the Tragopogon sequence databases. In this study, although we used three different databases, the NCBI nonredundant database does not contain sufficient information for the Tragopogon system, and both the Asteraceae and Tragopogon databases are incomplete, compiled largely from ESTs and without true functional annotations. Therefore, our database searches could reveal matches to only a small part of the entire proteome. Thus, the small number of differentially expressed proteins identified from iTRAQ might be a result of insufficient information about Tragopogon gene sequences. A Tragopogon genome sequence would greatly enhance future proteomics work.

Novel insights into the proteomes of the natural and synthetic allopolyploids and diploid F1 hybrid

The allotetraploid T. mirus shows additivity of cell size compared with its diploid parents. This result is consistent with the general additivity of its parental genome sizes (Pires et al., 2004). Here, we found that, although cell size differs between the diploid parents and their polyploid derivative, the protein amount is similar between diploids and the polyploid on a per-cell basis. However, the diploid hybrid expressed slightly larger amounts of protein than the other plants examined, a result also found at the transcript level (Koh, 2010). This phenomenon may be explained by ‘transcriptomic shock,’ as observed in T. miscellus (Buggs et al., 2011), in which regulatory controls in F1 hybrids seem to be relaxed, leading to excessive transcript levels.

Here, we observed six instances of novel expression in both synthetic and natural polyploids (Fig. S2). These novel expression patterns did not involve any changes at the sequence level in the parents. However, these novel expression patterns result from protein PTMs. In particular, proteolysis causes cleavage of a protein at a peptide bond as a structural change (Fruebis et al., 2001), deamination changes the chemical nature of the amino acid (Zhou et al., 2004) and oxidation involves addition of a smaller chemical group (Kim et al., 2006). These PTMs also occur in the diploid F1 hybrid and therefore result from the hybridization, rather than the polyploidization, process, and may constitute a form of ‘proteomic shock’ (cf. genomic and transcriptomic shock previously reported for hybrids; McClintock, 1984; Hegarty et al., 2008; Buggs et al., 2011). The detailed physiological and evolutionary significance of the PTMs deserves further investigation.

We also observed two instances of homeolog-specific expression. However, these cases do not result from regulation at the translational level; instead, they both result from silencing of the T. porrifolius transcript. Interestingly, this pattern of T. dubius-only protein expression was only observed in the natural polyploids, not in synthetic (S1) T. mirus or the F1 hybrid, which exhibit additivity of both gene and protein expression. The fact that the F1 hybrid and S1 polyploid undergo transcription at these loci indicates that silencing does not occur immediately on hybridization or in the earliest stage of polyploid formation, but rather several generations after polyploidization. Of course, the F1, S1 and natural T. mirus examined here do not constitute a direct lineage, and therefore variation among parental genomes from different individuals may contribute to the difference between the natural and synthetic polyploid plants investigated here.

Currently, a limitation of iTRAQ lies in the underestimation of fold changes, that is, small changes may reflect significant differences between samples (Bantscheff et al., 2008; DeSouza et al., 2009; Karp et al., 2010); thus, most iTRAQ analyses use fold changes of < 0.8 or > 1.2. Here, we applied the fold changes < 0.5 or > 1.5 for statistical and biological significance.

Comparative proteomics of the diploid parents (T. dubius and T. porrifolius) and natural and synthetic T. mirus, as well as the F1 hybrid of T. dubius × T. porrifolius, revealed that 32 proteins changed their expression patterns after hybridization between the two parents, and an additional 22 proteins were differentially expressed immediately following the process of genome doubling. In addition, our studies of natural populations of T. mirus compared with synthetic T. mirus revealed changes in expression profiles for an additional 14 proteins, indicating further change in the proteome over the c. 40 generations since polyploidization. Overall, our data suggest that the hybridization effect is more important in proteome evolution than is polyploidization itself, despite extensive homeolog loss in T. mirus (Koh et al., 2010).

In addition, there are nearly identical numbers of proteins up- or down-regulated in the F1 hybrid (17 vs 14; P = 0.7201 in a binomial test), whereas there are more proteins up-regulated than down-regulated in both natural (27 vs 15; P = 0.08843 in a binomial test) and synthetic (34 vs 15; P = 0.009399 in a binomial test) T. mirus (Table 4). This indicates that genome doubling might stimulate the mechanisms controlling protein up-regulation. Significantly, the biased up-regulated proteomic profiles were noted in synthetic T. mirus compared with the T. dubius proteome, but were not observed in natural T. mirus (Table 4). This biased proteomic expression pattern is consistent with the previous results of Buggs et al. (2010) based on gene expression in the newly formed allopolyploid T. miscellus. Of the 22% of single nucleotide polymorphisms that were differentially expressed in T. miscellus, the paternal T. dubius homeologs were more often up-regulated than the T. pratensis homeologs (77% vs 23%, respectively). However, it is currently unclear whether T. mirus shows paternally biased transcript expression, like T. miscellus; the transcriptome profiles of T. mirus and its diploid parents are needed to resolve the relationship between transcriptome and proteome in T. mirus.

Comparison between protein expression and gene transcript data in T. mirus

Through comparison of proteomic data with a recent study of gene expression (Koh, 2010), we found six pairs of protein–transcript matches. All six of these genes showed biased gene expression patterns towards one of the parental species (Koh, 2010) (Fig. 5 and Table S4), but exhibited additive patterns in the proteomic profiles, except in two cases. Apparently, the up-regulation of the T. dubius-derived FBP is a result of the absence of the T. porrifolius-derived proteins following silencing of the T. porrifolius homeolog in the synthetic polyploid (Koh, 2010). However, protein expression of T. porrifolius G3PDH was 0.6-fold lower in the F1 hybrid than in diploid T. porrifolius individuals (P = 0.011). Interestingly, genes that were nonadditively expressed in other polyploid plant systems showed additivity in the Tragopogon system (e.g. the small subunit of Rubisco (Wang et al., 2004; Hegarty et al., 2005), NAD-dependent malate dehydrogenase (Wang et al., 2004, 2006; Hegarty et al., 2005) and chlorophyll a/b-binding protein (Hegarty et al., 2005)), demonstrating that gene and protein expression for homologous loci differ among polyploid plant groups.

Although protein abundance does not correlate well with transcript amounts (Anderson & Seilhamer, 1997; Haynes et al., 1998; Gygi et al., 1999; Kersten et al., 2002; Albertin et al., 2005; Marmagne et al., 2010), the divergence between the two parental proteomes could have an impact on the phenotypic variation in T. mirus. In addition, parental homeologs can work differently in metabolic pathways, making it important to understand the dynamics of proteins in polyploids. Thus, high proteome coverage is needed. To achieve this, sample fractionation, a better sequence database and the implementation of other proteomics tools are essential for a nonmodel system such as Tragopogon.


Here, we report a comparative proteomic analysis of hybridization and polyploidy in a naturally occurring polyploid plant species (Tragopogon mirus and its parents), one of the first such studies to be conducted. Other proteomic analyses of polyploid plants have used 2-DE (Bahrman & Thiellement, 1987; Islam et al., 2003; Albertin et al., 2005, 2006; Hu et al., 2011). In particular, two studies showed proteomic bias towards one of the two parents, especially the paternal genome (Albertin et al., 2006; Hu et al., 2011). However, we observed unbiased proteomic profiles in both natural and synthetic allopolyploid T. mirus; this could be a result of technical differences. As 2-DE quantifies only spot intensity and patterns on a gel, it is possible to misinterpret the parental origin of spots; sequencing of isotope-tagged peptides is required to provide the required resolution. In contrast, iTRAQ both identifies and quantifies protein production by unique sequence area. Thus, the proteomic profiles can be compared more specifically.

In the only other study employing iTRAQ in a polyploid system, Ng et al. (2012) showed 8% protein divergence between the synthetic allopolyploids of Arabidopsis suecica and its parents. We identified 9–10% proteins (44 and 50 of 476 proteins in natural and synthetic T. mirus, respectively) that were quantitatively differentially expressed in T. mirus relative to its diploid parents. The similar values observed in both studies might indicate that evolution follows a similar track in the proteomic profiles in response to polyploidization. Clearly, more studies are needed to test this hypothesis. In addition, comparison between transcripts and protein expression showed that differential protein regulation is not necessarily correlated with changes in gene expression in Brassica and Arabidopsis polyploids (Marmagne et al., 2010; Ng et al., 2012). This suggests that most transcripts can be post-transcriptionally regulated. Further studies of gene expression based on differentially expressed proteins are needed to explore the regulatory mechanisms that underlie physiological changes or phenotypic variation in the evolution of allopolyploid Tragopogon.


We thank Drs Jennifer A. Tate, Ann Oberg and Lauren M. McIntyre for technical assistance on this project. We acknowledge the Proteomics Division of the University of Florida’s Interdisciplinary Center for Biotechnology Research (ICBR) for assistance in LC-MS/MS analysis. This work was funded by National Science Foundation grants MCB-0346437 and DEB-0614421 and a Research Opportunity Seed Fund grant from the University of Florida (project 00095031).