Omic technologies (i.e., genomics, proteomics, and metabolomics) have become widely incorporated into the fields of eco- and aquatic toxicology. This confluence was underscored by Snape et al. 1, who suggested that the term “ecotoxicogenomics” be applied to this area of inquiry. Proteomic techniques, in particular, offer great potential for insight into chemical modes of toxic action and are useful tools in biomarker discovery 2, 3. While expression analysis of specific proteins has been used extensively in the field of aquatic toxicology to monitor organism exposure and effect, the analysis of whole proteomes enables one to examine potentially unforeseen responses. First applied in aquatic toxicology by Shepard and Bradley 4 and Shepard et al. 5, who analyzed the proteomic response of mussels (Mytilus edulis) exposed to polychlorinated biphenyls (PCBs), Cu, and salinity stress, proteomic techniques have recently been applied to several species of fish, including fathead minnows (Pimephales promelas6) and medaka (Oryzias latipes7), as well as to other aquatic invertebrates, such as the Chinese mitten crab (Eriocheir sinensis) 8. Initial proteomic studies utilized two-dimensional gel electrophoresis (2DGE) to identify changes in the patterns of protein expression under chemical stress 9. Many of these studies did not identify the proteins that underwent significant change, however, and therefore offered limited information. From this initial “blast off” phase 10, researchers in this field have adopted newer technologies and have since become more technically sophisticated. The use of liquid chromatography (LC) and mass spectrometry (MS), along with the growth of available genome sequence information and powerful bioinformatics tools, have facilitated protein separation and identification, and have enhanced the knowledge gained from these studies 11.
The number of publications recognized by the Aquatic Pollution and Environmental Quality database (CSA Illumina, ProQuest) with the keyword terms “proteome” or “proteomics” has risen steadily from two in 2000 to a peak of 43 in 2008 (Fig. 1). Within this period, also a number of published review articles have highlighted proteomic applications in ecotoxicogenomics. Snape et al. 1, and Wetmore and Merrick 2, provided two of the earliest and most comprehensive reviews which set the framework for proteomics research in ecotoxicology; synthesis papers 3, 12–14 and book chapters 15, as well as excellent technical and methodological reviews 16, 17 have followed. This review synthesizes the recent proteomic literature in the field of aquatic toxicology and focuses on studies centered on model and non-model vertebrate (fish) (Table 1) and invertebrate (Table 2) species.
Table 1. Summary of laboratory controlled and field/caged studies that have utilized proteomic techniques in ecotoxicological research with Teleost fish speciesa,b
Proteomic approaches are readily applied to species of fish considered traditional and nontraditional models in aquatic toxicology. While the discussion over what species of fish are considered good toxicological models is lively 18, there are clear advantages to using both the traditional and nontraditional models. Arguably the utmost benefit gained from using model species in proteomics research is the greater amount of genome sequence coverage that is available in searchable databases. From a bioinformatics standpoint, this heightens the potential for identifying proteins and reduces the uncertainty associated with matching amino acid sequences across species. Monsinjon and Knigge 11, along with Snape et al. 1, have recognized this limitation and, in addition to sequencing the genomes of more nontraditional models, advocate that model organisms (i.e., genome-sequenced) be used more frequently in ecotoxicogenomics studies. The clear advantage of using non-model organisms is that in many, if not most cases, they provide a much better surrogate for biotic responses in a particular system of interest. The lack of robust and searchable species-specific sequence information, however, requires the employment of broader search categories (e.g., Actinopterygii, ray-finned fish), consequently reducing the likelihood and certainty of true protein identification. Dowling and Sheehan 16 argue that organisms underrepresented in sequence databases should be exploited, and suggest that de novo sequencing approaches should be utilized to circumvent these limitations. Because it is recognized that there are discernable differences in the approaches, tools, and interpretation of proteomic data derived from model and non-model species of fish, the following sections will address them separately. Model fish species considered herein are those with species-specific, searchable headings within the MASCOT search engine (Matrix Science): zebrafish (Danio rerio) and pufferfish (Takifugu rubripres).
Model fish species
Recent proteomics-based studies of zebrafish and pufferfish have predominantly used 2DGE and tandem MS time-of-flight (TOF/TOF) as the separation and identification techniques, respectively. De Wit et al. 19 utilized a modified version of 2DGE, differential in-gel electrophoresis (DiGE), to examine the protein expression changes in zebrafish exposed to tetrabromobisphenol-A (TBBPA). This technique enabled the researchers to separate the fluorescently dyed proteomes of two samples (control and exposure treatments) within one gel, thereby reducing the gel-to-gel variation that inherently increases the error associated with conventional 2DGE expression analysis. Developed over a decade ago (1997; 20), DiGE has not been readily used in aquatic toxicology, although Ankley et al. 21 have proposed using the method within their framework for developing biomarkers of endocrine disruption. Liquid chromatography (LC) offers another means of proteome separation that is likewise rarely used with model aquatic species, although it offers several advantages over 2DGE. While highlighting these limitations, Lin et al. 22 established a shotgun, liquid chromatography–tandem mass spectrometry (LC MS/MS) (ion trap) method for analyzing the proteomes of single zebrafish embryos at 72- and 120-h postfertilization. The authors were able to identify 509 and 210 proteins from these two developmental stages. Equal to the ubiquity of 2DGE as a separation technique, the most commonly used mass analyzer in recent years is the tandem TOF/TOF. This overrepresentation of TOF/TOF analyzers is likely because they offer a good combination of cost-effectiveness, accuracy, resolution, and speed 23. Few studies have used other analyzers. Gündel et al. 24, and Kling and Förlin 25, used ion trap and Fourier transform ion cyclotron resonance (FT-ICR; coupled with an ion trap) analyzers, respectively, to examine vitellogenin processing by zebrafish embryos and proteins associated with brominated flame retardant (BFR) toxicity in zebrafish livers. Although FT-ICR analyzers offer excellent mass accuracy and resolving power, they are not well-suited for high-throughput experiments and are often prohibitively expensive to operate 23. Ion traps are relatively inexpensive and are capable of high-throughput analysis, but they offer poor mass accuracy and resolving power 23.
Toxic proteomic responses of model fish species have recently been evaluated for perfluorooctane sulfonate 26, excessive fluoride 27, mycrocystin-LR 28, and BFR exposure 19, 25, 29. Protein profiles were typically evaluated in whole liver tissues 28, liver cell lines 25, whole embryos 26, or less often in muscle tissues 27. Brominated flame retardants have garnered the most attention from proteomic-enthusiasts. De Wit et al. 19, Kling et al. 29, and Kling and Förlin 25, have all evaluated the hepatic responses of zebrafish exposed to specific BFR substances (e.g., TBBPA) or to a mixture of them. De Wit et al. 19 were able to identify seven proteins (three up- and four down-regulated) that were differentially regulated after exposure to 1.5 µM TBBPA for 14 d. The up-regulated proteins included phosphoglycerate mutase 1, heat shock 70 kDa protein 5, and chaperone protein GP96 isoform 10 (similar). Down-regulated proteins included β-actin 2, and three isoforms of betaine homocysteine methyltransferase. Kling et al. 29 were able to identify 13 and 19 differentially expressed hepatic proteins in male and female zebrafish exposed to a mixture of BFRs at one of two doses (10 and 100 nmol/g feed for 21 d), respectively. The authors concluded that although specific protein expression responses were largely sexually exclusive, gender-specific responses were both protective and indicated an oxidative stress response. These authors also noted an induction of betaine homocysteine methyltransferase in both sexes. Kling and Förlin 25 analyzed the proteome response of zebrafish liver cells after 24- and 72-h exposure to 0.05, 0.5, 5, and 187 µM hexabromocyclododecane (HBCD) and TBBPA (0.05, 0.5, 5, and 220 µM), and a mixture of both (1, 10, 50, 100 µM each). They were able to identify seven differentially expressed proteins in response to HBCD exposure, two in response to TBBPA exposure, and nine in response to the mixture. They were also able to identify eight proteins that were co-regulated in both HBCD and TBBPA single exposures, and the mixture. Overall, it was found that HBCD uniquely affected protein metabolism, and TBBPA specifically altered protein folding and nicotinamide adenine dinucleotide phosphate oxidase (NADPH) production. The mixed exposure likewise affected NADPH production along with cell-cycle control, and overlapping responses involved an increase in gluconeogenesis.
Non-model fish species
Evaluations of chemically induced protein expression response in non-model fish species have been conducted on largemouth bass (Micropterus salmoides30), rainbow trout (Oncorhynchus mykiss 31), rare minnows (Gobiocypris rarus32), and others. Similar to research on model species, the most commonly used method of protein separation in non-model species is 2DGE, although LC has been utilized in a few studies 6. Tandem TOF/TOF is the most frequently used mass spectrometric setup in non-model proteomic research, but TOF analyzers coupled with quadropoles are not uncommon 33. An FT-ICR mass analyzer coupled with an ion trap was also used in one study 31. Surface-enhanced laser desorption ionization (SELDI)-TOF is a technique that exploits the chemical characteristics (e.g., hydrophobicity, charge) of proteins as a means of simplifying complex samples; SELDI-TOF does not enable one to identify differentially expressed proteins, but does yield information that may be helpful in identifying these proteins (peaks) of interest through subsequent tandem MS analysis. This technique has been used in at least two recent non-model studies 33, 34, and was highlighted in another 3. The isobaric tags for relative and absolute quantification (iTRAQ) approach was also used recently to quantitatively profile hepatic fathead minnow proteins involved in androgen receptor signaling 6. iTRAQ is a non-gel–based, tandem MS approach that enables isobarically labeled samples to be quantitatively compared simultaneously. The approach is more advantageous than other quantitative labeling methods (e.g., cleavable isotope-coded affinity tags cICAT) because it enables more than two samples to be compared at once, and it offers greater proteome coverage 35. That study 6 was the first to apply this technique to aquatic toxicology and proved its utility and power.
Proteomic research on non-model fish species have examined physiological responses to endocrine disrupting compounds 33, 34, Cd 36, 37, and microcystin-LR 7, 38, 39, among other environmental contaminants 30. Protein profiles have mostly been evaluated in hepatic tissues, but occasionally in gill 37, brain 36, and kidney tissues 40. Laboratory-based exposures were conducted most frequently, although there were a few good examples of in situ field exposures. Ripley et al. 40 analyzed the anterior kidney protein profiles in smallmouth bass (Micropterus dolomieu), collected from three sites along the Shenandoah River (VA, USA), which experienced numerous unexplained fish kills in 2005 and 2006. The researchers determined that the expression of proteins associated with stress and immune responses differed among sites, indicating that compromised leukocyte production may have contributed to the mass die-offs. The work also assisted in the development of further, testable physiologically based hypotheses. Albertsson et al. 31 examined the hepatic proteome response of caged rainbow trout located downstream of a sewage treatment facility near Gothenberg, Sweden. The authors identified three down- and one up-regulated protein in sewage effluent-exposed fish, and determined, through a parallel study, that the expression changes were not induced by estrogens. A third in situ study conducted in Beijing, China examined the hepatic protein expression of goldfish (Carassius auratus) collected from Gaobeidian Lake 41. The lake was influenced by coolant and wastewater effluent contributed from thermal power and treatment plants. The expression of detoxification enzymes, antioxidant, energy production, and stress-related proteins in lake-collected fish were altered in relation to control fish. While these studies provide good examples of proteomics being applied to field-based studies in aquatic toxicology, and offer an indication of where this field will find its greatest value, they likewise reinforce the benefit of concomitant laboratory experimentation 31. Such experimentation in controlled settings has the potential to lend itself to more thorough interpretations of findings.
Due to their environment and continuous contact with aquatic sediments, which can act as reservoirs for many environmental pollutants, aquatic invertebrates are popular sentinel species used in estuarine and coastal monitoring programs. Unfortunately, to date, few proteomic studies have utilized these key organisms to study the impacts of environmental pollutants on aquatic ecosystems or for biomarker discovery. Of those published, many have focused on bivalve species, such as marine mussels of the genus Mytilus (M. edulis, M. galloprovincialis, M. trossulus) that reside off the European coast. Most of the studies published have utilized proteomic analysis to examine taxonomic differences 42–47, heritability 48, and evolution of the nacre 49. One of the first environmental toxicology proteomic studies with aquatic invertebrates was conducted on Mytilus. Using protein expression profiles, several authors have reported on the use of proteomic approaches for the establishment of “protein signatures” after exposure to a variety of environmental contaminants including PCBs, polyaromatic hydrocarbons (PAHs), heavy metals 50, 51, and crude oil 52–54. Additionally, proteomics has been used to compare profiles of mussels residing in polluted areas to those of reference sites 55, 56. Not only have the proteomes of these bivalve species been evaluated when exposed to individual contaminants, but to mixtures as well 57. ProteinChip array technology combined with SELDI-TOF MS was adopted to examine proteome changes of mussels exposed to oil and oil spiked with alkylphenols and PAHs 50, 58.
Another bivalve species that has been used in environmental toxicology studies is the clam. Romero-Ruiz utilized 2DGE to examine proteomic profiles of Scrobicularia plana residing in the Guadalquivir Estuary, Spain 59. Other studies have examined the impacts of Cd 60, 61 and p,p'-dichlorodiphenyldichloroethylene (DDE) 62 on the proteome of Ruditapes decussatus. Two-dimensional electrophoresis has also been implemented to study glutathione affinity-selected proteins of the clam Tapes semidescussatus63. Finally, proteomics was used to screen changes in protein expression of the clam Chamaelea gallina exposed to arochlor 1254, Cu(II), tributyltin, and arsenic (III) 64.
Aquatic invertebrates other than bivalve mollusks exposed to various environmental conditions have also been evaluated. For example, 2DGE followed by MS/MS was used to characterize differentially expressed proteins of the Chinese mitten crab exposed to Cd 8. Also, ProteinChip technology was implemented for biomarker discovery of spider crabs (Hyas araneus), exposed to diallyl phatalate, bisphenol A, and polybrominated diphenyl ether (PDBE-47), as well as shore crabs (Carcinus maeanas) exposed to crude oil and oil spiked with alylphenol and 4-nonylphenol 65. Proteomic analysis was used to examine the effect of Cd on the aquatic midge (Chironomus riparius) 66, 67, as well as studying metamorphosis of barnacles 68, 69 and diapaused embryonic development of brine shrimp (Artemia franciscana) 70.
As with fish species, utilizing model species provides a greater amount of genome sequence coverage available in searchable databases, thus increasing the potential for identifying proteins and reducing the uncertainty associated with matching amino acid sequences across species. Unfortunately, few aquatic invertebrate species have been sequenced, thus making protein identification difficult. For example, Jonsson et al. 53 reported that matrix-assisted desorption ionization (MALDI)-TOF-MS of tryptic digested spots of M. edulis exposed to diallyl phatalate and crude oil did not decipher identities of any of the proteins affected by exposure. Also, only two proteins, hypoxanthine-guanine phosphoribosyltranferase and glyceraldehyde-3-phosphate dehydrogenase, were identified in S. plana clams inhabiting sites with high metal content 59. Conversely, upwards of 39 C. riparius larval proteins were identified through MALDI-TOF-MS and spectra blasted against the National Center for Biotechnology Information nonredundant protein database, or Mass Spectrometry protein sequence database (MSDB) with MS-Fit program 67. However, many proteomic studies with aquatic invertebrates have identified, on average, 15 proteins 8, 43, 56, 66, indicating the problem with matching amino acid sequences across species. Some studies have tried various techniques to improve protein identification by incorporating fractionation of samples, as well as the use of nanospray-ion trap MS, followed by database searching and de novo sequencing 3. Others have focused on machine learning techniques to process and extract informative expression signatures from MS data, instead of focusing on single protein identification 13. Furthermore, changes in chemical status of proteins, such as modifications due to oxidative stress, have been used to help in protein identification 55, 61, 62, 71–75. Overall, this demonstrates the limitations in protein identification when the species of interest does not have a sequenced genome.
The only aquatic invertebrate whose genome has been sequenced is Daphnia (http://wFleaBase.org). A recent article by Fröhlich et al. clearly demonstrated the usefulness, as well as the necessity, of a sequenced genome of the species of interest for MS-based high-throughput proteomic analysis 76. A comprehensive set of 701,274 peptide tandem-mass spectra, derived from D. pulex, were generated and led to the identification of 531 proteins. To further validate the utility of the D. pulex database for use with other daphnid species, tandem-mass spectra were generated for D. longicephala, leading to the identification of 317 proteins. These results demonstrated that use of the D. pulex protein database for other daphnids is quite feasible, but protein identification is slightly compromised. Conversely, when D. pulex spectra was blasted against the Drosophila melanogaster database (http://flybase.org), only 71 Daphnia proteins were identified, whereas 92 were identified when using the Swiss-Prot database 76. Thus, for the full potential of proteomics in ecotoxicology to be met, invertebrate model species must be selected and their genomes sequenced.
The typical proteomics workflow includes the following three main steps: First, protein extraction; second, protein separation and quantification; and third, protein identification and characterization. Pros and cons of different methods used for the second and third steps will be discussed here. Two-dimensional gel electrophoresis is the most commonly used method for initially separating proteins from tissue samples. Because proteins are separated based on both their isoelectric point and their molecular weight, 2DGE allows for the analysis of posttranslational modifications. It also provides a greater resolution compared to standard one-dimensional gel electrophoresis 77. However, 2DGE is time consuming, labor intensive, and requires a lot of hands-on training before a good quality gel can be produced. Because of the large variation in the quality of the gel that is produced, this method lacks reproducibility, and several gels need to be run per treatment. Another disadvantage of 2DGE is that it can mask the expression of less abundant proteins, resulting in inaccurate quantification and potential misidentification 77, 78.
An alternative to 2DGE is the use of DiGE. In this method, CyDye fluors are used to label proteins through an amide linkage, and usually control and treated samples are labeled using different dyes (e.g., Cy3 and Cy5, respectively), while a mixture consisting of an equal amount of the control and treated samples is labeled with Cy2. The labeled samples are combined and run in a single 2D gel to decrease variability across gels; therefore, in theory, no more than one gel would need to be run per experiment 77.
Because of the disadvantages associated with 2DGE, non-gel protein isolation techniques have become increasingly popular. The simultaneous use of LC as a separation tool and MS/MS as an identification tool results in much higher efficiency (i.e., higher number of proteins identified per unit of time), compared to standard gel approaches 76. In addition, several new proteomic tools have been derived from LC-based methods, and although not yet commonly used in ecotoxicological research, could become an important set of tools in this field. These include cICAT and iTRAQ that are discussed in more detail below.
In cICAT, proteins from controls and treated groups are labeled at cysteine residues with light and heavy, respectively, tags carrying a biotin moiety. After an affinity purification step, peaks corresponding to the same peptide are identified as doublets in MS due to the mass difference between light and heavy isotopes. The peak intensities of the peptides directly correlate with the relative abundance of the proteins in the two treatment groups 79. In iTRAQ, all peptides in a digest mixture are labeled using the same amine reactive isotopic tags (hence the name isobaric) for derivatization of peptides at the N-terminus and the lysine side chains 79. Upon fragmentation in MS/MS, signature ions (m/z from 114–117) are produced, which provide quantitative information upon integration of the peak areas. Although these new approaches are considered more sensitive (measured as the number of peptides per protein identified) compared to the other proteomic techniques described earlier, the best approach continues to be the utilization of several of these techniques simultaneously because this usually results in the identification and quantification of nonoverlapping groups of proteins, and thus it more closely reproduces a true whole proteome response of an organ or organism after exposure to stressors 79.
SUMMARY AND FUTURE RESEARCH NEEDS
Proteomics is defined as the simultaneous analysis and quantification of cellular or extracellular protein abundance. Compared to genomics, proteomics can provide functional mechanistic information because mRNA is a disposable message and only a limited amount of it gets translated into proteins. In addition, proteomic techniques can capture changes in the activity of proteins measured as posttranslational modifications. One major drawback of using proteomics in ecotoxicological research is that the genome of most organisms used for these studies is largely unknown. This results in the identification of a limited number of proteins.
Similarly to other omic techniques, proteomics offers great potential as a biomarker discovery tool, as well as for increasing our understanding of the cellular mechanisms involved in stress response. Thus, the major potential of these tools is based on their ability to predict adverse impacts of stressors to invertebrates and fish. Because organisms are rarely exposed to a single chemical, these high-throughput mechanistically based tools can also improve our knowledge of the effects of mixtures. However, several major challenges remain before proteomics can be successfully applied for environmental risk assessment. Some of these challenges include the following.
First, linking protein expression profiles to phenotypic and population-level effects. Integration of proteomic data with phenotypic changes and ecologically relevant endpoints is needed if these tools are to be implemented by regulatory agencies as predictors of risk to free-ranging animals. The use of proteomics in existing risk assessment approaches would allow for a high-throughput screening of simultaneous changes in thousands of molecules, and thus provide a more rapid evaluation of the toxicity of chemicals.
Second, more field studies that utilize proteomic approaches are needed. As can be seen from this review, few field studies have employed proteomic tools to evaluate exposure and effects of biota to pollutants. Results are promising in that multivariate analysis of protein profiles are predictive of site and, thus, specific contaminant exposure.
Third, most proteomic studies have focused on a single time point. More studies are needed that add a temporal component and evaluate changes in proteins over several time points, as well as during different life stages.
Fourth, a major challenge will involve separating biological noise with effects elicited by specific stressors. It is known that factors such as sex, age, and physiological condition can affect the profile of proteins. These types of effects must be taken into consideration during experimental design and data interpretation.
Finally, a major challenge is to annotate the proteomes of species routinely used in ecotoxicological research. This gap in proteomics research could be overcome in the short-term with the use of de novo sequencing approaches. Regardless of all the pressing challenges, there is no doubt that proteomics as a holistic measure of organism response to environmental stressors will continue to make a great impact in the ecotoxicological field.