SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

Reproductive proteins maintain species-specific barriers to fertilization, affect the outcome of sperm competition, mediate reproductive conflicts between the sexes, and potentially contribute to the formation of new species. However, the specific proteins and molecular mechanisms that underlie these processes are understood in only a handful of cases. Advances in genomic and proteomic technologies enable the identification of large suites of reproductive proteins, making it possible to dissect reproductive phenotypes at the molecular level. We first review these technological advances and describe how reproductive proteins are identified in diverse animal taxa. We then discuss the dynamic evolution of reproductive proteins and the potential selective forces that act on them. Finally, we describe molecular and genomic tools for functional analysis and detail how evolutionary data may be used to make predictions about interactions among reproductive proteins.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

Reproduction is a key component of a sexual organism's fitness, and across diverse taxa, questions about reproductive interactions are fundamental to understanding how organisms evolutionarily succeed. A female chimpanzee mates with several males in quick succession; what factors mediate the sperm competition that ensues? Two overlapping abalone species spawn simultaneously; what ensures the species specificity of fertilization? Sperm and egg meet in the reproductive tract of a female mouse; how do they fuse to form a zygote? After mating, a female fruit fly becomes unreceptive to courting; what causes this behavioral change? At the molecular level, reproduction is an intricate act composed of interactions between many proteins. Research over the past two decades has shown that these interactions govern at least some aspect of each question above. In cases such as the spawning abalone1, 2 and the disinterested female fly,3, 4 the causative proteins and molecular interactions are clear, but most reproductive phenotypes and behaviors remain unexplained at the molecular level.

Advances in genomics and proteomics are bridging the gap between reproductive phenotypes and their molecular mechanisms; for the first time it is possible to identify entire suites of reproductive proteins. Here, we review progress in identifying reproductive proteins from animals, with an emphasis on model systems. In particular, we consider proteins involved in sperm-egg interactions and proteins found in the male and female reproductive tracts. We discuss how comparative genomics allows tests of hypotheses about the evolutionary forces that act on reproductive proteins and describe how molecular evolutionary methods may be used to predict which reproductive proteins functionally interact. We also examine molecular and genomic approaches to functionally characterize reproductive proteins and suggest future experiments to address unresolved questions in the field.

Identifying reproductive proteins

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

Early studies of reproductive proteins began when the biochemical fractionation of gametes and gonads led to the purification and characterization of specific proteins. These cells and tissues were accessible and abundant in free-spawning marine organisms: the marine mollusk abalone and the echinoderm sea urchin have since become models for studying fertilization.5–8 Indeed, these taxa represent two of three systems in which interacting pairs of male and female reproductive proteins are known, though individual gametic and accessory proteins of mammals, Drosophila and other systems have also received extensive study.3, 9–14

Recent work has focused on identifying reproductive proteins en masse with genomic and proteomic approaches. One method has been to sequence libraries of expressed sequence tags (ESTs) from reproductive tissues in order to find transcripts likely to encode reproductive proteins.15–22 To distinguish genes with specific reproductive functions from housekeeping genes that are expressed ubiquitously, EST libraries are commonly screened against cDNA libraries made from non-reproductive tissues. For example, to identify genes expressed in the male accessory glands of D. simulans, Swanson et al.18 screened an accessory gland cDNA library with cDNA from female flies and sequenced only those accessory gland cDNAs that did not hybridize to female transcripts. Another method to identify candidate reproductive genes from EST libraries is to screen for proteins with motifs that are predictive of a particular cellular localization, such as a secretion signal or transmembrane domain in screens for female reproductive tract proteins.17, 19

More recently, whole-genome transcriptional profiling has improved upon the coverage of EST screens. Ravi Ram and Wolfner23 used such data to identify all annotated D. melanogaster genes expressed specifically in the male accessory glands and encoding putative secreted proteins. Similarly, Prokupek et al.24, 25 used EST analysis and microarrays to identify genes that are up and downregulated at different time points after mating in female Drosophila sperm storage organs.

A recent explosion of proteomics studies has identified many reproductive proteins in taxa ranging from crickets and honeybees to rodents and humans.15, 26–40 Applying mass spectrometry (MS)-based proteomics to the study of reproduction has been particularly fruitful, in our view, because reproductive tissues are often specialized cell types with the discrete function of producing proteins involved in a specific biological process. Thus, biochemical isolation and purification of proteins from these tissues and cell types are straightforward, and large amounts of relatively pure protein mixtures can be analyzed by MS. While MS methodologies vary by study,41, 42 the elements of an MS experiment include the digestion of proteins into peptides by a protease (e.g., trypsin), the chromatographic or electrophoretic separation of proteins or peptides, and the analysis of these peptides by a mass spectrometer. Mass spectra are matched to proteins using a computer algorithm to determine whether peptides found in annotated protein databases could produce spectra that resemble those observed experimentally (Fig. 1). For species with well-curated genomes, the database of proteins may be the annotated set of proteins available from a genome browser. For species without sequenced genomes, an EST library from a relevant tissue may be used as a starting point.15, 26, 27 Across taxa, these screens have repeatedly identified intriguing functional classes of reproductive proteins (Table 1).

thumbnail image

Figure 1. Identifying proteins from MS data. Shotgun proteomics experiments typically make two types of measurements. A: First, multiple peptides are analyzed in the first MS step (MS1) and the mass-to-charge (m/z) ratio of each is determined. B: In the second MS step (MS2), individual peptides from MS1 are sequentially isolated by the mass spectrometer and fragmented by collision with an inert gas. The m/z ratios of the fragments are then measured. After all spectra are collected, a computer algorithm44 is used to make peptide identifications. C: The first step of this algorithm identifies peptides from a protein database predicted to have the same m/z observed in MS1. D: The algorithm then predicts the theoretical fragmentation profile of each of these peptides and determines whether any of the predictions match the fragmentation profile observed in MS2. If so, then the protein from which the peptide was derived is identified. In this cartoon, the Drosophila SP protein produces a peptide, LNLGPAWGGR, with m/z = 520.8 and fragmentation m/z ratios as shown in (B). The predicted fragmentation spectrum for this peptide matches the observed MS2 spectrum, while the predicted spectra for other peptides with the same predicted m/z (derived from proteins CG31882 and CG3386) do not.

Download figure to PowerPoint

Table 1. Interesting classes of reproductive proteins identified by proteomics experiments
Protein classOrganism and source of proteinsPotential functional role
Regulators of the complement pathwayHuman seminal plasmaMay suppress female immune response against transferred sperm36
Serine protease inhibitorsMouse seminal vesiclesUpon transfer to female, may protect sperm from proteolytic attack or slow degradation of the copulatory plug30
Zona pellucida domain-containing proteinsAbalone (Haliotis spp.) egg vitelline envelopesStructural components of egg coat, including a candidate receptor for a fusagenic sperm protein22
Odorant binding proteinsD. melanogaster seminal fluid proteinsMay present odorants, pheromones or other small molecules to receptors in female reproductive tract32
Leucyl aminopeptidasesD. melanogaster spermPossible involvement in acrosome reaction31
Iron regulation and blood clottingAedes aegypti (mosquito) seminal fluid proteinsPossible localization to midgut after transfer to female reproductive tract, where proteins could help process blood meals that are necessary for egg production38
Antioxidant defense enzymesApis mellifera (honeybee) female spermathecaMay protect sperm from oxidative damage during long periods of storage28

MS-based identification of reproductive proteins has two advantages over sequencing ESTs. First, reproductive genes are identified based on evidence of the encoded protein's presence, leaving no doubt as to the protein-coding potential of the gene. Second, proteins that function in reproduction, but are also expressed elsewhere, can still be detected, since performing MS directly on proteins from reproductive tissues circumvents the need for subtractive hybridization. However, MS approaches have limitations. Without targeted purification schemes or analysis methods, MS may more readily identify certain classes of reproductive proteins (e.g., soluble, secreted, and/or abundant proteins) than others (e.g., transmembrane receptors, though see15, 31). Furthermore, not every protein identified in an MS screen for reproductive proteins functions in reproduction. For example, recent studies identified proteins from male reproductive tissues in flies and mice.30, 31, 40 Many of these proteins may interact with female proteins, and others may influence the regulation or modification of interacting proteins. Others, however, could play housekeeping roles that do not relate specifically to reproduction.

The drawbacks of large-scale MS studies can be addressed in several ways. If tissue-specific, whole-genome expression data are available,23 researchers can search lists of identified proteins for those with tissue-specific expression patterns. Alternatively, identified genes may be confirmed to have reproduction-specific expression patterns using RT-PCR.20 In certain systems, proteomics can be targeted toward those proteins with a direct effect on reproduction. Biochemical or physical fractionation, e.g., of mouse sperm heads39 or abalone vitelline envelopes,15, 26 is a simple, powerful method of isolating reproductive structures to guide proteomic analysis to proteins that play direct reproductive roles.

We recently developed a differential labeling method that directly identifies the reproductive proteins transferred in seminal fluid from male Drosophila to females.32 We produced female flies that were isotopically labeled with a non-radioactive isotope of nitrogen, 15N, by rearing flies for one complete generation on a paste of yeast that was grown with (15NH4)2SO4 as the sole nitrogen source.43 We then mated unlabeled males to these females and used MS to detect proteins from dissected female reproductive tracts. Female proteins were not identified because the masses of their peptide fragments were increased by incorporation of the heavy nitrogen. Specifically, female peptides were unidentified in the database-searching step of MS because their artificially increased masses (Fig. 1a) led the identification algorithm44 to select the wrong peptides from the protein database as possible matches (Fig. 1c), which produced predicted spectra that could not be matched to the observed spectra (Fig. 1b, d). This method identified more than 60 male proteins not previously known to be involved in reproduction and provided direct evidence for the transfer of >130 proteins. These proteins are, by definition, present during sperm competition, fertilization, and egg laying and are now targets for functional characterization. In subsequent work, we used MS data to identify seminal fluid genes that had been unannotated in the genomes of D. melanogaster and related species, suggesting that even for organisms with well-studied genomes, proteomic methods can identify many new reproductive genes.33

We anticipate that as MS technologies advance, differential labeling approaches will grow more powerful. New methods that specifically direct the mass spectrometer to analyze low-abundance proteins45, 46 will increase the sensitivity of detection for transferred proteins, and within several years, MS instruments will be able to discern in real time which peptides are unlabeled (representing transferred male proteins) and which are labeled (representing female-derived proteins). Such an advance would further increase the sensitivity to detect male-derived proteins, but it would also enable the identification of female-derived proteins that were newly synthesized upon mating by allowing researchers to compare which 15N peptides were identified in mated versus virgin female samples. Such comparisons are possible because database-searching algorithms can be configured to identify peptides with isotopic labels.

Evolutionary patterns of reproductive proteins

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

Detecting selection on reproductive proteins

Reproductive protein identification fuels a parallel interest in examining the evolutionary forces that act during reproduction. This interest began when many of the first characterized reproductive proteins were found to have highly divergent sequences between closely related species. This observation was initially surprising: reproductive proteins must function properly in order for an organism to reproduce, so the expectation was that new variants that arose in reproductive genes would harm fitness and thus be selected against. However, this issue can be viewed from the opposite perspective: if reproductive proteins evolve to function more efficiently, competitively, or potently, then these proteins' direct involvement in reproductive success should ensure that such new, adaptive variants are strongly favored.

At the molecular level, one way to distinguish between these patterns is to examine changes in a gene's coding DNA sequence between related species. The dN/dS ratio measures the ratio of non-synonymous coding sequence substitutions at non-synonymous sites (dN) to synonymous (silent) coding sequence substitutions at synonymous sites (dS) (Box 1). Most genes are conserved between related species and show dN/dS < 1, which indicates the action of purifying selection against amino acid-altering substitutions; pseudogenes that have been rendered non-functional by mutations may evolve neutrally, with dN/dS ≈ 1. However, some genes have dN/dS > 1, indicating that selection has driven the diversification of the protein sequence. The canonical examples of rapidly evolving genes are those that encode proteins involved in host-pathogen interactions, but when the genes encoding the abalone sperm proteins, lysin, and Sp18, and the D. melanogaster seminal fluid protein, ovulin, were first sequenced in multiple species, they were observed to be changing as rapidly as immunity genes.47–50 Genomic screens have repeatedly found that certain types of reproductive proteins have significantly higher dN/dS values than most genes (Fig. 2).17–19, 32, 51–53 Other methods, which detect recent selection acting within populations, have found adaptive evolution on reproductive proteins with a similar frequency.54, 55

Detection of positive selection on specific protein sites and phylogenetic lineages

The detection of positive selection on reproductive (and other) proteins has been enhanced by methods to detect specific sites and/or phylogenetic lineages that have experienced adaptive evolution. Estimates of dN/dS over the entire length of a gene are highly conservative, since most proteins are likely to experience multiple selective pressures. For example, key structural and core residues may be subjected to purifying selection, while residues responsible for interacting with a coevolving protein may evolve adaptively. Thus, dN/dS is rarely > 1 over the entire length of a gene. More sensitive methods use coding DNA sequences from multiple species to identify specific classes of protein sites that have experienced positive selection.116 These methods require no a priori knowledge of the structure of the protein, though when structures are available, the rapidly evolving sites often map to protein regions known to be involved in a coevolutionary process.117 It has also been observed that proteins with whole-gene dN/dS > 0.5 are often found in more sensitive analyses to have a set of adaptively evolving sites.19, 32, 51 Thus, genes that show a whole-gene dN/dS > 0.5 are prime candidates for having specific sites under selection, although the acquisition and analysis of gene sequences from additional species is required to conclude that positive selection has acted. Finally, additional methods can test whether selection has acted on particular phylogenetic lineages, which can be inferred with low power without an a priori hypothesis as to the lineages under selection118 or tested with higher power on specific lineages of interest.119

thumbnail image

Figure 2. Many Drosophila reproductive proteins evolve at elevated rates. A: Pairwise estimates of dN and dS between D. melanogaster and D. simulans for 123 seminal fluid proteins identified in D. melanogaster.32, 33B: Pairwise estimates of dN and dS between the same species for 100 randomly selected D. melanogaster genes with FlyBase-annotated orthologs in D. simulans. In each graph, the solid line shows dN/dS = 1, and the dashed line shows dN/dS = 0.5 (see Box 1 for an explanation of this threshold). Note that many, but far from all, seminal fluid proteins have elevated evolutionary rates.

Download figure to PowerPoint

Detecting positive selection on reproductive proteins is so common that it is now a mantra that reproductive proteins evolve rapidly, yet genomic screens have shown that most reproductive proteins have not experienced detectable positive selection. The D. melanogaster sperm proteome has evolved, on average, at the same rate as non-reproductive proteins.31 Of six accessory tissues of the murine male reproductive tract examined by proteomics, only the proteins of the seminal vesicle showed a significant elevation of evolutionary rate over the genome-wide average rate between mouse and rat.30 Furthermore, even those studies that have identified many rapidly evolving reproductive proteins have found even more proteins that are not adaptively evolving. For instance, of the 138 D. melanogaster seminal fluid proteins that we identified by proteomics,32 22 had whole-gene, pairwise dN/dS > 0.5, and only 16 were identified as having at least one protein site that was predicted to be adaptively evolving.

Of course, genomic and proteomic screens for reproductive proteins are likely conservative in their estimates of the proportion of genes evolving under positive selection. The most sensitive methods to detect selection (see Box 1) require gene sequences from multiple species, preferably at least six.56 For non-model organisms found in clades that have not yet been sequenced, obtaining these sequences requires considerable experimental work, and even in taxa like Drosophila, there are only five to six sequenced species that are closely related enough for analysis. Thus, proteomic-scale investigations of reproductive protein evolution are not detecting every rapidly evolving protein, but it remains likely that more reproductive proteins evolve under purifying selection. Also, different components of the reproductive process may evolve under different selective regimes, with data from Drosophila suggesting that secreted seminal fluid proteins may evolve faster than structural sperm proteins.31–33

Functions of rapidly and slowly evolving proteins

The interest in adaptively evolving reproductive proteins has also led to the implicit hypothesis that fast-evolving proteins are the most likely to have important functional roles. This hypothesis is logical: positive selection should act on novel protein variants that have large, positive effects on fitness. However, the functional data suggest that both slow- and fast-evolving reproductive proteins are important. Many marine invertebrate reproductive proteins evolve rapidly, and these proteins play essential roles in sperm-egg recognition and the species specificity of fertilization.2, 57–60 In Drosophila, some fast-evolving seminal proteins have essential functions: Acp26Aa (ovulin) dramatically increases egg laying in the 24 h following mating,10 and when males do not produce the protease CG9997, their sperm are inefficiently stored in female sperm storage organs.61, 62 However, the seminal fluid protein with the widest range of effects on mated females is the short, 36-residue sex peptide (SP). SP manipulates female behavior by increasing female egg production, upregulating an immune response, inducing feeding behavior, and preventing females from remating for days after mating.3, 63, 64 SP is also implicated in causing harm to females, in the form of shortened lifespan and reduced progeny production.65

Because of SP's apparent benefit to males and harm to females, SP and its receptor, SPR,4 seem prime candidates for cycles of male-female coevolution. However, neither SP nor SPR shows evidence of adaptive evolution. Between D. melanogaster and D. simulans, SP has a dN/dS = 0.33, which is above the genome-wide average of 0.09 but not indicative of adaptive evolution, and polymorphism-based tests find minimal evidence for recent selection.32, 66 SPR shows even greater conservation, as D. melanogaster SP can induce an SPR-mediated response from SPR orthologs identified in mosquitoes and moths.4 Furthermore, the rapidly evolving CG9997 acts in a network to ensure proper localization of SP to sperm storage organs,67 and the other network proteins evolve slowly. Perhaps these proteins anciently evolved to optimal sequences that preclude further improvement and play such essential roles that further amino acid substitutions would be deleterious.

The relative importance of rapidly and slowly evolving reproductive proteins may also be evident in mammalian systems. Proteins functioning in several steps of mammalian reproduction have evolved adaptively, including protamines that package DNA in sperm nuclei,68 sperm-egg surface proteins that potentially mediate gamete interactions,69–71 and seminal fluid proteins that form or degrade the copulatory plug.51 A more complete understanding of the specific protein-protein interactions that mediate mammalian fertilization will allow further assessment of this issue.

Gene duplications and lineage-specific proteins

Comparative genomics has revealed that gene duplication is an important force in the formation of new reproductive proteins. However, the mechanisms of duplication differ between protein types. Several groups have observed that retrotransposed genes often acquire testis-specific expression in both Drosophila and primates.72–74 These genes often move from the X chromosome to autosomes, which may allow them to escape X inactivation during spermatogenesis.75 Indeed, several members of the D. melanogaster sperm proteome are recently duplicated, retrotransposed genes.76 Among non-gametic proteins, tandem gene duplications that do not involve retrotransposition appear more common. We found that one-third of the transferred seminal fluid proteins in D. melanogaster were tandem gene duplicates.32 Curiously, female reproductive tract proteases in the desert drosophilid, D. arizonae, also show high rates of tandem gene duplication and rapid evolution,17 suggesting that throughout the Drosophila genus, duplication followed by divergence may be an important evolutionary mechanism by which males increase seminal fluid proteomic diversity and females adapt to this diversity.

Other studies have investigated lineage-specific changes in reproductive protein content. Some of the most exciting comparisons are performed between closely related species with divergent mating systems, under the hypothesis that the strength of sexual selection acting on reproductive proteins may depend on the intensity of sperm competition or the rate of female remating. In primates, Clark and Swanson51 found loss-of-function mutations in two genes (TGM4 and KLK2) involved in the formation or dissolution of the copulatory plug in species in which levels of sperm competition had decreased, whereas the genes remained functional and adaptively evolving in other species. Similarly, Dorus et al.77 found a positive correlation between the rate of evolution of SEMG2, which encodes a key component in semen coagulum, and the level of female promiscuity in apes and old-world monkeys. In rodents, the evolutionary rate of SVS II, a protein that becomes cross-linked upon insemination to form the copulatory plug, is correlated with testis size, a proxy for the level of sperm competition.37, 78 These studies suggest that in mammals, the mating system is an important factor that influences the rate of reproductive protein evolution. In Drosophila, more promiscuous species may have more rapidly evolving reproductive proteins,79 and with the new availability of larger data sets for male34 and female17 proteins of these species, we expect progress in identifying those proteins whose evolutionary histories have been most affected by mating system changes.

Selective forces driving reproductive protein evolution

In light of the repeated observations of adaptive evolution in reproductive proteins, one of the most elusive questions in the field is: what selective forces underlie this rapid divergence? This question has been repeatedly addressed;80–82 here, we briefly describe some leading hypotheses and provide citations to work relating to each. Two proposed hypotheses, sexual selection and sexual conflict, each predict the direct coevolution between male and female proteins, but they differ over the predicted consequences of this coevolution. Under sexual selection, females should choose specific variants of interacting male reproductive proteins, either because the male variant is an indicator of an unrelated but desirable trait or because a particular combination of male and female alleles leads to increased reproductive success.58, 77, 83 Under sexual conflict, males and females have different optima for a particular reproductive trait, and the rapid coevolution of reproductive proteins may represent each sex's effort to achieve its own optimum, at a fitness cost to the opposite sex.65, 84–87

Another hypothesis is selection that reinforces against hybrid formation. Reinforcement operates when two species occupying the same area interbreed, and the resulting hybrids have reduced fitness. In these circumstances, selection could act to prevent hybridization by causing divergence in reproductive characters.88, 89 A fourth hypothesis is that reproductive proteins evolve to avoid pathogens: if gametes, nutrient-rich eggs in particular, are subject to pathogen attack during development, spawning, ovulation, or fertilization, their surface proteins may evolve rapidly to avoid this cost. If changes to inhibit pathogen entry make sperm entry more difficult, then sperm must coevolve with the egg to maintain fertilization compatibility. This hypothesis makes the same predictions about male-female coevolution as the others, but it is difficult to test because of the paucity of data on which pathogens may prey upon gametes in different systems.

These hypotheses have been repeatedly discussed, but few experiments have directly tested them. It is also likely that no single force can explain each case of adaptive reproductive protein evolution, and in certain cases, multiple forces may be interrelated. For example, pathogen-driven egg protein evolution could, in turn, cause sexual selection for particular alleles of sperm proteins. Nonetheless, sexual conflict may be an important diversifying force in many situations, given the prevalence of documented conflicts in model systems.85, 86, 90, 91 Appealingly, the hypothesis does not depend on external factors, such as the presence of related species or pathogens, to explain rapid evolution in non-monogamous mating systems. However, measurements of the costs and benefits accrued by each sex during mating are necessary to assess the opportunities for conflict.80

Understanding the forces that drive reproductive protein evolution is also important because divergence in these proteins may be involved in speciation.92 If reproductive traits diverge along distinct trajectories within each of two populations, reproductive incompatibilities may occur when the populations experience secondary contact. Mathematical models of sexual conflict predict this sort of divergence in male and female alleles,93, 94 and in abalone species in which lysin-VERL incompatibilities appear to be a principal barrier to heterospecific fertilization,95 patterns of intra-population variation at these loci match the models' predictions.84 Of course, between-population divergence caused by any of the forces described above could cause these types of reproductive incompatibilities. Thus, identifying the forces that drive reproductive protein evolution may reveal forces at play during the process of speciation.

Functional analysis of reproductive proteins

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

It has become easier to identify reproductive proteins on a genomic scale and to analyze their evolutionary histories, but studies that do so often provide little functional information. In this section, we review molecular and genomic methods to functionally analyze reproductive proteins and describe ways in which evolutionary data may be used to predict protein-protein interactions.

Molecular techniques for functional analysis

Biochemical and genetic methods have yielded insights into the functions of specific proteins in a variety of reproductive interactions. A major open question in mammalian systems is the identification of specific sperm and egg proteins that mediate gamete fusion. CD9 was identified as an egg protein required for this process when CD9-knockout mice were found to be infertile. To identify sperm proteins that also act in gamete fusion, Inoue et al. used a monoclonal antibody which was developed to specifically inhibit the fusion of mouse gametes, to immunoblot for sperm proteins. A reactive protein was identified by MS, and the underlying gene, Izumo, was cloned and verified by mouse knockout experiments to be necessary for gamete fusion.96 Subsequent biochemical evidence suggested that Izumo may act as part of a larger sperm protein complex during gamete fusion.97 This combination of biochemistry and genetics illustrates the continuing utility of molecular methods.

These methods have also enhanced our understanding of mammalian egg coat [zona pellucida (ZP)] proteins, particularly in mice.14 Mice ZPs are principally comprised of three glycosylated proteins (ZP1-ZP3), which share a common ZP protein domain and interact non-covalently to create an elastic, fibrous coat. The ZP2 and ZP3 proteins are sperm receptors, and ZP3 is thought to induce the acrosome reaction, perhaps through a supramolecular complex with other ZP components.98 Knockout studies show that when ZP2 or ZP3 is not expressed, females produce eggs lacking ZPs and are infertile. The ZP domain of each protein contains two distinct portions: the N terminus, ZP-N, is responsible for polymerization of ZP proteins into fibrils, while the C terminus, ZP-C, has regions that regulate secretion and, in the case of ZP2 and ZP3, sperm binding. The crystal structure of the ZP-N domain of ZP3 was recently solved,99 providing detail into the formation of ZP structures and raising the possibility of understanding, at the molecular level, why particular mutations in ZP proteins lead to infertility. Curiously, ZP2 has multiple ZP-N domains preceding a single ZP-C domain, and these domains may function in preventing polyspermy and maintaining the species specificity of fertilization. Using the ZP3 ZP-N structure to comparatively model the ZP2 ZP-N domains99 and examining the structural effects of changes in positively selected sites should enhance our understanding of sperm-egg recognition and coevolution.

Other examples of genetic analysis come from ectopic expression studies of Drosophila seminal proteins. Yang et al.100 expressed a membrane-bound version of SP in different subsets of neurons in female flies and identified six to eight specific neurons in the female reproductive tract, marked by the expression of SPR and the genes pickpocket (ppk) and fruitless (fru), which appear sufficient for SP-mediated post-mating behavioral changes. Remarkably, a complementary approach using an RNAi screen drew the same conclusions.101 These findings represent perhaps the most detailed molecular interaction known between non-gametic reproductive proteins. Ectopic overexpression of seminal proteins in females has also been used to “magnify” small effects that might be missed by knockout experiments in males. Systemic expression of several male-derived proteins alters females' tolerance of bacterial infection, and four seminal proteins are toxic to females when overexpressed.102, 103 One of the toxic proteins was SP, a result consistent with experiments showing that females mated to SP-knockdown males had significantly higher fitness.65

Genomic screens for interacting proteins

Particularly in model organisms, large-scale genetic screens can identify interacting reproductive proteins. For example, SPR was initially identified by using a library of fly lines that enabled the tissue-specific RNAi knockdown of nearly every gene in the genome.104 To identify SPR, Yapici et al. screened for genes that when knocked down in females caused the same phenotype as that which occurs when females mate with males that do not transfer SP.4 Females knocked down for CG16752 (now SPR) laid fewer eggs and were unusually receptive to remating, and subsequent genetic, immunohistochemical and neurobiological analysis confirmed the interaction. While these sorts of screens – let alone the construction of a genome-wide RNAi library – are non-trivial, methods like a yeast two-hybrid screen105 are feasible in non-model systems, since they require no prior knowledge of any particular gene or potential interactor.

Phylogenetic signals of coevolution to predict functional interactions

In addition to the functional tests described above, recently developed evolutionary methods may allow predictions of functionally interacting reproductive proteins identified in genome-wide screens. The central idea behind these methods is that proteins that share a direct, physical interaction or that functionally interact (e.g., as a part of the same complex) experience shared evolutionary constraints. If, along one phylogenetic lineage, a substitution occurs in one protein of an interacting complex, then we might expect to observe in that same lineage a compensatory mutation in another protein in that same complex to maintain an optimal interaction (Fig. 3a). Along the branches of a phylogeny, then, we would expect to observe correlated evolutionary rates for trees depicting two interacting proteins (Fig. 3b, c).

thumbnail image

Figure 3. Coevolutionary signals suggest potentially interacting proteins. A: Proteins that functionally interact may coevolve to maintain their interaction interface. In this schematic, a change in the structure of a female reproductive protein (pink) may select for a corresponding change in the structure of an interacting male protein (blue). B: If coding gene sequences are available for multiple species, mirror trees can be constructed for the male and female proteins and a dN/dS ratio can be estimated for each branch of each phylogeny. C: Coevolving reproductive proteins may show correlated dN/dS rates across a phylogeny. Points in the scatterplot are color-coded to match branches with dN/dS estimates in the phylogenies. Likelihood models can be used to determine whether the correlation is significant.84

Download figure to PowerPoint

This idea of looking for correlations in the so-called “mirror trees” has been central to a variety of methods to infer interacting proteins in diverse biological processes106 and has recently been applied to interacting reproductive proteins in a study of abalone lysin and VERL.84 Clark et al. found a correlation between the dN/dS values for lysin and VERL along each branch of an abalone phylogeny, confirming the sensitivity of this method for a known pair of interactors. The study also detected coevolution through another new method, by looking for inter-locus linkage disequilibrium (LD) between single nucleotide polymorphisms (SNPs) in lysin and VERL within a population of abalone. This method raises the possibility of detecting candidate interacting proteins by searching for LD between physically unlinked loci. It would be interesting to validate these methods in another known case of interacting fertilization proteins, bindin and EBR1 in sea urchins, by performing polymorphism surveys in the species already shown to have associations between male and female genotypes during fertilization.58, 86

Both of these approaches could be effective in taxa for which large amounts of sequence and SNP data are available. Indeed, in Drosophila there are well-documented associations between reproductive success, variants in seminal fluid proteins, and differences in male and female genetic backgrounds.107–109 These observations, when combined with the recent identification of female proteins likely to affect sperm competition outcomes and interact with seminal proteins,19, 24, 25 make it possible to test for associations between variants in male and female reproductive proteins. Such tests could uncover interacting proteins and shed light on the types of selective forces that maintain the high levels of variation observed in many Drosophila seminal proteins.66

Linking molecular variants to reproductive phenotypes

Another fruitful avenue for future research will be to combine molecular studies of specific genes with whole-organism measures of fitness. Specifically, it will be of interest to examine the effects of variants and null mutations in specific reproductive genes on measures of fitness such as fertilization rate, sperm competitive ability, and overall fecundity. Because of the need to study variation and mutants, some of this work might be most readily accomplished in model systems. In Drosophila, many studies have demonstrated variation in natural or outbred populations in reproductive characters,87, 107–109 and as described above, the goal now is to identify the allelic variants that correspond with adaptive traits. For instance, Rice used experimental evolution to study male seminal fluid adaptations to a given female genotype,87 and it should now be possible to use next-generation DNA sequencing to measure changes in allele frequencies that occur during this type of experiment. It will also be important to study variants in female reproductive tract proteins to see which correspond with reduced harm from males and increased fecundity, as this area of research has gone largely unexplored. Another recent study found that male Drosophila transfer greater quantities of certain seminal proteins and have longer-lasting copulations when mating occurs in the presence of another male.110 These results suggest that males possess the ability to gauge the risk of sperm competition and allocate reproductive resources accordingly. It will be interesting to identify the genetic basis behind this strategic behavior.

Identifying the molecular underpinnings of reproductive phenotypes is important for other systems, as well. Natural strains of Caenorhabditis elegans vary in their males' abilities to form copulatory plugs after mating.111 The presence of a copulatory plug deposited by one male diminishes the chances of successful mating by a subsequent male,112 and the polymorphism in plug formation ability was recently mapped to the plg-1 gene, which encodes a mucin-like protein that comprises the plug.113 In non-plugging strains, a retrotransposon insertion interrupts the plg-1 coding sequence; the high frequency of this polymorphism has been interpreted as evidence that because C. elegans is a largely self-fertilizing, hermaphroditic species, selection for mechanisms to prevent sperm competition has been relaxed. Curiously, however, hermaphrodites preferentially utilize male sperm for fertilization, perhaps because of their larger size.114, 115 We speculate that a quantitative proteomic analysis of hermaphroditic and male sperm could reveal whether specific proteins or an overall size advantage cause male sperm precedence.

Conclusions

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

Proteomic and genomic methods are enabling researchers to identify the specific proteins that underlie long-standing observations in reproductive biology. These methods have shown that gametes and their accessory proteins are complex, with hundreds of proteins functioning together to ensure the creation of the next generation. The evolutionary patterns of these proteins are also complex. Reproductive proteins as a class have elevated evolutionary rates, and positive selection and gene duplication are important diversifying forces. However, many reproductive proteins are conserved between closely related species, including Drosophila SP, which is thought to underlie a conflict between the sexes.65

After the identification of large suites of reproductive proteins, functional characterization is essential. In addition to genetic screens, new phylogenetic and association-based methods are available to identify male and female reproductive proteins that may interact, as demonstrated in the abalone system with lysin and VERL.84 Even in the genomics era, genetic and biochemical methods remain important for revealing reproductive protein function, such as understanding the interactions between mammalian sperm and egg coat proteins.99 It is also critical to exploit naturally occurring variation in reproductive proteins and associations between SNPs in pairs of proteins to link organismal fitness phenotypes to specific genes. While the continued large-scale identification of reproductive proteins is important, we anticipate that over the next decade, much of the exciting research in this field will describe the molecular interactions required for reproduction to succeed.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References

We thank Jan Aagaard, Steve Springer, Renee George, Nathan Clark, and two anonymous reviewers for helpful comments on the paper, Jim Thomas and Mike Palopoli for discussions of C. elegans fertilization, and the Swanson Lab for support. WJS is supported by NIH grants HD057974 and HD042563, and NSF grant DEB-0743539. GDF has been supported by NIH training grant T32 HG00035.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Identifying reproductive proteins
  5. Evolutionary patterns of reproductive proteins
  6. Functional analysis of reproductive proteins
  7. Conclusions
  8. Acknowledgements
  9. References