Molecular evolution across developmental time reveals rapid divergence in early embryogenesis

Abstract Ontogenetic development hinges on the changes in gene expression in time and space within an organism, suggesting that the demands of ontogenetic growth can impose or reveal predictable pattern in the molecular evolution of genes expressed dynamically across development. Here, we characterize coexpression modules of the Caenorhabditis elegans transcriptome, using a time series of 30 points from early embryo to adult. By capturing the functional form of expression profiles with quantitative metrics, we find fastest evolution in the distinctive set of genes with transcript abundance that declines through development from a peak in young embryos. These genes are highly enriched for oogenic function and transient early zygotic expression, are nonrandomly distributed in the genome, and correspond to a life stage especially prone to inviability in interspecies hybrids. These observations conflict with the “early conservation model” for the evolution of development, although expression‐weighted sequence divergence analysis provides some support for the “hourglass model.” Genes in coexpression modules that peak toward adulthood also evolve fast, being hyper‐enriched for roles in spermatogenesis, implicating a history of sexual selection and relaxation of selection on sperm as key factors driving rapid change to ontogenetically distinguishable coexpression modules of genes. We propose that these predictable trends of molecular evolution for dynamically expressed genes across ontogeny predispose particular life stages, early embryogenesis in particular, to hybrid dysfunction in the speciation process.


Impact Summary
The development of an organism from a single-celled embryo to a reproductive adult depends on dynamic gene expression over developmental time, with natural selection capable of shaping the molecular evolution of those differentially expressed genes in distinct ways. We quantitatively analyzed the dynamic transcriptome profiles across 30 timepoints in development for the nematode C. elegans. In addition to rapid evolution of adult-expressed genes with functional roles in sperm, we uncovered the unexpected result that the distinctive set of genes that evolve fastest are those with peak expression in young embryos, conflicting with some models of the evolution of development. The rapid molecular evolution of genes in early embryogenesis contrasts with the exceptional conservation of embryonic cell lineages between species, and corresponds to a developmental period that is especially sensitive to inviability in interspecies hybrid embryos. We propose that these predictable trends of molecular evolution for dynamically expressed genes across development predispose particular life stages, early embryogenesis in particular, to hybrid dysfunction in the speciation process.
Ontogenetic development hinges on the changes in gene expression in time and space within an organism. The dynamic molecular networks that specify cell proliferation and differentiation together produce morphogenesis, going from a single-celled zygote to a reproductively mature adult. Evolution favors maximal reproductive success to shape those gene expression dynamics and the functional properties of the proteins they encode, with the strength of selection pressures recorded in their sequences. Therefore, the demands of ontogenetic growth ought to impose or reveal predictable pattern in the molecular evolution of genes expressed dynamically across development (Raff 1996;Kalinka and Tomancak 2012). The rules, if any, that govern the molecular evolution of development must integrate adaptive evolution within the cellular constraints to forming a whole organism in embryogenesis and the life history constraints on a whole organism to reproduce successfully. We can address these issues from the perspective of genetic controls (e.g., cis and trans regulation) or from spatiotemporal dynamics in the formation of the structures of a complete organism.
A physical, spatial perspective motivates one means of molecular evolutionary predictability in development: tissue or cell specificity of gene expression will narrow the breadth of expression in space and consequently narrow the potentially negative pleiotropic effects of changes to gene expression or protein function (Stern 2000;Carroll 2005;Haygood et al. 2010;He et al. 2012). This logic about the impact of pleiotropy mirrors arguments for the disproportionate role of cis-regulatory changes in adaptive divergence, relative to trans-regulatory and coding changes (Wray 2007;Carroll 2008;Stern and Orgogozo 2008;Wittkopp and Kalay 2012). For example, mammalian genes with greater tissue specificity of expression evolve faster in coding sequence but slower in terms of expression change (Liao and Zhang 2006b).
Temporal specificity of gene expression provides a parallel dimension to spatial specificity that can restrict or exacerbate the potential for pleiotropic effects of change to gene regulation or protein structure. Similar to the argument for spatial extent of gene activity, narrower duration of expression in ontogeny ought to narrow the potential for negative pleiotropic effects of changes to a given gene. A counterargument, however, points out the unidirectional nature of time: changes to early points in development can cascade through ontogeny with disproportionate force (Poe and Wake 2004;Irie and Kuratani 2014;Arthur 2015). Because most new mutations with fitness effects are deleterious (Keightley and Lynch 2003), this "early conservation" or "generative entrenchment" view predicts slower evolution of genes expressed earlier in embryonic development, as has been reported for mouse and zebrafish (Roux and Robinson-Rechavi 2008;Irie and Kuratani 2014). By contrast, the most famous temporal paradigm derives from embryological observations of a "phylotypic stage" with greatest phenotypic constraint at intermediate timepoints in development, the "hourglass model" (Raff 1996;Kalinka and Tomancak 2012). Applications of this idea to molecular data have renewed interest in it beyond morphology for diverse taxa, including Caenorhabditis elegans (Castillo- Davis and Hartl 2002;Cutter and Ward 2005;Levin et al. 2012;Zalts and Yanai 2017) and other invertebrates (Davis et al. 2005;Cruickshank and Wade 2008;Kalinka et al. 2010;Mensch et al. 2013;Gerstein et al. 2014;Levin et al. 2016;Liu and Robinson-Rechavi 2018b;Coronado-Zamora et al. 2019), vertebrates (Hazkani-Covo et al. 2005Domazet-Loso and Tautz 2010;Irie and Kuratani 2011;Piasecka et al. 2013;Liu and Robinson-Rechavi 2018a), and even plants (Quint et al. 2012;Drost et al. 2015). Different still, population genetics arguments about weaker purifying selection on genes expressed by just one sex, like maternal-effect gene products deposited in eggs, predict disproportionately rapid evolution of such maternally deposited genes involved in early embryogenesis of zygotes (Cruickshank and Wade 2008;Dapper and Wade 2016).
These "evo-devo" ideas, however, largely focus on embryogenesis, and do not explicitly incorporate the entirety of ontogeny over an organism's life cycle (Kalinka and Tomancak 2012). Ideas from the evolution of aging and senescence, by contrast, consider late life (Flatt and Schmidt 2009). In particular, the mutation accumulation theory of aging predicts more rapid evolution of genes expressed following the onset of reproductive maturity than for those expressed earlier because diminishing reproductive value following maturity weakens the ability of selection to eliminate mutations (Medawar 1952;Charlesworth 1993;Promislow and Tatar 1998;Partridge 2001). Genes with expression in just one sex also ought to experience weaker purifying selection than other genes, leading to faster protein evolution, because mutations would be exposed to selection in just a fraction of the population (Cruickshank and Wade 2008;Dapper and Wade 2016). Sexual contests and mate choice drive rapid divergence in morphological ornaments and their genetic underpinnings (Swanson and Vacquier 2002;Ellegren and Parsch 2007;Rowe et al. 2018), so sexual selection and sexual conflict also predict faster evolution of sex-biased genes and of genes expressed late in life, to the extent that their development gets specified toward adulthood. The coding sequences of adult-expressed genes do tend to evolve faster than embryonic genes in a number of taxa (Cutter and Ward 2005;Davis et al. 2005;Artieri et al. 2009;Liu and Robinson-Rechavi 2018a;Coronado-Zamora et al. 2019).
Caenorhabditis elegans and related nematodes are wellknown for their similarity in form (Haag et al. 2007;Stevens et al. 2019), despite the long times since species separated from one another (Cutter 2008). Indeed, the embryonic cell lineage of different Caenorhabditis species is outwardly preserved to an astonishing degree (Zhao et al. 2008;Memar et al. 2018), albeit with some key differences in timing of developmental milestones (Levin et al. 2012). Upon hatching at the end of embryogenesis, C. elegans individuals comprise 558 cells, then grow to become adult hermaphrodites with 959 somatic cells total (Sulston et al. 1983); sex-biased gene expression is most prominent near reproductive maturity (final L4 larval stage and adult; Reinke et al. 2004;Thomas et al. 2012). The similarity of form across species, however, masks substantial evolution of genetic interactions as revealed by pronounced embryonic mortality in interspecies hybrids (Baird et al. 1992;Baird and Seibert 2013;Bundus et al. 2015). Spindle movement in the first cell division of embryos has diverged across species in a manner consistent with developmental system drift (Riche et al. 2013;Farhadifar et al. 2015;Valfort et al. 2018). Experiments also demonstrate that morphological stasis and even conserved expression patterns mask profound cisregulatory divergence of conserved coding genes (Barriere et al. 2012; Barrière and Ruvinsky 2014;Verster et al. 2014;Barkoulas et al. 2016). Molecular evolution analysis of genes expressed differentially across postembryonic development from microarray data reported faster evolution of coding sequences associated with the onset of reproductive maturity, but little directional effect of timing in embryogenesis (Cutter and Ward 2005). These collective observations motivate characterization of molecular evolution for gene expression dynamics across the entirety of ontogeny to explain the paradox of hybrid dysfunction despite morphological conservation.
Here, we test for evo-devo patterns of molecular evolution by characterizing coexpression modules of the C. elegans transcriptome over the full course of development, using functional principle components analysis (FPCA) on a time series of 30 points from early embryo to adults (Gerstein et al. 2010(Gerstein et al. , 2014. By coarse grain modeling the functional form of these ontogenetic trajectories of gene expression, we capture quantitative metrics that reveal how developmental dynamics relate to rates of molecular evolution. We find predictable trends of molecular evolution across ontogeny that are most conspicuous when analyzing ontogenetically coexpressed sets of genes, with implications for the genetics of postzygotic reproductive isolation in the speciation process.

PROCESSING
We obtained RNAseq transcriptome sequences as SAM format files (mapped to C. elegans reference genome version WS248) from the public modENCODE data repository (http://data. modencode.org) for the C. elegans developmental time series for early embryos, each larval stage, and young adult hermaphrodites (Table S1; Gerstein et al. 2010Gerstein et al. , 2014. We quantified expression for each gene using featureCounts (Liao et al. 2014), based on exon annotations of WS248 (transposable element and pseudogene annotations were excluded; exons corresponding to all alternative splice forms of a given gene contributed to expression quantification for that gene). We then normalized expression counts following the log-counts per million method of (Law et al. 2014). Embryonic transcriptomes included a single biological replicate per timepoint, whereas larval and young adult transcriptomes included duplicates with no reported batch effects (Gerstein et al. 2014); given the high correlation between duplicates (r > 0.95), we used the average log-normalized expression for each larval and adult timepoint for subsequent analyses. We restricted our analyses to those 19,711 genes with an expression level ࣙ1 read count per million (cpm) in at least one timepoint (Robinson et al. 2010). We recalculated the log-cpm values for this set of 19,711 genes to account for the slight change in library sizes after the filtering step.

QUANTIFICATION OF MODULES
To uncover and identify distinct sets of gene expression patterns over time across the 19,711 genes in the C. elegans transcriptome (coexpression "modules"), we performed a FPCA. FPCA is appropriate for longitudinal datasets that may be sampled irregularly, with dense or sparse sampling, or with noisy values (Yao et al. 2005;Hall et al. 2006;Madrigal et al. 2018), as for this transcriptome time series with just a single replicate per timepoint. First, we applied FPCA to the log-normalized gene expression data, using the "FPCA" function in the R package fdapace, observing the first two components to cumulatively explain ß92% of the total variation. We then used each gene's FPC scores of the first two components as input for the clustering algorithm, implemented through the "FClust" function in R that uses a Gaussian Mixture Model approach based on EM-Cluster (http://cran.r-project.org/package=EMCluster). We determined the optimal number of coexpression clusters or modules in our analysis to be k = 14, based on minimizing the Bayesian information criterion (BIC) value. We varied k between 2 and 20 and observed minimum BIC = 11.4 occurring between k = 12 and k = 14. Visual inspection of expression trends affirmed the biological relevance of choosing k = 14 coexpression modules to represent the variation in expression profiles in the C. elegans transcriptome time series. Based on the outputs of the clustering algorithm, we assigned each gene to the module for which the gene has the highest membership probability.
To summarize quantitatively the dominant trends in expression over time for each coexpression module, we fit orthogonal cubic polynomial functions with time to lognormalized expression values, rescaled using the "poly rescale" function in the polypoly R package (https://cran.r-project. org/package=polypoly). To relate the coexpression modules to each other, we then performed hierarchical clustering on the module-wise cubic polynomial regression coefficients. The goal with this functional analysis was not statistical testing of model complexity (e.g., linear vs. quadratic), but to use the parameter values of a flexible functional form as a quantitative metric of expression profile shape that can be compared across coexpression modules and across genes. The parameters extracted from the cubic fits summarize the overall expression level (α), increasing or decreasing trends in expression across development (β 1 ), the degree of concave versus convex expression dynamics over ontogeny (β 2 ), and how S-shaped are the expression dynamics (β 3 ). To obtain a finer-grained view of the temporal trends, we also performed a gene-level analysis, in which we fit an orthogonal cubic polynomial to each individual gene expression profile and extracted the corresponding parameters for analysis.
Finally, we classified genes according to expression pattern in the simplest of ways, by grouping genes according to which timepoint they showed peak expression across the time series.

ENRICHMENT ANALYSIS
To investigate trends of genomic organization for each coexpression module, we used contingency tables and χ 2 -test statistics to test for nonrandom distributions of genes for each of the 14 modules across each of the six chromosomes in the genome. To achieve this, we arranged the data in 84 individual two-way contingency tables, so that we could obtain χ 2 -test statistics on 1 degree of freedom to test for an association within each modulechromosome combination. We further investigated trends of genomic organization by looking within chromosomes, at enrichment within the arm and center regions of each chromosome, with arm versus center domains defined by recombination rate breakpoint positions given by (Rockman and Kruglyak 2009). MtDNA genes were excluded for these analyses, and P-values were adjusted for multiple testing using the Holm-Bonferroni method.
We conducted gene ontology (GO) and phenotype enrichment analysis (PEA) tests using the list of genes in each coexpression module as input into the WormBase Enrichment Analysis Suite (Angeles-Albores et al. 2016, obtaining Benjamini-Hochberg false discovery rate corrected P-values (Q-values) for statistical significance. By also cross-referencing genes with the analysis of Tu et al. (2015), we used their determination of operon identity and calculations of coding sequence divergence between orthologs of C. elegans and Caenorhabditis briggsae to quantify molecular evolution of protein sequence as K A , the rate of nonsynonymous site substitution per nonsynonymous site. Because of the saturated synonymous-site substitution rates (K S ), we focus on K A as a metric of protein molecular evolution rather than K A /K S (Cutter and Ward 2005). Finally, we cross-referenced the genes in the transcriptome time series with genes identified in other studies of C. elegans transcriptomes to have (1) sex-neutral, oogenic, or spermatogenic enrichment of expression from dissected gonads (Ortiz et al. 2014) or (2) maternally deposited, embryonic, transient embryonic, or degradation patterns of expression in early embryos up to 186 min after first cleavage (approximately timepoints 6-7 in our main analysis; Baugh et al. 2003). Gene enrichments in these cross-referenced sets of genes were determined with contingency table analysis, as for chromosome enrichment tests.

STEREOTYPICAL TRANSCRIPTOMIC PATTERNS
We used FPCA to define 14 coexpression modules that describe clusters of the 19,711 genes that get expressed across 30 timepoints from early embryo through young adult stages of hermaphrodite C. elegans ( Fig. 1), based on ModENCODE transcriptome profiling data (Table S1; Gerstein et al. 2010Gerstein et al. , 2014. To obtain quantitative metrics describing the shape of each coexpression module, we then fit a cubic function to the gene expression profiles of each of the 14 developmental time series (Fig. 1). The parameter values extracted from the cubic fits capture the overall expression level (α), increasing or decreasing trends in expression across development (β 1 ), the degree of concave versus convex expression dynamics over ontogeny (β 2 ), and how S-shaped are the expression dynamics (β 3 ). When we fit the cubic functional form to each gene individually (Figs. S1 and S2), discriminant analysis demonstrated that values for these four parameters could correctly determine the coexpression module identity for 92.9% of genes, indicating that parameters from gene-wise cubic function fits capture well the key distinguishing features of ontogenetic expression dynamics.

GENE EXPRESSION MODULES
Upon defining these ontogenetically dynamic gene expression modules, we investigated their distinguishing features in terms of genomic organization, function, and molecular evolution. Interestingly, genes from related expression profiles showed distinctive chromosome biases. Five modules were enriched on the X-chromosome, all of which corresponded to those with peak expression in late embryogenesis (M1, M2, M5, M7, and M8; Fig. 2A). This genomic nonrandomness to expression covariation in ontogeny suggests that chromatin regulation might influence the fitness effects of gene translocations in predictable ways, given the distinctive chromatin dynamics of the X-chromosome (Kelly et al. 2002). The early-embryogenesis module M4 showed the greatest chromosomal bias of any module, being more than twofold enriched on Chromosome II and tending to be Fold enrichment on arms underrepresented on all other chromosomes ( Fig. 2A). Genes from those modules with peak postembryonic expression, by contrast, showed enrichment on chromosomes IV and V (M9, M10, M11, and M14), and highly expressed "constitutive" modules showed enrichment on chromosomes I and III (M3, M6, and M12; Fig. 2A). When we looked within chromosomes at their recombination domain structure of arms versus centers (Rockman and Kruglyak 2009), we found genes for most modules to be present in their expected proportions given chromosomal gene densities (Fig. 2B). However, genes in M4 were significantly enriched in arms on Chromosome II, the chromosome where M4 genes are exception-ally abundant, and also were elevated on arms relative to centers of other chromosomes (Fig. 2B). Postembryonic modules M9 and M10, as well as the low-expression "constitutive" module M13, also showed significant enrichment on arms of several chromosomes (Fig. 2B). By contrast, the highly expressed "constitutive" module M12 was under-enriched on the arms of Chromosomes II and V (Fig. 2B).
At a more local scale of genome organization, we found that three modules were hyper-enriched for membership in operons (Fig. 3). Each of the highly expressed "constitutive" modules M3, M6, and M12 contain >40% of their genes in operons (Fig. 3), compared to just 20.5% of coding genes overall  occurring in operons. Of the remaining modules, only M13 (the fourth "constitutive" module) and M8 had >10% operonic genes, and <4% of genes occurred in operons for all four modules with postembryonic peak expression (M9, M10, M11, and M14; Fig. 3).

DISTINCTIVE FUNCTIONAL PROPERTIES OF ONTOGENETIC GENE EXPRESSION MODULES
We cross-referenced the gene composition of coexpression modules with those gene sets identified by Ortiz et al. (2014) to have sex-neutral, oogenic, or spermatogenic enrichment of expression. These three expression categories had been inferred from differential expression of dissected gonads that had either active oocyte-only or sperm-only development (Ortiz et al. 2014). The early-embryogenesis module M4 showed extreme enrichment for oogenic genes (57%), with the next most enriched modules for oogenic genes being "constitutive" modules M3 (24%) and M12 (23%; Fig. 3, yellow portion of bar plots). By contrast, the four modules with peak expression in postembryonic stages contained almost no oogenic genes, instead being exceptionally enriched for spermatogenic genes (75% to 92%; Fig. 3; M9, M10, M11, and M14). As expected of genes with sperm-related function (Reinke and Cutter 2009), operons were rarest in these modules (M9-M11 and M14; Fig. 3). Eight of the 14 modules overall were composed of >50% sex-neutral genes, including all five of those with peak expression late in embryogenesis, although three of the "constitutive" modules contained the highest abundance of them (71% to 82%; M3, M6, and M12; Fig. 3, gray portion of bar plots).
We also tested for enrichment of maternally deposited and transient zygotically expressed genes for the subset of 6782 genes that we could cross-reference with the early embryo (up to 186 min after first cleavage) transcriptome analysis of Baugh et al. (2003). We found that module M4 was up to ninefold enriched for genes identified as embryonic transient and threefold underenriched for maternally deposited and degraded genes (χ 2 -test, df = 1, all Bonferroni-corrected P ࣘ 0.002; Fig. S3). By contrast, "constitutive" modules (M3, M6, M12, and M13) showed up to twofold enrichment for maternally deposited genes and under-enrichment for embryonic transient genes (Fig. S3). All other coexpression modules showed no enrichment or significant under-enrichment for maternal and transient gene categories from the early-expression dataset of Baugh et al. (2003), with the lone exception of 2.2-fold enrichment in module M10 for embryonic transient genes (χ 2 -test, df = 1, Bonferroni-corrected P = 0.047; Fig. S3).
GO and PEA further showed that the highly expressed "constitutive" modules are enriched for basic cellular processes, like ribosomal and mitochondrial activity, embryonic defects, and chromosome segregation (M3, M6, and M12; Fig. 2C; Tables S2 and S3). By contrast, the modules showing increasing expression across embryogenesis and later stages tended to have significant

. (A) Median rate of protein evolution (nonsynonymous site substitution, K A ± interquartile range for orthologs of C. elegans and C. briggsae) for genes within each coexpression module as a function of the proportion of module genes with the sex-neutral expression category, as defined by Ortiz et al. (2014). (B) Rates of protein evolution (K A , log-scale; zero values plotted at K A = 0.001) plotted as a function of the α parameter (overall expression level) from the polynomial fit to the expression time series. Per-gene values shown as small squares; module median values shown as large circles. Module membership color is the same in (A) and (B).
enrichment of developmental GO and behavioral PEA terms, such as regulation of cell shape, neural activity, linker cell migration, and animal motility (Fig. 2C, purple-and green-shaded modules; Tables S2 and S3). The most overrepresented terms across all coexpression modules were found in early-embryogenesis module M4, involving 21-fold enrichment of genes associated with protein heterodimerization activity (GO) and 19-fold enrichment of early embryonic chromatid segregation (PEA; Fig. 2C; Tables S2 and S3). Among the 105 genes in the C. elegans genome annotated with the protein heterodimerization activity GO term (GO:0046982), 69% correspond to histones, with most of the others composed of TATA-box binding proteins, transcription factors, and CENP centromere-related proteins; M4 alone has 31 histones.

PEAK EXPRESSION IN EARLY EMBRYOGENESIS AND ADULTHOOD
The coexpression modules differ significantly in the rate at which their gene members evolve (n = 12,628 genes with both expression and divergence information; Fig. 3). Surprisingly, we found that it is those genes in M4 with peak expression in early embryogenesis that comprise the most rapidly evolving set of genes (median K A = 0.43; Figs. 3 and 4A). As another sign of rapid evolution of genes in M4, this module contained the lowest percentage of genes with identifiable orthologs between C. elegans and C. briggsae (28% vs. 64% genome-wide and 92% ortholog pairs identified for M6; Fig. 3). The saturated synonymous-site divergence for C. elegans orthologs precludes robust tests of adap-tive evolution (median K S = 2.33), although a large fraction (83%) of nonsynonymous substitutions are estimated to have been driven by positive selection in other Caenorhabditis (Galtier 2016).
Curiously, however, module M4 has the highest fraction of genes with near-zero values of K A (9.3% vs. 0.5% of genes overall; Fig. S4). This observation indicates exceptionally strong selective constraint on this subset of genes within M4: this subset is composed entirely of histones that are well-known to evolve slowly, and yet are still overrepresented in M4. These 14 histone genes, plus another subgroup of 15 genes with K A < 0.02 (14 of which also are histones), imply that about 20% of M4's "early embryogenesis" genes encode histones, genes that evolve extraordinarily slowly. Nevertheless, the remaining 80% evolve so remarkably fast that they confer on M4 the highest average K A of any module ( Fig. 3; Fig. S4). The only other module with substantial abundance of a group of exceptionally conserved coding sequences is "constitutive" module M6 (4.9% of genes with near-zero K A ), which also shows the strongest sequence conservation on average irrespective of this exceptional subset of genes. Module M6 has a median K A = 0.033, implying that only about 3% of nonsynonymous sites in codons have changed between C. elegans and C. briggsae since their common ancestor, estimated at 113 million generations ago (Cutter 2008).
The four modules with peak postembryonic expression and enrichment with spermatogenic function also evolve up to twice as rapidly as the genome-wide median K A = 0.121 (median K A for "postembryonic" modules M9, M10, M11, and M14 from 0.185 to 0.261; Fig. 3). Overall, coexpression modules with lower incidence of sex-neutral genes exhibit more rapid sequence divergence (Fig. 4A). As expected from previous analyses of C. elegans molecular evolution ), genes in those modules with higher average expression tend to evolve more slowly and show more sequence conservation (Fig. 4B); this manifests as unusually low divergence at synonymous sites only for M6 (median K S = 1.1 vs. genome-wide median K S = 2.33). An outlier to the K A -expression relationship, however, is module M4: these early-embryogenesis genes show fast molecular evolution despite relatively high transcript levels ( Fig. 4A and 4B). Our gene-wise analysis of coarse-grained cubic function parameters corroborates these findings (Fig. S5), with the four α and β parameters being capable of explaining 11.5% of the variability in K A across genes (ANOVA F 4,12623 = 408.5, P < 0.0001; log-transformed K A ).
As a complement to the ontogenetic expression module analysis, we quantified rates of molecular evolution for a simpler partitioning of genes, by grouping genes according to the timepoint with highest observed expression (Fig. 5A). Average rates of protein sequence evolution were fastest for those genes with peak expression in the final L4 larval stage, young adults, and in early embryos (Fig. 5B and 5C), corroborating the findings from the ontogenetic coexpression modules. Among those genes with peak expression in embryogenesis, genes with later peak expression tended to evolve more slowly (Fig. 5B and 5C), recapitulating the contrast of K A for "early embryogenesis" module M4 versus "late embryogenesis" coexpression modules (M1, M2, M5, M7, and M8).
Interestingly, however, genes with peak expression at timepoints 7-9 (180-240 min) exhibit a dip in sequence divergence (Fig. 5C), suggesting a trend of greater sequence conservation near ventral enclosure in embryogenesis reminiscent of "hourglass" patterns of expression divergence between species (Levin et al. 2012). Caveats to concluding that this observation strongly supports an "hourglass" model of sequence evolution include the facts that timepoints 7-9 exhibit among the fewest genes with peak expression (from 115 genes in timepoint 7 to 366 in timepoint 9) and the clustering analysis revealed no distinct coexpression module exhibiting maximal expression in this developmental interval. Moreover, genes in the highly conserved and highly expressed "constitutive" modules M6 and M12 predominate among the genes with nominally peak expression between timepoints 7-9 (Fig. 5A), with histone genes especially enriched in timepoint 8. To test the sensitivity of these results to gene sample size and composition, we calculated the "transcriptome divergence index" (TDI; Quint et al. 2012), a metric of average sequence evolution for all 12,628 genes with K A values weighted by their expression level at a given timepoint (Fig. S6). Inspection of TDI over the time series shows TDI minimized at timepoint 7 (Fig. S6), suggesting that ventral enclosure may indeed represent a crucial developmental stage in terms of both conservation of the expression and sequence of genes (Levin et al. 2012). The TDI metric also is especially low at timepoint 1, perhaps consistent with the "early conservation" model, although these earliest transcripts likely are primarily maternal in origin. TDI has a maximal value in adulthood (Fig. S6), also showing high values in early embryonic developmental timepoints 2-6, consistent with our observations for ontogenetic coexpression modules and the peak expression analysis.

Discussion
Understanding the interplay between genes and phenotypes in the evolution of development must accommodates how molecular evolution can associate with both phenotypic divergence and phenotypic conservation. The conservation of phenotype, including developmentally static phenotypes like Caenorhabditis embryo-genesis (Zhao et al. 2008;Memar et al. 2018), need not imply conservation of the genetic pathways that produce them (Kalinka and Tomancak 2012). This idea is the essence of developmental system drift (DSD; True and Haag 2001), and a key question is to what extent are different stages of development more or less susceptible to molecular divergence and DSD in a predictable way. Temporal trajectories of gene coexpression provide a means of interrogating this question to determine what are the rules in the molecular evolution of development.

MOLECULAR EVOLUTION OF DEVELOPMENT
We observe the fastest coding sequence evolution for genes with peak expression early in embryogenesis (coexpression module M4), suggesting that the developmental stage in C. elegans near gastrulation may be especially prone to DSD. This rapid evolution also occurs in terms of gene turnover, with identifiable orthologs being underrepresented among the members of module M4. Why do genes with peak expression in early embryogenesis evolve so fast? This rapid evolution occurs despite an overrepresentation of histone proteins within this coexpression module that have exceptionally slow sequence evolution. Among the genes with rapid evolution, weaker purifying selection on maternally provisioned transcripts could provide one plausible basis for faster evolution of early embryogenesis genes (Cruickshank and Wade 2008;Dapper and Wade 2016). The predominantly selfing mode of C. elegans and C. briggsae with populations composed of >99% hermaphrodites, however, would restrict the evolutionary timeframe of such weaker selection to their ancestral gonochoristic lineages. Moreover, we actually observed under-enrichment of maternally deposited genes in module M4. Work in Drosophila also implicates especially fast evolution of genes expressed in early embryogenesis (Mensch et al. 2013;Coronado-Zamora et al. 2019). Thus, exceptionally rapid evolution of a subset of genes with stereotypical expression in early embryogenesis might suggest a general rule in the molecular evolution of development.
Two other factors could also contribute to especially rapid evolution of the module of genes with peak early embryonic expression. First, a greater incidence of positive selection could contribute to their rapid evolution, perhaps resulting from parentoffspring conflict or protein-protein coevolution yielding DSD between gene partners (True and Haag 2001;Clark et al. 2009;de Juan et al. 2013). Second, the nonrandom genomic organization of genes with shared ontogenetic expression could lead genome structural changes to bias the developmental stages affected. Genes in M4 are over-represented on autosomal arms (64% of M4 genes on arms vs. 37% genome average), genomic regions known to exhibit more rearrangements, indels, gene turnover, and to have genes with greater divergence in genome comparisons between species Ross et al. 2011). Consequently, the genomic organization of genes with shared profiles of expression may make them experience predictable molecular evolution that depends less on their ontogenetic properties and more on the details of a species' genome architecture.
The cell lineage in early embryos is extremely consistent across different Caenorhabditis species (Zhao et al. 2008;Memar et al. 2018). This observation might lead one to predict that embryogenesis is extremely canalized and unusually robust to environmental or genetic perturbation. This idea is dashed by one kind of genetic perturbation: formation of interspecies hybrids. Consistent with divergence of genes and genetic interactions with important biological consequences, embryonic arrest near gastrulation represents the usual fate of interspecies hybrid zygotes (Baird et al. 1992;Baird and Seibert 2013;Dey et al. 2014;Bundus et al. 2015). Thus, the molecular evolutionary consequences of the nonrandom collection of genes with peak expression early in embryogenesis might lead them to be predisposed to DSD and to contribute to hybrid inviability in the speciation process.
Our observation of more rapid coding sequence evolution for genes with peak expression early in embryogenesis clearly conflicts with the "early conservation" model for the evolution of development (Kalinka and Tomancak 2012). Moreover, it has been argued that "conservation at the end of embryogenesis is not endorsed by any model" (Kalinka and Tomancak 2012), and yet the trend we observe shows just that, based on analyses of both coexpression modules and peak gene expression patterns. Our analysis of peak expression timing and an expression-weighted divergence index (Fig. 5C, Fig. S6), however, hint at a phase of mid-embryonic development with strongest constraint (timepoint 7, at 180 min), suggestive of the "hourglass model" that has been endorsed in Caenorhabditis from analysis of expression divergence between species and among mutation-accumulation strains (Levin et al. 2012;Levin et al. 2016;Zalts and Yanai 2017). The prevalence of genes from highly expressed "constitutive" coexpression modules during the "waist" of the hourglass, however, makes it challenging to understand what is distinctive about the genes with expression at this point midway through embryogenesis. Possible factors could involve the abundance of histone genes to define it as a key developmental phase for chromatin remodeling; alternately, this timepoint might simply represent a lull in stage-specific expression with the "constitutive" genes inevitably dominating the expression composition and, consequently, the signal of high sequence conservation. Regardless, the unusually rapid evolution of early embryogenesis represents, in our view, the pattern of molecular evolution requiring special explanation and attention. Expression of genes in this early portion of embryogenesis also shows more sensitivity to perturbation by mutations (Zalts and Yanai 2017). Developmental stages associated with genes having faster rates of molecular evolution ought to be predisposed to more extensive DSD, and impose detectable and predictable phenotypic rules. Specifically, each evolutionary substitution has the potential to contribute to the formation of Dobzhansky-Muller incompatibilities between species (Orr 1995). Consequently, stages prone to DSD through the accumulation of sequence divergence may reveal themselves by manifesting as being most sensitive to hybrid dysfunction in crosses between diverged species (Bundus et al. 2015).
To date, analyses of molecular evolution have primarily revealed gametic and postembryonic stages to have fastest rates of evolution in animals and plants (Cutter and Ward 2005;Ellegren and Parsch 2007;Arunkumar et al. 2013;Piasecka et al. 2013;Liu and Robinson-Rechavi 2018a). Our findings corroborate this result, showing that coexpression modules with peaks in adulthood that are enriched for sperm-related gene function evolve especially rapidly. In the context of C. elegans biology, where self-fertilizing hermaphrodites evolved from an outbreeding male-female species, both sexual selection pressures in the ancestral species and relaxed sexual selection in the modern day likely contribute to the rapid evolution of sperm genes (Cutter 2015). Weaker selection efficacy on genes with sex-limited expression also could have influenced the molecular evolution of such genes in the gonochoristic ancestor, prior to the origin of predominant selfing (Dapper and Wade 2016).
Tissue-specific genes have faster coding sequence evolution in mammals (Liao and Zhang 2006b), and temporal specificity might lead to similar consequences. In our analysis, we can think of genes with extreme values of β 1 , β 2 , and β 3 as having greater temporal specificity of expression and therefore mutations to them having lower potential for pleiotropic effects; however, we observe relatively weak individual associations of these metrics with K A (Fig. S2). Alternately, we can think of mutations to genes with lower α (i.e., a profile of lower overall expression across ontogeny) as having lower potential for pleiotropic effects due to the rarity of gene products, and indeed genes with lower α evolve faster. Genes in module M4, with peak expression during early embryogenesis, represent important outliers to this trend, as they tend to have both fast sequence evolution and moderately high values of α (Fig. S2). In yeast, however, factors like translational robustness appear to be especially important in mediating the correspondence between expression level and rate of coding sequence evolution (Drummond et al. 2005), although it remains unclear how general this explanation holds across eukaryotes.

DIVERGENCE IN SEQUENCE
Our analysis puts to the side the question of the relative importance of regulatory versus coding changes in adaptation and morphological divergence (Wray 2007;Carroll 2008;Stern and Orgogozo 2008). Instead, we focus on coding sequence evolution to ask what features of ontogeny predict differences in the rates of evolution across genes. However, observing differences in rates of coding sequence evolution among distinct coexpression modules implies a mapping between the nature of regulatory control and protein evolution. Previous studies of diverse animals show a weakly positive correlation between molecular evolutionary rates of coding sequences and regulatory regions (Jordan et al. 2005;Lemos et al. 2005;Liao and Zhang 2006a), including for Caenorhabditis (Castillo- Davis et al. 2004;Mark et al. 2019). Both coding sequences and gene expression are subject to purifying selection in C. elegans (Denver et al. 2005;Cutter et al. 2009), but future genome-scale analyses that couple ontogenetic transcriptome profiles with coding and regulatory sequence evolution are required to more fully determine the magnitude of interdependence of these modes of molecular evolution across development. Establishing such links would be valuable in integrating "hourglass" patterns of expression divergence and sequence evolution.
Evo-devo generally focuses on how the relative strength of constraint, which manifests as purifying selection and sequence conservation, could shape temporal ontogenetic patterns of evolution (Kalinka and Tomancak 2012). And yet, microevolutionary studies demonstrate that a majority of amino acid substitutions in protein coding sequence evolution often accumulate as a result of adaptive evolution in many animals, especially those with large effective population sizes like C. elegans' congeners (Galtier 2016). Genes biased toward expression in adults and gametes are known to show elevated rates of adaptive evolution (Swanson and Vacquier 2002;Arunkumar et al. 2013;Liu and Robinson-Rechavi 2018a;Coronado-Zamora et al. 2019), but the extent of embryonic adaptive evolution and its implications are less well established. In Drosophila, rapidly evolving proteins involved in chromatin regulation and genomic conflict are known to play important roles in creating postzygotic reproductive barriers between species during early development (Presgraves 2010;Maheshwari and Barbash 2011;Cooper et al. 2018). Evolutionary conflict over allelic expression in early embryos also can drive rapid sequence evolution (Haig 1997), as can less effective selection on genes expressed by one sex (Cruickshank and Wade 2008;Dapper and Wade 2016). Presuming a substantial contribution of adaptive divergence to coding sequence evolution in Caenorhabditis (Galtier 2016), our findings support the possibility that adaptive evolution, in addition to differences in constraint, contributes importantly to ontogenetic patterns in the molecular evolution of development (Kalinka and Tomancak 2012;Liu and Robinson-Rechavi 2018a;Coronado-Zamora et al. 2019). Rapid evolution of genes expressed at distinct times in embryogenesis, whether due to adaptation or weaker constraint, should lead to predictable developmental manifestations in the form of hybrid dysfunction in the speciation process.

AUTHOR CONTRIBUTIONS
A.D.C., R.G., and L.S. conceived and designed the study; A.D.C., R.G., S.M., and W.W. analyzed data; A.D.C. drafted the initial version of the manuscript and all authors contributed to the final manuscript. This study was supported by a Discovery Grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada to A.D.C. and L.S., and by an NSERC Undergraduate Summer Research Award to R.G.

ACKNOWLEDGMENTS
We are grateful to the ModENCODE consortium for providing publicly available expression data for C. elegans. A.D.C. is supported with funds from the Natural Sciences and Engineering Research Council (NSERC) of Canada; L.S. is supported with funds from NSERC and the Canadian Institutes of Health Research (CIHR). R.G. was supported by ban NSERC Undergraduate Summer Research Award.

DATA ARCHIVING
Expression data collected by ModENCODE are publicly available at http://data.modencode.org.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Supplementary Table S1. List of ModENCODE C. elegans transcriptome datasets used in analysis of differential expression across development. Supplementary Figure S1. Distributions of polynomial parameter values from expression profile function fits for each gene (α = overall expression level, β 1 = linear change over time, β 2 = quadratic curvature, β 3 = cubic S-shape to expression profile over development). Supplementary Figure S2. Biplots of per-gene polynomial fit parameter values, colored by coexpression module (as in Fig. 4 main text), show clustering of genes with similar parameter values (α = overall expression level, β 1 = linear change over time, β 2 = quadratic curvature, β 3 = cubic S-shape to expression profile over development). Supplementary Figure S3. Enrichment of early embryonic expression categories defined by Baugh et al. (2003) among coexpression modules. Note that not all categories are mutually exclusive. Supplementary Figure S4. Cumulative distribution of non-synonymous site substitutions (log-transformed K A ) for each coexpression module illustrates the distinct incidence of extremely low K A values for M4 and M6 (top panel), indicating the subset of genes with little protein sequence divergence between C. elegans and C. briggsae. Supplementary Figure S5. Rates of protein evolution (K A , log-scale) plotted as a function of the polynomial fit parameter values to the expression time series (α = overall expression level, β 1 = linear change over time, β 2 = quadratic curvature, β 3 = cubic S-shape to expression profile over development). Supplementary Figure S6. Transcriptome divergence index (TDI and TDI * ) shows lowest values at timepoints 7 (180 min) and at timepoint 1. The adult stage (timepoint 30) shows one of the highest values. Supplementary Table S2. Summary of gene ontology (GO) term enrichment for each coexpression module. Supplementary Table S3. Summary of phenotype enrichment analysis (PEA) terms for each coexpression module.