Mating strategy predicts gene presence/absence patterns in a genus of simultaneously hermaphroditic flatworms

Abstract Gene repertoire turnover is a characteristic of genome evolution. However, we lack well‐replicated analyses of presence/absence patterns associated with different selection contexts. Here, we study ∼100 transcriptome assemblies across Macrostomum, a genus of simultaneously hermaphroditic flatworms exhibiting multiple convergent shifts in mating strategy and associated reproductive morphologies. Many species mate reciprocally, with partners donating and receiving sperm at the same time. Other species convergently evolved to mate by hypodermic injection of sperm into the partner. We find that for orthologous transcripts annotated as expressed in the body region containing the testes, sequences from hypodermically inseminating species diverge more rapidly from the model species, Macrostomum lignano, and have a lower probability of being observed in other species. For other annotation categories, simpler models with a constant rate of similarity decay with increasing genetic distance from M. lignano match the observed patterns well. Thus, faster rates of sequence evolution for hypodermically inseminating species in testis‐region genes result in higher rates of homology detection failure, yielding a signal of rapid evolution in sequence presence/absence patterns. Our results highlight the utility of considering appropriate null models for unobserved genes, as well as associating patterns of gene presence/absence with replicated evolutionary events in a phylogenetic context.

Gene repertoire turnover characterizes the evolution of genomes throughout the tree of life (Albalat and Cañestro 2016;Fernández and Gabaldón 2020). A common observation at many taxonomic scales is that some lineages have no detectable homologs of genes present in other, related lineages (Zhaxybayeva et al. 2007;Vakirlis et al. 2020;Weisman et al. 2020). Often, the inference is that such patterns are the result of adaptive gene gain and/or loss. For example, the study of the origins of novel genes has received a lot of attention (e.g., Neme et al. 2017), and gene loss is often adaptive (Albalat and Cañestro 2016). However, a well-understood, but less exotic, explanation is that "missing" genes have simply evolved, at a constant rate, to the point of being unrecognizable by sequence-similarity approaches to homology detection, so-called "homology detection failure" (Zhaxy-bayeva et al. 2007;Vakirlis et al. 2020;Weisman et al. 2020). As many as 20%-55% of so-called "lineage-specific" genes may be due to homology detection failure in outgroups (e.g., Zhaxybayeva et al. 2007;Vakirlis et al. 2020;Weisman et al. 2020). For genes with overall rapid rates of sequence evolution (e.g., reproduction-related genes, see below), homology detection failure may be more common and failure to consider it as a hypothesis will lead to errors of inference. Recent efforts have aimed to develop methods to test whether an observation of gene absence is expected on the basis of sequence similarity at different evolutionary distances from a focal species, when assuming a constant rate of sequence evolution (Weisman et al. 2020). Deviations from this "null" model can then be due to a variety of factors, including changes in the rate of evolution due to changes in selective pressures on existing genes or bona fide gene turnover (gain/loss; Weisman et al. 2020).

REPRODUCTION-RELATED GENES
Reproduction-related genes are typically found to be evolving at higher rates compared to the genomic background (Swanson and Vacquier 2002;Wilburn and Swanson 2016). A commonly proposed explanation for these observations is diversifying sexual selection (Swanson and Vacquier 2002;Wade 2016, 2020; for reviews, see also, e.g., Vacquier and Swanson 2011;Wilburn and Swanson 2016). However, an alternative explanation is that relaxed selection is the driving force behind many of these patterns Wade 2016, 2020). Indeed, there are several reasons why reproduction-related genes might be expected to be under relaxed selection. First, in separate-sexed organisms, reproduction-related genes are often sex biased or even sex limited in expression (Dapper and Wade 2020). In the extreme case of sex-limited expression, the efficiency of selection is half that of a gene with unbiased expression, because it is only exposed to selection when residing in one sex (Dapper and Wade 2020). However, it is unclear to what extent sex biases in expression could operate under other modes of reproduction (e.g., simultaneous hermaphroditism, where individuals are both male and female at the same time; Wiberg et al. 2021). Second, also in separate-sexed organisms, reproduction-related genes are often sexually antagonistic, where the fitness effects differ, and are sometimes in opposition, in the two sexes. In this case, the net selection is the average of that experienced in either sex. All else being equal, sexually antagonistic loci therefore necessarily experience lower net directional selection (with a similar effect to relaxed selection) compared to the rest of the genome. Finally, regardless of the mode of reproduction (i.e., hermaphroditism or gonochorism), genes involved in sperm competition or cryptic female choice experience weaker selection because the genetic variation exposed to selection within females is typically only a fraction of that present in the population at large, especially in mating systems with low female remating rates Wade 2016, 2020).
Be it due to relaxed selection or positive selection, the faster rates of evolution of reproduction-related genes would also predict higher rates of homology detection failure for such genes. A few studies have found evidence that, in addition to rapid rates of sequence evolution, turnover of gene repertoires is a characteristic feature of reproduction-related genes (Zhang et al. 2004;Cutter and Ward 2005;Ellegren and Parsch 2007;Hahn et al. 2007;Ahmed-Braimah et al. 2017). In particular, genes expressed in male reproductive tissues (e.g., the testes, the prostate, or accessory glands) often show high rates of turnover (Zhang et al. 2004;Cutter and Ward 2005;Ellegren and Parsch 2007;Hahn et al. 2007;Ahmed-Braimah et al. 2017). In contrast, some studies, for example, a recent study of 11 passerine species, found no evidence for higher turnover rates for male-biased reproductionrelated genes (Rowe et al. 2020). However, compared to studies of rates of molecular sequence evolution, relatively little work has focused on patterns of presence/absence of reproduction-related genes in different lineages, especially in systems where a link can be drawn to consistent differences in the sexual selection context.
Although there is considerable taxonomic bias in the above studies, they suggest that, in general, reproduction-related genes may not only be rapidly evolving in sequence but also show higher rates of missing genes than other categories of genes. However, this pattern is not strictly universal, raising the question of why such patterns are observed in some systems but not others. Such a pattern is not easily explained as a simple consequence of the higher rates of sequence evolution due to relaxed selection at reproduction-related genes. Because sexual selection is thought to be an important force affecting the evolution of reproduction-related genes, an obvious explanation is that differences in the sexual selection context between systems might account for the observed differences in patterns of gain and loss. This is especially true because changes in copy number may be an important component in the evolution of sexual dimorphism; redundant copies of genes are more free to evolve sex-specific expression patterns and functions, partially resolving intralocus sexual conflict (Ellegren and Parsch 2007). For example, sexbiased expression, allowing sex-specific phenotypes, seems to be more common at recently duplicated genes (Wyman et al. 2011). Moreover, turnover in sex-biased gene expression is associated with differences in the sexual selection context in birds (Harrison et al. 2015). Thus, both a reduced efficiency of selection at reproduction-related genes as well as changes in the intensity of selection at particular loci due to changing sexual selection contexts might result in the loss of genes from a species' repertoire. To date, few studies have associated such changes in gene repertoires with changes in the sexual selection context (but see Thomas et al. 2012;Fierst et al. 2015;Yin et al. 2018, reviewed in Cutter 2019, for changes in repertoires associated with transitions to self-fertilization, which likely also involves a change in the sexual selection context).
A further technical complication is that it is often difficult to distinguish actual absence of a gene from artifacts arising from homology detection failure (e.g., Weisman et al. 2020) or poor gene assembly quality. Poor assembly quality may also in some cases mislabel paralogs as orthologs, as ortholog identification typically relies at least in part on the detection of similar sequences. If true orthologs are not successfully assembled, then (likely more diverged) paralogs may be the most similar sequences detected. Additionally, in the case of transcriptome data, variation in expression levels may also create apparently missing genes if the transcriptome is not sampled to high enough coverage or when expression of some genes is temporally or spatially restricted. Some of this variation in expression, at least across species, may also be a signal of adaptive differences (e.g., Harrison et al. 2015). It is clear that more work is needed to understand whether and how patterns of rapid evolution of reproduction-related genes translate into patterns of gene presence/absence. Additionally, there is a need for data from more taxonomic groups, ideally combined with replicated contrasts of different sexual selection contexts.

THE GENUS Macrostomum
Free-living flatworms of the genus Macrostomum have become excellent models for studying mating behavior and sexual selection, including sexual conflict (Schärer et al. 2004;Vizoso et al. 2010;Ramm et al. 2015;Giannakara et al. 2016;Giannakara and Ramm 2017;Marie-Orleach et al. 2017;Patlar et al. 2020;Singh et al. 2020a,b). In particular Macrostomum lignano has been the focus of numerous experiments and has substantial genetic resources (Ladurner et al. 2005;Wudarski et al. 2020). For example, expression studies in M. lignano have provided information about body regions containing important reproductive tissues that can be used to annotate reproduction-related transcripts and genes (Arbore et al. 2015;Brand et al. 2020). Macrostomum worms are simultaneously hermaphroditic, producing sperm and eggs at the same time. Hermaphroditism is expected to lead to conflicts between individuals over the sex roles they assume during copulation, with the role of sperm donor thought to be preferred over the role of sperm recipient in many scenarios (Charnov 1979;Michiels 1998;Schärer et al. 2015). In Macrostomum, such conflicts likely explain much of the diversity of reproductive morphologies and behaviors that are observed among species (Schärer et al. 2011;Brand et al. 2022a).
One of the main aspects of this diversity is variation in the mating strategies. Many species of Macrostomum engage in reciprocal copulation, where both partners donate and receive sperm at the same time (Schärer et al. 2004;Vizoso et al. 2010). Other species inseminate hypodermically by injecting sperm into the body of their partner using a "needle-like" stylet (Schärer et al. 2011;Ramm et al. 2015). These contrasting strategies have been shown to be strongly associated with other axes of variation in reproductive morphology and behavior (Schärer et al. 2011;Brand et al. 2022a). In particular, reciprocally copulating species have sperm with stiff lateral bristles that likely serve as an anchoring mechanism to prevent postcopulatory sperm removal (Vizoso et al. 2010;Schärer et al. 2011). Meanwhile, hypodermically inseminating species have simpler sperm that lack such bristles. A recent phylogenomic study has shown that hypodermic insemination has evolved at least nine times independently within the genus (Brand et al. 2022a,b). A handful of species also show intermediate sperm morphologies, with short or reduced bristles, possibly representing a transitional state between the reciprocal and hypodermic mating strategies (Schärer et al. 2011;Brand et al. 2022a). The state of the sperm bristle (present, reduced, or absent) is therefore a good indicator of the mating strategy (Schärer et al. 2011;Brand et al. 2022b). Thus, contrasting mating strategies, themselves the result of conflicts over sex roles, represent different sexual selection contexts resulting in repeated evolution of different reproductive morphologies.
We have previously shown that reproduction-related genes evolve rapidly in Macrostomum (Brand et al. 2020;Wiberg et al. 2021), and that species with absent sperm bristles (taken as a proxy for hypodermic mating) show faster genome-wide rates of sequence evolution than species with bristles (Wiberg et al. 2021). However, an open question is whether the bristle state also predicts whether orthologs of reproduction-related genes are observed, or not, across species. Transcriptome assemblies show substantial variation in gene presence/absence patterns, although there is a strong association with genetic distance from M. lignano, the species from which annotations are derived, there is a conspicuous lack of representative sequences within a clade of exclusively hypodermically mating species (Fig. S1). A strong effect of bristle state would indicate that different sexual selection contexts alter the nature of selection acting on particular genes, sets of genes, or the genome as a whole. Such effects might inform us about candidate genes involved in, for example, sperm bristle development in the case of genes expressed in the testes. If shifts to a hypodermic mating strategy impose strong selection to lose the bristles, then the genes involved might degrade and be lost rapidly, making them difficult to detect in studies of the rate of sequence evolution (e.g., Brand et al. 2020 andWiberg et al. 2021). Alternatively, if genes underlying bristle development were also important in other contexts, the loss or reduction of bristles may involve changes in gene expression patterns rather than complete loss. Addressing this possibility will require comparative gene expression datasets from different tissues, which are currently unavailable.
If there were no overall effect of bristle state, then the reason for unobserved transcripts in other species is more difficult to infer and could be linked to one of multiple, nonmutually exclusive factors. For example, the lack of an effect could indicate that there is no overall turnover of gene repertoires due to sexual selection. Technical effects due to the use of transcriptome data, such as variation in the quality of assemblies or low expression of certain genes/transcripts, will lead to noise in presence/absence data (see above).
Here, we use a set of previously established orthogroups (OGs-a collection of orthologous sequences across multiple species; Wiberg et al. 2021;Brand et al. 2022a) that have been annotated on the basis of reproduction-related transcripts from the model species M. lignano (Arbore et al. 2015;Brand et al. 2020). We then test for a signal of decay in similarity scores with increasing phylogenetic distance from the model species M. lignano and evaluate the probability of detection given the observed rates of decay (Weisman et al. 2020). Next, we apply phylogenetically controlled logistic regression to test the hypothesis that different sexual selection contexts predict the probability of observing a representative sequence in an OG. We also extend the annotation of these OGs using RNA-Seq data, novel to M. lignano, comparing gene expression between adults and hatchlings in this model species . Elevated expression in adults presents an additional source of evidence for a reproduction-related function of a gene because only adults are reproductively active.

ORTHOGROUPS AND PAIRWISE SIMILARITY SCORES
Data have previously been generated in de novo transcriptome assemblies from 98 species of Macrostomum (Brand et al. 2022a). OGs were produced from these data using the PDC pipeline of Yang and Smith (2014), resulting in OGs with a single representative sequence from each species (though they may not represent strict 1-to-1 orthologs, as from de novo transcriptome data it is difficult to tell the difference between allelic variants, isoforms, and duplications). These OGs were processed and filtered to produce codon alignments in a previous study on molecular evolution in Macrostomum (Wiberg et al. 2021). We obtained the amino acid alignments corresponding to the codon alignments used in Wiberg et al. (2021). These alignments represent OGs and are broadly annotated into reproduction-related categories, namely, by their expression in the testis, ovary, and tail region of the worm (Arbore et al. 2015;Brand et al. 2020;Wiberg et al. 2021; Table S1 [sheet 2]). There is also a randomly chosen set of OGs that are annotated as ubiquitously expressed with no obvious reproductive function that serves as a comparison to the reproduction-related OGs. A random subset, rather than all ubiquitously expressed transcripts, was chosen also to mitigate the inflated power of any statistical analyses that assume statistical independence between OGs. Annotations are based on data from a previous study (Arbore et al. 2015), which were used to reannotate a more recent version of the M. lignano genome-guided transcriptome assembly (Brand et al. 2020). Briefly, worms were cut at different levels along the body axis to produce fragments containing the head only; the head and the body region including the testes; the head and the body regions containing the testes and ovaries; and the whole worm. RNA-Seq data from pools of these body fragments were mapped to the M. lignano transcriptome assembly (Grudniewska et al. 2018) and transcripts were annotated on the basis of expression differences between fragments (Brand et al. 2020). Note that body fragments contained other tissues than the "target" tissues (e.g., the region containing the testes also contains part of the gut).
The amino acid alignments from Wiberg et al. (2021) were trimmed using ZORRO (version 1.0; Wu et al. 2012) scores, removing columns with a score ≤3 (note that some columns will still contain gaps in some representative sequences). For each trimmed alignment, we then removed gap characters (-) and created protein blast databases (blast version 2.9.0; Camacho et al. 2009). We also refined the annotation of these OGs using expression data from adults and hatchlings of M. lignano to identify transcripts that are more highly expressed in adults (see Text T1 in the Supporting Information for more details as well as Fig. S2 and Tables S2 and S3).

FAILURE
For all OGs, we collected similarity scores (bit-scores) for all hits between the focal M. lignano sequence and all other species present in the OG using a BLASTP (version 2.9.0; Camacho et al. 2009) search of all sequences in an OG against a database created from the same OG always treating the M. lignano sequence as the "target." We then applied an approach to model the decay in the bit-score with increasing genetic distance from M. lignano (Table S1 [sheet 3]). Weisman et al. (2020) defined an exponential decay model based on a simple model of sequence evolution: where S(t) is the bit-score at a genetic distance t from the focal species (see Table S1 [sheet 4] for the genetic distances of all species), L is the bit-score of the gene of the focal species with itself (and thus related to the protein's length), and R is the rate of decay in the bit-score. The expected variance at genetic distance t is given by Because linear and nonlinear regression fitting approaches minimize slightly different sums of squares, and nonlinear regression approaches require specified starting values for estimated parameters, we took a double-fitting approach to estimate the L and R parameters. First, we fit a linear model to log()-transformed bit-scores, and used the L and R parameter estimates as starting values to fit a nonlinear least squared (nls) regression, with untransformed bit-scores, using the "nls()" R function. To fit these models, we only used species with bristles present, and only OGs with data from ≥5 species (Table S1 [sheet 4]). This approach allowed us to evaluate whether the species with reduced or absent bristles deviated from the patterns observed in the species with bristles present. We also created a weighting variable in an attempt to account for variation in assembly quality (weight = 1 -P miss ; where P miss is the proportion of BUSCO [version 5.0.0; Simão et al. 2015;metazoa_odb10] genes that were reported missing from a transcriptome; see Table S1 [sheet 3]). We also performed the analysis excluding species with high rates (>50%) of missing BUSCO genes. Despite a substantial reduction in sample sizes, particularly for species with reduced or absent bristles, our results were qualitatively unchanged (see Text T2 in the Supporting Information). Any OGs that produced R values >0 (i.e., an estimated increase in similarity score with increasing distance from M. lignano; N = 7/999, 0.7%) were treated as spurious, likely due to noisy input data, and were not considered further.
We next computed residuals for all species, including those that were not used to estimate the best-fit line, by subtracting the empirical bit-score from that predicted by the parameter estimates and the nonlinear equation given above (eq. 1). Because none of the species used in the model fitting had genetic distances to M. lignano >0.7, the accuracy of predictions could be compromised when extrapolating to larger genetic distances. We therefore only used residuals computed for species with genetic distances to M. lignano <0.7. This effectively excludes all species in a clade of exclusively hypodermically inseminating species (called the "hypodermic mating clade" in Brand et al. 2022a; yellow box in Fig. 1a). This had the added benefit that there was no longer a detectable systematic difference between the species with different bristle states in the genetic distance to M. lignano (Fig. S3a,b). Moreover, there was no systematic difference in assembly "completeness," as measured by BUSCO, between species with different bristle states (Fig. S3c,d). We summarized the residuals for each bristle state within each OG as the median value across representative species (Fig. 1b,c). We present a validation of the above approach, namely, to fit models using one group of species and to evaluate their fit to another group of species not included in the original model, in the Supporting Information (see Text T3 in the Supporting Information).
Given this model, fit using all available data for species with bristles present, and the parameter estimates of L and R for each OG, it is then possible to compute the probability of detecting a homolog at some genetic distance t, by setting an e-value threshold, and assuming the sequence length. We set an e-value threshold of 0.001 (a commonly used value), and assume that the length of a hypothetical sequence is the same as that of the relevant M. lignano sequence. The database size was set to 853,740,050 amino acids, the size of the initial database in Brand et al. (2022a) used for homology detection. We then computed the probability of detection (P detect ) as the proportion of a normal distribution with mean S(0.7) and standard deviation σ(0.7) that lies above the given bit-score threshold (Weisman et al. 2020; Fig. 1c). We identified OGs with a high probability of detection (P detect > 0.90; Table S1 [sheet 4]). If the fit of data from species with absent or reduced bristles is a good fit to the model, then these P detect values should also be representative for these species. Thus, for OGs with P detect > 0.90, but for which representative sequences are unobserved for some species, alternative explanations can then be explored (see also next section).
The approach by Weisman et al. (2020) has some limitations. First, because bit-scores are dependent on the length of the sequences being compared, and sequence lengths varied both within OG alignments, due to alignment gaps even after trimming, as well as across OGs, due to variation in sequence length across genes, this results in substantial variation in the bit-scores (Fig. S4a). This is likely to be a source of increased variation in bit-scores more generally, especially in nonmodel systems with fewer and less well characterized genetic resources. Second, as in any comparative analysis, species may have similar bit-scores, and genetic distances to the focal species, because they are themselves closely related (i.e., phylogenetic nonindependence ;Felsenstein 1985;Uyeda et al. 2018). Thus, modeling approaches that explicitly account for phylogeny would be preferable. We therefore also modeled the rate of decay in the bit-scores with increasing phylogenetic distance from M. lignano in a way that tried to address these limitations. However, because these additional analyses produced very similar results to those above, we only present the relevant methods and results in the Supporting Information (Text T4 in the Supporting Information; Table S1 [sheet 5]).

GENE PRESENCE/ABSENCE
Our above approach to compute residuals from the fitted decay models does not capture cases of complete absence of a sequence from an OG, because in such cases there are no empirical bitscores to compare against predicted values. We therefore took a complementary approach to analyze transcript presence/absence data. We fit phylogenetically controlled logistic regression models to presence/absence data for those OGs with a high probability of detecting homologs among species with bristles present (P detect > 0.9; see above). These models test whether the bristle state or reproduction-related annotations had an effect on the probability of observing a representative sequence. Because the decay rate analyses establish that simple models of similarity decay fit the data well also for species with reduced and absent bristles, any significant effect of bristle state on the probability can be attributable to other processes. To incorporate the phylogenetic relationships as a random effect, we fit these logistic regression models with the "MCMCglmm" package (version 2.32; Hadfield 2010), using a probit link function (family = "categorical"). To incorporate the phylogenetic random effect, we 1 using the penalized marginal likelihood approach (Sanderson 2002) in TreePL (Smith and O'Meara 2012), as in Brand et al. (2022b). As above, we excluded data from species with a genetic distance from M. lignano >0.7. For a single OG, model fitting was not possible because all species with genetic distances to M. lignano <0.7 had a representative sequence. We summarized the effects of bristle state by collecting, for each OG, the posterior means (on the probit scale) for each bristle state (Table S1 [sheet 6]). We then tested for an overall difference in the posterior means between bristle states within each annotation category, and between annotations within bristle states, using Kruskal-Wallis rank-sum tests and Dwass-Steele-Critchlow-Fligner all-pairs tests for post hoc contrasts. We also identified a set of candidate OGs by comparing the 95% confidence intervals of the posterior means between bristle states (Table S1 [sheet 6]). If the entire interval for species with absent bristles was below that for species with bristles present, indicating a much lower probability of observing a representative sequence, we called the posterior means significantly different.

STATISTICAL SOFTWARE AND PACKAGES
In addition to the above-named packages, several other R packages were used in the analysis pipeline of these data: "ddpcr"

Results and Discussion
Reproduction-related genes have been shown to evolve rapidly in the genus Macrostomum, with rates of sequence evolution being associated with the state of the sperm bristles (Wiberg et al. 2021), a diagnostic feature of divergent mating systems in these flatworms (Brand et al. 2022b). Meanwhile, patterns of pres-ence/absence hint at an association with bristle states (Fig. S1), a pattern also seen for reproduction-related genes in other taxa, and often attributed to sexually selected turnover of gene repertoires (Zhang et al. 2004;Cutter and Ward 2005;Ellegren and Parsch 2007;Hahn et al. 2007;Ahmed-Braimah et al. 2017;Cutter 2019). However, homology detection failure may also produce patterns of presence/absence, and such artifacts may be even more likely among rapidly evolving genes. We apply a recently proposed approach (Weisman et al. 2020) to evaluate whether patterns of presence/absence are related to bristle states beyond what is expected from simple models of sequence similarity decay as a function of genetic distance (homology detection failure). To reject this hypothesis, the probability of observing a representative sequence predicted by the sperm bristle states must be lower than expected from differences between bristle states in the rate of sequence similarity decay. We were able to estimate decay rates for 959 OGs (Fig. S5a-d; Table S1 [sheet 4]), and for 796 of these OGs (83%) a homologous sequence had an estimated probability (P detect ) >0.9 of being detected at a genetic distance of 0.7 from M. lignano. The percentage of genes with P detect > 0.9 varied among the reproduction-related OGs (testis region 81.7%; ovary region 83.6%; tail region 78.2%; ubiquitously expressed 88.0%). As expected, OGs annotated as ubiquitously expressed had the highest percentage. The ubiquitously expressed OGs are expected to be more conserved and thus probably have slower rates of evolution on average, making representative sequences easier to detect at greater genetic distances.
There were significant differences in the median residuals across bristle states for OGs with the testis-region annotations (Kruskal-Wallis rank-sum test, χ 2 = 53.51, P < 0.001), with the pattern being driven by more negative residuals in species with reduced and absent bristles (Fig. 2a). This suggests that the simple model of a constant rate of similarity decay estimated from species with bristles does not adequately explain similarity scores for species with either reduced or absent bristles. This is in line with previous observations that genes with testis-region annotations show faster rates of sequence evolution in species with absent bristles compared to species with bristles (Wiberg et al. 2021). There were also significant effects of bristle state on the probability of observing representative sequences within OGs with testis-region annotations (Kruskal-Wallis ranksum test; χ 2 = 14.26, d.f. = 2, P < 0.001). This effect was driven by differences between species with bristles and those with bit scores, from which the probability of detection (P detect ; see Methods) can be calculated. Panels (d) and (e) are representations of the approach to compute P detect from the estimated variance in expected bit scores at a given evolutionary distance. A bit-score threshold is defined, and the proportion of a normal distribution with mean S (0.7) and standard deviation σ(0.7), given by the fitted model, that lies above this threshold is defined as P detect . Panel (d) shows a case with a high P detect , where most of the predicted distribution of bit-scores lies above the threshold bit-score value, whereas panel (e), in contrast, shows a case with a low P detect . either reduced or absent bristles (Fig. 2b). Thus, for OGs with the testis-region annotations, results indicated rapid rates of sequence evolution as well as lower probabilities of observing a sequence in species with absent or reduced bristles. Although it remains possible that there is true gene loss among testis-region genes driving the lower probability of observing a sequence, rather than homology detection failure, the combination of low probabilities of observing a sequence and the faster rates of sequence evolution mean that it is not possible to tell these hypotheses apart. Indeed, these hypotheses are also not mutually exclusive and either processes could be at work for any particular OG. The most parsimonious explanation, effectively the null hypothesis, is that the more rapid rates of sequence evolution in species with reduced and absent bristles lead to higher rates of homology detection failure. These findings potentially have implications for interpreting previous findings in other taxa. Several studies have found male-biased genes, in particular those expressed in testes and the male germline, to show both high rates of sequence evolution and gene turnover. For example, in both Caenorhabditis briggsae (Cutter and Ward 2005) and Drosophila melanogaster (Zhang et al. 2004), there is evidence of high turnover for genes expressed in the testes and the male germline. Additionally, in a comparative study of Drosophila, Hahn et al. (2007) showed evidence for expansions in gene families associated with spermatogenesis in the lineage to D. melanogaster. Another comparative study of Drosophila, this time in the Drosophila virilis group, found lineage-specific transcripts only for genes expressed in testes of Drosophila americana and the ejaculatory bulb of Drosophila lummei (Ahmed-Braimah et al. 2017). In the light of our findings, such results might be due to detection failure in some lineages stemming from more rapid sequence evolution of genes expressed in testes as compared to other tissues. Our results suggest that this hypothesis should be considered and tested before explanations involving adaptive gain/loss are invoked.
In other taxa, different groups of reproduction-related genes also show interesting patterns of presence/absence. For example, for lineage-specific gene families unique to the Drosophila subgenus, many of the genes unique to D. melanogaster were found to be expressed in the accessory glands (Drosophila 12 Genomes Consortium 2007;Haerty et al. 2007;Hahn et al. 2007), which are also known to evolve particularly rapidly in Drosophila (Haerty et al. 2007). More recently, it was shown that several members of the CheB family of gustatory receptors, which show sexually dimorphic expression and are involved in modulating mating and aggression behaviors in D. melanogaster, show high rates of gene turnover (Torres-Oliva et al. 2016). Our results suggest that homology detection failure might provide an explanation more generally and gene turnover rates should therefore be placed in the context of sequence evolution for a full picture.
Among OGs with ubiquitously expressed annotations, there was a significant effect of bristle state (χ 2 = 12.95, P < 0.001), also in line with previous findings (Wiberg et al. 2021). However, this effect was driven by slightly more negative residuals in species with reduced bristles compared to either species with present bristles or species with absent bristles (Fig. 2a), which is difficult to explain. This might be a spurious effect of the relatively low sample sizes for species with reduced bristles. This predicts that there should be no overall difference among bristle states in the probability of observing a sequence for OGs annotated as ubiquitously expressed. For OGs annotated as ubiquitously expressed, there was a marginally nonsignificant effect of bristle state on the probability of observing a representative sequence (χ 2 = 5.26, d.f. = 2, P = 0.07).
In contrast, we did not detect an effect of bristle state for ovary-or tail-region OGs (χ 2 = 4.53, d.f. = 2, P = 0.1 and χ 2 = 2.78, d.f. = 2, P = 0.25, respectively; Fig. 2a). For these annotation groups, the results therefore differ from previous analyses, where an effect of bristle state was observed for all reproduction-related annotation categories (Wiberg et al. 2021). However, given that the model of Weisman et al. (2020) employed here is a fairly simplistic way of describing sequence evolution, these contrasting results are not so surprising. Codon models, as applied in Wiberg et al. (2021), are likely more sensitive to variation in rates of sequence evolution, and we therefore consider these results largely consistent with those of previous work (Brand et al. 2020;Wiberg et al. 2021). On the other hand, the absence of a significant difference in the residuals between species with present bristles and those with absent bristles suggests that the probability of a sequence being detectable should also not depend on the bristle state for these annotation categories. As expected from these results on bit-score decay, there were no detectable effects of bristle state for ovary-or tail-region OGs on the probability of observing a representative sequence (χ 2 = 3.0, d.f. = 2, P = 0.22 and χ 2 = 4.37, d.f. = 2, P = 0.11, respectively; Fig. 2b).
Some important caveats should be borne in mind. There is substantial variation in assembly quality and completeness across species and this probably explains some of the variation in unobserved sequences. For any given species, we cannot exclude the possibility that a transcript is unobserved due to it not being assembled, even though it occurs in the species. Moreover, lower assembly quality due to unassembled genes may result in the mislabeling of paralogs as orthologs. Ortholog identification typically involves at least an initial search for similar sequences (homologs). If orthologs are unassembled, then the most similar sequence will sometimes be a paralog. Unassembled genes might result for related technical reasons, with low-expressed genes being less likely to be sampled by an RNA-Seq strategy. Moreover, gene expression levels may vary across species. Many of the analyzed worms were caught in the field and are thus not standardized in condition or environment, which might also result in gene expression variation. In addition, turnover of expression across species, for example, due to selection for different expression patterns, might also result in variation. However, sources of bias from systematic differences between species with different bristle states should produce artifactual results throughout the genome (i.e., in all annotation categories). Together with our substantial efforts to test our method by cross-validation and account for sources of bias and variation in our analyses, our finding of an effect largely isolated to testis-region annotated OGs provides an additional validation of our approach.
Despite the above caveats, a number of OGs (N = 15) can be highlighted as potentially interesting candidates for follow-up studies (Table S1 [sheet 7]) on the basis of nonoverlapping confidence intervals between bristle states (see Methods) from logistic regression analysis in the presence/absence data (Table S1 [sheet 6]). Of these, four OGs are annotated as expressed in the testis region, all with significantly higher expression in adults compared to hatchlings, making them strong candidates for genes affecting reproduction-related functions. In particular, the transcript Mlig016310.g1, which annotates OG0001273_1_MIortho10 as being expressed in the testis region, has been confirmed, by in situ hybridization (ISH) in M. lignano, to be limited in expression to the testes (Arbore et al. 2015; called RNA815_7008 therein). Moreover, RNAi knockdown of this transcript in M. lignano produces an aberrant sperm bristle phenotype (Arbore et al. 2015), in which the usually backward pointing bristles are no longer stably anchored in that position, pointing to a possible role in the formation of the bristle complex (Willems et al. 2009). Although a previous study did identify orthologs of this transcript both in a species with bristles (M. spirale) and two species with absent bristles (M. pusillum, and M. hystrix), all showing expression localized to the testes (Brand et al. 2020), no species with absent bristles are represented within this OG in this study. This discrepancy can probably be attributed to differences in the orthology detection approach. Brand et al. (2020) relied exclusively on the sequence similarity-based OrthoFinder pipeline (Emms and Kelly 2015), whereas we here use orthologs generated from the PDC pipeline of Yang and Smith (2014), which incorporates phylogenetic information from gene trees of initial homolog groups. Thus, previously identified OGs may have contained spuriously identified paralogs with similar tissue expression patterns across species that are correctly identified as paralogs in more recent analyses including more species and treebased methods (Yang and Smith 2014;Wiberg et al. 2021;Brand et al. 2022a). Further validation of these findings will require a more thorough, manual investigation of ortholog and paralog relationships of genes with homology to Mlig016310.g1, validation of their absence from genome sequences, as well as confirmation of expression patterns in species with different bris-tle states, all of which are well outside the scope of this study. An additional seven, three, and one candidate OGs are annotated as ubiquitously expressed or specific to the ovary and tail regions, respectively. No further ISH data are currently available for the transcripts in these OGs. As above, these OGs will require more thorough validation, such as more targeted identification of homologs, orthologs, and paralogs, as well as ISH experiments.

Conclusions
We explored patterns of missing genes in ∼100 transcriptome assemblies from diverse Macrostomum species with contrasting mating strategies. We also expand annotations for M. lignano and identify a more stringent set of reproduction-related transcripts with higher expression in adults compared to hatchlings. Our results are consistent with previous findings of rapid sequence evolution at reproduction-related genes, at least for OGs annotated as testis region specific, and in species with reduced and/or absent sperm bristles, a proxy for the hypodermic mating strategy. Our novel findings also show that species with absent bristles have a lower probability of observing a sequence within OGs annotated as testis region specific. We suggest that the most parsimonious explanation is that the faster rate of evolution seen also for these OGs results in a higher rate of homology detection failure for such genes, resulting in a signal of rapid evolution of reproduction-related genes also in sequence presence/absence data. Our study highlights at least one exciting candidate OG for further study where representative transcripts have already been associated with phenotypic effects on sperm bristle phenotypes in M. lignano. Thus, our results highlight the utility of considering an appropriate null expectation for missing or unobserved genes as well as associating patterns of gene presence/absence with replicated evolutionary events in a phylogenetic context.

AUTHOR CONTRIBUTIONS
RAWW and LS conceived the study. RAWW conducted all analyses and led the writing of the manuscript with input from LS. GV performed RNA extractions and purification. All authors have commented on and approved the final manuscript. Transcriptome  assemblies  and  annotated  orthogroups  forming  the  basis  of  these  analyses  are  available  from