Comparative transcriptomics of ice‐crawlers demonstrates cold specialization constrains niche evolution in a relict lineage

Abstract Key changes in ecological niche space are often critical to understanding how lineages diversify during adaptive radiations. However, the converse, or understanding why some lineages are depauperate and relictual, is more challenging, as many factors may constrain niche evolution. In the case of the insect order Grylloblattodea, highly conserved thermal breadth is assumed to be closely tied to their relictual status, but has not been formerly tested. Here, we investigate whether evolutionary constraints in the physiological tolerance of temperature can help explain relictualism in this lineage. Using a comparative transcriptomics approach, we investigate gene expression following acute heat and cold stress across members of Grylloblattodea and their sister group, Mantophasmatodea. We additionally examine patterns of protein evolution, to identify candidate genes of positive selection. We demonstrate that cold specialization in Grylloblattodea has been accompanied by the loss of the inducible heat shock response under both acute heat and cold stress. Additionally, there is widespread evidence of selection on protein‐coding genes consistent with evolutionary constraints due to cold specialization. This includes positive selection on genes involved in trehalose transport, metabolic function, mitochondrial function, oxygen reduction, oxidative stress, and protein synthesis. These patterns of molecular adaptation suggest that Grylloblattodea have undergone evolutionary trade‐offs to survive in cold habitats and should be considered highly vulnerable to climate change. Finally, our transcriptomic data provide a robust backbone phylogeny for generic relationships within Grylloblattodea and Mantophasmatodea. Major phylogenetic splits in each group relate to arid conditions driving biogeographical patterns, with support for a sister‐group relationship between North American Grylloblatta and Altai‐Sayan Grylloblattella, and a range disjunction in Namibia splitting major clades within Mantophasmatodea.


| INTRODUC TI ON
Species that are considered relicts are by definition old, taxonomically depauperate lineages, where extinction has substantially diminished their former diversity (Grandcolas, Nattier, & Trewick, 2014;Habel, Assmann, Schmitt, & Avise, 2010). This can be difficult to assess when the fossil record is poor, but can be approximated by deeply divergent evolutionary lineages that lack extant diversity relative to sister taxa (although this assumes that speciation rates are not exceptionally low). Notably, this definition of relictualism does not imply conservation of characters or a lack of evolution, and so it remains unclear why relict taxa would experience a disproportionally high extinction rate. Do such relict species exhibit evolutionary constraints that limit ecological divergence? One possibility is that relict species are also niche specialists, which utilize a narrow range of ecological conditions and often exhibit restricted biogeographic distributions (Sexton, Montiel, Shay, Stephens, & Slatyer, 2017). Research on niche evolution remains central to understanding rates of ecological divergence in different regions and taxonomic groups (Ackerly, 2009). While there has been considerable interest in niche-determining factors that allow lineages to diversify and generate ecological novelty in adaptive radiations (Gavrilets & Losos, 2009;Schluter, 2000), much less attention has been paid to factors that constrain niche space and therefore constrain diversification. Improving our understanding of how relict species interact with their environments and evolve would broaden our understanding of mechanisms linked to extinction rates and whether there are factors that constrain niche evolution. This knowledge, in turn, would aid in efforts to conserve these phylogenetically divergent lineages in the face of habitat loss and environmental change (Ohlemüller et al., 2008;Vane-Wright, Humphries, & Williams, 1991;Winter, Devictor, & Schweiger, 2013).
Relict species are often used to argue for habitat stability in the regions that they inhabit, as niche conservatism is assumed, but tests for niche evolution are not always undertaken (Grandcolas, 2017).
The insect group Grylloblattodea, commonly known as ice-crawlers and rock-crawlers, provide a case study of a relict lineage appearing to have had much greater diversity in the past and a broader geographical distribution (Schoville & Kim, 2011;Vrsansky, Storozhenko, Labandeira, & Ihraingova, 2001). Grylloblattodea are also considered niche specialists, as they are only found in cool and humid microhabitats in mountainous terrain in eastern Asia and western North America (Schoville, 2014;Schoville, Slatyer, Bergdahl, & Valdez, 2015). Ice-crawlers, like other insects, are poikilotherms that depend on environmental conditions to regulate body temperatures, but they are noteworthy among insects for exhibiting cold specialization. The genus Grylloblatta forages nocturnally on snowfields and prefers ambient temperatures close to 0°C (Mills & Pepper, 1937). The most heat tolerant Grylloblattodeans, members of Galloisiana in Japan, live in cool, rocky soil and prefer only slightly warmer ambient temperatures (4°C-10°C; Nagashima, Ando, & Fukushima, 1982). In contrast, other Polyneopteran insects prefer much warmer temperatures, even ambient temperatures >30°C (Forsman, 2000;Harris, McQuillan, & Hughes, 2013). The closest relatives to Grylloblattodea, their sister taxon Mantophasmatodea (known as the heelwalkers or gladiators), are found in sparsely vegetated regions of South Africa that are on average warmer, although temperatures can vary widely within a single day, from freezing to 35°C (Roth, Molina, & Predel, 2014). Males of the mantophasmid Karoophasma biedouwense prefer temperatures of 25°C for sexual recruitment of females, and cease activity below 15°C and above 30°C (Eberhard, Metze, & Küpper, 2019). In contrast, North American ice-crawlers are freeze-avoidant (surviving up to their supercooling point, −3.9 ± 1.0°C) and stenothermal (Edwards, 1982; TA B L E 1 Taxon sampling, RNA sequencing treatments, and resulting genetic datasets. Previously published data are noted with an asterisk (*). See Table S2  whereas freeze avoidance (ice formation is lethal) is more common in predictable environments (Sinclair, Addo-Bediako, & Chown, 2003).
The constrained thermal breadth and foraging behavior of Grylloblattodea suggest freeze avoidance (Schoville et al., 2015), and this may have evolved due to the energetic requirements of living in consistently cold habitats. Organisms in cold habitats must adjust energy metabolism, ion homeostasis, neuromuscular control, and protein function to survive (Hayward, Manso, & Cossins, 2014).
These adjustments may lead to irreversible evolutionary trade-offs.
For example, in stenothermal Antarctic fish, cold specialization has led to a loss of inducible heat shock responses, and upregulation of pathways involved in protein degradation (especially ubiquitination and proteosomal degradation), antioxidant response, protein synthesis, and metabolism (Hofmann, Buckley, Airaksinen, Keen, & Somero, 2000;Logan & Buckley, 2015). Upon exposure to heat, these fish experience elevated oxidative stress and inflammation that constrain performance in warm or highly variable thermal environments (Logan & Buckley, 2015). However, among cold-specialized insects, research has shown that there are a broad array of strategies to deal with cold environmental conditions (Hayward et al., 2014;Storey & Storey, 2012). Most of these insects rely on inducible mechanisms of gene regulation to cope with cold conditions, which can be reversed under warmer conditions. It is unclear how cold specialization has evolved in Grylloblattodea or whether the accompanying physiological changes constrain niche evolution.
Cellular responses to environmental stressors can shed light on the physiological mechanisms that might constrain evolutionary change in different organisms (Tomanek, 2008). In particular, changes in gene expression following exposure to thermal stress provide insight into gene regulatory pathways that underlie temperature tolerance, although the magnitude and timing of gene expression responses can depend on stress exposure and gene expression is not always strongly correlated with protein abundance and activity (Evans, 2015;Schwanhäusser et al., 2011). Here, we examine whether acute thermal stress activates gene expression differentially in members of Grylloblattodea. We also examine genomic divergence patterns for evidence of positive selection in protein-coding genes related to cold specialization. Finally, these data provide an opportunity to resolve uncertainty in the deeper phylogeny of Grylloblattodea (Jarvis & Whiting, 2006;Schoville, Uchifune, & Machida, 2013), particularly in the sister-group relationship among Asian and North American genera. Using a comparative approach, we investigate cold specialization and physiological niche divergence in Grylloblattodea relative to their sister-group Mantophasmatodea.
Grylloblattodea and Mantophasmatodea diverged more than 150 million years ago (Eberhard, Schoville, & Klass, 2018;Misof et al., 2014), and, despite this long period of evolutionary history, they remain the two smallest insect orders (33 and 25 currently described extant species, respectively). Both lineages had formerly widespread fossil ancestors, and, as a result, they are both considered relictual orders (Roth et al., 2014;Schoville & Kim, 2011). Here, however, we focus on the evolutionary consequences of relictualism in Grylloblattodea, to closely examine the role of niche evolution in this climate specialized lineage. of critical thermal minima and maxima (Schoville et al., 2015), and to characterize rapid responses to extreme temperature exposure.

| MATERIAL S AND ME THODS
Across published studies, the recorded intervals of mRNA levels following stress vary across experiments, yet it is well known that initiation of stress response pathways occurs within minutes (De Nadal, Ammerer, & Posas, 2011) and changes in mRNA levels are readily detectable 10-30 min postexposure in a wide range of organisms, from yeast (Gasch et al., 2000) to fruit flies (Udaka, Ueda, & Goto, 2010). Here, we chose to focus on an acute 30 min exposure to temperature stress (details of these treatments, including temperature settings, can be found in Table S2). Briefly, specimens were exposed directly to a test temperature, which varied among taxa to avoid inducing mortality. For ice-crawlers, −5°C was used for a cold treatment, except for Galloisiana yezoensis, which was 2.5°C.
Under the heat treatment, 20 o C was used for most taxa, except to obtain sufficient RNA. Due to the rarity of grylloblattodean specimens, sample sizes were small (ranging from two to nine individuals, see Table 1).

| Transcriptome assemblies and annotation
Using the short-read RNAseq data, transcriptome assemblies were generated de novo for each taxon. Samples from the 1KITE project were trimmed and assembled using SOAPdenOvO-TrAnS with a 31kmer size (Xie et al., 2014), following protocols described in detail by . For all other samples, raw reads for each species or population were first assessed for quality using FASTqc v.0.10.1 (Babraham Bioinformatics, 2011). Based on these results, we used TrimmOmATic v0.32 (Bolger, Lohse, & Usadel, 2014) to trim the raw reads, removing three base pairs from the head of each read, and any leading and trailing bases with quality scores less than 34. The program TriniTy v2.0.3 (Haas et al., 2013)  and PFAm database (Sonnhammer, Eddy, & Durbin, 1997, accessed August 24, 2017 to match sequences to reference proteins based on amino acid similarity.

| Prediction of orthologous genes and phylogenetic analysis
In order to generate a reliable dataset for phylogenetic analysis, we identified orthologous genes that were shared across our study taxa.
We used two approaches to identify orthologous genes among transcriptomes, both of which assign candidate sequences to predefined orthologous groups based on amino acid similarity. First, we used OrThOgrAPh v0.5.4. (Petersen et al., 2017), which uses a best reciprocal hit criterion to map transcripts to the globally best matching orthogroup, while tracking best matches for all query sequences to avoid redundancy. For OrThOgrAPh, we used a custom-made ortholog set specifically designed for Polyneoptera taxa comprising 3,247 protein-coding genes for four reference species: Ephemera danica, Ladona fulva, Zootermopsis nevadensis, and Rhodnius prolixus (for more detail see: Evangelista et al., 2019). Ortholog prediction of transcripts for each of the 26 taxa in our study was done in OrThOgrAPh using the following settings: max-blast-searches = 50; blast-max-hits = 50; extend-orf = 1; substitute-u-with = X. Second, we used the program hAmSTr v13.2.3 (Ebersberger, Strauss, & von Haeseler, 2009), which also uses a best reciprocal hit criterion, but does not track matches across query sequences. We used the Insecta v3.2 reference dataset of known orthologs (1,668 genes), with Bombyx mori as the focal outgroup, and applied default settings in hAmSTr. The resulting datasets of orthologs from each method (dataset 1, "OrThOgrAPh," and dataset 2, "hAmSTr") were examined separately in downstream phylogenetic analyses.
Nucleotide sequences were aligned for each orthologous gene in each dataset using the mAFFT algorithm (Katoh & Standley, 2013) in the Consensus Multiple Align tool of geneiOUS v9 (Biomatters, Ltd.).
These gene alignments were filtered for occurrence among taxa (>10 species) and then screened for outliers using the kdeTreeS package (Weyenberg, Huggins, Schardl, Howe, & Yoshida, 2014) in r v3.5.3 (R Core Team, 2017). To run kdeTreeS, neighbor-joining trees were produced for each alignment using the bio-nj function in PhAngOrn (Schliep, 2010), assuming the K80 substitution model. Outlier tests were conducted based on branch length distributions in the gene sets.
We used both a species tree approach and a concatenated analysis to estimate phylogenetic relationships among the study taxa. In the species tree approach, independent estimates from each gene are used to jointly infer the evolutionary divergence in a multi-species coalescent framework (Edwards et al., 2016).
This approach accounts for the possibility of incomplete lineage sorting leading to statistical inconsistency in the estimated relationships among taxa, but not for hybridization (which we assume does not occur in our dataset). While some authors have criticized this approach for deep phylogenetic reconstruction (Gatesy & Springer, 2014), investigations of the performance of these methods have shown they are robust to recombination, gene tree heterogeneity, and some uncertainty in nucleotide alignments Wu, Song, Liu, & Edwards, 2013). However, substantial missing data can introduce bias in phylogenetic reconstruction (Wiens, 2006), so we generated a third dataset by combining unique genes from datasets 1 and 2 (to maximize the total number of retained genes) and pruned the data to a subset of genes where all taxa were represented. Separate species tree analyses were conducted for all three datasets in BeAST v2.6.0 (Bouckaert et al., 2019) to estimate individual gene trees. We made several assumptions to reduce the number of free parameters in BeAST. First, we selected the HKY substitution model for each gene partition, allowing for substitution rate variation across nucleotides and nucleotide classes. Additionally, a strict clock model was selected for each gene and the Yule tree prior was chosen. The MCMC chain length was set to 5 × 10 9 steps to ensure enough search time, and the data were stored every 1 × 10 5 steps. TrAcer (Rambaut & Drummond, 2009) was used to assess stationarity of the MCMC chain and to ensure that the effective sample size (ESS) of the parameters was high (>200) after burnin samples were discarded.
A maximum clade credibility tree was estimated for each gene in the BeAST TreeAnnOTATOr module. These gene trees were then used to estimate a species tree in the program ASTrAl-ii v4.10.12 (Mirarab & Warnow, 2015). This method applies the multi-species coalescent framework to quartets of taxa in order to infer the species phylogeny, given the input gene tree bipartitions found in the gene set. Support for each quartet is measured as a local posterior probability value and branch lengths are measured at internal nodes in terms of coalescent units (small values indicate a large amount of gene tree discordance). Tree images were generated for each dataset using the ggTree package (Yu, Smith, Zhu, Guan, & Lam, 2017) in R. To compare differences among trees generated by each dataset, we used the Kendall-Colijn metric (Kendall & Colijn, 2016) of pairwise differences among tree topologies. Trees can be compared by counting the number of edges or the branch lengths between two tips and their most recent common ancestor.
The R package TreeSPAce (Jombart, Kendall, Almagro-Garcia, & Colijn, 2017) was used to compare trees based on topology, as well as branch lengths, using the multidist function.
Phylogenetic analysis of the concatenated amino acid data from dataset 1 (OrThOgrAPh) was conducted as an alternative approach to tree inference. The largest dataset was used, without filtering for taxonomic representation. Orthologous genes/groups were aligned applying the L-INS-i algorithm of mAFFT v.7.221 (Katoh & Standley, 2013) at the translational (amino acid) level. Each multiple sequence alignment (MSA) was assessed and outliers were removed using the procedure outlined by Misof et al. (2014), but using the -addfragments algorithm implemented in mAFFT. Gap-only positions were removed, and ambiguously aligned or randomized MSA sections were identified for each amino acid MSA with AliScOre v.1.2 and removed with AlicUT v.2.3 (Kück et al., 2010;Misof & Misof, 2009).
Phylogenetic relationships were then inferred under the maximum likelihood (ML) optimality criterion as implemented in iq-Tree, using the best model for each gene partition and allowing partitions to have different evolutionary rates (option -ssp). We performed 50 independent tree searches (25 searches with a random start tree and 25 with a parsimony-inferred start tree). To assess the number of unique topologies present within the 50 inferred trees, we used the software UniqUe Tree v.1.9 (T. Wong, unpublished). Node support was estimated via nonparametric bootstrapping of 100 bootstrap replicates in iq-Tree and mapped onto the ML tree with the best log-likelihood. The resulting tree topologies were visualized using the ggTree package in R. As this analysis resulted in alternative ML tree topologies (see Section 3), we applied Four-cluster Likelihood Mapping (FcLM; Strimmer & Von Haeseler, 1997) in combination with permutation analyses (Misof et al., 2014) to determine support for two specific phylogenetic relationships in the Grylloblattodea clade: (a) G. yezoensis and (b) G. pravdini (see Tables S3 and S4 for respective species included in the groups). Alternative topologies might be caused by confounding signal due to among-lineage heterogeneity (heterogeneous composition across amino acid sequences/nonstationary substitution processes) that violate globally stationary (permutation I), a nonrandom distribution of missing data (permutation II), or a mixture of both (permutation III). A detailed explanation of the procedure can be found in Simon, Blanke, and Meusemann (2018). FcLM analyses were performed for the concatenated supermatrix used for the ML analyses using IQ-TREE. We used the lg substitution model for each partition in this analysis.

| Gene expression analysis
In order to identify gene expression responses to temperature exposure, differential expression (DE) of control versus acute thermal stress treatments was examined for each species/population separately using its own de novo transcriptome as reference. We deemed this necessary, as extensive genetic differences among taxa in the focal samples limit our ability to use a single set of reference sequences. As a consequence, we do not present direct statistical comparisons of DE results among taxa. Instead, we compare the annotation of top differentially expressed genes and the statistically significant tests for enrichment of gene ontology terms among taxa.
We focus particularly on species where at least two biological replicates were available for all treatments (K. biedouwense, G. yezoensis, G. pravdini, Grylloblatta bifratrilecta, Grylloblatta sp. "Lillburn Cave," and Grylloblatta sp. "North Cascades" at Whitechuck Mountain), but reference the other species with unbalanced sampling designs or a lack of replicates for generalized patterns.
The following steps were implemented using helper scripts provided in the TriniTy analysis pipeline.

| Positive selection in protein-coding genes
In order to detect protein-coding genes under positive selection, we employed two bioinformatics pipelines: POSigene ( POSigene requires a phylogenetic tree for the branch-site test of positive selection in cOdeml (Yang, 2007;Yang & Nielsen, 2002), and both an anchor species and a target species are designated from the data set. The anchor species serves as the basis for matching isoforms of this species to the best matching isoform of every other species in the data set. The target species denotes the evolutionary branch of the input tree to be tested for positive selection, and multiple species within a clade may be designated as the target species.
We used the species tree from dataset 1 as the input phylogeny, and Grylloblatta bifratrilecta was chosen as the anchor species as it had the most complete data set in Grylloblattodea. The Grylloblattodea clade and the subclade Grylloblatta were chosen as target species in alternative runs of POSigene. In the first step of POSigene, ortholog groups were assigned for all input species using a bi-directional blast and then aligned with clUSTAlw (Larkin et al., 2007). Using the anchor species as the reference, multiple sequence alignment in clUSTAlw was used to select a single matching isoform across species. For isoforms found in a minimum of three species, PrAnk (http://wasab iapp.org/softw are/prank/) was used to align sequences at the codon level. An additional filtering step using gBlOckS was used to remove unreliable alignment regions and to reduce the likelihood of false positives being identified in branch-site tests (Castresana, 2000). cOdeml was then used to conduct branch-site specific tests for positive selection along the designated branches.
Families with at least 15 sequences were tested for positive selection at the site-specific amino acid level using the branch-site test implemented in FUBAr (Murrell et al., 2013).
Genes identified as under positive selection by either method were cross-referenced to look for convergence in the two approaches, and also compared to the list of top differentially expressed genes to determine whether positive selection and gene regulatory evolution operated on the same genes.

| Transcriptome assemblies and ortholog datasets
Transcriptome assemblies varied in size, contig number, and completeness (

| Phylogenetic analysis
Phylogenetic estimates based on species tree analysis (Figure 1) provide strong support for the monophyly of Mantophasmatodea and Grylloblattodea, and consistent relationships among taxa within the Grylloblattodea. The topology inferred from dataset 1 ( Figure 1a) matched the topology estimated in dataset 3 (Figure 1b), although branch support increased as missing data was removed.
Relationships varied in dataset 2 (see the phylogenetic tree from dataset 2 in Figure S1), where the topology differed in one subclade A maximum likelihood (ML) analysis of the concatenated amino acid data from Dataset 1 was run for 50 independent tree searches.
Three alternative tree topologies emerged from these searches, with the best topology ( Figure 2) supported 39 times. The other topologies were recovered eight times and three times, respectively ( Figure S2).
Notably, all three topologies are broadly concordant with the species trees (Figure 1), including relationships within Mantophasmatodea.
One difference in the ML trees and the species trees involves a more  (Table S6). This is in contrast to the species tree which places G.
yezoensis at the base of North American Grylloblatta + G. pravdini.
However, it has to be noted that the number of drawn quartets was low (44 unique quartets). Therefore, the FcLM results might be unreliable without further taxon sampling. We also evaluated the support for a sister-group relationship of G. pravdini to the East Asian  (Table S7).

| Differential expression analysis
Strong patterns of differential expression are observed in the acute heat stress response (when compared to control samples) in the mantophasmatodean K. biedouwense (Figure 3). The most highly differentially expressed genes include ten canonical genes in the heat shock protein (hsp) family related to heat stress (hsp70, hsp68, hsp20.5, alpha-crystallin, and protein lethal(2) essential for life in the hsp20 family; Table S8). The pattern of differential expression was less pronounced in the acute cold stress treatment of K.
biedouwense, but included several genes associated with managing oxidative stress (TAR1 and ART3) and a lysozyme associated with immune function (Table S9). In contrast, differential expression is weak, with few significantly expressed genes in the grylloblattodean species. This is most pronounced in the North American Grylloblatta lineage. Examining the top differentially expressed genes in the acute heat stress treatments, only a single hit in G.
yezoensis is related to heat stress (hsp70 B2; Interestingly, protein degradation (GO:0006508: proteolysis) was both up-and downregulated within and across species, suggesting fine tuning of this pathway.

| Constitutive expression of heat shock proteins
To determine whether constitutive levels of hsp gene expression were simply higher in Grylloblattodea, we compared mean levels of expression for members of the heat shock protein family across treatments (a heat map is shown in Figure 4; the TMM expression values are provided as Table S14). While hsp levels in the control samples were low for genes that were induced during the heat shock response in K. biedouwense (generally <100 TMM; μ TMM = 3.3, σ TMM = 251.4), levels of constitutive expression for some hsp genes were high for G. yezoensis (»200 TMM, μ TMM = 21.8, σ TMM = 488.1).
These represented hsp70 genes, as well as heat shock cognate (hsc) and hsp20 genes that are actively expressed during development.

F I G U R E 2
The maximum likelihood tree inferred from the concatenated amino acid dataset of 3,022 genes. This ML tree received the best log-likelihood score 39 times across 50 independent tree searches. Node support values were estimated based on 100 bootstrap replicates. Branch lengths are unscaled

| Evidence for positive selection on proteins
We tested for evidence of selection on protein-coding genes using two methods. The branch-site test implemented in POSigene focused on orthologous genes found in at least three representative species.
A total of 53 genes were identified as candidates for positive selection at the node of Grylloblattodea, completely overlapping the set of significant genes identified at the node of North American Grylloblatta (Table 2). However, only a subset of these genes (10 of 53, shown in bold type TA B L E 2 Orthologous genes found to be under selection using the cOdeml branch-site test implemented in POSigene. Ratios of dn/ds listed as not applicable ("NA") could not be computed due to a lack of synonymous substitutions. All genes shown were identified as significant within the North American Grylloblatta branch, while genes in bold type include taxa across Grylloblattodea  (Table S15). Grouping gene families by biological function shows a large number of families related to stress or immune response (31 gene families), particularly in protein folding and the proteolytic response pathway (including two hsps).
Other key biological functions include oxygen reduction and cellular respiration (13 gene families), metabolism and development (31 gene families), DNA replication and protein synthesis (43 gene families), and olfaction and nervous system function (seven gene families).
Interestingly, the signature of positive selection on odorant-binding proteins appears to be restricted to the Grylloblattodea. Candidate selection genes did not directly overlap in test results from POSigene and FUSTr, except for one gene (pathogenesis-related protein 5-like) involved in pathogen resistance. Furthermore, there was no overlap in the results from POSigene and the differential expression analysis of thermal stress, while only one ortholog group identified in FUSTr (apolipoprotein D-like protein) overlapped with the differential expression analysis (as significantly downregulated during cold stress in G. sp. "North Cascades" Whitechuck Mountain). However, we did observe broad overlap in gene function among tests, which included  proteolytic activity, lipid metabolic processes (lipophorin activity), regulation of mitochondrial respiration, and protein synthesis.

| D ISCUSS I ON
Here, we tested the question of whether the relict insect lineage Grylloblattodea (Vrsansky et al., 2001;Wipfler et al., 2014) provides molecular evidence of evolutionary constraints in the physiological tolerance of temperature. Previous research has shown that thermal breadth and critical thermal limits do not vary in North American species (Schoville et al., 2015), consistent with constraints on thermal niche evolution. In addition, lineages in Asia have shifted into subterranean habitats with subsequent eye loss where they have persisted in warm climates (Schoville & Kim, 2011). First, we examined whether acute thermal stress activates gene expression differentially in members of Grylloblattodea. We found that Grylloblattodea lack an inducible heat shock response typical of most animal species (Kültz, 2005b), including its sister lineage Mantophasmatodea.
Although these results should be interpreted with some caution, as we did not assay protein activity in ice-crawlers and only focused on rapid responses (~30 min) to acute temperature stress (Evans, 2015), additional evidence from selection on protein-coding genes is consistent with evolutionary constraints due to cold specialization (Logan & Buckley, 2015;Overgaard & MacMillan, 2017;Storey & Storey, 2012). Below we discuss the evidence supporting evolutionary constraints in this relict lineage, as well as insights that transcriptomic analysis provides on the evolutionary diversification of Grylloblattodea and Mantophasmatodea, and conservation of these rare lineages.

| Heat shock responses and cold specialization
Heat shock proteins (hsps) are molecular chaperones that are typically highly conserved in diverse organisms (from bacteria to eukaryotes), with strong purifying selection and gene conversion maintaining nucleotide similarity across divergent lineages (Hess et al., 2018). Similarly, the heat shock response is a canonical part of the cellular stress response (Evgen'ev, 2014;Kültz, 2005a), with upregulation of hsps occurring under stressful conditions to prevent damage to cellular proteins. For example, studies in Drosophila melanogaster have shown that the inducible heat shock genes, particularly the hsp70 family, are critical to heat resistance (Takahashi, Okada, & Teramura, 2011). Mutant lines that lack hsp70 have reduced thermal tolerance and experience high mortality following heat and cold shock (Gong & Golic, 2006;Štětina, Koštál, & Korbelová, 2015). This may be linked to cell cycle arrest and induction of the apoptosis pathways as cellular damage accumulates (Logan & Buckley, 2015). Interestingly, a suite of additional hsps (i.e., hsp20 and hsp90) are upregulated in these hsp70-null lines following acute heat stress (but not acute cold stress), suggesting some redundancy in function among hsps, although these proteins do not fully compensate for the loss of hsp70 function (Bettencourt, Hogan, Nimali, & Drohan, 2008;Štětina et al., 2015). The induction of the heat shock response is also widely observed following cold stress in insects (Storey & Storey, 2011). In contrast, our results show a complete lack of constitutive or inducible hsp expression in two genera within Grylloblattodea (Grylloblattella and its sister lineage Grylloblatta). Although one hsp70 is expressed under heat stress in G. yezoensis, the upregulation of a single gene differs strongly from the dramatic upregulation of multiple hsp genes in the mantophasmatodean K. biedouwense (Figure 3), or heat shock response in other insects (King & MacRae, 2015), as evidenced by enrich-  (Teets, Gantz, & Kawarasaki, 2020). While we favor the interpretation that heat shock responses have been lost in Grylloblattodea (given additional data, see below), we note that further experimental validation is needed.
Alteration of the conserved stress response (Kültz, 2005) to reduce induction of hsps could arise if the pathway is energetically expensive and rarely needed, or alternatively if the hsps are already present in high quantities due to constitutive expression (Tomanek, 2008). As mentioned earlier, stenothermal Antarctic fish similarly show the loss of inducible heat shock responses (Logan & Buckley, 2015), possibly as a result of the disruption of cis-regulatory binding sites (Bogan & Place, 2019). Interestingly, hsps remain active in these fish through constant, constitutive expression (Place & Hofmann, 2005). Similar patterns are found in animals inhabiting xeric environments, which maintain heat shock proteins at physiologically moderate temperatures and thus do not need to rely on a rapid upregulation of hsps during acute thermal stress (Evgen'ev, Garbuz, Shilova, & Zatsepina, 2007). Other species exposed to chronically stressful or fluctuating thermal environments also show constitutive expression of hsps (Tomanek, 2008). For example, fly species from four different families show high constitutive expression, with the level of expression increasing in the most heat-resistant species (Zatsepina et al., 2016). Cold-hardy insects often go through seasonal acclimation that includes accumulation of hsps that can mask an acute cold shock response (Storey & Storey, 2012).
High constitutive expression of hsp70 is seen in G. yezoensis (compared to levels seen in K. biedouwense). Other Grylloblattodea show some high levels of hsc and hsp20 genes, but these genes are known to be constitutively expressed during development. Therefore, it does not appear that constitutive upregulation broadly applies to Grylloblattodea (Figure 4), although we have only examined constitutive levels under a limited set of conditions.
Grylloblattodea species occupy habitats that have constant, cool temperatures throughout the year, such as subterranean, cavernicolous, subnival, and talus environments (Goodrich, 1982;Kamp, 1970;Millar, D. Westfall, & Delany, 2014). This may suggest that constitutive expression of hsp genes is not involved in their cold specialization ability and that the inducible heat shock response may be lost. However, more detailed experimentation will be required to understand the heat shock responses of Grylloblattodea. For example, research on the Antarctic midge, Belgica antarctica (Rinehart et al., 2006), has shown that the ability to exhibit an inducible heat shock response varies across larval and adult life stages, and in other insect species, it can be initiated at different durations following exposure to thermal stress (Storey & Storey, 2012).

| Protein evolution and cold specialization
There is additional evidence that cold specialization has constrained thermal niche evolution in Grylloblattodea. Changes in physiological processes that maximize energy allocation can involve tradeoffs that are challenging to reverse (Angilletta Jr., Wilson, Navas, & James, 2003). Additionally, selection on protein function in cold environments leads to structural changes that impair function at other temperatures (Coquelle, Fioravanti, Weik, Vellieux, & Madern, 2007;Fields, 2001). We found evidence of positive selection on proteincoding genes involved in trehalose transport, metabolic function, cellular respiration, oxygen reduction, oxidative stress response, and protein synthesis. Previous research has proposed that many of these molecular pathways related to survival in cold climates (Storey & Storey, 2012).
For example, a recent review of molecular adaptation to chill tolerance in insects (Overgaard & MacMillan, 2017) suggests that taxa should undergo structural changes in amino acid composition for genes encoding lipid membrane composition, osmolyte production (e.g., trehalose), and regulation of protein synthesis, in addition to selection on thermal sensitivity of key enzymes involved in cell homeostasis like gated ion channels and ion transporters. Cell membrane function and ion homeostasis, as well as the enhancement of antioxidant defenses and suppression of the apoptosis pathway, are generally viewed as key physiological requirements for cold tolerance in insects (Overgaard & MacMillan, 2017;Storey & Storey, 2012).
Selection on protein-coding genes in the Grylloblattodea trehalose pathway (Tret1-2) is not surprising, given that trehalose production is known to protect cellular macromolecules and stabilize the membrane lipid bilayer in other cold-specialized insects (Storey & Storey, 2012). A variety of techniques, including genetics and metabolomics, has confirmed that trehalose not only serves as a stress protectant, but also promotes starvation resistance (Hibshman et al., 2017). Furthermore, knock-out studies in D. melanogaster have shown that trehalose is also required for body water homeostasis and desiccation tolerance in some insects (Yoshida, Matsuda, Kubo, & Nishimura, 2016).
We also found results indicating pervasive selection on mitochondrial function, including genes involved in cellular respiration, mitochondrial protein synthesis, protein folding, and autophagy, as well as oxidoreductase activity. This is noteworthy, as many freeze-avoidant insects show reorganization of mitochondrial metabolism to maintain aerobic respiration at cold temperatures (Storey & Storey, 2012). Selection on these pathways extends beyond insects to other poikilotherms, as the extensive literature on stenothermal fish has shown similar requirements. Adaptive regulatory changes occur in pathways involving protein synthesis, protein folding, and protein degradation (especially via the ubiquitination pathway), as well as in pathways involving lipid metabolism, mitochondrial function, antioxidant response, anti-apoptosis, and immunity (Logan & Buckley, 2015). In results that parallel ours, a recent study of cold specialization in alpine stick insects found selection on sulfotransferase as evidence for metabolic adaptation (Dennis, Dunning, Sinclair, & Buckley, 2015).
Do these molecular changes in protein-coding genes imply that there are constraints that limit thermal niche evolution? Performance at cold temperature is predicted to require amino acid changes that influence thermal sensitivity and activity of proteins (Overgaard & MacMillan, 2017), and thus, the patterns of positive selection that we observe likely include trade-offs with respect to protein function in warmer thermal niches. In stenothermal Antarctic fish, cold specialization has led to evolutionary losses of key proteins that likely impair physiological processes at warm temperatures (Beers & Jayasundara, 2015). Comparative analysis of some of our selected proteins in other cold-specialized species suggests that this might be the case. Several candidate genes in Grylloblattodea have appeared in functional biochemical studies of cold adaptation in other organisms. For example, structural changes in tubulin alpha-1 have been linked to the efficiency of microtubule assembly in Antarctic fish and algae (Detrich, Parker, Williams, Nogales, & Downing, 2000;Willem et al., 1999), and their function is impaired at warm temperatures. Similarly, amino acid changes in Peptide Transporter Family 1 appear to increase protein flexibility at subzero temperatures and enhance nutritional sequestration of animal protein in icefish (Rizzello et al., 2013), but to result instability at higher temperatures.
Additional work is needed to understand if the amino acid changes we observe in Grylloblattodea specifically enhance protein function at cold temperatures.

| Evolutionary diversification of Mantophasmatodea and Grylloblattodea
Our phylogenetic analyses of protein-coding genes provide important insight into the evolutionary relationships of these poorly studied insect lineages. For Mantophasmatodea, the phylogenetic analyses suggest an early split into two major clades, corresponding to a range disjunction formed by an arid region separating northern and central Namibia from southern Namibia and South Africa (Eberhard et al., 2018). This biogeographic division is consistent with a previous phylogenetic analysis using peptidomics (Predel et al., 2012), and seen in other vertebrate and invertebrate groups (Griffin, 1998a(Griffin, , 1998b. Furthermore, our inferred relationships (Figures 1 and 2) are consistent with the peptidomics phylogeny, and largely consistent with a phylogeny based on mitochondrial data (Damgaard, Klass, Picker, & Buder, 2008;Dool, Künzel, Haase, Picker, & Eberhard, 2017). The mitochondrial phylogeny differs in the placement of T. gladiator as the most basal lineage (perhaps due to different taxon sampling), and placing Austrophasmatidae sp. 1 as sister to Austrophasmatidae sp. 2 rather than K. biedouwense in our analysis. Further work will be required to understand the source of the discrepancies with this mitochondrial dataset.
For the Grylloblattodea, the transcriptome data provide the first well-supported phylogeny of relationships among genera, as earlier reconstructions resulted in considerable uncertainty in deeper nodes of the phylogeny (Jarvis & Whiting, 2006;. In the species tree analysis of our three datasets Galloisiana paraphyletic and suggesting this may be an undescribed genus (Schoville, 2014). There is, however, disagreement about the placement of this lineage as sister to Grylloblattella + Grylloblatta ( Figure 1), or sister to Galloisiana + Grylloblattina (Figure 2; Figure S2).
FcLM analysis of the concatenated data supports the latter placement. Further taxonomic sampling may improve estimates of these relationships. Finally, our results of relationships within Grylloblatta are consistent with a recent 322 gene nuclear phylogeny of 96 Grylloblatta samples (Schoville et al., 2019), with the exception that G. sp. "Sierra Buttes" is highly nested within California species in the transcriptome species tree (Figure 1).
Analysis of other candidate proteins may shed additional light into the diversification of Mantophasmatodea and Grylloblattodea.
While the lineages occupy highly divergent habitat and thermal niche space, they are notable among Polyneopteran insects in sharing complete wing loss. It is unclear if winglessness is a synapomorphy, or an independently derived character state in both lineages (Eberhard et al., 2018). Though fully winged fossils have been assigned to Grylloblattodea (Storozhenko & Aristov, 2014;Yingying, Béthoux, Chungkun, & Dong, 2010), there is some controversy over the placement of these fossils as ancestors to extant Grylloblattodea due to the lack of diagnostic synapomorphies (Wipfler et al., 2014). It is therefore noteworthy then that one of the top candidate selection genes identified in Grylloblattodea, protein lingerer, is involved in developing wing tissues in Drosophila (Baumgartner, Stocker, & Hafen, 2013). Further work will be needed to understand if this is linked to winglessness in Grylloblattodea and provides evidence of independent wing loss relative to Mantophasmatodea. Another interesting result is the pattern of Grylloblattodea-specific selection on multiple odorant-binding proteins, which may signal unique chemosensory specializations (Robertson, 2019). For example, odorant-binding proteins have been linked to adaptive shifts in food preferences in novel environments (Wada-Katsumata, Robertson, Silverman, & Schal, 2018). It is possible, however, that some of these candidate genes have strongly divergent coding sequences due to pseudogenization. Pseudogenes can be expressed as RNA for multiple gene regulatory functions, including suppression of molecular pathways (Tutar, 2012). The development of genome resources in Grylloblattodea would aid in more detailed examination of these candidate selection genes.

| Implications for conservation of Grylloblattodea
Relict species are often valued in conservation decision making due to their phylogenetic uniqueness, endemism and rarity (Winter et al., 2013). These species are also associated with greater extinction risk, as extinction risk appears to increase with taxon age (Warren et al., 2018). These species should warrant special protected status if they are exceptionally vulnerable to environmental change. Here, we showed that cold specialization in Grylloblattodea has been accompanied by molecular changes to core physiological processes that create evolutionary constraints. Species of Grylloblattodea persist in small patches of suitable habitat (Schoville et al., 2019), and because they are poorly dispersing and have narrow ecological niches, they are unlikely to have the capacity to move or evolve in response to the rapid rate of contemporary climate change (Moritz & Agudo, 2013). As a result, along with other cold-specialized species that inhabit alpine and cave environments, they should be considered "exceptionally at risk" of decline under climate change projections (La Sorte & Jetz, 2010;Mammola, Goodacre, & Isaia, 2018;Ohlemüller et al., 2008). Much more effort needs to be made to understand their ecology and population dynamics to aid in conservation management (Schoville & Graening, 2013).

ACK N OWLED G EM ENTS
We are grateful to members of the 1KITE project, who contributed resources to this paper. The 1KITE project was supported by China National GeneBank and BGI-Shenzhen. This research was supported by grants to SDS from the National Science Foundation

DATA AVA I L A B I L I T Y S TAT E M E N T
Raw sequence data and transcriptome assemblies have been deposited at the National Center for Biotechnology Information (sample numbers are listed in Table 1).