Transcriptomic analysis of sweet potato under dehydration stress identifies candidate genes for drought tolerance

Abstract Sweet potato (Ipomoea batatas [L.] Lam.) is an important subsistence crop in Sub‐Saharan Africa, yet as for many crops, yield can be severely impacted by drought stress. Understanding the genetic mechanisms that control drought tolerance can facilitate the development of drought‐tolerant sweet potato cultivars. Here, we report an expression profiling study using the US‐bred cultivar, Beauregard, and a Ugandan landrace, Tanzania, treated with polyethylene glycol (PEG) to simulate drought and sampled at 24 and 48 hr after stress. At each time‐point, between 4,000 to 6,000 genes in leaf tissue were differentially expressed in each cultivar. Approximately half of these differentially expressed genes were common between the two cultivars and were enriched for Gene Ontology terms associated with drought response. Three hundred orthologs of drought tolerance genes reported in model species were identified in the Ipomoea trifida reference genome, of which 122 were differentially expressed under at least one experimental condition, constituting a list of drought tolerance candidate genes. A subset of genes was differentially regulated between Beauregard and Tanzania, representing genotype‐specific responses to drought stress. The data analyzed and reported here provide a resource for geneticists and breeders toward identifying and utilizing drought tolerance genes in sweet potato.

tolerance among sweet potato germplasm suggests that drought tolerance loci can be introgressed into drought-sensitive cultivars. Thus, understanding the genetic and molecular basis of drought tolerance will aid in breeding for high-performing, drought tolerant sweet potato cultivars.
Drought stress induces a set of physiological responses with stomatal closure to reduce water loss through transpiration being the most immediate and important (reviewed in Osakabe, Osakabe, Shinozaki, & Tran, 2014). At the onset of drought, abscisic acid (ABA) levels are elevated, initiating signaling pathways that modulate guard cell osmolarity to achieve stomatal closure. As a result, components of the ABA signaling pathway as well as guard cell ion pumps are essential for proper stomatal closure during drought, leading to drought tolerance. However, control of stomatal closure during drought involves compromising between limiting water loss and maintaining CO 2 uptake for photosynthesis, which ultimately impacts crop yield. Thus, understanding the intricate processes that modulate drought responses such as stomatal closure and resiliency of photosynthetic rates during drought are crucial aspects of optimizing drought tolerance in crop plants.
Beauregard is a US-bred variety with low dry matter and orange flesh due to high levels of carotenoids (Rolston et al., 1987), while Tanzania is a popular Ugandan landrace with high dry matter and cream flesh (Mwanga et al., 2001). Though a Tanzania × Beauregard population (Cervantes-Flores, Yencho, Kriegner, et al., 2008) has been used to map quantitative trait loci (QTL) for root-knot nematode resistance, dry matter, starch, and β-carotene content (Cervantes-Flores, Yencho, Pecota, Sosinski, Cervantes-Flores et al., 2011), drought tolerance traits have not yet been investigated in this population. However, we have an ongoing project to map QTL for drought-related traits using a population derived from the reciprocal cross, Beauregard × Tanzania. Promisingly, Kivuva et al. (2015) in a study involving 84 total genotypes showed that Beauregard differs from Tanzania in several drought response traits.
Greenhouse-grown plantlets derived from cuttings of Tanzania stay green longer under drought conditions than those from Beauregard by 6-7 days, suggesting greater drought tolerance in Tanzania (Supporting Information Figure S1a). Mature plants grown at two different sites in Kenya under irrigated and rain-fed conditions resulted in a larger decrease in chlorophyll content in Tanzania compared to Beauregard, yet the number of storage roots and the fresh storage root weight were more sensitive in Beauregard than in Tanzania (Supporting Information Figure S1b-d).
A few studies have examined global transcriptomic responses for drought-related traits in different Ipomoea species. Yang et al. (2018) recently reported a three-way combined de novo transcriptome for whole plants of PEG-treated Ipomoea triloba, I. batatas (cv. Kokei No. 14) and a somatic hybrid between the two species, finding genes that are differentially regulated in the I. batatas parent compared to either I. triloba or the hybrid, which are more drought tolerant. Other de novo transcriptome approaches have been reported for hexaploid Ipomoea trifida and Ipomoea imperati for drought and salt stress, respectively (Peng et al., 2017;Solis, Baisakh, Brandt, Villordon, & Bonte, 2016). In this study, global gene expression patterns in leaf tissue of two varieties of I. batatas, Beauregard, and Tanzania, during PEG-simulated drought stress were characterized with RNA-Sequencing (RNA-Seq); diploid I. trifida, which is closely related to I. batatas and is a putative progenitor species (Austin, 1988;Kobayashi, 1983), was used as the reference genome sequence (http://sweetpotato.plantbiology.msu.ed u/gt4sp_download.shtml). Differentially expressed genes (DEGs) were identified, revealing candidate genes for improving drought tolerance in sweet potato. In addition, genes with unique expression patterns in the two cultivars were classified into co-expression clusters showing enrichment for biological processes that could contribute to phenotypic differences in their drought responses. Data from this study will be useful in identifying candidate genes within drought tolerance QTL and improving drought tolerance in this key food security crop.

| Determination of optimal conditions to induce dehydration stress
In vitro plants of Beauregard (CIP440132) and Tanzania (CIP440166) were obtained from the International Potato Center (CIP) Genebank and nodes were cultured to form plantlets on MPB medium (Murashige and Skoog salts, 3% sucrose, 2 mg/L calcium pantothenate, 100 mg/L L-arginine, 200 mg/L ascorbic acid, 20 mg/L putrescine-HCl, 10 mg/L GA 3 , 0.3% Phytagel, pH 5.7, autoclaved at 121°C for 15 min as recommended by the supplier) in culture tubes (three per tube). Shoot cultures were kept in a growth room at 27 ± 2°C, 3,000 lx, and 12 hr light/12 hr dark.
Cultures were maintained in a growth chamber under the conditions described above and stress symptoms were evaluated at 4, 24, 48 hr, and 21 days after stress application.

| Plant growth conditions for RNA-Seq
For both varieties, one node was cut from the donor plant and placed in a 16 × 125 mm tube containing solid MPB media. Only the internode was submerged into the solid media to allow the node to grow. Fourteen days after planting, six plantlets with similar size were selected and transferred to another vessel containing liquid MPB media. A thin plate of acrylic plastic containing small holes was used to support plantlets and to allow leaf expansion. Three biological replicates and three controls were used. Cultures were maintained in the growth chamber under the same conditions as described above. At 21 days after planting, the initial liquid media was removed and replaced with liquid media containing 25% PEG in liquid MPB medium. Leaf samples were collected for RNA extraction at 24 and 48 hr after stress (HAS) imposition, flash frozen, and stored at −80°C. RNA extraction was performed with 200 mg of leaf tissue for each sample using TRIzol (Invitrogen) according to supplier instructions.
Cufflinks (v2.2.1; Trapnell et al., 2010) was used to calculate fragments per kilobase exon model per million mapped reads (FPKMs) for the set of high-confidence I. trifida gene models (v3) using the options: -multi-read-correct, -min-intron-length 10, -maxintron-length 5000, -library-type fr-first strand. One technical replicate was removed because of a markedly lower alignment rate compared to other technical replicates. Transcripts itf01g31740.t1, itf02g16070.t2, itf03g01940.t1 were removed because these had identical coordinates as another isoform of the same gene. Correlations between different sequencing runs of the same library were quantified using Pearson Correlation Coefficients from FPKMs in R (pairwise.complete.obs, method = pearson). Technical replicates were highly correlated, with most correlation coefficients above 0.95 and others above 0.90. The alignment files for technical replicates were merged using the MergeSamFiles tool in PicardTools (v2.1.1; https:// broadinstitute.github.io/picard/), and used for calculating FPKMs for each biological replicate (Supporting Information Table S2) and for differential expression analysis.
Read counts were used to detect differential expression with DESeq2 v1.16.1 (Love, Huber, & Anders, 2014). For differential expression between control and PEG treatment at each time-point in each variety, unique combinations of treatment, time-points, and variety were treated as a single factor, and the contrasts function in DESeq2 was used to test whether log2 fold change was equal to 0 for each pair of control and PEG treatment levels.
An adjusted p-value cut-off of 0.05 and a log2 fold-change threshold of 1.5 were used. The "lfcShrink" function was used to restrain high log2 fold changes for genes with low expression levels.
To detect genes that responded differentially to PEG between Beauregard and Tanzania at either time-point or both time-points, the full model,~treatment + variety + time_point + variety:time_point + treatment: time_pt + treatment:variety:time_point, was tested against a reduced model of,~treatment + variety + time_point + variety:time_point + treatment:time_pt using a likelihood ratio test implemented in DESeq2 and an adjusted p-value cut-off of 0.01. For K-means clustering, log2 fold changes calculated by DESeq2 were obtained for significant genes and mean-centered by subtracting the arithmetic mean for each gene. K-means clustering was conducted using the "kmeans" function in R with the settings, centers = 16, nstart = 25, iter.max = 100, algorithm = "MacQueen" (MacQueen, 1967). A K of 16 was chosen because relatively little reduction in total within-cluster variation was observed for larger values.

| Gene ontology term enrichment analysis
Groups of DEGs or clusters with similar expression patterns were tested for enriched gene ontology (GO) terms using the "weight01" algorithm and Fisher's exact test implemented in topGO v2.28.0 (Alexa & Rahnenfuhrer, 2016) using all genes with GO terms as the background. p-values were corrected for multiple testing using the "p.adjust" function in R to implement the FDR method (Benjamini & Hochberg, 1995), and filtered at the 0.05 level.

| Global gene expression profiles are conserved between Beauregard and Tanzania
To assess the quality of the biological replicates and the extent of   (Carpena, 2009), these two varieties are likely adapted to different environmental conditions, and shared genetic responses to PEG may be involved in critical drought tolerance or survival mechanisms. At 24 HAS, 2,216 and 1,809 genes were upregulated while 3,798 and 3,318 genes were downregulated in Beauregard and Tanzania, respectively. At 48 HAS, numbers of upregulated genes were comparable to 24 HAS, but downregulated genes decreased to 3,285 and 2,482. Among genes differentially expressed in the same direction in both genotypes, similar GO terms were enriched at 24 and 48 HAS. Consistent with a drought response, "water deprivation," "response to ABA," and "nitrate transport" were among the top biological processes (Figure 2b). Enrichment for the "response to ABA" GO term is expected because drought utilizes ABA signaling to induce stomatal closure in order to reduce water loss via transpiration (reviewed in Osakabe et al., 2014). An important part of this ABA signaling pathway is the trans-   transport" GO term may be involved in the accumulation of free amino acids such as proline that are hypothesized to help maintain cell turgor by decreasing water potential inside cells (Stewart & Lee, 1974). Upregulated genes enriched for the "water deprivation" GO term (Supporting Information Dataset S1) included well-known drought tolerance genes such as DEHYDRATION-RESPONSIVE ELE-MENT-BINDING PROTEIN 2A (DREB2A) which encodes a droughtresponsive transcription factor (Liu et al., 1998;Sakuma et al., 2006) and LATE EMBRYOGENESIS ABUNDANT (LEA), a family of stressinducible genes that encode proteins that have been implicated in diverse drought tolerance mechanisms such as preventing protein aggregation and membrane protection during dehydration (Goyal, Walton, & Tunnacliffe, 2005;Tolleter et al., 2007). proliferation" and the "response to auxin," an important regulator of cell and tissue growth. Leaves, the tissue that RNA was extracted from for this study, are often more susceptible to reduced growth rate compared to roots and it is hypothesized that reduced leaf area helps limit water loss due to transpiration (Aguirrezabal et al., 2006;Pace, Cralle, El-halawany, Cothren, & Senseman, 1999;Xu et al., 2015). These overall expression patterns are consistent with a drought response and indicate that dehydration stress was successfully imposed by the PEG treatment.

| A group of leucine-rich kinases are downregulated at 24 HAS
For both directions of regulation, hundreds of DEGs were unique to a particular time-point in each variety, indicating that dynamic regulatory changes take place between 24 HAS and 48 HAS (Figure 3a,b).
To examine their biological significance, GO term enrichment was tested for the DEGs that were specific to a particular time-point and

| Upregulated candidate genes include a LEA gene and ABA metabolism genes
We focused on the 21 upregulated (itf12g21010 was downregulated at 24 HAS in Beauregard) and 16 downregulated drought candidate genes that were differentially expressed at all four variety-time point combinations, reasoning that these genes are strong candidates to be involved in drought stress responses (Figure 4b; Supporting Information Figure S2). A TAS14 homolog (itf03g07310) was upregulated by >8 log2 fold change across all four variety-time points combinations, substantially higher than any other drought tolerance candidate gene (Figure 4b). TAS14 is a LATE EMBRYOGENESIS ABUNDANT (LEA) protein of the dehydrin subfamily and its expression is induced by salt, ABA or mannitol treatment in tomato (Godoy, Pardo, & Pintor-Toro, 1990 (Kang et al., 2010;Kuromori, Sugimoto, & Shinozaki, 2011;Kuromori et al., 2010). Besides the transport of ABA, its metabolism was also implicated in dehydration response in sweet potato. NCED encodes 9-cisepoxycarotenoid dioxygenase, an ABA biosynthesis enzyme whose induction correlates with increases in ABA during drought stress (Iuchi et al., 2001). Conversely, CYP707A1 and 3 are ABA catabolic enzymes that are induced by both dehydration and rehydration in Arabidopsis, allowing controlled ABA responses and rapid disengagement of ABA responses, respectively (Kushiro et al., 2004;Saito et al., 2004;Umezawa et al., 2006 Jansson, 1999) are two such genes and, intriguingly, these two genes lack paralogs in I. trifida, suggesting that their down-regulation during dehydration stress cannot be compensated by the activity of other genes (Figure 4c). KAT2 was another gene downregulated at all four variety-time point combinations that has no paralogs, but the downregulation of this gene, encoding an inwardly rectifying K + channel, is consistent with the need for net K + efflux from guard cells during stomatal closure (Schroeder, Hedrich, & Fernandez, 1984).
In contrast, the downregulation of LHCB6 and SLAC1 may have detrimental effects on stomatal closure during drought, and restoring their functions by overexpression may improve drought tolerance in sweet potato. LHCSB6 encodes a protein associated with photosystem II (Jansson, 1999) that, in Arabidopsis, inhibits ABA-mediated stomatal closure when expression is knocked down and enhances ABA-mediated stomatal closure when overexpressed (Xu et al., 2012). Due to its role in mediating stomatal closure, the knockdown mutant of LHCSB6 has increased drought sensitivity, with visibly more severe reductions in leaf size and greenness compared to wildtype (Xu et al., 2012). SLAC1 is an anion efflux channel protein that is essential for proper stomatal closure in response to many factors, including ABA (Negi et al., 2008;Vahisalu et al., 2008). In Arabidopsis, leaf expression of SLAC1 is restricted mainly to guard cells (Negi et al., 2008;Vahisalu et al., 2008) and is upregulated 1 to 1.6 log2 fold change by drought (Wang et al., 2016). In contrast, we observed a strong downregulation (−2.8 to −3.6 log2 fold change) of SLAC1 in sweet potato leaves in response to PEG (Supporting Information Figure S2; Supporting Information Table S3).

| Varietal differences in differentially expressed genes under dehydration stress
Differential gene expression analysis for each individual variety-time point combination suggested that there are~2,000 PEG-responsive genes unique to each variety (Figure 2a). We hypothesize that expression differences in some of these genes contribute to the differences in days to permanent wilting point, and drought-induced reductions in chlorophyll content and storage roots observed between the two varieties (Supporting Information Figure S1). To statistically identify genes that are regulated differentially between Beauregard and Tanzania in response to PEG, genes were tested for significant interaction effects between PEG treatment and variety, either as two-way interactions or as three-way interactions with time. The significant genes were classified into co-regulated clusters using a K-means approach in order to uncover expression patterns.

Time point
Beauregard Tanzania F I G U R E 5 Genes differentially regulated between Beauregard and Tanzania in response to polyethylene glycol (PEG). (a) Representative Kmeans clusters for mean-centered (per gene) log2 fold changes (PEG/control) for Beauregard and Tanzania at 24 and 48 hr after stress. The number of genes in each cluster is indicated in parentheses. (b) Top three most significant biological processes gene ontology terms for each K-means cluster 2,502 genes used for clustering, had marked differences in expression pattern between the two varieties consistent with distinct genetic responses to dehydration stress.
Additionally, these expression differences between the two varieties were dramatically different at the two time-points in clusters 11, 13, and 15, emphasizing again that dynamic regulatory changes occur between the two time-points. Cluster 11 was not significantly enriched for any biological process GO term, but contained an ortholog of OST2, which encodes an H + -ATPase proton pump that is involved in ABA-mediated stomatal closure in response to drought (Merlot et al., 2007;Figure 5b; Supporting Information Table S3).
Three OST2 orthologs were significant for interaction between treatment and variety effects (the other two were in clusters 3 and 6) and except for 24 HAS for itf03g23220, higher expression was observed for Beauregard for all data points suggesting that reduced OST2 activity in Tanzania may contribute to its higher days to permanent wilting ( Figure 1a, Supporting Information Figure S4). In contrast to cluster 11, clusters 13, and 15, which had increased expression in Tanzania at 48 HAS compared to 24 HAS but had either similar or weakened expression at 48 HAS compared to 24 HAS in Beauregard, were weakly significant for the biological process GO term "hydrogen peroxide-mediated programmed cell death" (Figure 5b; Supporting Information Dataset S3). This suggests that drought-induced hydrogen peroxide (Noctor, Veljovic-Jovanovic, Driscoll, Novitskaya, & Foyer, 2002) causes more cell death in Tanzania. However, the overall effect of hydrogen peroxide in Tanzania could potentially not be detrimental because hydrogen peroxide is an important signal molecule for proper ABA-mediated stomatal closure (Pei et al., 2000).
Clusters 1, 2, 4, and 6 were enriched for photosynthesis-related GO terms and contained genes that were, in general, expressed higher in Beauregard or at comparable levels in both varieties (Figure 5). Cellular compartment GO term analysis supported an enrichment for genes involved in photosynthesis in these clusters, with significant GO terms for chloroplast thylakoid membrane, stroma and envelope (Supporting Information Figure S5a). Clusters 2, 4, and 6 were also enriched for the pentose-phosphate shunt biological process GO term (Figure 5b). One of the products of the pentose-phosphate shunt is NADPH (Kruger & von Schaewen, 2003), which is required for the anti-oxidative function of glutathione reductase that may help quench reactive oxygen species (ROS) produced during drought that could damage chloroplast membranes (Das & Roychoudhury, 2014). Together, the higher expression of photosynthesis and pentose-phosphate shunt genes in clusters 1, 2, 4, and 6 in Beauregard may contribute to its smaller reduction in chlorophyll content compared to Tanzania plants in response to drought (Supporting Information Figure S1b).
Cluster 8 genes were on average regulated at~3 log2 fold higher in Tanzania than in Beauregard at both 24 and 48 HAS (Figure 5a).
Interestingly, cluster 8 was most significant for enrichment of the GO term, "anthocyanin-containing compound biosynthesis" (Figure 5b). Anthocyanin production is triggered by various biotic and abiotic stresses, including drought stress (Kovinich et al., 2014).
While the function of stress-induced anthocyanin is still debated, overexpression of anthocyanin biosynthesis improves drought resistance and the leading hypothesis is that it is involved in removing ROS (Gould, McKelvie, & Markham, 2002;Kovinich, Kayanja, Chanoca, Otegui, & Grotewold, 2015;Nakabayashi et al., 2014). Because ROS overaccumulation can damage not only cellular membranes but also nucleic acids and proteins, the increased expression of anthocyanin-containing biosynthesis genes in Tanzania may contribute to its extended days to permanent wilting point compared to Beauregard (Supporting Information Figure S1a) by helping to limit levels of damaging ROS during drought stress.

| CONCLUSION S
We present here an expression profiling resource for drought stress experiments in sweet potato and have demonstrated its utility for generating hypotheses for functional genomics. Overall expression patterns were conserved between Beauregard and Tanzania, and consistent with drought responses reported in the literature, with a general upregulation of ABA signaling components and downregulation of tissue growth processes. These observations corroborate the quality of our data and give weight to the other aspects of our analysis, including several interesting findings. We have identified a group of LRR kinases that are down-regulated at 24 HAS but not at 48 HAS suggesting their reduced activities at 24 HAS have important roles in dehydration response. Interestingly, we observed downregulation of single-copy LHCSB6 and SLAC1 orthologs in both varieties at both time-points. These two genes encode effector proteins, a chlorophyll-binding component of PS II and a guard cell anion efflux protein, respectively, that have been shown to be highly important for stomatal closure during drought stress in Arabidopsis and we present these as strong candidates for overexpression experiments in sweet potato. Co-regulated clusters of genes involved in photosynthesis and the pentose-phosphate pathway that is expressed higher in Beauregard may contribute to its chlorophyll content being more resilient to drought than Tanzania. On the other hand, a separate co-regulated gene cluster involved in the production of anthocyanin-containing molecules may help increase the number of days to permanent wilting in Tanzania.