Insights into gene regulation of the halovirus His2 infecting Haloarcula hispanica

Abstract Gene expression in Haloarcula hispanica cells infected with the gammapleolipovirus His2 was studied using a custom DNA microarray. Total RNA from cells sampled at 0, 1, 2, 3, and 4.5 hr postinfection was reverse‐transcribed into labeled cDNA and hybridized to microarrays, revealing temporal and differential expression in both host and viral genes. His2 gene expression occurred in three main phases (early, middle, and late), and by 4.5 hr p.i. the majority of genes were actively transcribed, including those encoding the major structural proteins. Eighty host genes were differentially regulated ≥twofold postinfection, with most of them predicted to be involved in transport, translation, and metabolism. Differentially expressed host genes could also be grouped into early‐, middle‐, and late‐expressed genes based on the timing of their up‐ and downregulation postinfection. The altered host transcriptional pattern suggests regulation by His2 infection, which may reprogram host metabolism to facilitate its own DNA replication and propagation. This study enhances the characterization of many hypothetical viral genes and provides insights into the interaction between His2 and its host.

Identifying and understanding the interplay of viral and host factors during cell entry, replication, and egress is critical to deciphering the events that determine the fate of infection. The majority of the archaeal viruses isolated so far contain dsDNA as the genetic material and infect halophilic or hyperthermophilic host species (Munson-McGee et al., 2018;Prangishvili, Forterre, & Garrett, 2006).
Haloarcula virus His2 (family Pleolipoviridae) infects Haloarcula hispanica (Bath, Cukalac, Porter, & Dyall-Smith, 2006) and is currently the only member of the genus Gammapleolipovirus (Krupovic et al., 2018;Pietilä et al., 2012). Virions are pleomorphic and possess a lipid membrane with two exposed spike proteins (VP28, VP29) and two minor membrane-associated proteins (VP27, VP32) (Pietilä et al., 2012). The virus genome is dsDNA, 16 kb in length with long inverted terminal repeats and terminal proteins, and is predicted to encode a putative type B DNAdependent DNA polymerase among its 35 annotated ORFs (Bath et al., 2006). Genome replication is most likely by protein-priming.
At the nucleotide level, His2 shows little similarity to other viruses while at the predicted protein level His2 shows mixed relationships, with the DNA polymerase (His2V_gp14) being similar to that of the spindle-shaped virus His1 (Salterprovirus) while the spike protein (VP29; His2V_gp29) and the AAA ATPase (His2V_gp33) share similarity to the corresponding proteins of betapleovirus HHPV3 (Demina, Atanasova, Pietilä, Oksanen, & Bamford, 2016).
In single-step growth studies, virus release begins at around 3 hr postinfection (p.i.) and exit is thought to occur continuously via budding through the cell membrane, as suggested by the retardation of host cell growth concurrent with lipid acquisition by the virus (Bath et al., 2006;Pietilä et al., 2012;Quemin et al., 2016). The lack of cell lysis by His2 (Svirskaite, Oksanen, Daugelavicius, & Bamford, 2016) is a characteristic shared with other haloviruses, such as SH1 (Porter et al., 2005), as well as with members of Fuselloviridae such as SSV1 (Fröls, Gordon, Panlilio, Schleper, & Sensen, 2007) and STSV1 (Porter et al., 2005;Xiang et al., 2005). Its mode of replication appears to be very different from the well-studied lytic infections of model bacterial caudoviruses such as T4 (Desplats & Krisch, 2003), or T3 (Krüger & Schroeder, 1981).
The host species of His2 is the extremely halophilic archaeon Har. hispanica (Class Halobacteria, family Haloarculaceae), which was isolated from a solar saltern in Spain and grows optimally at 25% (w/v) salinity (Juez, Rodriguez-Valera, Ventosa, & Kushner, 1986). It is an aerobic heterotroph, and like many haloarchaea, the cells of this species have a simple cell envelope consisting of the cell membrane enclosed by a thin, paracrystalline protein layer (S-layer). The genome sequence of Har. hispanica has been determined (Ding, Chiang, Hong, Dyall-Smith, & Tang, 2014;Liu, Zhenfang, et al., 2011), and methods for genetic manipulation are available (Liu, Han, Han, Liu, Zhou, & Xiang, 2011), making this species an attractive model for studying the dynamics of virus infections in haloarchaea.
In the ongoing struggle between viruses and hosts, host cells develop mechanisms to defend against virus predation while viruses evolve to evade host defenses. One approach to gaining more insight into virus-host interactions is to measure and analyze differential gene expression using the microarray technique. This has been used to study archaeal viruses of Sulfolobus, such as the fusellovirus SSV1 (Fröls et al., 2007) and the icosahedral virus STIV (Ortmann et al., 2008), allowing the global surveillance host and virus genes over the infection cycle, and revealing differentially regulated gene expression. In more recent studies of Sulfolobus viruses, microarrays were used to examine rudivirus SIRV2 infection (Okutan et al., 2013); the dynamics and interplay between the fusellovirus SSV2, plasmid pSSVi and host genes (Ren, She, & Huang, 2013); and the gene expression in SSV1-and SSV2-lysogens as well as in cells coinfected by both viruses (Fusco, She, Fiorentino, Bartolucci, & Contursi, 2015).
One insight from these studies is that SSV1 infection does not induce major changes in host (Sulfolobus) gene expression (Fröls et al., 2007;Fusco et al., 2015), which is consistent with the continued (but reduced) growth of the host while the virus is constantly shed.
In this study, we monitored changes in the expression of His2 and Har. hispanica genes during the infection cycle using a microarray-based approach. Temporal expression and differential regulation of both viral and host genes were observed and supported the idea that His2 infection, at least over the first 4.5 hr, has a relatively low impact on host gene expression.

| Viral infection
His2 virus stocks were produced by infecting early exponential phase Har. hispanica strain N601 cultures (OD 600 = 0.2) with the virus at a multiplicity of 1:10. For preparing infected-cell RNA, early exponential phase Har. hispanica cells grown in 23% MGM medium at 37°C were collected by centrifugation at 5,000 g for 15 min at room temperature, the supernatant discarded and the pellet resuspended in 18% MGM medium containing His2 virus (10 10 PFU), with multiplicity of infection (MOI) in the ratio of 10 8 :10 9 (cells:virus).
Mixtures were incubated for 15 min at 37°C to enable viral infection, after which the cells were pelleted by centrifugation at 5,000 g for 15 min at room temperature and the supernatant discarded. Cells were then washed twice with fresh 18% MGM medium, and the final pellet resuspended in 100 ml of 18% MGM medium and incubated at 37°C with slow shaking (100 rpm). About 1 ml samples were taken at 0 (after the absorption, washing, and collection of the samples), 1, 2, 3, and 4.5 hr p.i. (postinfection) for RNA extraction, and an additional 1 ml samples were taken at the same time to determine the virus titer by plaque assay (Dyall-Smith, 2009). Cultured samples were frozen in liquid nitrogen until further extraction. We used a T0 reference for this type of analysis, and the advantage is that the cells are identical in every respect except one variable, time. The 15 min infection incubation is short compared with the life cycle of His2, and washes were done at RT. In this way, the comparisons were T1/ T0, T2/T0, T3/T0, and T4.5/T0.

| Infected host cell RNA extraction
The TRI-reagent method (Dyall-Smith, 2009) was used to extract total RNA from virus-infected cells. Culture samples (1 ml) were centrifuged at 12,000 g (1 min, 4°C), homogenized in 1 ml TRI-reagent solution (Invitrogen), and incubated at room temperature for 5 min, and then centrifuged at 12,000 g (10 min, 4°C). The top (aqueous) layer was transferred to a clean microfuge tube, 200 µl chloroform added, and each mixture vortexed for 15 s, and then incubated at room temperature for 15 min. After incubation, the samples were centrifuged at 12,000 g (10 min, 4°C), and the top (aqueous) layer transferred to a clean microfuge tube, 500 µl isopropanol added, and the tubes vortexed for 10 s, and then incubated at room temperature for 10 min. After centrifugation at 12,000 g (8 min, 4°C) to pellet RNA, the supernatants were discarded. The RNA pellets were washed twice in 1 ml 75% ethanol and centrifuged at 7,500 g for 5 min at 4°C, and then air-dried before being resuspended in nuclease-free water. DNase I (BioLabs) was used to remove residual genomic DNA. Briefly, 2 units DNase I and 5 µl of 10× DNase I buffer were added to 5 µg of RNA sample, and incubated at 37°C for 10 min.
Following the incubation, we added 0.5 M EDTA (final concentration of 5 mM EDTA) and removed DNase I with Amicon Ultra-0.5 ml centrifugal filters (Millipore, Ultra 100k) by centrifuging at 6,000 g for 6 min at 4°C. The RNA quantity was determined using a NanoDrop ND-1000 UV-Vis Spectrophotometer (Nano-Drop Technologies) and BioAnalyzer 2100 (Agilent 2100 Bioanalyzer) using the Agilent RNA 6000 Nano kit, and RNA integrity was assessed by electrophoresis on 1% agarose-guanidine thiocyanate gels ( Figures A1 and A2).

| Microarray design and hybridization
A microarray chip was designed based on the 3,905 annotated genes of Har. hispanica strain N601 (BioProject: PRJNA227070). We assayed three biological replicates (A, B, and C) of each sample time using a two-color platform array and synthesized the complementary DNA from 15 µg total RNA using Superscript TM Plus Indirect cDNA Labeling System (Invitrogen). Reference cDNA samples (0 hr) were synthesized using primers for downstream capture by Cy3, and experimental samples (1, 2, 3, 4.5 hr) were synthesized using primers for downstream capture by Cy5.
We used the Agilent Gene Expression Hybridization Kit for hybridization. Briefly, the microarrays were scanned on an Agilent Technologies Scanner G2505C using the one-color scan setting for 8 × 15 K array slides. The raw intensity data were then normalized to a global average for each experiment, log 2 transformed and analyzed using GeneSpring GX7.3.1 (Agilent). A twofold change in gene expression as compared to time 0-hr was used as the minimum value (or threshold) for describing differences. Z scores were calculated using the log 10 -transformed gene raw intensity data for each experiment. Z ratio values for each experiment were then calculated by taking the difference between the averages of the observed gene Z scores and dividing by the SD of all of the differences for that particular comparison (Cheadle, Vawter, Freed, & Becker, 2003). A onefold change in Z ratio gene expression was used to distinguish significant changes in gene expressions throughout the experiment (Cheadle et al., 2003). Hierarchical cluster analysis (Gene Cluster 3.0) was used to analyze the gene expression profiles between the three replicates. The raw data for all three biological replicates are provided in Table S1 (https://doi.org/10.6084/m9.figsh are.11800872).

| Viral infection microarray
A microarray designed to detect the expression of 3,905 annotated genes of Har hispanica strain N601, and 35 genes of halovirus His2 were hybridized to labeled cDNA transcripts of virus-infected cells sampled from 0 to 4.5 hr postinfection. Three biological replicates were used, and in all cases, virus release was detected at 3 hr p.i. (Table S2; https://doi.org/10.6084/m9.figsh are.11800872). The results for genes showing significant regulatory changes have been summarized in Figure 1 and Table 1, and the full compilation of results is given in Table S1A (https://doi.org/10.6084/m9.figsh are.11800872). A total of 114 genes (80 genes from the host and 34 from the virus) showing at least a twofold change among the three biological replicates were detected. The heat map shown in Figure 1 provides a graphical summary of the changes in gene expression for both virus and host (indicated at the right edge). The three replicates for each time point are indicated at the top, and group together as expected for the 1 and 2 hr sample times, while one of the 3 hr samples branches with the 4.5 hr group. Hierarchical clustering of genes based on their expression patterns is shown at the left edge of the map and groups genes into three major phases; early, middle, and late. To extend and more confidently substantiate findings reported here, future work should employ high-replicate designs based on the protocols developed in this study, which will further resolve the understanding of His2 and its host.

| Regulation of His2 gene expression
Three phases of gene expression were observed; early (0-1 hr p.i.), middle (2-3 hr p.i.), and late (4.5 hr p.i.). An overview of these phases can be seen in Table 1 (virus, upper   Note: The table shows the stage (early, middle, and late) of gene expression (column-gene expression column) and the peak expression (shaded) Z ratio transcription values postinfection. Z scores were calculated using the log 10 -transformed gene raw intensity data for each experiment. Z ratio values for each experiment were then calculated by taking the difference between the averages of the observed gene Z scores and dividing by the SD of all of the differences for that particular comparison. A onefold change in Z ratio gene expression was used to distinguish significant changes in gene expressions throughout the experiment. Values represent the mean values of three replicates.

TA B L E 1 (Continued)
4.5 hr p.i. (late phase), most genes are strongly expressed (with the conspicuous exception of the early genes, which remain strongly downregulated).
In the early phase, transcription of the first three viral CDS (His2V_gp01, His2V_gp02, His2V_gp03) was observed (Table 1,   Table S1B;  or facilitate the expression of middle and late genes from virus-specific promoters (Hinton, 2010;Krüger & Schroeder, 1981). The features of His2 proteins specified by the CDS His2V_gp01-gp03) are consistent with these functions.
In the middle phase, the three early phase genes are strongly downregulated and remain so until the last sampling time at 4.5 hr p.i. In contrast, six viral genes (His2V_gp24, His2V_gp31-His2V_ gp35) are upregulated at 2 hr p.i. (Table 1, Table S1B; https://doi. org/10.6084/m9.figsh are.11800872), with the expression of His2V_ gp31-His2V_gp35 remaining upregulated until the late phase. On the viral genome, His2V_gp31-His2V_gp35 are closely spaced, similarly oriented, and located near the right terminal inverted repeat.
Most of these five CDS are overlapping and are probably transcribed as a single mRNA. His2V_gp24 encodes a hypothetical protein of unknown function, while His2V_gp31-His2V_gp35 specify two uncharacterized proteins with transmembrane domains (gp31, gp34), a virus structural protein (VP32), an AAA family ATPase (gp33) and a protein with CxxC motifs (gp35) that suggests a role in DNA binding (Nagel, Machulla, Zahn, & Soppa, 2019;Wang et al., 2007).
They also include a potential packaging ATPase (His2V-gp33). Most of the other six genes specify proteins with membrane domains and their close genomic location and late expression pattern suggest they are also likely to be involved in the assembly of mature (membrane-containing) virions. The second cluster of upregulated genes are annotated as hypothetical and their functions have not been determined; however, five specify small proteins that carry one or more CxxC motifs suggestive of DNA binding (Nagel et al., 2019).

| Host cell gene expression changes during virus infection
Only 80 out of 3,905 host genes (2%) showed significant change (≥twofold) in their expression after His2 infection (Table 1, Table S1C; https://doi.org/10.6084/m9.figsh are.11800872). Table 1 shows the times of peak upregulation for these genes, while the color changes in Figure 1, and the shading changes in Table S1, indicate that for many of these genes, their differential expression changed over time from 1 to 4.5 hr p.i. These changes allowed genes to be classified by hierarchical clustering into three phases (early, middle, and late; Figure 1, left and right sides) along with the virus genes.

| Early phase host genes
Twenty-one differentially regulated host genes were designated as early expressed because they were upregulated within the first hour of infection and subsequently downregulated ( Figure 1, Table   S1C: https://doi.org/10.6084/m9.figsh are.11800 872.v2). The ten most significantly upregulated early genes are shown in Table 1 (blue shading), and of these, seven specify protein components of two different membrane transport systems; ZnuABC (HISP_05835, 05840, 05845), a specific and high-affinity Zn 2+ uptake system (Pederick et al., 2015), and PstABCS (HISP_10570, 10575, 10580, and 10585), a specific (and high-affinity) importer of phosphate. In bacteria, PstA is not only used for phosphate uptake but is structurally related to PII signal-transduction proteins and can bind the secondary messenger molecule cyclic-di-AMP (c-di-AMP), so influencing many different cellular processes (Müller, Hopfner, & Witte, 2015). The presence and significance of c-di-AMP in the haloarchaeon Hfx. volcanii has recently been described (Braun et al., 2019). Zn 2+ is not only an important and essential nutrient but it is tempting to speculate that the presence of numerous potential zinc-finger motifs (CxxC) in many His2 proteins (Nagel et al., 2019), including those encoded by early genes, may be relevant in the upregulation of znuABC.
Of the other four genes, two encode membrane-associated proteins involved in energy production (COG category C); NADH dehydrogenase subunit L (HISP_18845) and V-type ATP synthase subunit B (HISP_02210). The third gene (HISP_03090) encodes a cytosolic enzyme, carbamoylphosphate synthase, which catalyzes the first committed step in pyrimidine and arginine biosynthesis, and the fourth gene is histidinol-phosphatase (HISP_17355).
In summary, most of the upregulated genes were involved in the uptake of zinc and phosphate, while the remainder code for proteins with roles in energy production or arginine/nucleotide synthesis.
The upregulation of these early genes may reflect the cell responses to membrane damage upon virus entry and/or the effects of early virus proteins that enhance the expression of host genes that favor virus replication.

| Middle and late phase host genes (2-4.5 hr p.i.)
A total of 59 differentially regulated host genes were designated as middle-or late-expressed, and as there were only 6 late genes they will be described together with middle genes. These genes were upregulated from 2 to 4.5 hr p.i. (Figure 1, Of the two late-expressed genes shown in Table 1, one encodes a signal transduction histidine kinase (HISP_18340) and the other specifies glycine cleavage system T protein (aminomethyltransferase) (HISP_19155).
Although relatively few host genes showed significant differential regulation during His2 infection, it is likely that the virus has evolved to modify and redirect host cell metabolism in order to optimize virus replication, assembly and exit (Sanchez & Lagunoff, 2015). Other archaeoviruses have previously been shown to have a low impact on host cell metabolism, such as SSV1 (Fröls et al., 2007;Fusco et al., 2015), and in a similar study with the bacterial tectivirus PRD1 and its host E. coli (Poranen et al., 2006), changes at the whole-genome level were described as moderate. In the present study, His2-infected Har. hispanica cells displayed significant upregulation of many genes that are potentially advantageous for the virus, such as pyruvate dehydrogenase (PDH) complex (energy/biosynthesis) and ribosomal proteins (translation). The increased capacity and output of biosynthesis systems could then be redirected into the synthesis of virus components instead of cell growth (Sanchez & Lagunoff, 2015). As many metabolic pathways are inter-related, the exact mechanisms by which His2 gene products achieve optimal growth of virus be challenging to understand. This study provides a starting point for future investigations aimed at identifying the roles of specific His2 genes in driving metabolic changes in the host, such as the roles of the many small, zinc-finger motif proteins.

| CON CLUS ION
The synchronization of His2 infection of Har. hispanica allowed temporal and differential regulation of viral and host genes to be examined. Eighty host genes were differentially regulated ≥twofold postinfection. Both viral and host genes could be grouped into early-, middle-, and late-expressed genes, according to the times at which their transcripts were upregulated. Infection, replication, and propagation of His2 coincided with the regulation of host genes that were involved in transport, energy production, translation, and metabolism. Further studies will be needed to unravel and better understand the virus transcription program and the roles of individual genes in the interplay and evolution of His2 and its host.

ACK N OWLED G M ENTS
We would like to thank the Biodiversity Research Center, Academia Sinica for their financial and facilities support.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
Sonny Lee equally contributed to data curation, formal analysis, investigation, methodology, project administration, resources, validation, visualization, writing-original draft, and writing-review and editing; Jiun-Yan Ding equally contributed to data curation, formal analysis, investigation, methodology, validation, visualization, and writing-review and editing; Pei-Wen Chiang equally contributed to data curation, formal analysis, investigation, methodology, validation, and visualization; Mike Dyall-Smith equally contributed to data curation, formal analysis, methodology, resources, validation, and writing-review and editing; Sen-Lin Tang equally contributed to conceptualization, data curation, funding acquisition, investigation, methodology, project administration, resources, supervision, validation, and writing-review and editing.

E TH I C S S TATEM ENT
None required.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data generated or analyzed during this study are included in this published article and the appendices, as well as Tables S1 and S2