Genetic diversity of equine herpesvirus 1 isolated from neurological, abortigenic and respiratory disease outbreaks

Summary Equine herpesvirus 1 (EHV‐1) causes respiratory disease, abortion, neonatal death and neurological disease in equines and is endemic in most countries. The viral factors that influence EHV‐1 disease severity are poorly understood, and this has hampered vaccine development. However, the N752D substitution in the viral DNA polymerase catalytic subunit has been shown statistically to be associated with neurological disease. This has given rise to the term “neuropathic strain,” even though strains lacking the polymorphism have been recovered from cases of neurological disease. To broaden understanding of EHV‐1 diversity in the field, 78 EHV‐1 strains isolated over a period of 35 years were sequenced. The great majority of isolates originated from the United Kingdom and included in the collection were low passage isolates from respiratory, abortigenic and neurological outbreaks. Phylogenetic analysis of regions spanning 80% of the genome showed that up to 13 viral clades have been circulating in the United Kingdom and that most of these are continuing to circulate. Abortion isolates grouped into nine clades, and neurological isolates grouped into five. Most neurological isolates had the N752D substitution, whereas most abortion isolates did not, although three of the neurological isolates from linked outbreaks had a different polymorphism. Finally, bioinformatic analysis suggested that recombination has occurred between EHV‐1 clades, between EHV‐1 and equine herpesvirus 4, and between EHV‐1 and equine herpesvirus 8.

damage can cause abortion or myeloencephalopathy (Edington, Smyth, & Griffiths, 1991;Whitwell & Blunden, 1992). Despite the potential severity of outbreaks and the financial losses incurred as a result, there are no vaccines licensed to protect against neurological disease, and outbreaks, as well as cases of abortion, still occur in highly vaccinated animals. This was recently highlighted by the abortion storm recorded in Hertfordshire (United Kingdom; UK) in 2016 in fully vaccinated animals (http://www.aht.org.uk/cms-display/inte rim-report16-april2.html).
The linear, double-stranded genomes of these viruses share a common structure. Thus, the EHV-1 genome is approximately 150 kbp in size and consists of long and short unique regions (U L and U S , respectively), the former flanked by a small inverted repeat (TR L /IR L ) and the latter by a large inverted repeat (TR S /IR S ) ( Figure 1) (Henry et al., 1981;Pellett & Roizman, 2007;Telford, Watson, McBride, & Davison, 1992;Whalley, Robertson, & Davison, 1981;Yalamanchili & O'Callaghan, 1990). The genome contains 76 open reading frames (ORFs) predicted to encode functional proteins, four of which are duplicated in TR S /IR S . The average nucleotide composition of the genome is 56.7% G+C, although it is significantly higher in TR S /IR S , at 67% (Telford et al., 1992).
More recent genetic studies include determination of the whole genome sequences for two well-characterized strains, Ab4 and V592 (Telford et al., 1992;Nugent et al., 2006). Comparison of the sequence of the neuropathogenic strain Ab4 (Crowhurst, Dickinson, & Burrows, 1981;Telford et al., 1992) with that of the abortigenic strain V592 (Mumford et al., 1987;Nugent et al., 2006) identified 43 amino acid residue differences distributed among 31 ORFs (Nugent et al., 2006). Of these, ORF68, which encodes a non-essential, membrane-associated virion component homologous to herpes simplex virus type 1 (HSV-1; species Human alphaherpesvirus 1) gene US2, was shown to be particularly variable and was developed as a target for classifying field isolates into six groups (Meindl & Osterrieder, 1999;Nugent et al., 2006). There was no obvious association of these genotypes with pathogenicity.
A single nucleotide polymorphism (A2254G) in ORF30, which encodes the DNA polymerase catalytic subunit, results in the single amino acid substitution N752D and was shown to be associated at a statistically significant level with neurological disease (Nugent et al., 2006). Numerous studies have demonstrated that differences in pathogenicity correlate with the ability to disseminate to and infect vascular endothelial cells in the uterus and central nervous system (Edington, Bridges, & Patel, 1986;Mumford et al., 1994;Patel, Edington, & Mumford, 1982;Platt, Singh, & Whitwell, 1980). Viruses carrying the A2254G neurological marker are thought to replicate to a higher level and induce a longer-lasting viraemia than those lacking it (Allen & Breathnach, 2006). Welsh mountain ponies infected with a mutant virus expressing N752 induced significantly less neurological disease than the wild-type virus, which has the D752 neurological marker (Goodman et al., 2007). The presence of the A2254G marker has been linked to increasing morbidity and mortality in field outbreaks since 2000 (Lunn et al., 2009;Smith et al., 2010).
The EHV-1 pathogenicity determinants that influence abortion are unclear. However, the EHV-1 neuropathogenic strain Ab4 is significantly better at inducing abortion in experimental studies than other strains, including the abortigenic strain V592 and the neuropathogenic strain OH03 Gardiner et al., 2012). This difference in induced abortion was observed despite the viraemia in infected animals being of similar magnitude, as measured by qPCR (Gardiner et al., 2012). EHV-1 strains that reproducibly F I G U R E 1 EHV-1 genome structure showing functional ORFs, inverted repeats and tandem repeats (adapted from GenBank accession NC_001491.2). Inverted repeats are shown as black rectangles, and unique regions as black lines above the ORFs. Regions used for subsequent sequence analysis are indicated with an asterisk (*) cause neurological disease may also cause abortion, due to high levels of endometrial damage during infection (Gardiner et al., 2012;Smith et al., 1992).
Until recently, only two complete genome sequences have been available for analysis of EHV-1. The objective of this study was to generate a database of sequence information for a large panel of well-documented, historically extensive EHV-1 clinical isolates maintained at the Animal Health Trust (UK), including viruses from respiratory, abortigenic and neurological outbreaks, with the aim of determining whether there are any clear genetic factors involved in the three types of disease. These data would be useful for improving the safety of newly designed modified live vaccines for preventing EHV-1 disease. The sequences of 78 isolates were determined, covering at least 80% of the genome for each. The majority of strains analysed were low passage clinical isolates from outbreaks in the UK. Together with 26 EHV-1 genome sequences sequenced in the USA and Australia (Vaz et al., 2016) and deposited in GenBank during the course of this project, these data provide an invaluable database of genetic information for this important pathogen.

| Preparation of viral DNA
The virus stocks were titrated by plaque assay on RK13 cells as described previously (Tearle et al., 2003). Viral DNA was extracted from semi-purified nucleocapsids as follows. RK13 cells were infected at a multiplicity of infection of 0.1 plaque-forming units per cell and incubated at 37°C in 5% (v/v) CO 2 until 80% of cells showed cytopathic effect (4-6 days). Cells were harvested and viral nucleocapsids were enriched using a variation of a protocol described previously . Cell pellets from two 150 cm 2 flasks were resuspended in 5 ml LCM buffer (3% sodium deoxycholate, 30 mM Tris-HCl pH 7.4, 123 mM KCl, 0.5 mM EDTA, 3.6 mM CaCl 2 , 3 nM MgCl 2 , 3% (v/v) NP-40 and 43 ll 2-mercaptoethanol/100 ml), and lipid envelopes were removed by adding 1 ml 1,1,2-trichloro-1,2,2-trifluroethane (Freon, Sigma-Aldrich) and mixing by inversion. The phases were separated by centrifugation at 1,500 g for 5 min, and the supernatant was reextracted with 1 ml Freon. Approximately 2.8 ml supernatant was layered over a two-step glycerol gradient consisting of 1.3 ml 45% (v/v) and 1.3 ml 5% (v/v) glycerol in LCM buffer in a 5-ml thin-walled Ultraclear centrifuge tube (Beckman Coulter). Viral nucleocapsids were pelleted by centrifugation at 96,000 g in an AH-650 rotor for 1 hr at 4°C. The glycerol was removed, and the pellet was drained by inverting the tube and resuspended in 400 ll sterile phosphate-buffered saline. DNA was extracted from the enriched viral nucleocapsids using an Isolate II genomic DNA kit (Bioline) according to the manufacturer's instructions, and eluted in 100 ll water. DNA concentration was quantified using a Nanodrop (Thermo Scientific), and DNA integrity was assessed by agarose gel electrophoresis of intact DNA.

| Preparation of DNA sequencing libraries
Libraries for EHV-1 strains to be sequenced at the Animal Health Trust (UK) were prepared using an Illumina series KAPA library preparation kit (catalogue no. KK8200). An aliquot (200 ng) of viral DNA was sheared by sonication and used as input for the library preparation protocol as described in the manufacturer's instructions.
A MinElute reaction clean-up kit (Qiagen) was used to purify the DNA between steps, and DNA fragments of approximately 500 bp were selected by agarose gel electrophoresis. DNA bands were excised and purified using a QIAquick gel extraction kit (Qiagen).
The Genome Analysis Centre (now the Earlham Institute; UK) prepared viral DNA libraries using a Covaris S2 sonicator and an Illumina TruSeq DNA library preparation kit as described by the manufacturer. Size selection of 500-bp DNA fragments was performed using E-Gel â SizeSelect TM 2% (w/v) agarose gels (Invitrogen). The insert size of the libraries was verified using an Aligent 2100 bioanalyser and a DNA 1000 assay (catalogue no. 5067-1504). DNA concentration was determined using a Qubit â dsDNA HS assay kit (catalogue no. Q32854).

| DNA sequencing
Paired-end read sequencing runs of 75-250 nt/read were carried out on an Illumina MiSeq sequencer, using the MiSeq reagent kit v. 2 or v. 3 (300-500 cycles). Following preliminary analysis, the MiSeq reporter programme was used to generate FASTQ-formatted read files for each EHV-1 strain.
For de novo assembly, host cell DNA reads were removed using BWA v. 0.5.9rc1 (Li & Durbin, 2010) (Katoh & Standley, 2013). For deposition of sequences in GenBank, sequence files were assembled with strings of 100 N residues between blocks of contiguous data.
Equivalent blocks of EHV-1 sequences were trimmed to the same length for the analysed strains.

| DNA sequencing of EHV-1 strains
Seventy-eight representative EHV-1 isolates selected from the archive at the Animal Health Trust (UK) were sequenced (Table 1). A sequencing library was prepared for each strain using one of two approaches, as described in materials and methods, generating paired-end reads (i.e., sequences from the opposing ends of random DNA fragments) of 75-250 nucleotides (nt) each.
Methods involving de novo and reference-guided assembly were then used to create contigs covering the majority of U L and U S as indicated in Figure 1. The use of de novo assembly indicated that the genome structure was conserved in the viruses sequenced, in that no large-scale rearrangements were identified.
The number of reads mapped to the reference genome for each virus is shown in Table 1.
As for other herpesvirus genomes, several regions of tandem repeats were identified (Figure 1). The tandem repeat regions were difficult to sequence as they were often longer than the read length. Even when tandem repeats were short enough to be covered within a single read, they were observed to vary in length among strains, and sometimes within strains, as reported previously for EHV-1 (Nugent et al., 2006) and other alphaherpesviruses (Depledge et al., 2014;Hondo & Yogo, 1988 (Figure 2b). Sequence conservation was even higher in U S than U L , with 99.89% identity between strains Ab1 (clade 2) and Suffolk/87/2009 (clade 6).

| Genetic variability in EHV-1
Analysis of the ORFs from the clinical isolates identified mainly sin- There was no evidence in the sequence data from the viral stocks used for the loss of a BamHI site in ORF21 (ribonucleotide reductase large subunit), as was reported after serial passage of strains Army 183 and Ab1 in cell culture (Bonass et al., 1994).
Unfortunately, the passage number of these stocks was unknown.
To investigate protein sequence divergence, the numbers of variable amino acids in ORF1 to ORF63 (in U L ) and ORF69 to ORF76 (in U S ), normalized to protein length but excluding frameshifts and  Figure 4c onto the three-dimensional structure of the HSV-1 DNA polymerase catalytic subunit (2gv9.pdb) showed that the variant residues were located in the palm, N-terminal, exo and thumb domains.

| Recombination in EHV-1
The Splitstree program was used to assess whether there was evidence for recombination in EHV-1, as a tree-like splits network can be interpreted as supporting various modes of evolution, including recombination (Huson, 1998;Huson & Bryant, 2006). A splits network was constructed using sequence data from U L for all strains, employing the default NeighborNet method imple-

| DISCUSSION
The apparent association of G2254/D752 strains, rather than A2254/N752 strains, with neuropathogenic disease has been a particular focus of studies of EHV-1 pathogenesis in recent years (Fritsche & Borchers, 2011;Pronost et al., 2010;Stasiak, Rola, Ploszay, Socha, & Zmudzinski, 2015;Tsujimura et al., 2011). Sometimes overlooked is the finding in the field that the association of abortion with the EHV-1 A2254/N752 strain variant was more statistically significant (Nugent et al., 2006). The majority of strains analysed in the present study had been isolated from abortion cases and had the A2254/N752 genotype, and most of the remaining strains that had been isolated from neurological disease outbreaks belonged to the G2254/D752 genotype, similar to published conclusions. However, there are other genetic differences between these genotypes within the DNA polymerase catalytic subunit, and their effects on protein function have yet to be studied.
The previously sequenced EHV-1 strains Ab4 (G2254; clade 1) and V592 (A2254; clade 9) can both cause abortion in an experimental setting, albeit with apparently different efficiencies. Infection of pregnant ponies and horses with strain Ab4 resulted in severe clinical disease, with high levels of viral shedding and pyrexia in all ponies, coupled with abortion, ataxia and quadriplegia (Gardiner et al., 2012;Mumford et al., 1994). Experimental infections with strain V592 induced mild respiratory signs and low numbers of abortions, but not neurological disease . The duration of viraemia and the amount of viral shedding was also significantly lower in horses infected with V592 compared with Ab4 Smith et al., 2000). Despite the virulence of D752 strains, the majority of disease in equines does not appear to Previously published work has shown no obvious difference in the processivity of either DNA polymerase variant in vitro, except that the G2254 version was sensitive to the drug aphidicolin (Goodman et al., 2007). This may indicate that the two versions of the protein adopt slightly different conformations that affect pathogenicity by a means that has not yet been identified. It has been suggested that G2254 strains are able to infect cell types in vivo that are different from those infected by A2254 strains, allowing these viruses to replicate and disseminate more efficiently (Goodman et al., 2007). However, work carried out in vitro has failed to identify any cellular preference due to this mutation (Ma, Lu, & Osterrieder, 2010). catalytic subunit adjacent to a cluster of conserved, highly charged residues that may interact with RNA (Liu et al., 2006). This substitution was also found in the strain RacL11, along with two further changes in ORF30, the neurological marker N752D and Y753S. The polymorphism at position 753 has been reported previously in a subset of strains isolated in central Kentucky (Smith et al., 2010) and in the strain RacH used in the live attenuated vaccines Rhinomune (Boehringer Ingelheim Vetmedica) and Prevaccinol (MSD Animal Health) (Nugent et al., 2006).
During latency, alphaherpesvirus genomes exist in a circular, episomal form in the cell nucleus. There is no viral replication and viral transmission does not occur. This process is likely to slow down the evolutionary rate of the virus in individual animals. Moreover, recent work has suggested that host cell-encoded DNA repair mechanisms are able to control genetic damage in latent herpesvirus genomes (Brown, 2014). Sequencing of live attenuated varicella-zoster virus (VZV) vaccine isolates from individuals has further implied that latency slows the evolutionary rate (Weinert et al., 2015). The dynamics of EHV-1 latency and reactivation are not well understood. Latent virus has been identified in lymph nodes and nervous tissue (including trigeminal ganglia), and latency-associated transcripts have been identified in CD5+/CD8+ leucocytes (Baxi et al., 1995;Chesters, Allsop, Purewal, & Edington, 1997;Pusterla, Mapes, & Wilson, 2010). The amount of subclinical reactivation from these sites and the mechanisms of subsequent clearance are unknown. The anatomically distinct nature of these sites of latency raises the possibility that viruses at different sites may reactivate independently, and that the overall population may have been derived from viruses originating from separate shedding episodes, similar to that reported for herpes simplex virus 2 (HSV-2) (Johnston et al., 2014). In the present study, strains Essex/199/05 and Essex/ 200/05 were isolated from different tissues from an individual abortion and had identical sequences, as expected (Figure 2a, clade 7).
Strains Nottinghamshire/10/04 and Nottinghamshire/70/04 were also isolated from pooled tissue from one abortion, but were genetically different from each other (Figure 2a, clade 7), suggesting that horses can be infected with multiple EHV-1 strains at the same time.
This could create difficulties when assigning a particular disease presentation to a viral strain, as some strains may be coincidental isolations.
Co-infection also gives rise to the possibility of recombination to generate new strains. High-throughput genome sequencing has provided evidence of recombination between strains in other herpesviruses, namely HSV-2, VZV, Marek's disease virus, HSV-1, pseudorabies virus, human cytomegalovirus and infectious laryngotracheitis virus (Hughes & Rivailler, 2007;Kolb, Larsen, Cuellar, & Brandt, 2015;Lee et al., 2013;Norberg et al., 2011Norberg et al., , 2015Sijmons et al., 2015;Szpara et al., 2014;Zell et al., 2012). In the present study, network analysis suggested that recombination has also occurred between EHV-1 strains. In addition, recombination between EHV-1 and other, closely related equine herpesviruses was also detected, indicating that co-infection of the There were four non-synonymous changes in this region when compared to the consensus EHV-1 sequence, but it is not known whether this has any functional outcome, although the C-terminus of HSV-1 ICP4 is thought to enhance the functions of the transactivation domain located at the N-terminus (Bruce & Wilcox, 2002;Wagner, Bayer, & Deluca, 2013). A similar, but not identical, recombination event between EHV-1 and EHV-4 has been reported in the same ORF in an EHV-1 strain isolated in Japan (Pagamjav, Sakata, Matsumura, Yamaguchi, & Fukushi, 2005). In addition, strain Suffolk/ 123/2005, isolated from an abortion, had a sequence of 620 bp containing the ORF47/44 splice acceptor site and the 5 0 -end of ORF45 that is identical to the corresponding region in EHV-8 ( Figure 6).
EHV-8 was first isolated from donkeys treated with corticosteroids, probably as a result of viral reactivation from latency, and was initially named asinine herpesvirus 3 (Browning, Ficorilli, & Studdert, 1988). EHV-8 strain wh, for which a complete genome sequence is available, was isolated from a horse in China in 2010 (Liu, Guo, Lu, Xiang, & Wang, 2012), suggesting that EHV-8 is not specific to its host species. One EHV-1 strain sequenced in this report, Devon/28/ 2003, which falls within clade 10 in the ML tree (Figure 2a), was isolated from a donkey with respiratory disease. It has been reported previously that it is difficult to infect donkeys experimentally with EHV-1 derived from a horse (Gupta et al., 2000). However, recent reports from Ethiopia of severe disease in donkeys caused by EHV-1 suggest that these animals can be productively infected naturally and may be able to shed virus in a similar way to that reported for mules, thus posing a threat to other equines in close proximity (Negussie, Gizaw, Tessema, & Nauwynck, 2015;Pusterla et al., 2012).
The viruses sequenced in this report were isolated in cell culture over a period of 35 years. Passage of EHV-1 Bonass et al., 1994) and other herpesviruses (Dargan et al., 2010;Tyler et al., 2007) in vitro is known to select mutants that have a growth advantage over wild-type virus. The majority of viruses analysed in the present study were sequenced at passage 3 or 4 after their original isolation from the animals, to minimize the extent of adaptation, but it is possible that mutations occurred in vitro.
Analysis of the genome sequences derived in the present study identified four mutations that would truncate the cognate proteins.
The strain Kent/43/1994 had a mutation in ORF63 (ICP0). Changes in a different region of ORF63 have been observed in the cell culture-adapted EHV-1 strain Kentucky A, which has a deletion of residues 319-431 compared to strain Ab4. Despite this, ICP0 in Kentucky A retains its activity (Bowles, Holden, Zhao, & O'Callaghan, 1997), even though mutational studies of HSV-1 ICP0 have indicated that the C-terminus can have an effect on protein intranuclear localization (Everett, 1988). The mutation in strain Buckinghamshire/9/93 ORF34 (protein V32) disrupts the C-terminal part of the protein. The role of this protein in infection is not clear, although it is known that it is degraded via its N terminus through the ubiquitin proteosome pathway, and that deletion mutants lacking the whole ORF have a significant growth defect in PBMCs at early times post-infection (Said, Damiani, & Osterrieder, 2014). The mutation in strain Cambridgeshire/96/2013 ORF14 (VP11/12) would truncate the protein by 124 residues. In HSV-1, VP11/12 undergoes tyrosine phosphorylation in NK and T cells, but its biological function is unclear (Zahariadis et al., 2008). The strain Suffolk/45/2013 ORF11 was truncated by 99 residues. There was no evidence of the non-deleted virus in the sequence data for this strain. The ORF11 protein (VP22) is highly conserved among alphaherpesviruses. EHV-1 VP22 is required for efficient growth in cell culture, but is not essential for pathogenicity in the hamster model (Okada, Izume, Ohya, & Fukushi, 2015). The C-terminal 89 residues of HSV-1 VP22 contain a signal for cytoplasmic localization, and the C-terminal 128 residues are required for chromatin-binding activity (Martin, O'Hare, McLauchlan, & Elliott, 2002). In bovine herpesvirus 1 (BHV-1) infection, full-length VP22 has been shown to localize to the cell nucleus and interact with histones, and C-terminally truncated forms have been shown to localize exclusively in the cytoplasm (Liang et al., 1995). Two of these mutations were due to deletions occurring within homopolymer tracts, which may suggest polymerase slippage had a role in protein evolution at these sites.
In conclusion, the present study describes the sequences of 78 EHV-1 strains, most of which were isolated in the UK. Phylogenetic analysis of U L identified up to 13 circulating viral clades differing from each other mainly by individual nucleotide variations spread throughout the genome. We identified a polymorphism in ORF30 that may be linked to neurological outbreaks and warrants further study. Moreover, the data suggest that intraspecies and interspecies recombinations of equine alphaherpesviruses have occurred. The extensive sequence information obtained adds considerable depth to that reported previously for EHV-1, bringing the number of substantially complete genome sequences to over 100, thus helping to provide a comprehensive resource for fundamental and applied studies of EHV-1.

ACKNOWLEDG EMENTS
This work was supported by the Alborada Trust, the Animal Health