Insights into the phylogenetic and molecular evolutionary histories of Fad and Elovl gene families in Actiniaria

Abstract The biosynthesis of long‐chain polyunsaturated fatty acids (LC‐PUFAs, ≥ C20) is reliant on the action of desaturase and elongase enzymes, which are encoded by the fatty acyl desaturase (Fad) and elongation of very long‐chain fatty acid (Elovl) gene families, respectively. In Metazoa, research investigating the distribution and evolution of these gene families has been restricted largely to Bilateria. Here, we provide insights into the phylogenetic and molecular evolutionary histories of the Fad and Elovl gene families in Cnidaria, the sister phylum to Bilateria. Four model cnidarian genomes and six actiniarian transcriptomes were interrogated. Analysis of the fatty acid composition of a candidate cnidarian species, Actinia tenebrosa, was performed to determine the baseline profile of this species. Phylogenetic analysis revealed lineage‐specific gene duplication in actiniarians for both the Fad and Elovl gene families. Two distinct cnidarian Fad clades clustered with functionally characterized Δ5 and Δ6 proteins from fungal and plant species, respectively. Alternatively, only a single cnidarian Elovl clade clustered with functionally characterized Elovl proteins (Elovl4), while two additional clades were identified, one actiniarian‐specific (Novel ElovlA) and the another cnidarian‐specific (Novel ElovlB). In actiniarians, selection analyses revealed pervasive purifying selection acting on both gene families. However, codons in the Elovl gene family show patterns of nucleotide variation consistent with the action of episodic diversifying selection following gene duplication events. Significantly, these codons may encode amino acid residues that are functionally important for Elovl proteins to target and elongate different precursor fatty acids. In A. tenebrosa, the fatty acid analysis revealed an absence of LC‐PUFAs > C20 molecules and implies that the Elovl enzymes are not actively contributing to the elongation of these LC‐PUFAs. Overall, this study has revealed that actiniarians possess Fad and Elovl genes required for the biosynthesis of some LC‐PUFAs, and that these genes appear to be distinct from bilaterians.

The fatty acyl desaturase (Fad) gene family encodes desaturase enzymes which insert double bonds at different positions of PUFAs. The coordination of multiple functionally different desaturase enzymes is often required to desaturate PUFAs and LC-PUFAs.
Desaturase enzymes are required to have a combination of Δ5 and/ or Δ6 activity; however, alternative pathways also exist which utilize desaturase enzymes with Δ8 activity (Cook & McMaster, 2002;Monroig, Li, & Tocher, 2011;Sprecher, 2000). Genes that encode elongase enzymes are from the elongation of very long-chain fatty acid (Elovl) gene family. In mammals, seven members of the Elovl gene family have been identified, with different genes encoding elongase enzymes that have altered affinity to elongate precursor fatty acids. Specifically, elongase enzymes encoded by Elovl1, 3, 6, and 7 are involved in the elongation of saturated fatty acids (SFAs) and monounsaturated fatty acids (MUFAs), whereas Elovl2, 4, and 5 encode enzymes involved in the elongation of PUFAs to LC-PUFAs (Jakobsson, Westerberg, & Jacobsson, 2006;Leonard, Pereira, Sprecher, & Huang, 2004;Tamura et al., 2009). Despite this research, the distribution and evolution of genes that encode enzymes responsible for the desaturation and elongation of PUFAs remain largely unresolved in many metazoan taxa.
Studies investigating the fatty acid profiles of early diverging metazoan taxa have been focused on the fatty acid profile of cnidarians that rely on an interaction with symbionts, such as Symbiodinium (Garrett, Schmeitzel, Klein, Hwang, & Schwarz, 2013;Harland, Fixter, Davies, & Anderson, 1991, 1992Papina, Meziane, & van Woesik, 2003). From this body of work, there is strong evidence to suggest that the symbionts transfer essential LC-PUFAs to the host. This was evident with the fatty acid profile of sea anemones that were treated to remove symbionts revealing the presence of LC-PUFAs, ARA and EPA, but lacked LC-PUFAS > C 20 such as docosapentaenoic acid (DPA; 22:5n-3) and docosahexaenoic acid (DHA; 22:6n-3) (Garrett et al., 2013;Harland et al., 1991Harland et al., , 1992Papina et al., 2003). The fatty acid profile of early diverging metazoan species that lack a symbiotic relationship, however, remains unclear, and further research investigating the ability of these organisms to elongate and desaturate PUFAs to LC-PUFAs is required.
Using a comparative genomic approach, this study examined the distribution and copy number of Fad and Elovl genes from four cnidarian genomes (Hydra vulgaris, Acropora digitifera, Nematostella vectensis, and Exaiptasia pallida). A further fine-scale comparative transcriptomic analysis was also undertaken, within order Actiniaria, to identify specific candidate genes in this group. Phylogenetic and selection analyses of these data have also been performed to elucidate the molecular evolution of the Fad and Elovl gene families in Cnidarians. The fatty acid profile of candidate cnidarian species, Actinia tenebrosa (Figure 1), which lacks a symbiotic relationship with F I G U R E 1 Australian sea anemone, Actinia tenebrosa. Photograph credit: Jonathon Muller Symbiodinium (Black & Johnson, 1979;Muller, Fine, & Ritchie, 2016;Ottaway, 1978), was investigated using fatty acid analysis to address our lack of understanding of the baseline levels of fatty acids in these organisms. Finally, we examined if the fatty acid composition data were concordant with the Fad and Elovl enzymes found in this species.
Protein sequences generated from both genomic and transcriptomic datasets were then used to identify candidate genes. BLASTp (e value < 1e −05 ) was performed using the nonredundant translated ORFs as queries against the Swiss-Prot database. Potential Fad and Elovl candidates were identified that had a top-blast hit with a functionally characterized protein from the Swiss-Prot database (e value < 1e −05 ). Functionally characterized Fad and Elovl proteins were identified in the Swiss-Prot database by having the essential Pfam domains. For Fads, this required Pfam domains: Cyt-b5 (PF00173) and FA_desaturase (PF00487); and Elovls this required: ELO (PF01151). The respective candidates and functionally characterized Fad and Elovl proteins were aligned using MUSCLE in MEGA 7 (Kumar, Stecher, & Tamura, 2016). Sequences were retained only if they contained essential structural characteristics.
These included an N-terminal cytochrome b5-like binding domain (cyt-b5; PF00173), three histidine boxes (HXXXH, HXXXHH, and QXXHH) located in a fatty acid desaturase domain (FA_desaturase; PF00487), and a hem binding motif (HPGG) in Fads. Furthermore, functionally characterized sphingolipid desaturases that also contain these essential structural characteristics were removed from the alignments. The structural characteristics of Elovls included a diagnostic histidine box motif (HXXHH) and a Pfam ELO domain (PF01151). These structural characteristics are essential for desaturation and elongation, and therefore, transcripts not containing these domains were not considered. Candidate genes from actiniarian transcriptomes were checked for symbiont contamination using PSyTranS (https://github.com/sylvainforet/psytrans). The symbiont proteomes from Symbiodinium microadriaticum, Symbiodinium kawagutii, and Symbiodinium minutum were used as a training dataset to identify potential contamination, while the host proteome used for training was N. vectensis.

| Phylogenetic analyses
The refined list of full-length translated ORFs was used for phylogenetic analyses to determine the distribution of Fad and Elovl proteins within and across Metazoa. Protein sequences were aligned using MUSCLE in MEGA 7 (Kumar et al., 2016) followed by manual curation to remove sequences that lack conserved residues and motifs. Protein alignments were imported into IQ-TREE (v1.4.2) (Nguyen, Schmidt, von Haeseler, & Minh, 2014) to determine best-fit of protein model evolution. Using Bayesian information criterion, a LG+I+G4 model was selected for both Fad and Elovl as the best-fit model of protein evolution. Phylogenetic trees were generated from alignments using 1,000 ultrafast bootstrap iterations.

| Selection analyses
Sequences that encode full-length protein sequences for both Fad and Elovl proteins generated from actiniarian transcriptomes were investigated to detect the action of pervasive diversifying selection. These codon sequences for the respective Fad and Elovl gene families were aligned using MUSCLE within MEGA 7 (Kumar et al., 2016). Codon alignments were imported into IQ-TREE (v1.4.2) (Nguyen et al., 2014) to determine best-fit substitution model (GTR+I+G4), and phylogenetic trees were generated from alignments using 1,000 ultrafast bootstrap iterations. Using these alignments and phylogenetic trees as inputs, the rates of selection TA B L E 1 Fad and Elovl gene copy numbers in cnidarian taxa with sequenced genomes

Fad Elovl
Hydra vulgaris 1 1 Acropora digitifera 0 3 Nematostella vectensis 1 1 Exaiptasia pallida 3 4 could be determined using maximum-likelihood models in the program CODEML in PAML (v4.8) (Yang, 2007) using the protocol of (Fang et al., 2009) and codon frequency F3X4. To accurately determine significance, Bonferroni correction was computed to account for the repeated testing of multiple branches (Fletcher & Yang, 2010;Hunt et al., 2011) where Fad had n = 2 branches and Elovl had n = 4 branches and the adjusted p-value = .05/n. To detect pervasive purifying and diversifying selection, Fast Unconstrained Bayesian AppRoximation (FUBAR) (Murrell et al., 2013) was used from the HyPhy package (Pond, Frost, & Muse, 2005). min, and temperature held for 5 min. The mass spectrometer was equipped with an ion source (250°C). The data were acquired with Q3 scan mode from m/z 50-650. For data collection, the MS spectra were recorded from 4 to 30.5 min.

| Fatty acid analysis
All data were processed using GCMS Postrun Analysis software (Shimadzu, Kyoto, Japan). FAME identification was based on an internal spectral library as well as a series of FAME standards (20-component FAME mix) (Restek, Bellefonte, PA, USA) were used to identify retention times (R t ) of specific m/z profiles associated with known FAs. The data processing included smoothing, peak detection, integration, peak alignment, normalization, and identification. Extraction and solvent blanks were included in the analysis to allow exclusion of ions detected at lipid masses that result from extraction chemical or solvent impurities. Quantification was achieved by comparison of the peak area of individual lipids to the internal standard.

| Identification of candidate genes
The distribution and copy number of Fad and Elovl genes found in cnidarian species with sequenced genomes are shown in  (Tables S1 and S2).
Fad and Elovl gene copies were identified in all transcriptomes (

| Comparative and phylogenetic analyses of Fad and Elovl gene families
Using a phylogenetic framework, we investigated the distribution of Fad genes across Metazoa (Figure 2). A maximum-likelihood tree revealed three distinct clades, which we name A, B, and C. Clades A and B were sisters to each other, with clade C the most divergent.
All bilaterian Fad proteins are found in clade B. Sequences within clades A and C are found to be from phylum Cnidaria as well as non-   Figure 3 and was annotated as Elovl1, Elovl7, Elovl1/7-like, Elovl4, Elovl5, Elovl2, a subclade that include sequences from actiniarian taxa that did not cluster with any functionally characterized sequences (Novel ElovlA), and a subclade that included cnidarian taxa that did not cluster with any functionally characterized sequences

| Selection analysis of the Fad and Elovl gene families
To investigate the selection pressures on both the Fad and Elovl gene families, multiple analyses were performed. Results from the site model selection analysis reveal that both the Fad and Elovl gene families show patterns of nucleotide variation consistent with the action of pervasive purifying selection (  (Table S3) (Table 4).
Finally, the branch-sites model was implemented to test for codons under episodic diversifying selection following gene duplication ( Figure 4). The same foreground branches as previously described were tested ( Figure S1). Foreground branches with T R 4 1 8 9 4 _ c 0 _ g 1 _ i2 _ m .3 0 2 0 6 9 _ A c ti n ia _ te n e b ro s a      Figure 5).
The fatty acid profile of M. alpina has been found to have high levels of ARA and also EPA when conditions are optimal, but lack > C 20 PUFAs, such as DHA (Knutzon et al., 1998;Michaelson et al., 1998).
Furthermore, M. alpina has been shown to encode an additional Δ6 desaturase enzyme; however, this sequence is not present in the Swiss-prot database and therefore was not included in this analysis (Knutzon et al., 1998;Michaelson et al., 1998).
The fatty acid analysis in A. tenebrosa revealed a similar fatty acid profile as M. alpina, with the presence of EPA and ARA, and an absence of DHA. Although A. tenebrosa was starved prior to fatty acid analysis, it is likely that some fatty acids from the diet are incorporated into its lipid profile. This is evident with both A. tenebrosa and prawn sharing similar lipid profiles; however, this inflated concentration of FAME does not account for the lack of DHA found in the fatty acid profile of A. tenebrosa.
In mammals, the Elovl gene family has been comprehensively investigated, revealing repeated rounds of gene duplication, resulting in seven members: Elovl1-7 (Jakobsson et al., 2006;Leonard et al., 2004). Of these seven genes, only the proteins encoded by with only Elovl4 identified in most species investigated. It should be noted that the Elovl4 protein has been shown to elongate LC-PUFAs > C 20 Monroig et al., 2013Monroig et al., , 2016, and therefore, functional characterization of the Elovl4 protein in cnidarians is required.  (Figure 4). From Figure 3, the Elovl genes that clustered to the Novel ElovlA and Novel ElovlB subclades appear to be actiniarian-specific and cnidarian-specific, respectively. In particular, 15 codons were identified to be under episodic diversifying selection on branch 4 of Figure 4, which corresponds to actiniarian Elovl genes orthologs to bilaterian Elovl4 ( Figure 3). As Elovl4 proteins are responsible for elongating PUFAs, these codons may have a role in the targeting and elongating of PUFAs. A study from the genus Drosophila revealed that the Fad gene family in this group is under the influence of pervasive purifying selection but also episodic diversifying selection at specific codons (Fang et al., 2009). This study also found that the majority of the codons under episodic diversifying selection occurred in clades produced by duplication events. The authors suggest that the amino acid residues under positive selection may be responsible for altered substrate selectivity.
To date, research investigating the ability of metazoan taxa to biosynthesis LC-PUFAs has been restricted to bilaterian taxa. Here, we provide the first comprehensive analysis of Fad and Elovl gene families across phylum Cnidaria. Our analysis revealed that lineagespecific gene duplication has played a major role in the distribu- and Elovl genes required for the biosynthesis of some LC-PUFAs, and these genes appear to share a greater similarity to non-metazoan eukaryotes.

ACK N OWLED G M ENTS
The authors would like to thank the Evolutionary and Physiological Genomics Lab (ePGL), in particular Chloé A. van der Burg, for their continual help and support. The authors would also like to thank QUT Marine group for their help and advice caring for the animals. Toledo to study at the QUT and contribute toward this project's work.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
AP, JS, TT, and PP conceived and designed the project. JS performed phylogenetic and selection analysis TT performed fatty acid analysis. AP, JS, TT, and PP wrote and edited the manuscript. All authors read and approved the final manuscript.