A toolkit for Nannochloropsis oceanica CCMP1779 enables gene stacking and genetic engineering of the eicosapentaenoic acid pathway for enhanced long‐chain polyunsaturated fatty acid production

Summary Nannochloropsis oceanica is an oleaginous microalga rich in ω3 long‐chain polyunsaturated fatty acids (LC‐PUFAs) content, in the form of eicosapentaenoic acid (EPA). We identified the enzymes involved in LC‐PUFA biosynthesis in N. oceanica CCMP1779 and generated multigene expression vectors aiming at increasing LC‐PUFA content in vivo. We isolated the cDNAs encoding four fatty acid desaturases (FAD) and determined their function by heterologous expression in S. cerevisiae. To increase the expression of multiple fatty acid desaturases in N. oceanica CCMP1779, we developed a genetic engineering toolkit that includes an endogenous bidirectional promoter and optimized peptide bond skipping 2A peptides. The toolkit also includes multiple epitopes for tagged fusion protein production and two antibiotic resistance genes. We applied this toolkit, towards building a gene stacking system for N. oceanica that consists of two vector series, pNOC‐OX and pNOC‐stacked. These tools for genetic engineering were employed to test the effects of the overproduction of one, two or three desaturase‐encoding cDNAs in N. oceanica CCMP1779 and prove the feasibility of gene stacking in this genetically tractable oleaginous microalga. All FAD overexpressing lines had considerable increases in the proportion of LC‐PUFAs, with the overexpression of Δ12 and Δ5 FAD encoding sequences leading to an increase in the final ω3 product, EPA.


Introduction
Long-chain polyunsaturated fatty acids (LC-PUFAs) are hydrocarbon acyl chains (18-22 carbons) containing multiple cis double bonds. The double bond closest to the methyl (x) end of a fatty acid, usually three or six carbons, differentiates LC-PUFAs into x3 or x6 classes, respectively. Evidence suggests consuming LC-PUFAs in adequate amounts with a balanced ratio among LC-PUFAs is essential for human physical and mental health (Chen et al., 2016), while current diets are often low in x3 LC-PUFAs.
Currently, seafood is the major source of x3 LC-PUFAs, particularly EPA (20:5; number of carbons:number of double bonds) and docosahexaenoic acid (DHA,22:6). However, overfishing, habitat destruction and pollution have reduced wild fish stocks, while increasing human populations drive a record demand for LC-PUFA nutrients (Betancor et al., 2015;Petrie and Singh, 2011). Although fish are rich in x3 fatty acids, the majority of LC-PUFAs acyl chains originate at the base of the food chain in marine microalgae (Doughman et al., 2007;Martins et al., 2013). Aquaculture of microalgae is an emerging source of LC-PUFAs for fish farming or direct human consumption, and understanding the biosynthetic pathways and factors that influence LC-PUFA metabolism in algae will enable increased production from these organisms (M€ uhlroth et al., 2013).
Metabolic engineering of multigene pathways has been developed using a variety of strategies (Halpin, 2005;Naqvi et al., 2010). Commonly, each transgene is placed under the regulation of separate promoters and terminators, and single gene expression plasmids are independently introduced into the host or assembled into a larger multigene expression plasmid (Halpin, 2005). However, other strategies exist to express multiple genes. For example, bidirectional promoters allow compact assembly of coregulated gene pairs and have been utilized in a number of transgenic strategies, including in Nannochloropsis species (Kilian et al., 2011;Moog et al., 2015).
An obstacle to compact multigene expression systems in eukaryotes is the requirement of each transgene to be encoded in individual mRNAs. However, introducing short viral 2A peptide coding sequences (Szymczak et al., 2004;Szymczak-Workman et al., 2012) prevents the formation of a peptide bond during translation, allowing multiple proteins to be encoded by a single mRNA molecule, which has increased the efficiency of multiple coding sequence expression in eukaryotic cells. Advantages of using 2A peptides include their compact size of 20-60 amino acids (aa), high efficiency in most eukaryotes tested to date, low toxicity and stoichiometric coexpression of linked proteins (de Felipe et al., 2003;Kim et al., 2011;Sharma et al., 2012;Szymczak et al., 2004;Szymczak-Workman et al., 2012). However, the efficiency of peptide bond skipping varies depending on the 2A peptide and the host combination; therefore, optimization for the respective host is needed for the approach to become practical.
Using state-of-the-art technology, we developed vectors that combine a highly active unidirectional promoter (EF) or a bidirectional promoter (Ribi) with a variety of reporter proteins and epitope tags for optimized transgene expression in N. oceanica CCMP1779. Moreover, we optimized a viral 2A peptide for polycistronic expression to generate a multigene expression system for this oleaginous microalga. This vector toolkit was successfully used to manipulate the EPA biosynthesis pathway in N. oceanica CCMP1779.

Results
Genes encoding enzymes of the eicosapentaenoic acid biosynthesis pathway are coexpressed The identification of x6 intermediates (20:4 D5,D8,D11,D14 ) ( Figure 1a) points to the presence of an x6 pathway for EPA biosynthesis in N. oceanica CCMP1779 (Figure 1b). We isolated the cDNAs for the five FADs and a putative D6 fatty acid elongase (FAE) proposed to be involved in LC-PUFA biosynthesis in N. oceanica CCMP1779 (Figure 1b). These cDNA sequences served to update the gene models of the D9, D12, D6, D5, x3 FADs and D6 FAE. Their updated cDNA sequences were deposited at NCBI with the accession numbers KY214449, KY214450, KY214451, KY214453, KY214454 and KY214452, respectively. Using the corrected gene models, we observed that the FAD genes are highly coexpressed under light/dark conditions with a maximum expression 6 h after dawn (Figure 1c).
Computational tools and manual examination were used to predict functional domains, subcellular localization and transmembrane sequences in these proteins in support of their initial functional annotations ( Figure 2). The FADs contain typical fatty acid desaturase domains, in particular the crucial three histidine boxes for coordinating a diiron centre in the active site (Broun, 1998;L opez Alonso et al., 2003). Moreover, the D6 and D5 FADs contain a cytochrome b5 domain that donates electrons for desaturation as observed for front-end desaturases of eukaryotic origin, as well as glutamine substitutions in the third histidine box characteristic of front-end desaturases (Domergue et al., 2002;Hashimoto et al., 2008;Meesapyodsuk and Qiu, 2012).
The D6 FAE contains an ELO family elongase domain, which is involved in very-long-chain fatty acid elongation and sphingolipid formation (Oh et al., 1997;Tvrdik et al., 2000). Further evidence of a possible elongase function is provided by the presence of FLHXYHH and MYSYY motifs characteristic of D6 and D5 fatty acid elongases; the first is positioned with an upstream glutamine characteristic of PUFA elongases, and the latter is typically found in microalga D6 and D5 fatty acid elongases (Hashimoto et al., 2008;Jiang et al., 2014;Yu et al., 2012) (Figure 2).

N. oceanica CCMP1779 FADs catalyse the production of LC-PUFAs in yeast
To test the biological activity of the putative FADs and FAE from N. oceanica CCMP1779, the pathway was reconstituted in the heterologous host Saccharomyces cerevisiae using a gene stacking strategy (Figure 3a). S. cerevisiae contains a single D9 FAD which produces 16:1 D9 and 18:1 D9 (Stukey et al., 1990). Therefore, to generate EPA from the endogenous 18:1 D9 , it is necessary to introduce four additional desaturases and an elongase. Towards this end, we coexpressed cDNAs encoding the D12, D6 and D5, x3 FADs and the D6 FAE under the control of galactose-inducible promoters in the S. cerevisiae strain InvSc1 (Figure 3a).

A vector toolkit for multigene expression in Nannochloropsis species
To facilitate the coexpression of multiple coding sequences and characterization of the respective enzymes, we generated a set of vectors to overexpress multiple FADs in N. oceanica ( Figure 4a). For single transcript expression vectors, we chose the elongation factor promoter (EFpro)(NannoCCMP1779_10181) due to its high constitutive activity during light:dark cycles ( Figure 4b). We generated a vector series (pNOC-OX) encoding a variety of reporter proteins and epitope tags, including hemagglutinin (HA), green fluorescent protein (eGFP), cyan fluorescent protein (cerulean), yellow fluorescent protein (venus) and the ultra-bright codon optimized (Table S2) NanoLuciferase (Nlux) (Hall et al., 2012), flanked by glycine-serine-glycine encoded linkers and a set of compatible restriction sites (AscI/HpaI and MluI/NruI) which enable translational fusion of the epitope tags to either the C or N terminus of the targeted protein ( Figure 4a).
To identify candidate bidirectional promoters in N. oceanica, a custom python script (Data S1) was used to find diverging gene pairs that coexpressed (Table S3). We selected the intergenic region between two ribosomal subunits as a promising candidate bidirectional promoter (Ribi) due to a high degree of coexpression and moderate expression levels throughout the light:dark cycle of the respective gene pair (NannoCCMP1779_9669, Nan-noCCMP1779_9668) (Figure 4b). The Ribi promoter (after modification to remove MluI and NruI recognition sites, Figure S2) was assembled with the best performing selection marker P2A cassette coding sequences (BleR-P2A(60) and HygR-P2A(60)) and the toolkit epitope tag coding sequences to generate the pNOCstacked vector series ( Figure 4a).
To test the newly developed vectors, we transformed N. oceanica CCMP1779 with pNOC-OX-CFP, pNOC-OX-Nlux and pNOC-stacked-Nlux. Production of CFP and Nlux was confirmed in selected strains by immunoblotting with a-GFP and a-HA antibodies, respectively (Figure 4c, Figure S3a). To assess the activity of the selected promoters, transformants of N. oceanica with pNOC-OX-Nlux and pNOC-stacked-Nlux were screened for their luminescence signals. To quantitatively compare Nlux reporter signal from each promoter, luminescence from an equal number of cells of pNOC-OX-Nlux and pNOC-stacked-Nlux transformants was measured. Reflecting the higher activity level of EFpro than Ribi, the luminescence of Nlux in pNOC-OX-NLux was greater than in pNOC-stacked-Nlux lines ( Figure S3b). Viral-derived 2A peptides are used for polycistronic expression of multiple transgenes in eukaryotes and are widely used to tie resistance markers to the production of target proteins. We first determined the ribosomal skipping efficiency of the three most commonly used 2A peptides of~20 aa, designated 2A peptide (amino acid length). The F2A(24), T2A(18) and P2A(19) coding sequences were appended to the zeocin/bleomycin (BleR) and/or hygromycin (HygR) resistance marker genes followed by insertion of the HA-tagged firefly luciferase (Flux) coding sequence ( Figure 5a). Introduction of these constructs into N. oceanica CCMP1779 resulted in zeocin-or hygromycin-resistant colonies, of which those with high luciferase activity were selected for further study. Immunoblotting detected full-length BleR-2A-Flux protein, and only small amounts of released Flux in the first round of screening (Figure 5b, Figure S4). Ribosomal skipping efficiency was less than 10% for F2A, and T2A sequences, while P2A had an efficiency of 10-30% ( Figure S4c). In order to increase ribosomal skipping efficiency, the N-terminal F2A peptideencoding sequence was extended to 58 aa, and the P2A sequence to 30 aa, 45 aa and 60 aa. These changes enhanced the ribosomal skipping efficiency for F2A(58) to~20%, for P2A (30) to~40%, for P2A(45) to~50% and for P2A(60) to >50% (Figure 5b, Figure S4). Based on these results, the extended P2A sequence was selected as the most promising 2A peptide for use in N. oceanica CCMP1779.

Overexpression of EPA biosynthesis genes in N. oceanica CCMP1779
We first generated lines expressing the D9, D12 and D5 FADs under the control of the EF promoter in N. oceanica CCMP1779 (designated DOX9, DOX12 and DOX5 lines, respectively) (Figure 6a). The D9 FAD coding sequence was cloned into pNOC-P2A (30) ( Figure 5) to generate the pNOC-DOX9 vector. The D12 and Figure 4 Assembly of native promoters, terminators and a range of reporters to generate a transgenic expression toolkit for N. oceanica CCMP1779. (a) The pNOC-OX vector series contains a series of reporters under the control the EF promoter with the LDSP terminator, and a hygromycin resistance gene under the control of the LDSP promoter and 35S terminator. Epitopes include the cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), green fluorescent protein (GFP) and NanoLuciferase (Nlux), and hemagglutinin peptide (HA) encoding sequences. P2A(60) is an extended 2A peptide coding sequence placed 3 0 of the zeocin resistance gene (BleR) or hygromycin resistance gene (HygR) for bicistronic expression by ribosomal skipping. The pNOC-stacked vectors utilize a bidirectional promoter (Ribi) for coexpression of a reporter and resistance gene with 2A peptide coding sequence and multicloning site followed by a heat-shock terminator. (b) RNA expression of endogenous genes corresponding to the promoters, NannoCCMP1779_10181 (EFpro), and the gene pair NannoCCMP1779_9669 and NannoCCMP1779_9669 (Ribi promoter) under light:dark cycles (data from (Poliner et al., 2015)). (c) Transgenic protein confirmation by immunoblot of pNOC-OX-CFP transformants detected with a-GFP. Total protein was stained using the dye DB71. D5 FAD coding sequences were placed in overexpression vectors with C-terminal epitope tags (pNOC-DOX12-HA, pNOC-DOX12-CFP, pNOC-DOX5). Lines with changes in their fatty acid profiles were selected for further studies. The DOX12 and DOX5 lines produced appropriately sized proteins (Figure 6b). However, we did not detect full-length HA-D9 FAD in DOX9 lines but only several small molecular weight polypeptide products (Figure 6b), which could be due to degradation of the tagged protein.
Confocal microscopy of DOX5 and DOX12 lines revealed that the desaturase CFP fusion proteins have an ER subcellular location, as indicated by overlap with an ER-specific marker dye, supporting the protein annotation and predicted location ( Figure S5).
To test whether overproduction of more than one FAD would improve LC-PUFA accumulation in N. oceanica CCMP1779, we transformed wild-type and the DOX5 strain B3 with a desaturase stacking vector, containing the HA-D9 FAD and D12 FAD-HA coding regions under the control of the Ribi bidirectional promoter (pNOC-stacked-DOX9 + 12) (Figure 6a,c). DOX9 + 12 and DOX5 + 9 + 12 lines with changes in their fatty acid profile were selected for further analyses. Immunoblotting of selected lines detected full-length D12 FAD-HA and HA-D9 FAD peptides in all lines, and D5 FAD-CFP in the triple FAD overexpression lines (Figure 6d).
To determine the level of overexpression, mRNA levels were quantified by qPCR (Figure 7a). Lines transformed with the pNOC-OX vectors (Figure 4a) resulted in expression up to 3.5-fold higher than the wild type, while DOX9 overexpression using pNOC-P2A (30) led to~eightfold increase in mRNA. In the DOX9 + 12 lines, the D12 FAD mRNA content was 10-30 times that of the wild type and displayed large differences between the lines, while the D9 FAD expression was increased two-to fourfold.

Increase in LC-PUFA content in FAD overexpressing lines
Fatty acid profiling of wild-type, empty vector controls (EV) and DOX lines showed that an increase in FAD production altered fatty acid proportions (Table 1). While EV controls did not cause alterations to the fatty acid profile compared to the wild type, the overexpression of D9 FAD led to a small increase in the mol per cent of its product 18:1 D9 and the overexpression of the D12 or D5 FADs resulted in a higher LC-PUFA fraction. We did not observe a further increase in LC-PUFAs in the stacking lines, DOX9 + 12 and DOX5 + 9 + 12, and these lines had a similar~25% increase in EPA (20:5 D5,D8,D11,D14,D17 ) and 35% increase in LC-PUFAs mol ratio as the single FAD overexpressing lines (Table 1).
To further assess the impact of FAD overproduction, we measured cell growth over several days (Figure 7b). The growth rate of EV, DOX5, DOX12, DOX9 + 12 and DOX5 + 9 + 12 lines, but not DOX9, was increased with respect to the wild type ( Figure S6). This effect could be related to the decreased average cell sizes of these lines (Figure 7c). Total cellular fatty acid content per cell was likewise decreased in DOX5, DOX12, DOX9 + 12 and DOX5 + 9 + 12 lines, while EV and DOX9 lines were unaffected (Table 2).

Expanding transgenic techniques in N. oceanica
A strong biotechnological interest has driven development of transgenic tools for Nannochloropsis species in recent years Kang et al., 2015b;Kilian et al., 2011;Radakovits et al., 2012;Vieler et al., 2012). However, a comprehensive and modular protein production toolkit vector set allowing the engineering of convenient epitope-tagged fusion proteins was lacking. Such a toolkit with multiple genetic reporters, selection markers, two additional promoters and several strategies for multigene expression is now available. We have tested two new promoters for protein overproduction in N. oceanica CCMP1779. To enhance expression of single transgenes in Nannochloropsis species, engineers have used several endogenous promoters including the lipid droplet surface protein (LDSP) (Kaye et al., 2015;Vieler et al., 2012), b-tubulin, heat-shock protein 70 and ubiquitin extension promoters (UEP) (Kang et al., 2015a;Radakovits et al., 2012). We have tested an elongation factor promoter that displays high and stable expression throughout light:dark cycles (Poliner et al., 2015), and has been utilized in diverse organisms (Gill et al., 2001;Kim et al., 1990), including diatoms (Seo et al., 2015). A variety of promoters allow gene expression at different levels and in response to  different environmental conditions. Furthermore, repeated use of transgenic elements can lead to genetic instability and as gene stacking techniques mature in Nannochloropsis species, multiple promoters are desirable to modulate the expression of many genes. Bidirectional promoters are extremely useful for multigene expression. The endogenous bidirectional promoter (VCP2) between a pair of violaxanthin/chlorophyll a-binding proteins (NannoCCMP1779|4698 and NannoCCMP1779|4699) has been used to drive resistance genes (Kilian et al., 2011) and has been paired with coding sequences for fluorescent proteins with subcellular location tags (Moog et al., 2015). We have tested the ribosomal component S15 and S12 bidirectional promoter (Ribi) and identified several additional bidirectional promoters that may also be suitable for transgenic expression including histones, other VCPs and nitrate reductase (Table S3). These bidirectional promoters can be used for selection of high target transgene expression or for reducing potential silencing effects of transgenes by linking transgene production to a resistance marker.
Transgenic techniques in a number of eukaryotes have exploited 2A peptides, which enable a single transcript to encode multiple discrete protein products. A variety of 2A peptides have been identified and exploited with differing levels of efficiency in a number of eukaryotes, including yeast (Doronina et al. 2008), animals (Kim et al., 2011;Szymczak et al., 2004), plants (Bur en et al., 2012), insects (Wang et al., 2015) and algae (Plucinak et al., 2015;Rasala et al., 2012). Applying this approach to stramenopiles, we have successfully developed an extended P2A sequence that is efficient for peptide bond skipping in N. oceanica CCMP1779. When placed between a resistance gene and firefly luciferase reporter, we obtained skipping efficiency of greater than >60%. Moreover, the overexpression of the D9 FAD coding sequence downstream of BleR-P2A(30) resulted in nonsignificant increase in 18:1 D9 , indicating enzymes produced as a P2A-mediated fusion with a resistance gene are functional in N. oceanica CCMP1779. Our extensions of the P2A sequence showed diminishing increases in cleavage efficiency suggesting an optimal length has been found ( Figure S4). Therefore, the P2A peptide is a promising tool for producing discrete proteins from a single reading frame in N. oceanica CCMP1779, and further mutational studies of the P2A sequences could potentially yield higher performing variants.
We anticipate that the development of the pNOC-OX and pNOC-stacked expression vectors will facilitate research in Nannochloropsis species and lay the foundation for combinatorial genetic engineering in an oleaginous microalga chassis for synthetic biology.

Characterization of the EPA biosynthetic pathway
As an example for multigene metabolic engineering in N. oceanica CCMP1779, we identified and characterized the four fatty acid desaturases and one elongase involved in the production of the LC-PUFA, EPA (20:5). Heterologous expression in S. cerevisiae confirmed the predicted biochemical function of each gene in the pathway, leading to EPA production without the supply of external fatty acids (Figure 3, Table S1). We also showed that the D5 and x3 FADs can both use x3 or x6 fatty acids when reconstituted in yeast ( Figure S1). Similarly, N. oceanica D6 FAD expressed in S. cerevisiae was able to process x3 or x6 fatty acids when exogenously supplied (Ma et al., 2011), suggesting substrate preference may be determined by glycerolipid acyl carriers.
We used our gene stacking vector toolset (Figure 4) to increase expression levels of single or multiple endogenous genes involved in EPA synthesis. Strains overexpressing one, two or three FAD encoding genes displayed elevated fractions of LC-PUFAs, notably EPA (Table 1). However, we did not observe a further increase in LC-PUFAs in the lines overproducing more than one enzyme. The isolation of the EPA biosynthetic genes enables further studies into regulation of the pathway and provides tools useful for manipulation of LC-PUFA production in N. oceanica and heterologous hosts.
Metabolic engineering for increased EPA content in N. oceanica The overexpression of single D5 or D12 FADs led to an approximate 25% increase in EPA mol ratio (Table 1)  control of a fucoxanthin chlorophyll a/c binding protein gene promoter led to a 58% increase in EPA (Peng et al., 2014).
Overproduction of an elongase involved in LC-PUFA production in P. pseudonana led to a 40% increase in EPA (Cook and Hildebrand, 2015). It has been previously shown that in N. oceanica CCMP1779, the inducible overproduction of the D12 FAD using the LDSP promoter produced a 50%-75% increase in the 20:4 D5,D8,D11,D14 mol ratio during nitrogen deprivation and stationary phase (Kaye et al., 2015). The FAD genes in N. oceanica CCMP1779 are coexpressed under diurnal light:dark cycles during exponential growth (Figure 1) (Poliner et al., 2015). Therefore, we had initially hypothesized that high expression of multiple enzymes is necessary to maintain flux through the pathway and further increase endproduct concentration. However, the overproduction of the multiple desaturase proteins in the DOX9 + 12 and DOX5 + 9 + 12 lines did not have an additive effect. These results suggest that there might be a limit to PUFA content in N. oceanica CCMP1779 before cell physiology is affected under the growth conditions tested.
The cell division rate of DOX lines was not reduced when compared to wild-type cells. However, decreases in total cellular fatty acids of DOX lines expressing the D12 or D5 FAD coding sequences indicates that the physiology of N. oceanica is compromised by desaturase overproduction (Table 2). In N. oceanica, EPA is an endogenous major LC-PUFA that is associated with polar lipids of plastidic membranes and likely has a functional role in the photosynthetic membrane (Valentine and Valentine, 2004). Typically, in response to environmental conditions, membrane saturation is altered to maintain membrane fluidity, and LC-PUFAs are increased in membranes during lower temperatures and high light. Interference with native membrane composition may have deleterious effects that limit the accumulation of EPA beyond certain levels.
A higher relative increase in some LC-PUFAs has been achieved when the strategy has been to introduce novel LC-PUFAs (Ruiz-Lopez et al., 2014;Xue et al., 2013) or to elevate the level of minor LC-PUFAs, initially present in small amounts (Hamilton et al., 2014). For example, the heterologous expression of both a D6 FAD and a D5 FAE coding sequence in P. tricornutum led to a stronger DHA accumulation than in the single overexpressing lines (Hamilton et al., 2014). However, wild-type P. tricornutum contains only trace amounts of DHA and in these transgenic lines, the accumulation of DHA correlated with a strong decrease in EPA levels, indicating that these transgenics increased partitioning towards DHA but not an overall increase in flux through the LC-PUFA pathway. These observations support the hypothesis of a limit of LC-PUFA imposed on the cell under specific growth conditions.
In addition to possible negative effects on algal physiology, it is likely that other steps in LC-PUFA biosynthesis, such as fatty acid elongation or D6 or x3 desaturation, are rate limiting and need further enhancement to increase LC-PUFA content. Moreover, it is possible that increased LC-PUFA turnover by b-oxidation is involved in maintaining a balanced LC-PUFAs content (Moire, 2004;Xue et al., 2013). Down-regulation of the native EPA biosynthetic pathway at a transcriptional or post-transcriptional level may also compensate for the overproduction of FADs described here.
Strategies for sequestration of LC-PUFAs in TAG could be an option to overproduce LC-PUFAs without compromising cell physiology (Kaye et al., 2015). Although N. oceanica contains little LC-PUFAs in TAG under normal conditions, the content of LC-PUFAs in TAG increases following cellular stresses (Liu et al., 2013;Vieler et al., 2012). We observed a decrease 16:0 and 16:1 per cell in the DOX lines (Table 2); however, the overexpression of a D12 FAD using a stress-inducible promoter (Kaye et al., 2015) did not cause such a decrease, indicating that the timing of gene expression is likely to be important for minimizing negative effects of enhanced LC-PUFA content by maybe sequestering LC-PUFAs in TAG. Diacylglycerol acyltransferases (DGATs) and phosphatidylglycerol acyltransferase (PDATs) that have preferences for 20C fatty acids also offer potential tools for channelling EPA to the TAG stores (Manandhar-Shrestha and Hildebrand, 2015;Xu et al., 2013). N. oceanica contains 12 DGATS and two PDATs that are likely to have different substrate preferences, including for LC-PUFAs (Zienkiewicz et al., 2017). Studies into the functional effects of altering the fatty acid profile as well as identifying compensating internal forms of regulation are needed to identify strategies for further accumulation of LC-PUFAs.

Growth conditions
Axenic cultures were grown in shaking flasks of F/2 medium under 100 lm/s/m 2 white light, at 22°C and 120 rpm. For protein, metabolite and gene expression analyses, cells were grown under constant light and samples were collected from mid-log cultures (30 9 10 6 cells/mL).

Cloning of N. oceanica CCMP1779 EPA pathway genes
Axenic cultures were under 12:12 light:dark cycle at 22°C (Poliner et al., 2015). Cell counts and cell size measurements were obtained using a Coulter Counter Z2 (Beckman Coulter) using a profile with a range of 1.8-3.6 lm. N. oceanica CCMP1779 at mid-log phase was used for RNA isolation as described previously (Poliner et al., 2015). First-strand DNA synthesis was accomplished using SuperScript III with oligo dT (NEB). cDNAs were amplified using primers shown in Table S4 and Q5 polymerase (NEB), blunt cloned into pCR-Blunt (Thermo Scientific) and sequenced.

Yeast transformation and expression
EPA pathway genes were cloned into yeast expression vectors containing galactose-inducible promoters. The elongase PCR product was integrated into pYES2.1-topo (Invitrogen). Desaturases were amplified with the addition of C-terminal 6X histidine tails and restriction sites for integration into pESC-his and pESCleu (Agilent). The PCR product was digested with the noted restriction enzymes (Table S4) and ligated into the yeast expression vectors. InvSc1 yeast cells (Kajiwara et al., 1996) were transformed with the expression vectors using the Frozen-EZ Yeast Transformation II Kit (Zymo Research) and selected on SC (ClonTech) medium with proper dropout auxotrophy selection. Several colonies from each transformation were selected for further experimentation and were grown in 5 mL of SC overnight at 30°C. The overnight cultures were collected by centrifugation at 1000 g for 5 min, thoroughly decanted, resuspended in 5 mL of SC without sugar, histidine, leucine and uracil and OD600 was measured in duplicate. For the zero-hour time point fatty acid analysis, 0.5 mL culture was collected by centrifugation at 13 000 g, decanted and frozen in liquid nitrogen. Flasks of 5 mL SC 2% galactose were inoculated at 0.4 OD600 and grown at 20°C, and 24-h and 48-h time points were collected. For substrate feeding, 0.1% NP-40 (Sigma-Aldrich), as a detergent, and 0.5 mM free fatty acids (Santa Cruz Biotechnology) in glucose and galactose SC medium were included. Fatty acid analysis with washed cell pellets and 5 lg of pentadecanoic acid internal standard were conducted. LC-PUFA authentic standards were prepared in a separate reaction for confirmation of LC-PUFA running times. Fatty acid methyl ester preparation and extraction were performed as described previously (Liu et al., 2013).

Identification of bidirectional promoters in N. oceanica CCMP1779
A custom python script was used to assemble the coding regions of each gene as determined in the genome assembly and annotation of N. oceanica CCMP1779 (Vieler et al., 2012), and only genes with start and stop codons were added to the final list. These coding regions were assessed with the CUSP function of the EMBOSS program (Table S2). A custom python script was used to identify diverging gene pairs with intergenic regions of <1500 base pairs, with suitable gene expression during light:dark cycles (Poliner et al., 2015) (Data S1). Putative gene pairs were manually examined to determine functional annotation (Table S3).

Construction of Nannochloropsis expression vectors
All constructs were derived from the pNoc-Dlux vector (Data S2), a pGEM-derived plasmid containing a hygromycin resistance cassette (Vieler et al., 2012) and a gateway-firefly luciferase cassette. All of the vectors for expression in S. cerevisiae and N. oceanica are listed in Table S5, and the complete annotated sequences are included in Data S2.

Nannochloropsis transformation
Vectors were linearized by restriction digestion, and purified and concentrated by ethanol precipitation. N. oceanica CCMP1779 transformation was performed according to the method of Vieler et al. (2012) with 3 lg of vector DNA, with 30 lg carrier DNA (Invitrogen UltraPure TM Salmon Sperm DNA Solution). Transformed cells were allowed to recover for 48 h and then plated in top agar with the respective selection. After 3-4 weeks, individual colonies were resuspended in 100 lL F/2. From each transformation,~20 colonies were screened for increased LC-PUFA content, and two to three colonies identified as positive.
Nannochloropsis luminescence assays N. oceanica CCMP1779 culture was mixed with F/2 supplemented with either firefly luciferin (Gold Biotech) or NanoLuciferase substrate (Promega), at a final volume of 200 lL, with 500 lM firefly luciferin or 10 000 9 dilutions of NanoLuciferase substrate. For normalized measurements, 1 million N. oceanica cells were used. Luminescence was measured with a Centro XS3 LB960 luminometer (Berthold Technologies) over a 0.3-s exposure.

Expression analysis in N. oceanica CCMP1779
For protein expression, frozen pellets from 5 mL culture were ball-milled in 2-mL tubes (30 Hz, 2 min) with a TissueLyser II (Qiagen). After addition of protein extraction buffer (100 mM Tris pH 8.0, 2 mM PMSF, 2% B-mercaptoethanol, 4% SDS), the sample was heated to for 3 min (60°C for FADs, 80°C for other proteins), centrifuged at 13,000 g for 3 min and the supernatant was transferred to new tube. Protein content was determined using the RCDC assay (Bio-Rad), and equal quantities of protein were loaded for SDS-PAGE. Proteins were transferred to PVDF membranes (Bio-Rad) overnight at 4°C. Blots were blocked in TBST with 5% milk for 1 h at room temperature and washed six times with TBST. For GFP detection, we used a-GFP antibody (Abcam ab5450) 1:1000 in TBST with 5% BSA for 1 h, and a secondary donkey a-goat HRP antibody (Santa Cruz sc-2020) 1:10 000 in TBST with 5% milk. For HA detection, we used a-HA-HRP antibody solution (Roche 3F10) at 1:1000 in TBST with 5% milk for 1 h. Signals were detected using clarity chemiluminescence reagent (Bio-Rad). Band quantification was conducted using Image Lab software (Bio-Rad).
RNA isolation, cDNA synthesis and real-time PCR were performed as described previously (Poliner et al., 2015). Realtime PCR primers were checked for efficiency and specificity. The delta-delta Ct method was used to determine gene expression relative to the gene encoding the actin-related protein (ACTR) NannoCCMP1779_1845.
Fatty acid methyl ester extractions in N. oceanica Cells (2 mL) were collected by filtration through GF/C filters, and filters were stored in screw top tubes at À80°C. Fatty acid methyl ester (FAME) extraction and analysis were carried out as described previously (Liu et al., 2013).

Confocal microscopy
Cerulean detection in transformed N. oceanica CCMP1779 was carried out with Olympus Spectral FV1000 microscope (Olympus, Japan) at the excitation wavelength of 435 nm (a diode laser). For endoplasmic reticulum labelling, 50 nM DiOC6 (Sigma-Aldrich) in F/2 medium was used. Cells were labelled directly before microscopic analysis. An argon (488 nm) laser was used for DiOC6 excitation. Chloroplast autofluorescence was excited using a solid-state (515 nm) laser. CLSM figures represent Z-series images composed using the Olympus FluoView FV1000 confocal microscope software (Olympus).

Supporting information
Additional Supporting Information may be found online in the supporting information tab for this article: Figure S1 The final two steps of EPA biosynthesis in S. cerevisae with exogenous supply of substrates. Figure S2 Modification of the Ribi promoter to remove restriction sites. Figure S3 Assessment N. oceanica CCMP1779 promoters' strength using Nano-luciferase. Figure S4 N-terminal extended 2A peptide screening for increased ribosomal skipping efficiency. Figure S5 CLSM analysis of N. oceanica CCMP1779 wild-type, and empty vector and CFP-desaturase overexpressing (DOX) transformants. Table S1 Fatty acid mole percentage of S. cerevisiae strains.