A genetic basis for the phenotypic differentiation between siscowet and lean lake trout (Salvelinus namaycush)

Authors


  • The present study is part of a larger effort to understand the biology of lake trout ecotypes in Lake Superior and represents a collaborative effort of a number of investigators in different fields. The analysis of growth, morphometry and the ecological significance of these results was a joint effort of several authors including Dan Rosauer, a fishery biologist at the Great Lakes WATER Institute (GLWI), Shawn Sitar, a fishery biologist with the Michigan DNR with interests in population dynamics and Great Lakes deepwater ecology, and Chuck Bronte a fishery biologist and data analyst at the U.S. Fish and Wildlife Service who has been working for 25 years on the biology and restoration of lake trout in the Laurentian Great Lakes. The bioinformatic analysis of the Roche 454 dataset and the supporting qPCR analysis and interpretation was also a joint effort of several authors including Steven Roberts, an assistant professor at the University of Washington, who is a comparative physiologist using transcriptomic approaches to examine how aquatic organisms respond to changes in environmental conditions, Crystal Simchick, a molecular biologist in the Goetz laboratory, and a bioinformaticist, Giles Goetz, working at the GLWI with interests in the analysis of global genome datasets. The lipid analysis and interpretation was accomplished by Ron Johnson, a research chemist at NOAA interested in lipid dynamics and reproduction in fish, and Cheryl Murphy an assistant professor at Michigan State University interested in how changes in the physiology of an individual fish translate to population and/or community level changes. The lead author, Rick Goetz, is a research scientist at the GLWI working in several areas of fish biology including the molecular basis of phenotypic differentiation in fish. The last author, Simon MacKenzie, is a faculty member of the Universitat Autonoma de Barcelona (Spain) with interests in using global genome approaches to address basic biological problems in fish biology. He and the lead author designed and implemented the pyrosequencing approach used in this study.

Frederick Goetz, Fax: +1 414 382 1705; E-mail: rick@uwm.edu

Abstract

In Lake Superior there are three principal forms of lake trout (Salvelinus namaycush): lean, siscowet and humper. Wild lean and siscowet differ in the shape and relative size of the head, size of the fins, location and size of the eyes, caudal peduncle shape and lipid content of the musculature. To investigate the basis for these phenotypic differences, lean and siscowet lake trout, derived from gametes of wild populations in Lake Superior, were reared communally under identical environmental conditions for 2.5 years. Fish were analysed for growth, morphometry and lipid content, and differences in liver transcriptomics were investigated using Roche 454 GS-FLX pyrosequencing. The results demonstrate that key phenotypic differences between wild lean and siscowet lake trout such as condition factor, morphometry and lipid levels, persist in these two forms when reared in the laboratory under identical environmental conditions. This strongly suggests that these differences are genetic and not a result of environmental plasticity. Transcriptomic analysis involving the comparison of hepatic gene frequencies (RNA-seq) and expression (quantitative reverse transcription–polymerase chain reaction (qPCR)) between the two lake trout forms, indicated two primary gene groups that were differentially expressed; those involving lipid synthesis, metabolism and transport (acyl-CoA desaturase, acyl-CoA binding protein, peroxisome proliferator-activated receptor gamma, and apolipoproteins), and those involved with immunity (complement component C3, proteasome, FK506 binding protein 5 and C1q proteins). The results demonstrate that RNA-seq can be used to identify differentially expressed genes; however, some discrepancies between RNA-seq analysis and qPCR indicate that methods for deep sequencing may need to be refined and/or different RNA-seq platforms utilized.

Introduction

Resource polymorphisms have been recorded in a diverse number of vertebrates from fish to mammals and have been instrumental in addressing a variety of questions in competition, phenotypic plasticity, speciation and niche utilization (see, Skulason & Smith 1995; for review). A number of fish species contain sympatric populations exhibiting distinct ecological, morphological and genetic differences. They are most common in the Salmonidae (trout, salmon, whitefish, char), Gasterosteidae (sticklebacks) and Osmeridae (smelts) but include fish in other families as well (Taylor 1999). These fish have been the focus of many studies, particularly related to speciation (Jonsson & Jonsson 2001; McKinnon & Rundle 2002).

Lakes in the Northern Hemisphere, formed as a result of the last glaciation event, contain a number of fish ecotypes representing several fish taxa. In many of these examples, sympatric populations occupy benthic and limnetic habitats and involve phenotypes specialized for resources and feeding in these habitats (Robinson & Wilson 1994; Skulason & Smith 1995). A large number of these ecotypes fall within the Salmonidae including the chars that exhibit enormous phenotypic diversity, especially in lake trout (Salvelinus namaycush) and to an even greater extent in Arctic char (Salvelinus alpinus). Historically, at least 15 separate subspecies of Arctic char were recognized (Adams & Maitland 2006). There is a high degree of genetic differentiation as determined by microsatellites among Arctic char populations across many lakes in Europe and among as well as across populations within the same lake or body of water (Wilson et al. 2004; Adams et al. 2006). This suggests that reproductive isolation of sympatric Arctic char populations is responsible for maintaining different forms.

Similarly, various lake trout forms have been reported in the Laurentian Great Lakes (Agassiz 1850; Brown et al. 1981; Goodier 1981) and more recently from large Canadian shield lakes (Blackie et al. 2003; Alfonso 2004; Zimmerman et al. 2007). Overharvesting and predation by sea lamprey were responsible for the collapse of lake trout stocks across the Great Lakes and the loss of diversity in all the lakes with the exception of Lake Superior (Krueger & Ihssen 1995; Hansen 1999). This remnant diversity is now used or being considered for reintroduction into the other Great Lakes (Bronte et al. 2008; Markham et al. 2008). In Lake Superior the three principal lake trout forms are the lean, siscowet and humper lake trout (Lawrie & Rahrer 1973; Moore & Bronte 2001). The origins of the three forms are not entirely clear. Based on mitochondrial DNA analysis, it has been suggested that siscowets evolved within the last 14 000 years from mixed origins (Wilson & Mandrak 2004). Microsatellite analysis of Lake Superior lake trout supports the divergence of morphotypes prior to the partitioning of proglacial lakes into the Great Lakes (Page et al. 2004). However, an alternative hypothesis proposes that the evolution of the different forms occurred during or after the formation of proglacial lakes and that the humper diverged from lean lake trout first, followed by the siscowet (Eshenroder 2008). This would make the origins of siscowet postglacial. The shape and relative size of the head, size of the fins, location and size of the eyes and caudal peduncle shape can visually differentiate these forms. Lean, siscowet and humper lake trout have been experimentally delineated on the basis of osteology (Agassiz 1850; Burnham-Curtis & Smith 1994), morphology (Khan & Qadri 1970; Burnham-Curtis 1993; Moore & Bronte 2001) as well as microsatellite analysis (Page et al. 2004). Besides the differences in morphology, siscowet and lean lake trout occupy different habitats in Lake Superior. Lean lake trout occupy shallower nearshore waters at depths less than 100 m, whereas siscowet lake trout are most abundant at depths greater than 100 m (Bronte et al. 2003), including the deepest parts of Lake Superior (∼400 m; Sitar et al. 2008). The diets of these forms also differ as a result of their depth distribution: lean lake trout feed mostly on rainbow smelt (Osmerus mordax), cisco (Coregonus artedii) and slimy sculpin (Cottus cognatus), whereas siscowet lake trout in deepwater, feed on coregonines (Coregonus sp.), deepwater sculpin (Myoxocephalus thompsoni) and burbot (Lota lota) (Conner et al. 1993; Harvey et al. 2003; Ray et al. 2007; Sitar et al. 2008). Siscowet also have a significantly higher percentage of skeletal muscle lipid (Eschmeyer & Phillips 1965). Lean lake trout have been the primary focus of restoration efforts, harvest management and research; however, siscowet lake trout currently make-up most of the lake trout biomass in Lake Superior, as they probably did historically (Bronte et al. 2003).

Distinct lake trout forms also occur in other large lakes in North America including Great Bear Lake (Blackie et al. 2003; Alfonso 2004), Great Slave Lake (Zimmerman et al. 2006) and Lake Mistassini (Zimmerman et al. 2007). A deepwater form similar to siscowet and a shallower lean-like form have been reported from Great Slave Lake (Zimmerman et al. 2006), and a humper form was described from Lake Mistassini (Zimmerman et al. 2007). Forms from deep and shallow habitats from these lakes are differentiated by body shape (deep vs. elongate), length of the pectoral fins (long vs. short) and buoyancy (high vs. low) (Zimmerman et al. 2006). Differences in these characters are thought to be related to the energetics of swimming and movement in the water column (Zimmerman et al. 2006).

Page et al. (2004) reported genetic differences between lean and siscowet lake trout, but the selectively neutral markers used may be unrelated to the actual phenotypic variation observed. Theoretically, phenotypic variation could be a result of genetic regulation and/or environmental plasticity. For example, the different habitats occupied by these lake trout forms vary greatly in pressure, light, temperature and diet and could be responsible for some phenotypic variation. A number of studies have addressed the genetic or environmental basis for phenotypic variation in Arctic char by rearing populations of different morphs under identical environmental conditions in the laboratory (Nordeng 1983; Svedang 1990; Hindar & Jonsson 1993; Skulason et al. 1996; Klemetsen et al. 2002; Adams & Huntingford 2004). The results have varied, indicating in some cases predominantly genetic control (Skulason et al. 1996) to predominantly environmental control (Adams & Huntingford 2004). It has been suggested that one reason for the differing emphasis on genetic vs. environmental control in these studies is that different polymorphisms are at different stages of divergence (Adams & Huntingford 2004). Surprisingly, there have been few attempts to look at the control of phenotypic differences between lean and siscowet lake trout. A comparison of lean and siscowet lake trout raised under similar environmental conditions was conducted at the Marquette State Fish Hatchery (Marquette, MI) but only results on lipid levels were published (Eschmeyer & Phillips 1965). The results indicated that cultured siscowet lake trout retained the same high lipid levels observed in wild siscowet, and that hybrids between siscowet and lean lake trout had intermediate lipid levels, which strongly suggested that lipid content was genetically determined. Another study conducted by the Michigan Department of Natural Resources at the Thompson State Hatchery, cross-bred siscowet, humper and lean lake trout (Stauffer & Peck 1981). The initial results indicated differences in some meristic characters measured, ‘but none were definitive’ (Stauffer & Peck 1981). However, Burnham-Curtis (1993) reanalysed the Stauffer & Peck (1981) data using principal component analysis and concluded that morphological and meristic differences observed in the progeny had some genetic basis and were not exclusively environmental. However, some caution must be applied because the fish analysed came from crosses of only two males and two females per lake trout form.

While the results of these two studies indicate a genetic basis for some differences observed between wild lean and siscowet lake trout, there has not been a comprehensive study that investigates a wide range of phenotypic differences between these forms within the same individuals reared under tightly controlled environmental conditions. Thus, one objective of the present study was to analyse growth, condition factor, morphometry and lipid content between lean and siscowet lake trout reared under identical conditions. Based on the results of the laboratory rearing studies on Arctic char ecotypes, we hypothesized that some of the phenotypic traits observed in wild lean and siscowet lake trout would be maintained under laboratory rearing conditions, but some would not (i.e. genetic and environmental contributions). Alternatively, the magnitude of the differences in the expression of certain phenotypes between laboratory-reared lake trout forms would be different in comparison to wild lean and siscowet lake trout.

There has been a strong effort recently to utilize global genomic approaches such as DNA microarrays, to determine the basis for phenotypic divergence and parallel evolution in fish ecotypes (Goetz & MacKenzie 2008). Microarrays have the advantage over directed gene approaches of looking at the expression of hundreds to thousands of genes simultaneously. As a result, it is possible to define sets of genes related to structural, physiological or metabolic functions that could underlay the differences observed between fish ecotypes. For example, using GRASP (Genomic Research on Atlantic Salmon Project) chips, differences in genes involved in energy, lipid metabolism, muscle contractility and growth were differentially regulated between sympatric whitefish ecotypes that exhibited differences in habitat and life history strategies (Derome & Bernatchez 2006; Derome et al. 2006; St-Cyr et al. 2008). We have proposed that, given the advances in deep sequencing platforms and technologies, deep sequencing should replace the use of hybridization-based approaches (i.e. microarrays) for quantitative comparative transcriptomics, particularly for nonmodel species (Goetz & MacKenzie 2008). There are several advantages of transcriptome sequencing (RNA-seq) compared to hybridization-based approaches. With RNA-seq, there are no preconceived ideas of which genes are important. Hybridization arrays are only as good as the genes they contain, while RNA-seq could theoretically provide an entire transcriptomic profile and thereby enable transcript quantification and gene discovery to occur simultaneously. This is particularly important for nonmodel organisms for which sequenced genomes do not exist. In addition, when studying organisms with duplicated genomes and/or strains with genetic dissimilarity (as in the current study), differences in hybridization efficiency can impact results.

While next-generation sequencing platforms have the capacity to produce millions of reads per sequencing run, some platforms such as the Illumina-Solexa, ABI SOLiD and HeliScope, produce sequences with short read lengths. This makes transcript identification by sequence alignment difficult for nonmodel organisms or species without genomic sequences for comparison. Effective bioinformatics such as de novo assembly and expression profiling also present challenges with short read lengths. In contrast, the Roche 454 Genome Sequencer (GS)-FLX produces relatively long read lengths (>200 bp) compared to other next-generation sequencing approaches, benefiting assembly and annotation. In addition, investigators have demonstrated that 454 sequencing technology has the potential to be effectively used for gene expression profiling (Torres et al. 2008). Thus in the current study, a second objective was to use the Roche GS-FLX 454 sequencing technology to compare the hepatic transcriptomes between lean and siscowet lake trout that were raised under identical environmental conditions and were the same fish used to analyse growth, morphometrics and lipid content. We hypothesized that, given the differences in lipid content in the muscle, we would observe differences in the transcript levels of hepatic genes involved with lipid metabolism, synthesis and binding between lean and siscowet lake trout. Further, given the depth of sequencing provided by pyrosequencing, we would be able to use this technology to quantitatively compare differences in gene read frequencies between the siscowet and lean lake trout livers.

Methods

Animals, husbandry and rearing

On 26 October 2006, siscowet lake trout were sampled at a bottom depth of 109 m using multifilament bottom-set gill nets at an offshore site in Lake Superior northeast of Marquette, MI. Siscowet were chosen based on their morphometry including the shape and relative size of the head, size of the fins and location and size of the eyes (Moore & Bronte 2001), and the fact that they were collected at depths greater than 100 m (Bronte et al. 2003). Lake trout forms designated as siscowets, leans and humpers have been shown to be genetically separated based on microsatellite analysis (Page et al. 2004). Fish averaged 2184 g in weight and a number of females sampled had ovulated eggs in the body cavity and males had easily expressible milt. Eggs of 12 females were fertilized in separate batches with milt from two to three males using the wet method. After washing, eggs were placed in Teflon containers and were transported to the Great Lakes WATER Institute (GLWI) on ice then incubated in a Heath vertical incubator with flow-through water at 10 °C in the dark. All eggs hatched by 7 January 2007.

On 21 December 2006, newly hatched lean lake trout arrived at the GLWI from the Les Voigt Fish Hatchery (Wisconsin DNR). These fry originated from spawning 24 female and 24 male wild lean lake trout caught 13 October 2006 on the Gull Island Shoal in western Lake Superior. Thus, the lean lake trout were approximately 2 weeks older than the siscowet. Lean lake trout fry were incubated in separate trays (with lids) within the same Heath vertical incubator that held the siscowet eggs. Lean lake trout swim-up fry were moved from the incubator to a 4 ft. diameter tank on 18 January 2007 and siscowet fry were moved to a separate 4 ft. diameter tank on 1 February 2007. These tanks were supplied with flow-through water at 10 °C plumbed from the same header tank. Both siscowet and lean lake trout fry were feed habituated to Rangen Trout Starter diet (Rangen Inc.) using automatic vibratory feeders set for 21 feedings/day and were transitioned through Rangen trout crumble and pellets as they grew.

At 6 months posthatch, 500 lean and 500 siscowet lake trout were obtained randomly from the stock tanks and placed into two 1000 litre tanks (250 lean lake trout and 250 siscowet lake trout comingled/tank). Lean lake trout were identified with an adipose fin clip. At 2 years of age, approximately 80 lean and 80 siscowet lake trout were moved from each tank into a third 1000 litre tank to decrease overall fish densities. All tanks initially received 4% of their body weight (Rangen Trout Production pellets) per day that was distributed by three automatic vibratory feeders per tank over 15 feedings/day and was gradually reduced to 0.75%, coinciding with decreased feed intake as fish got older. Gut analyses on a subset of fish terminally sampled at 1 and 2 years of age, indicated that all fish were feeding. Fish were sampled for digital photography, lipid analysis and tissues for genomic assays at 1 and 2 years of age as described below.

Growth

Measures of length (0.25 cm) and weight (0.1 g) were taken on a subsample of 25 fish of each lake trout form per tank when the fish were comingled on 18 June 2007 and repeated every other week from June to August (2007), and then once monthly thereafter. For these assays, sampled fish were anesthetized with 100 mg/L MS-222 (Argent Labs), measured, and returned to the tank for recovery. On 16 September 2008 and 5 January 2009, lengths and weights were conducted on all the fish in each tank. Trends in growth (weight and length) and condition factor were assessed from the periodic sampling but were only analysed for statistical significance (two-sample t-test) at the two assays where all fish were measured. Condition factor (KTL) was calculated using:

image

where W is the weight in grams and L is total length in millimetres (Carlander 1950).

Truss analysis

The truss protocol (Strauss & Brookstein 1982; Brookstein et al. 1985) was used to describe differences in the shape of specific regions of each fish. Trusses are vertical, horizontal and oblique distances measured between preselected anatomical landmarks (Fig. 2), which are points identified on the basis of local morphological features and chosen to divide the body into functional units (Brookstein et al. 1985). For truss analysis, digital photographs of the whole bodies (left lateral side) of 10 and 15 individuals per lean and siscowet lake trout were taken at 1 and 2 years posthatch, respectively. All landmark and truss measures used were taken from prior studies directed at the delineation of Lake Superior lean and siscowet lake trout (Moore & Bronte 2001; Bronte & Moore 2007). Truss measurements were collected from the photographs using ImageJ (1.32j) and were analysed using the multivariate analysis of variance (manova) function in Minitab (15.1.30.0). If the manova indicated significant differences between lean and siscowet lake trout, univariate analysis of variance was completed to identify the truss elements that were significantly different (P ≤ 0.05).

Figure 2.

 Upper: Positions of the nine truss elements used to determine potential differences in morphometry between laboratory-reared lean and siscowet lake trout. Elements were the same as those used by Moore & Bronte (2001) to distinguish wild lean and siscowet lake trout. Measurements were 1, posterior end of maxillary to anterior tip of snout; 2, posterior aspect of neurocranium to origin of pectoral fin; 3, origin of dorsal fin to origin of pelvic fin; 4, origin of pelvic fin to origin of anal fin; 5, origin of anal fin to origin of adipose fin; 6, origin of adipose fin to posterior end of anal fin; 7, origin of adipose fin to anterior attachment of ventral membrane of caudal fin; 8, insertion of anal fin to attachment of dorsal membrane of caudal fin; 9, anterior attachment of ventral membrane of caudal fin to distal margin of caudal peduncle. Lower: Average and standard deviation (SD) for truss measurements (as shown in upper figure) at years 1 and 2 for laboratory-reared lean and siscowet lake trout and P-values calculated by multivariate analysis of variance (manova). Asterisks indicate significant difference between lean and siscowet lake trout at P ≤ 0.05.

Lipid analysis

In year 1 of the study, six lean and six siscowet lake trout were sampled as whole fillets minus heads and viscera. Sampled fish were all within 5% of the mean length for each lake trout form (lean = 17.00 ± 0.25 cm and 40.3 ± 2.2 g; siscowet = 16.75 ± 0.25 cm and 43.8 ± 2.3 g). Skin-on fillets were oven dried for 2 days at 104 °C prior to grinding and lipid analysis. In year 2, a 1 in. steak cross-section of muscle was sampled from 20 lean and 20 siscowet lake trout just posterior of the head and pectoral fins. Sampled fish were all within 5% of the mean length for each lake trout form (lean = 39.00 ± 1.00 cm and 608.0 ± 20.9 g; siscowet = 39.25 ± 0.75 cm and 630.4 ± 20.6 g). The steak cross-section was similarly minced and oven dried for 2 days at 104 °C prior to grinding. The high lipid content of the samples in both years necessitated the use of dry ice pellets while grinding to ensure a homogeneous sample and to prevent grinding to a paste. After grinding, samples were dried at 104 °C overnight to remove excess carbon dioxide and moisture that accumulated while grinding.

Lipids were extracted from dried samples via the Soxhlet method (AOCS Method Ba 3-38) with a Büchi 810 Fat Extraction Apparatus. Sample size was approximately 2 g and extractions lasted 2 h at a drip rate of over 200 drips/min. Extracted lipids were dried to constant weight at 104 °C and fat content was determined on a gravimetric basis and expressed as percent by wet or dry weight of sample in the results. For statistical analysis, data were arcsin transformed and analysed by analysis of variance (anova) followed by Tukey’s post-hoc test. Pairwise comparisons with P-values of ≤0.05 were considered significantly different.

Fatmeter

At 27 months, 19 siscowet and 19 lean lake trout within 5% mean length for each lake trout form (siscowet = 984.4 ± 59.0 g and 44.75 ± 1.00 cm; lean = 856.4 ± 52.0 g and 43.75 ± 1.25 cm) were analysed for lipid content by microwave. Fish were briefly anaesthetized by immersion in MS222 (100 mg/L), weighed and measured for length, and analysed for lipid content. Each fish was analysed for lipid content on five defined areas on the right side of the body (Fig. 4) using a noninvasive microwave sensor device (Distell Model 692 Fish Fatmeter, Distell Inc.). The device emits a low-powered wave (2 GHz, 2000 MHz, power 2 mW) that interacts with water within the somatic tissues and uses the inverse relationship between water and lipid to estimate the lipid concentration in tissue (as a percent) from each of the specified locations. Each reading took less than 30 s to register. The locations on the fish were chosen based on the manufacturer’s recommendation and prior research suggested that the combination of these readings provided a reasonable representation of total lipid content in the fish (Crossin & Hinch 2005). Previous research also demonstrated no significant differences between measurements taken at the specified location on either side of the fish (C. A. Murphy, unpublished). The manufacturer provides species-specific calibrations that are programmed into the fatmeter, however, the ‘research’ setting was used since there were no programmed calibrations for lake trout. In previous research, calibration using Soxhlet extractor showed a strong linear log transformed relationship between device output and lipid content (R2 = 0.78; C. A. Murphy, unpublished) and that different calibrations for siscowet and lean lake trout were not required (M. Zimmerman, personal communication).

Figure 4.

 (A) Regions of the body sampled by the fatmeter. Note: the fatmeter was placed on the ventral body wall facing dorsal for region 5. All other regions were on the lateral surface of the right side of the body. (B) Mean fatmeter readings + SEM for each region of the body/lean or siscowet lake trout (N = 19). Asterisks indicate significant (P ≤ 0.05) differences between lean and siscowet lake trout means/sampling region calculated on arcsin transformed data. (C) Matrix of P-values for mean comparisons between the fatmeter measurements made on all regions (1–5) of the body per lean and siscowet lake trout. P-values ≤0.05 are significantly different.

Since we were primarily interested in differences in the relative lipid content between lean and siscowet lake trout and in whether lipid accumulated in different areas of the body, the device readings were not transformed to represent absolute lipid concentrations. In addition, normalization for differences in body size that might occur in wild sampled fish were not necessary since all fish sampled were within 5% of the mean length. Fatmeter readings (% lipid) from each of the locations were first arcsin transformed and compared between the two lake trout forms using a manova function in Minitab (15.1.30.0). If the manova indicated significant differences between the lake trout forms, univariate analysis of variance was completed to identify the sample locations that were significantly different (P ≤ 0.05).

RNA extraction, cDNA preparation and pyrosequencing

For samples taken in years 1 and 2, liver sections were dissected, snap frozen and stored at -80 °C until processed. Following thawing, livers were extracted in Tri Reagent (Molecular Research Center, Inc.) according to the manufacturer’s protocol (Chomcynski & Sacchi 1987; Chomcynski 1993). The RNA was treated with DNase I and cleaned using the RNeasy MinElute Cleanup kit (Qiagen). Pyrosequencing was only conducted on the RNA extracted from liver samples taken at year 1. For cDNA synthesis, equal quantities of total RNA were combined between 10 liver samples per lake trout form (lean and siscowet). Because large quantities (10 μg/morphotype) of cDNA are required for 454 sequencing, the SMART PCR cDNA synthesis kit (Clontech) was used according to the manufacturer’s instructions. This kit initially produces single-stranded cDNA from RNA by reverse transcription followed by PCR amplification to produce double-stranded cDNA. Subsamples of cDNA from lean and siscowet lake trout were examined using agarose gels to ensure that the same relative amounts of cDNA were being produced. Pyrosequencing was conducted on a Roche GS-FLX (454 Life Sciences) at the Interdisciplinary Center for Biotechnology Research genomics core at the University of Florida (Gainesville, FL). Libraries were constructed for leans and siscowets using unique tags for each library; Ligation Multiplex Identifiers (MIDs). The two libraries were combined and titred by sequencing approximately 16 000 sequences. The titres on the initial sequencing run indicated a ratio of 1:5 sequences (MID1 (siscowet):MID2 (lean)). In the final full-scale sequencing run the ratio of sequences between siscowet and lean libraries was 1:1.75 and this value was used to adjust the gene frequencies when comparing sequence reads between siscowet and lean lake trout (Tables 1–3). All of the original Roche 454 GS-FLX sequences for this study have been submitted to the NCBI Short Read Archive for public access (study SRP001186).

Table 1.   Gene frequency analysis from contiguous assemblies obtained from blastn alignment analysis and qPCR corroboration
Leans over siscowetsGene frequency analysisQPCR analysis
Gene annotationContig no.Total # seq.Lean seq.Siscowet seq.Lean (L)/siscowets (S)Adjusted L over SLean meanSiscowet meanDifference LSP-value
  1. Sequences for contigs are provided in Supplemental data with detailed top blast information for gene annotation. ‘Adjusted’ values represent either (S over L) × 1.75 or (L over S)/1.75 to account for differences in the overall number of sequences in lean and siscowet Multiplex Identifiers libraries. qPCR (quantitative reverse transcription–polymerase chain reaction) mean values are normalized (to actin) expression values.

C1q-like adipose specific protein9821135129621.512.323.310.8122.500.039
ovary-specific C1q-like factor26546191811810.310.016.223.780.001
similar to complement component C316894191811810.313.255.657.600.014
proteasome subunit alpha type-214019191811810.32.892.140.750.008
RING-box protein 19356191811810.311.929.722.200.039
NADH dehydrogenase 12475215141148.0    
Acyl-CoA desaturase2233714131137.4875.35642.06233.280.060
Secreted immunoglobulin domain 42256714131137.41038.02963.7074.320.465
Unknown but blasts against EST2615013121126.98.558.59−0.050.982
Unknown (no blast hits)2434813121126.9    
Solute carrier family 272785712111116.332.5231.171.350.783
Reticulon 42656112111116.338.2428.879.360.034
Apolipoprotein A-IV precursor2557212111116.310.624.765.850.136
Catechol-O-methyltransferase2474412111116.30.000.000.000.520
Unknown (blasts to ESTs)1540912111116.34.392.332.060.001
Mitochondrial ribosomal protein S331126512111116.38.355.632.730.030
Cytochrome c oxidase polypeptide VIII- heart822612111116.340.4141.08−0.670.890
Hypothetical S. salar protein711512111116.310.449.141.300.150
Diacylglycerol O-acyltransferase 2527612111116.321.7925.66−3.870.358
Apolipoprotein B2596311101105.747.2635.1912.060.025
glycine C-acetyltransferase2046211101105.78.417.810.600.520
Siscowets over leans 
Gene annotationContig no.Total # seq.Siscowet seq.Lean seq.Siscowets (S)/leans (L)Adjusted S over LSiscowet meanLean meanDifference SLP-value
Stannin4651181711729.80.180.20−0.020.295
Unknown (no blast hits)10018-1111011017.523.8726.05−2.180.050
Unknown (no blast hits)10018-2         
Guanine nucleotide binding proteinβ22046211929.516.668.4461.137.310.235
Arginine rich protein (ARMET)33161210258.85.408.28−2.870.018
Unknown (blasts to ESTs)132511924.57.91.061.35−0.280.189
Trafficking protein particle complex
subunit 2-like protein
310811924.57.92.492.61−0.120.753
FK506-binding protein 5447911924.57.90.170.31−0.140.005
Similar to Fibrocystin-L precursor76611924.57.90.040.04−0.010.782
Aginase type Ib2730108247.03.323.060.260.518
Hypothetical protein983141133.676.4    
Replication protein A5080141133.676.40.330.330.010.890
Perforin549221753.46.043.5724.4019.170.653
ATP synthase lipid-binding protein374131033.335.829.1937.49−8.300.053
Solute carrier family 25 alpha1809-1131033.335.85.374.850.520.245
Zonadhesin-like gene2198131033.335.87.189.37−2.190.192
Alpha-tectorin-like protein2247131033.335.8    
Myosin 13907131033.335.825.4329.41−3.980.187
Cytochrome P4502080171343.255.78.967.171.790.165
Table 2.   Gene frequency analysis from contiguous assemblies produced by Genomics Workbench (CLCBio) software and qPCR (quantitative reverse transcription–polymerase chain reaction) corroboration
Leans over siscowetsGene frequency analysisqPCR analysis
Gene annotationContig no.Total # seq.Lean seq.Siscowet seq.Lean (L)/siscowets (S)Adjusted L over SLean meanSiscowet meanDifference LSP-value
  1. Sequences for contigs are provided in Supplemental data with detailed top blast information for gene annotation. ‘Adjusted’ values represent either (S over L) × 1.75 or (L over S)/1.75 to account for differences in the overall number of sequences in lean and siscowet Multiplex Identifiers libraries. qPCR means are normalized (to actin) expression values.

C1q-like adipose specific protein25126120620.011.423.310.8122.500.039
Proteasome subunit alpha type-214041918118.010.32.892.140.750.008
Secreted immunoglobulin domain 48951817117.09.791.7732.0959.690.236
RING-box protein 118671716116.09.111.929.722.200.039
Unknown but blasts to ESTs11421414014.08.01.211.32−0.110.501
Acyl-CoA desaturase20622826213.07.4875.35642.06233.280.060
N-acetylglucosamine-6-sulfatase precursor14161313013.07.41.151.070.080.535
Unknown (no blast hits)15751313013.07.40.280.170.100.032
Microsomal glutathione S-transferase 313716257511.46.526.0214.3511.670.001
C1 inhibitor1202322939.75.5501.86453.5648.300.182
Pyridoxine-5-phosphate oxidase1157171527.54.30.280.170.100.032
Unknown (no blast hits)1664413657.24.11.502.97−1.470.374
Fatty acid binding protein, heat458312746.83.941.7942.23−0.430.958
Complement C1q-like protein 21225306265416.53.7613.1666.97546.190.011
Peroxisome proliferator activated receptor637321269525.23.026.8119.826.990.010
Siscowets over leans 
Gene annotationContig no.Total # seq.Siscowet seq.Lean seq.Siscowets (S)/leans (L)Adjusted S over LSiscowet meanLean meanDifference SLP-value
CDV3 homolog13962827127.047.31.181.33−0.150.341
Similar to FBP327322423123.040.354.9753.321.650.818
Stannin11471716116.028.01.080.960.120.401
Ribosomal protein L1717541716116.028.045.1848.97−3.780.613
Unknown (no blast hits)4931414014.024.50.030.04−0.010.042
Unknown (blasts to ESTs)10131111011.019.3CT values to low for calculation   
Src family associated phosphoprotein 111801110110.017.50.390.54−0.150.034
Unknown (blasts to ESTs)16959818.014.00.050.050.000.591
Anionic trypsin-1 precursor12429818.014.05.363.491.870.434
Microsatellite Alu16 sequence3409818.014.00.090.10−0.010.750
Unknown3079818.014.023.8726.05−2.180.050
Similar to FBP32169353147.813.610.419.201.210.201
Unknown (blasts to ESTs)10937707.012.37.607.82−0.230.863
Unknown (blasts to ESTs)908121025.08.82.632.77−0.150.784
FK506-binding protein 51390121025.08.80.170.31−0.140.005
Table 3.   Analysis of contiguous assemblies having the highest overall expression values when analyzed by Genomics Workbench (CLCBio) software
Lean over siscowetGene frequency analysisGene expression analysisqPCR analysis
Gene annotationContig no.Total # seq.Lean seq.Siscowet seq.Lean (L)/siscowet (S)Adjusted L over SLean expressionSiscowet expressionExpression differenceLean meanSiscowet meanDifference LSP-value
  1. For gene frequency analysis, adjusted values represent either (S over L) × 1.75 or (L over S)/1.75 to account for differences in the overall number of sequences in lean and siscowet Multiplex Identifiers libraries. qPCR (quantitative reverse transcription–polymerase chain reaction) means are normalized (to ac tin) expression values. Sequences for contigs are provided in Supplemental data with detailed top blast information for gene annotation.

28S Ribosomal RNA13678476691783.82.114 45067527698110.5982.8127.770.041
Apolipoprotein AII1472183013524782.81.616 28610 1126174103.8291.4712.350.179
Cysteine sulfinic acid decarboxylase811576777238042.01.235 84030 80550359.719.120.590.751
Mitochondrial genome blast160310671712135502.01.129 57225 89036822.851.801.050.005
B-microseminoprotein precursor20687865592272.51.411 892848034123.062.170.890.027
Coiled-coil transcriptional coactivator b1423153611044322.61.510 6857342334312.7811.191.590.207
Complement C1q-like protein 21225306265416.53.7396810782890613.1666.97546.190.011
Mitochondrial genome blast115191513215942.21.312 86210 156270617.8716.171.700.474
Liver SSH library from warm, cold and hypoxic stress15065183941243.21.8522028852335333.12176.35156.770.035
Serum albumin 24856055402520302.01.119 78217 5212261    
Liver-type fatty acid binding protein662168411185662.01.120 33118 067226476.5670.905.660.475
Cytochrome C oxidase subunit II200112468414052.11.212 60910 6641945    
Apolipoprotein A-I-1119011187763422.31.3794061451795105.4196.588.840.402
Unknown but blasts strongly to ESTs20785133541592.21.3778061371643    
Catechol-O-methyltransferase domain- containing protein 11214245203424.82.82466896157030.4912.5617.930.001
Alpha 1-antiproteinase-like protein1336564851712.81.638162363145315.4115.65−0.240.887
Peroxisome proliferator activated receptor gamma637321269525.23.02165735143026.8119.826.990.010
60s Ribosomal protein L361855130100303.31.92891152313681.281.150.130.4
Siscowet over lean 
Gene annotationContig no.Total # seq.Siscowet seq.Lean seq.Siscowet (S)/lean (L)Adjusted S over LSiscowet expressionLean expressionExpression differenceSiscowet meanLean meanDifference SLP-value
Fucolectin-4 precursor19735424235430700.81.339 70129 48410 21729.0423.115.930.317
Saposin-related isoform A/antimicrobial peptide NK-lysin144113857656201.22.217 6548147950774.78101.76−26.980.070
Saposin-related, isoform A NK-lysin type 3187113597685911.32.313 849606977802734.522373.41361.100.635
Type-4 ice-structuring protein LS-12 precursor19428274733541.32.313 341568676551013.74645.76367.980.012
RNAse2/angiogenin145015297307990.91.617 71311 04066736.177.55−1.380.334
Precerebellin201316677479200.81.419 38513 59557904.355.45−1.100.074
Serum albumin 220493894170121930.81.420 97115 3965575    
Hemopexin-like protein17332494118013140.91.615 0259528549720.8223.95−3.130.122
Saposin-related antimicrobial peptide NK-lysin185611575625950.91.710 74164754266820.47833.20−12.740.957
Acyl CoA binding protein14665052632421.11.96358333230261341.861031.22310.650.052
Similar to neurotoxin/C59/Ly611484252381871.32.2537924072972313.01249.8863.130.085
Secreted phosphoprotein 241809−212715497220.81.311 764881029549.789.030.750.542
Fucolectin-4 precursor12536733073660.81.57228490723212384.052376.827.230.988
CDV3 homolog13962727027.047.32147021471.181.33−0.150.341
SE-chephalotoxin210210064905160.91.74280256617144.615.75−1.130.345
Haptoglobin713391921471.32.32970129516755.243.042.200.044
Complement factor H18229359341.73.0201866213561.761.660.100.763
Toxin 117953301401900.71.3573944351304    

Sequence analysis

Sequences from the two libraries were initially separated by respective MIDs at the genome core facility. At the GLWI, the sequences were further processed to remove linker primer sequences derived from the SMART cDNA synthesis. In addition, poly A/T tails were also trimmed from the sequences when they occurred at the end or the beginning of sequences.

For the comparative analysis of gene expression between lean and siscowet lake trout, two bioinformatic approaches were employed, one using blastn alignments and one using assembled contigs. At the beginning of this investigation, programs that could effectively assemble the total number of gene reads produced from the GS-FLX sequencing of unnormalized cDNA were not available. Thus, we developed an approach in which the individual gene reads were compared to each other using blastn and the top 500 hits were recorded/sequence. Sequences were then grouped into gene clusters based on their blast score and the relative number of sequences from both MID libraries in each group was calculated. Sequences within each gene cluster were then assembled using CAP3 (Huang & Madan 1999) for subsequent annotation and qPCR primer design (see below). For this analysis, only sequences ≥100 bp after trimming were analysed. This amounted to 91 810 sequences for siscowet lake trout and 160 892 sequences for lean lake trout. Later in our study, commercial platforms were released that contained programs that could assemble the very large sequence sets produced from next-generation sequencing of unnormalized cDNA. Thus, in a second approach, de novo assembly of sequencing reads was performed, followed by comparative expression analyses (Genomic Workbench, CLCBio). Initially, all sequences were trimmed based on quality scores of 0.05 (Phred; Ewing & Green 1998; Ewing et al. 1998) and the number of ambiguous nucleotides (>2 on ends). Sequences smaller than 30 bp were removed leaving 153 046 sequences for the siscowet library and 272 775 sequences for the lean library. Assembly resulted in 2276 contiguous sequences which were then used as a reference for comparing expression between the siscowet and lean MID libraries in two ways: (i) ratios of gene reads per MID library, and (ii) relative expression values measured in RPKM (reads per kilobase of exon model per million mapped reads) (Mortazavi et al. 2008) between MID libraries.

For all bioinformatic approaches, assembled contigs were aligned by blast to the NCBI nr protein, nucleotide and dbEST (‘all others’) databases for annotation, and sequences and detailed blast results are provided in the Supplemental Data (Contig Sequences and Annotation).

Quantitative reverse transcription-polymerase chain reaction

Quantitative reverse transcription-polymerase chain reaction (qPCR) was carried out to assess the results obtained from the various bioinformatic approaches used to analyse the RNA-seq data. Primers for qPCR (Supplemental Data: qPCR primers) were designed to contiguous assemblies (Supplemental Data: Contig Sequences) and qPCR was performed on individual RNA samples from the same 10 lean and 10 siscowet lake trout that were pooled originally to generate cDNA for 454 sequencing. First-strand cDNA synthesis was performed on total RNA with Improm II (Promega) reverse transcriptase using 0.5 μg RNA, 0.25 μg oligo dT primer and 10 mm dNTP mix. Reactions were carried out for 1 h at 37 °C. All qPCR reactions were created as master mixes and individual reactions contained the following: 2.5 μL of a 1:10 dilution of cDNA, 5 pm each of forward and reverse gene primers (Supplemental Data: qPCR primers), and 12.5 μL Power SYBR Green PCR Master Mix (Applied Biosystems). Cycling and fluorescence measurements were carried out in a Stratagene Mx 3000P System (Stratagene) with the following cycling parameters: 1 cycle of 95 °C for 10 min; 40 cycles of 95 °C for 15 s and 58 °C for 1 min. Fluorescence readings were taken at the end of each cycle. Immediately after cycling, a melting curve analysis was run. Amplification products from qPCR primers were analysed initially on agarose gels to ensure the presence of single bands of the correct size, and quality control for qPCR included the analysis of no template controls for the absence of primer dimers, and dissociation curves for the presence of sharp single peaks.

Raw data were processed with Real-time PCR Miner (Zhao & Fernald 2005). Quantification was performed by calculating the relative mRNA concentration (R0) for each gene per individual sample. Briefly, this was calculated using the following equation: R0 = 1/(1 + E)Ct, where E is the gene efficiency calculated as the average of all individual sample efficiencies across all reactions for a given gene per qPCR plate, and Ct is the cycle number at threshold (Zhao & Fernald 2005). The R0 for each gene was normalized to a control (actin— Supplemental Data: qPCR primers). Average normalized values were calculated for lean and siscowet lake trout per gene and analysed by anova. P-values of ≤0.05 were considered significantly different, though P-values for all reactions are provided in the results and were sometimes used more broadly to interpret qPCR corroboration.

Results

Growth

When lean and siscowet lake trout began communal rearing (6 months posthatch), lean lake trout were 6.9 ± 1.3 g and 9.75 ± 0.50 cm, and siscowets were 4.6 ± 1.0 g and 8.50 ± 0.50 cm. Lean lake trout were 2 weeks older and initially were slightly larger then siscowets. Even so, the siscowets outgrew the leans (Fig. 1). By 2–2.5 years old, when all fish were weighed, the siscowets were significantly heavier (859.9 g t = -6.83, d.f. = 810; P < 0.001) and longer (42.50 cm, t = −6.64, d.f. = 773; P < 0.001) than leans (746.9 g and 41.1 cm), however the difference in length was minor compared to the difference in weight (Fig. 1). As a result, siscowet (KTL = 1.10) also had a significantly (t = −9.36, d.f. = 785, P < 0.001) higher condition factor than lean lake trout (KTL = 1.05).

Figure 1.

 Mean weights (A) and lengths (B) of laboratory-reared lean and siscowet lake trout calculated from assays performed semimonthly (first 3 months) and monthly (>3 months) for 2.5 years. Each point represents the mean for 25 individuals/lean or siscowet lake trout/tank. Inset figures indicate the mean weight (g) ± SD and length (cm) ± SD assayed at two time points for all individuals/lean or siscowet lake trout (16 September 2008: Lean N = 412; Siscowet N = 427; 5 January 2009: Lean N = 395; Siscowet N = 393). Asterisks indicates significant difference between lean and siscowet lake trout at P ≤ 0.05.

Morphometry

At 1 year, three truss elements (1, 3, 7) were significantly different and by 2 years, this increased to six elements (1, 2, 3, 4, 5, 9), mostly associated with head, body depth and caudal peduncle measures (Fig. 2).

Lipids

Lipids were analysed in lean and siscowet lake trout by a chemical method and by a microwave device. While the chemical method is a precise measure of the lipids in the muscle, the microwave device allowed us to quickly compare relative lipid levels at a number of different areas of the body. At both sampling times, siscowet lake trout had a significantly higher percentage of lipid compared with lean lake trout on both a dry or wet weight basis (Fig. 3). In addition, there was a significant increase in lipids from year 1 to year 2 for both lean and siscowet lake trout on a wet or dry weight basis. At all five locations sampled with the fatmeter, there was a significantly greater reading in siscowet than in lean lake trout, indicating a higher percentage of fat at those locations (Fig. 4B). In many cases, there were significant differences between the fatmeter readings at different body locations within each lake trout form, however this was observed more in siscowet lake trout (Fig. 4C). For example, readings at locations 1 and 2 in siscowet lake trout were very different (P < 0.0001) from location 3, whereas in lean lake trout, readings at location 1 were not different from location 3 and readings at location 2 were just significantly (P < 0.045) different from readings at location 3.

Figure 3.

 Mean % lipid + SEM calculated on the basis of wet weight (upper) and dry weight (lower) for N = 6 (2007) and N = 20 (2008) individuals/lean or siscowet lake trout. Asterisks on bars indicate significant (P ≤ 0.05) differences between lean and siscowet lake trout per sampling period and on brackets indicate significant (P ≤ 0.05) differences from year 1 to year 2 within a lake trout form.

Comparative transcriptomics

The analysis of sequences using blastn alignments identified 21 gene clusters containing sequences having a higher (≥5.7) adjusted gene frequency in lean vs. siscowet libraries, and 19 gene clusters containing sequences with higher (≥5.7) adjusted gene frequencies in siscowet compared with lean lake trout (Table 1). qPCR results corroborated 10 (one at P = 0.06) of the lean over siscowet gene frequency comparisons, and with a few exceptions, even the ones not corroborated by qPCR were in the correct direction (e.g. lean qPCR mean > siscowet qPCR mean—Table 1) as expected from the gene frequency analysis. However, qPCR did not corroborate the gene frequency results obtained from the blastn groupings in which siscowet lake trout had higher gene frequencies than lean lake trout. In fact, four gene expression profiles were significantly different in the opposite direction (e.g. leans > siscowets) to what was predicted from the gene frequency results (Table 1). Regardless, the differences in expression measured by qPCR were not very large between the lake trout forms for those genes with higher frequencies in siscowet lake trout. In contrast, the mean differences observed in expression between some of the genes with higher frequencies in lean vs. siscowet lake trout were large (e.g. C1q adipose-specific protein).

Gene frequency analysis derived from aligning individual reads to 2276 assembled contiguous sequences (Genomic Workbench, CLCBio) produced similar results to the blastn gene clustering. In this analysis, the 15 assembled contigs with the greatest gene frequency differences for each MID library comparison (lean over siscowet and siscowet over lean) were analysed using qPCR (Table 2). Of the genes exhibiting higher frequencies in lean vs. siscowet lake trout, nine of the comparisons were corroborated by qPCR with many of them having large fold differences in expression levels (Table 2). Further, several of these qPCR-corroborated genes were the same as those observed in the blastn-based analysis including, C1q-like adipose-specific protein, proteasome subunit alpha type-2, RING-box protein 1 and acyl-CoA desaturase. In addition, the frequency differences for these genes were in the same order for both analyses (Tables 1 and 2). Again, however, qPCR results did not agree with results for genes that exhibited higher gene frequencies in siscowet vs. lean lake trout (Table 2). Similar to the blastn-based analysis, there were a few significant comparisons but in the opposite direction from that predicted by gene frequency. Of these, the FK506-binding protein 5 was significant in both analyses (Tables 1 and 2).

Since many of the clusters or assemblies analysed by gene frequency contained a low total number of sequences (Tables 1 and 2), we also used the RPKM method (Mortazavi et al. 2008) for determining differences in expression values that takes into consideration relative library size and the overall number of gene reads within contigs (Genomic Workbench, CLCBio). Using this approach, 18 genes with the largest expression value differences across libraries were examined (Table 3). Given the way in which the expression values are calculated, these contigs contained some of the largest number of reads within each library. While these transcripts also exhibited differences in gene frequencies between libraries, the differences were lower than those observed with the two gene frequency analyses discussed above. Of the 18 comparisons showing higher expression values in lean lake trout, 6 were corroborated by qPCR and 17 of the expression differences measured by qPCR were in the correct direction (i.e. lean > siscowet) regardless of the P-value (Table 3). Most of the qPCR-corroborated genes also contained the highest frequency differences between gene reads (Table 3). In contrast to the gene frequency analyses, four genes (type-4 ice-structuring protein LS-12 precursor, acyl-CoA binding protein, similar to neurotoxin/CD59/Ly6, and haptoglobin), that showed higher expression levels in siscowet vs. lean lake trout, were corroborated by qPCR (‘similar to neurotoxin/CD59/Ly6’ at P = 0.085).

Discussion

While past studies have reared lean and siscowet lake trout under hatchery conditions (Eschmeyer & Phillips 1965; Stauffer & Peck 1981), this is the first comprehensive study that quantified a wide range of phenotypic differences between these lake trout forms within the same individuals reared under identical environmental conditions. It is also the first study to use pyrosequencing to quantify differences in transcriptome expression between different ecotypes. The results clearly demonstrate that key phenotypic differences that have been observed between wild lean and siscowet lake trout such as condition factor, morphometry and lipid levels, persist in these two forms when reared in the laboratory under identical environmental conditions. The results strongly suggest that these differences are genetic and not a result of plasticity due to environment.

In differentiating general morphology among lean, siscowet and humper lake trout, Moore & Bronte (2001) found that the 9 (out of 31) truss elements used in the current study were able to discriminate between lean and siscowet lake trout at a level of 80%. Based on these nine elements, we found significant differences between our communally reared siscowet and lean lake trout in three elements at year 1 and six at year 2. Of these, two elements (1 and 3) associated with depth of the body and length of the jaw, were significantly different in both years. Khan & Qadri (1970), applying more classical morphometric techniques, found that siscowet had smaller heads and shorter jaws than lean lake trout. This is consistent with the shorter length of truss element 1, a measurement of the length of the jaw, that we observed in laboratory-reared siscowet in both years. Similarly, truss element 3 is associated with the depth of the body and this was significantly longer in siscowet as compared to lean lake trout in both years. The difference in this truss element between the lake trout forms, correlates well with the higher condition factor of siscowet compared to lean lake trout. In fact, differences in truss element 3 were the most significant (P = 0.006 in year 1 and P = 0.002 in year 2) of all the truss measurements. Wild siscowet lake trout have longer and thicker caudal peduncles than other lake trout forms (Moore & Bronte 2001), and this correlates with the significant increase we observed in truss element 9 in siscowet lake trout in year 2.

The number of significantly different truss elements was less for our laboratory fish compared to wild lean and siscowet lake trout (Moore & Bronte 2001). Sympatric populations of Arctic char reared under common conditions, maintained significant morphological differences, but the differences were not as great as those seen in wild fish, indicating some phenotypic plasticity (Adams & Huntingford 2004). The lean and siscowet lake trout reared in this study may also be exhibiting some similar plasticity. Another factor may be the size and age of the fish analysed in this study compared to the wild fish analysed by Moore & Bronte (2001). The nine truss elements used in the present study were found to be informative in differentiating wild lean and siscowet lake trout that were 300–900 mm (Moore & Bronte 2001). Fish in the current study averaged 200 and 390 mm in years 1 and 2, respectively. Truss landmarks are more difficult to determine in smaller fish and the size difference between the fish used in each study may have influenced our ability to detect significant differences in trusses on fish assayed in year 1 of the study. In addition, the changes in body morphometry may simply increase with size which is supported by our results showing a greater number of significant truss elements in year 2 than in year 1. Thus, we hypothesize that differences will become greater as the fish become larger and that other truss elements will become significant.

This study found significant differences in weight–length relationships between lean and siscowet lake trout within 1 year of being grown under identical conditions. Laboratory-reared siscowet lake trout were significantly heavier and longer than lean lake trout by 2 years, but the difference in weight was more dramatic. Thus, weight–length growth model parameters were significantly different between laboratory siscowet and lean lake trout (P < 0.005; GLM procedure in R version 2.8.1; Fig. 5). For any given length, laboratory-reared siscowet had higher weights than lean lake trout, and this relationship is also observed between wild siscowet and lean lake trout populations in Lake Superior (Fig. 5).

Figure 5.

 Weight–length relationships for laboratory (subscript L) and wild (subscript W) siscowet (subscript S, dashed lines) and lean (subscript L, solid lines) lake trout from Lake Superior populations. Data for wild fish were collected from lake trout surveys conducted in Michigan waters of Lake Superior near Marquette during 2003–2006. Data presented for laboratory lean and siscowet lake trout are reanalysed from that presented in Fig. 1.

We found significantly higher muscle lipid content in siscowet vs. lean lake trout within 1 year of growth under identical environmental conditions, similar to results reported by Eschmeyer & Phillips (1965) for pond-reared siscowet and lean lake trout. Wild siscowet have a higher lipid level than lean lake trout (Eschmeyer & Phillips 1965; Wang et al. 1990) and this difference increases with size (Eschmeyer & Phillips 1965). It is difficult to absolutely compare the lipid levels reported for past studies on wild lake trout to our results since there were differences in the body region sampled and in the lipid analysis itself. However, on a dry weight basis, the laboratory-reared siscowet in our study appeared to have very similar levels in relation to length as reported for wild siscowet lake trout (Eschmeyer & Phillips 1965). Our laboratory-reared lean lake trout, however, appear to have higher lipid content than wild lean lake trout at similar sizes. For example, a wild lean lake trout approximately 40 cm long had a lipid level of approximately 20% (Eschmeyer & Phillips 1965), whereas lipid levels in our cultured lean lake trout at the same size were approximately 32% (Fig. 4). Thus, while the difference in lipid levels observed between lean and siscowet lake trout undoubtedly has some genetic basis, other factors such as diet and activity could also influence the lipid levels both between wild lake trout forms and also between the wild and laboratory-reared fish.

Although the techniques used with the fatmeter and the chemical lipid measurements were very different, the general conclusion drawn from the results of these two approaches was the same; siscowet had a higher lipid level in the skeletal muscle than lean lake trout. We did not perform any direct calibration of the fatmeter with actual lipid levels in muscle, so our measures are relative differences between lean and siscowet lake trout. Fatmeter measurements made on the two sampling positions over the epaxial muscle mass posterior to the head (positions 1 and 2) were higher in both lake trout forms compared to more posterior or ventral positions on the side of the body (Fig. 4). In Pacific salmon, these two regions of the body gave the strongest predictive relationships between gross energy density and fatmeter readings (Crossin & Hinch 2005). In the present study, the highest readings were on the ventral aspect of the body (position 5). However, the body wall is very thin at this point and these readings may be measuring fat that invests the pyloric caeca.

While the adaptive significance of higher fat content in siscowet lake trout is unclear, one hypothesis is that increased lipid is associated with the bioenergetics of vertical migration. Lipid may decrease the costs of maintaining neutral buoyancy at a range of depths (Eshenroder & Burnham-Curtis 1999; Henderson & Anderson 2002) and facilitate vertical and horizontal migration during feeding.

Overall, our study results suggest that there are strong genetic components to the phenotypic differences observed between siscowet and lean lake trout in the wild. Differences in the morphology of various anatomical regions of the rainbow trout body derived from several distinct habitats were also shown to have a strong genetic component in garden variety experiments (Keeley et al. 2007). These differences correlated with the habitat (e.g. streams vs. lakes) or feeding (e.g. piscivores vs. nonpiscivores) characteristics of the various rainbow trout ecotypes. In Arctic char, there are many examples of sympatric morphs occurring in postglacial lakes (review Jonsson & Jonsson 2001). Results from various laboratory rearing studies have implicated both genetic and environmental explanations for the differences observed in wild Arctic char populations (Nordeng 1983; Svedang 1990; Hindar & Jonsson 1993; Skulason et al. 1996; Klemetsen et al. 2002; Adams & Huntingford 2004); however, the degree that genetics or the environment are involved appears to vary among studies. It has been hypothesized that species divergence may initially involve the environmental regulation of alternative discrete phenotypes followed by genetic control (Skulason et al. 1999). If so, the degree that a phenotypic trait in Arctic char is explained by genetic or environmental control in rearing experiments, may be related to the level of diversification in the morphs that are being tested (Adams & Huntingford 2004). Regardless, the typical habitats in which sympatric Arctic char morphs are found are deep lakes that have profundal, pelagic and littoral habitats. Most of the lacustrine Arctic char morphs are separated on the basis of habitat and diet; profundal/zoobenthic feeding and limnetic/zooplankton feeding, though other differences may exist between morphs (e.g. spawning time, coloration, body size—Hindar & Jonsson 1982). Differences, particularly in head and mouth structure appear to be related to feeding. For example, pelagic morphs have terminal mouths, long and dense gill rakers and short pectoral fins compared with benthos feeding morphs (Hindar & Jonsson 1982, 1993).

As with Arctic char, deep postglacial lakes also contain lake trout morphs that may be specialized to profundal and pelagic habitats such as seen in Lake Superior with lean and siscowet lake trout. It has been suggested that the longer and thicker caudal peduncle of siscowet lake trout is adaptive to foraging in deeper water where vertical migrations may be extensive and burst swimming necessary for feeding (Moore & Bronte 2001). Similarly, in other large lakes in North America, deepwater and shallow lake trout forms have been reported (Blackie et al. 2003; Alfonso 2004; Zimmerman et al. 2006, 2007) that are differentiated by body shape (deep vs. elongate), length of the pectoral fins (long vs. short) and buoyancy (high vs. low) (Zimmerman et al. 2006). Differences in these characters are also thought to be related to the energetics of swimming and movement in the water column (Zimmerman et al. 2006). Besides salmonids, there are examples of fish from other taxa in which sympatric populations within a species demonstrate morphological specializations correlated with feeding in specific habitats. For example, studies have shown that perch in the littoral zone had a significantly deeper body in comparison to pelagic perch and in the laboratory, had higher capture rates in feeding trials in vegetation as opposed to open water (Svanback & Eklov 2003). The opposite was true for pelagic perch that were significantly more streamlined and had higher capture rates in the laboratory in open water trials as compared to trials in vegetation. Thus, differences in morphometry and physiology (i.e. lipid) between sympatric lake trout forms living in divergent habitats are probably related to feeding efficiencies within a specific habitat. As with Arctic char, it is possible that there are other differences between sympatric lake trout forms that are genetically fixed since artificial rearing experiments with nonoverlapping lake trout ecotypes taken from several lakes, have demonstrated a genetic component to egg size and age and size at sexual maturity (McDermid et al. 2007).

In the present study we used Roche 454 sequencing technology to examine differences in transcriptome expression between the livers of lean and siscowet lake trout reared under identical environmental conditions. In a study on yellow perch, we used traditional cDNA library construction and capillary sequencing (ABI 3730) to investigate differences in transcript expression in the livers of yellow perch treated with estradiol-17b in the diet (Goetz et al. 2009). That study identified 28 oestradiol-regulated (75% difference between libraries) genes. Of 17 of the 28 genes that we examined with qPCR, 14 were confirmed to be regulated by oestrogen. Since very small numbers of sequences were analysed (∼3500/library), we considered that study a successful prelude to the use of pyrosequencing for quantitative transcriptomics through the comparison of gene expression on a much larger scale. Results presented here clearly indicate that sequencing can successfully identify novel genes and delineate differentially expressed transcripts. However, discrepancies between RNA-seq analysis and qPCR are evident. When we started this work, readily available software programs (e.g. CAP3) were incapable of handling the data produced with 454 sequencing on unnormalized cDNA. Thus we initially developed an indirect method to look at gene frequencies by aligning all sequences (from both MID libraries) at the nucleotide level using blastn and then grouping sequences based on blastn scores. In the study on yellow perch livers, we compared the alignment approach with an analysis of assembled contigs using CAP3. We observed nearly the same transcript frequency results with both bioinformatic approaches (Goetz et al. 2009), suggesting that the alignment approach would work with much larger numbers of sequences. However, other algorithms and software packages (e.g. Genomic Workbench, CLCBio) have recently been developed to assemble sequences and examine expression patterns from pyrosequencing. Clearly both bioinformatic approaches were able to delineate transcripts that are constitutively expressed at higher levels in lean vs. siscowet lake trout livers. Further, many of the same genes were identified in both frequency analyses, particularly the genes that could be corroborated by qPCR. However, delineating genes that were expressed at higher levels in siscowet vs. lean lake trout livers was not possible using either of these gene frequency analyses.

The reason for this is unclear but could be a result of several things. First, an inherent problem with library construction was the large amount of cDNA required. We produced 10 μg of cDNA for each library that was constructed and this required SMARTTM (Clontech) technology that uses PCR amplification. The amplification may not have been equal across transcripts, resulting in higher levels of certain genes in a given library when the expression of those genes was not really different in the original tissues. The amplification processes used to produce the libraries for GS-FLX sequencing may have further accentuated differences.

When examining qPCR results, we also noticed that expression for some genes was not continuous across individuals, but was either very high or very low, differing by several orders of magnitude. Thus, for these genes, cDNA produced from the pooled livers of one lake trout form could have very large quantities of the transcript if several of these individuals were included in the pool, even though there would not be a significant mean difference in the level of the transcript when assayed by qPCR and averaged over a number of individuals per lake trout form. In addition, when MID libraries are sequenced within a single 454 run, there can be an unequal number of sequences produced between the libraries even if preliminary titering is done before the final sequencing. In the current study, the final ratio of sequences between the two MID libraries was 1:1.75 (siscowet:lean). We corrected gene frequencies by this factor (Tables 1–3), but the unequal number of sequences between the libraries may still have had an impact that is not fully realized.

There could also be problems with the qPCR used to corroborate the RNA-seq results. For example, it has recently been reported that sequence polymorphisms in oysters could lead to large differences in the efficiencies of qPCR reactions between individuals as a result of differences in primer annealing across samples (Taris et al. 2008). Given the combined results of the current study it is possible that genetic differences exist between lean and siscowet lake trout that could be present at primer annealing locations. Further, if there are multiple forms of a gene with similar sequences within the primer regions (e.g. superfamily members, duplicate genes), a single qPCR may amplify very similar genes that are not differentially expressed and would, therefore, mask the results of the differentially expressed gene. However, while problems with qPCR might impact the results observed for isolated genes, they are unlikely to be the basis for the inability to corroborate upregulation of genes in siscowet lake trout in general.

Since discrepancies were identified between gene frequency and qPCR in regard to genes expressed at higher levels in the livers of siscowet vs. lean lake trout, we examined the RNA-seq data based on differences in expression values as determined by the RPKM method (Mortazavi et al. 2008). This expression analysis accounts for the number of sequences within an assembled contig and for differences in the number of total reads per library. Large expression values, or even large differences in the expression values, do not necessarily translate to large fold differences in gene frequencies between the libraries (see adjusted frequencies, Table 3). However, we hypothesized that given the large number of reads being analysed overall (>250 000), it might be easier to observe a difference in gene expression by chance alone if small numbers of gene reads were involved rather than large numbers of reads. In addition, problems arising from biased cDNA synthesis might impact small numbers of gene reads more so than large ones. Because of the total number of sequence reads involved, the genes from the expression analysis that were analysed were not the same as those in the gene frequency analyses. Exceptions were the C1q complement protein and the peroxisome proliferator-activated receptor (PPAR) that were present in high copy even in the gene frequency analyses (Table 2). Looking at the contigs containing the greatest differences in expression values, we observed several genes that were expressed at higher levels in lean vs. siscowet lake trout livers that were corroborated by qPCR. Interestingly, most of these genes also contained the highest frequency differences between gene reads (Table 3) even if they were not as great as those in the gene frequency analysis. However, more importantly, we identified four genes that were expressed at higher levels in the livers of siscowet as compared to lean lake trout, and these could be corroborated by qPCR. These were the only transcripts observed to be higher in siscowet lake trout livers throughout all of the analyses.

Clearly, a consideration that must be made in using pyrosequencing for comparative transcriptomic analysis is the way in which samples are prepared for sequencing. Specifically, the use of amplification for cDNA production will probably bias some gene frequencies. Thus, some other method should be used to obtain large amounts of cDNA for 454 sequencing. Alternatively, for nonmodel organisms without characterized genomes, other sequencing technologies (i.e. Illumina Genome Analyzer, ABI SOLiD) could be used for specifically determining gene frequencies against an existing EST backbone, or a backbone developed by 454 sequencing of appropriate samples.

Complementary DNA microarrays have been used to determine the differences in transcripts constitutively expressed in liver and muscle between sympatric whitefish (Coregonus clupeaformis) ecotypes (Derome & Bernatchez 2006; Derome et al. 2006; St-Cyr et al. 2008). Genes that are involved in energy production and muscular activity differed between ecotypes and generally reflected their life histories. For example, dwarf whitefish that occupy the pelagic zone and are active swimmers, had higher expression of genes involved in energy production and muscle contractility compared with normal whitefish that forage on benthic prey and are less active (Derome et al. 2006). Lean and siscowet lake trout occupy different habitats; siscowet being deepwater forms while lean lake trout live at shallower depths. As a result of the habitats they are found in, there might be some differences in the overall activity of leans and siscowets as well (e.g. Henderson & Anderson 2002; Hrabik et al. 2006) though differential expression of genes related to metabolism were not readily apparent in the current study. Given the higher lipid levels in siscowet vs. lean lake trout, we expected to see differences in genes related to lipid production, metabolism or transport. From the gene frequency and expression analysis we did observe several lipid-related genes including acyl-CoA desaturase, PPAR gamma (PPARγ), and apolipoprotein B to be expressed at higher levels in lean vs. siscowet lake trout livers. Acyl-CoA desaturase (also called stearoyl-CoA desaturase) is a pivotal enzyme that catalyses the initial oxidation reaction for the desaturation of long-chain saturated fatty acids into monounsaturated fatty acids (Nakamura & Nara 2004). This gene has been studied in several fish species (Hsieh et al. 2004) and the activity has been associated with membrane fluidity because of the different melting temperatures of the monounsaturated products formed by the action of the enzyme (Tocher 2003). In mammals, the product of acyl-CoA desaturase, oleic acid, is the major fatty acid of adipose triglycerides (Kokatnur et al. 1979) and could, therefore, implicate this enzyme in lipid storage. PPARs are nuclear transcriptional factors that bind fatty acids and eicosanoids (Willson et al. 2000). The contig observed in this study aligned (weakly) at the nucleotide level with Atlantic salmon PPARγ. Salmon PPARγ is expressed in the liver and thought to be involved in peroxisomal β-oxidation of liver fatty acids and its presence in the liver was speculated to be the result of lipid deposition (Ruyter et al. 1997). Elevated acyl-CoA desaturase and PPARγ in lean lake trout could be related to a preferential storage of lipid in the liver as compared to the muscle in siscowet lake trout. Lipid-related genes that were higher in siscowet vs. lean lake trout livers included acyl-CoA binding protein; a highly conserved protein that binds long-chain acyl-CoA esters and acts as an acyl-CoA transporter (Burton et al. 2005; Faergeman et al. 2007). The precise role is unknown but in yeast it is involved in fatty acid chain elongation and sphingolipid synthesis (Faergeman et al. 2007). Another gene that was higher in siscowet vs. lean lake trout livers was annotated as ‘Type-4 ice-structuring protein LS-12 precursor’ from Atlantic salmon (accession no. ACI68824). This protein contains an apolipoprotein AII (apoAII) region. Apolipoproteins bind lipids and are fundamental to the packaging of lipids into lipoproteins for transport through the primary circulatory system in animals. In mammals (Schonfeld et al. 1978) and fish (Babin & Vernieer 1989), apoAII is the second most abundant protein component of high density lipoproteins. ApoAII is associated with increased levels of plasma fatty acids and triglycerides (as summarized in Castellani et al. 2008), and transgenic mice that overexpress apoAII have reduced skeletal muscle fatty acid oxidation and increased triglyceride accumulation (Castellani et al. 2001, 2004). Assuming a similar relationship for apoAII in fish, increased expression of this gene may be associated with decreased fatty acid utilization and increased lipid storage in the siscowet muscle.

In addition to genes related to lipids, we observed a significant number of immune-related genes to be differentially expressed between lean and siscowet lake trout and this was unexpected. In particular, complement component C3, proteasome, FK506 binding protein 5 (immunophilin family member), and several C1q proteins were constitutively expressed at higher levels in lean vs. siscowet lake trout and this was corroborated by qPCR. The gene that was most consistently (across all bioinformatic analyses) and differentially expressed between the two lake trout forms, was the C1q complement protein. In fact, there appeared to be several forms of C1q (C1q-like adipose specific protein, ovary-specific C1q-like factor, complement C1q-like protein 2) that were differentially expressed; some to a greater extent than others based on qPCR (Tables 1–3). C1q is the target recognition protein of the classic complement pathway (Kishore & Reid 2000) and a member of a large family of proteins that contain the C1q domain (Ghai et al. 2007) including precerebellin that was also differentially expressed between siscowet and lean lake trout (Table 3). Because of the unique structure of the molecule, C1q can bind to a number of ligands including LPS, porins, phospholipids, IgG, IgM and DNA to initiate a response (Ghai et al. 2007; Sjoberg et al. 2009). Thus, it plays a pivotal role in complement activation. C1q and other C1q domain containing proteins have been found in other vertebrates including fish (Mei & Gui 2008). Complement component C3 is also a pivotal member of the complement system being at the convergence of all three complement activation pathways: classical, alternative and lectin (Carroll 2004). Interestingly, C1q and C3 have been demonstrated in microarray experiments to be upregulated in the liver during bacterial challenge in channel catfish (Ictalurus punctatus) (Peatman et al. 2007) and a putative C1q homolog was upregulated in rainbow trout (Oncorhynchus mykiss) following stimulation with bacteria (Gerwick et al. 2007). Further, these microarray studies also demonstrated the upregulation of several other genes including haptoglobin (catfish and trout), neurotoxin/C59/Ly6-like protein, and catechol-O-methyltransferase domain containing 1 (catfish) that we also observed to be transcribed differentially between lean and siscowet lake trout livers (though not always higher in lean lake trout). In fact, the catechol-O-methyltransferase domain containing gene has no prior reported relationship with immunity though it was upregulated 14.8-fold in bacterial-challenged catfish (Peatman et al. 2007).

Complement factor proteins circulate in the blood and are recognized as key elements in the innate immune response to pathogens. However, they also enhance the adaptive immune response by indirectly activating B and T cells (Carroll 2004). The primary site for the synthesis of complement proteins is the liver, and in mammals this is thought to be primarily constitutive. However, these factors can be further regulated during infections (Carroll 2004) as reported in catfish and trout (Gerwick et al. 2007; Peatman et al. 2007). If transcript abundance reflects protein synthesis, could it be that lean lake trout have a higher constitutive level of complement than siscowet lake trout? This would suggest that lean lake trout may be more susceptible to pathogen exposure than siscowet, and constitutive elevation in immune factors such as the complement proteins would be adaptive in defending against pathogens. While it has been proposed that siscowet lake trout undergo vertical migrations for feeding (Henderson & Anderson 2002; Hrabik et al. 2006), the frequency of this is unknown. If siscowet lake trout are primarily demersal, living at depths greater than 100 m, then the water temperatures that they experience would be approximately 4 °C throughout the year (Sitar et al. 2008). These low temperatures may be less conducive to the survival and propagation of pathogens. In contrast, lean lake trout are more pelagic, shallow water forms that experience a much wider variation of temperatures (Mattes 2004), and possibly a greater number of pathogens including those from nearshore sources.

Conclusion

This study has demonstrated a strong genetic component to the phenotypic differentiation observed between wild lean and siscowet lake trout including differences in growth, morphology and lipid levels. The results on truss analysis and lipid levels suggest that environmental effects could also influence some of these differences. While the basis for these differences at the gene level are still unknown, the transcriptomic analysis presented here suggests that there are various physiological processes that could be different between these lake trout forms and that we know very little about the extent of these differences. Lake Superior is the only Great Lake with remnant multiple forms of wild lake trout that are adapted to different habitats. Recovery programs in the lower Great Lakes have focused primarily on lean lake trout from shallow water, but now are broadening their management options to consider re-introductions of deepwater forms (Bronte et al. 2008; Markham et al. 2008). Thus, understanding the physiological adaptations to living in deepwater habitats by siscowet lake trout would be very informative to Great Lakes lake trout recovery programs.

Acknowledgements

The authors would like to thank Greg Kleaver, Dawn Dupras, Brandon Bastar and Kevin Rathbun aboard the Michigan DNR ‘Judy’ who helped collect the siscowet lake trout used to derive the laboratory lines; and Jim Barron, Justin Wernecke, Doug Immerman, Adam Gilmore, Matthew Nichols and Erin Weber for their help in maintaining and assaying the lake trout strains reared in the laboratory. We also thank the staff of the Les Voigt Fish Hatchery for supplying lean lake trout fry. This study was supported in part by a grant from the Great Lakes Fishery Commission to F.G., S.S. and C.B., and grant CSD2007-00002 (Consolider-Ingenio 2010, Spanish Ministry of Science and Education, Spain), to S.M. This is contribution P-2009-4 of the U.S. Fish and Wildlife Service, Region 3 Fisheries Program.

Conflicts of interest

The authors have no conflict of interest to declare and note that the sponsors of the issue had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Ancillary