Genetic basis of differential opsin gene expression in cichlid fishes


Karen L. Carleton, Department of Biology, 1210 Biology/Psychology Bldg #144, University of Maryland, College Park, MD 20742, USA.
Tel.: 301 405 6929; fax: 301 314 9358; email:


Visual sensitivity can be tuned by differential expression of opsin genes. Among African cichlid fishes, seven cone opsin genes are expressed in different combinations to produce diverse visual sensitivities. To determine the genetic architecture controlling these adaptive differences, we analysed genetic crosses between species expressing different complements of opsin genes. Quantitative genetic analyses suggest that expression is controlled by only a few loci with correlations among some genes. Genetic mapping identifies clear evidence of trans-acting factors in two chromosomal regions that contribute to differences in opsin expression as well as one cis-regulatory region. Therefore, both cis and trans regulation are important. The simple genetic architecture suggested by these results may explain why opsin gene expression is evolutionarily labile, and why similar patterns of expression have evolved repeatedly in different lineages.


There has been considerable debate regarding the molecular mechanisms underlying organismal diversity. King & Wilson (1975) first proposed that changes in gene regulation were critical to phenotypic differences among closely related species. More recent work has suggested that changes in cis-regulatory modules are the most common basis for evolutionary changes in development (Wray, 2007; Carroll, 2008). Others maintain that structural changes in proteins are the most important mechanism of evolutionary change (Hoekstra & Coyne, 2007).

With modern molecular tools, we can study the genetic architecture underlying a broad range of phenotypes. Although many of the previous studies have focused on morphology or pigmentation, adaptive variation exists in many other traits. Variation in sensory systems is critical for foraging, mating and predator avoidance (Endler, 1992). Sensory systems are optimized for particular environments, and organisms that inhabit different environments may diverge in sensitivity. Sensory divergence can lead to speciation through the process of sensory drive (Ryan & Rand, 1993; Boughman, 2002). Recently, we have shown that both coding sequence changes and differences in gene expression can alter visual sensitivities (Hofmann et al., 2009). However, we do not know whether changes in gene regulation are the result of cis- or trans-regulatory elements. Understanding the genetic basis of sensory systems is important to understanding how organismal diversity arises and how rapidly it can change. Further, such diversity is important as it can lead to adaptation as well as help drive speciation.

African cichlid fishes are a model for organismal diversity and rapid speciation (Fryer & Iles, 1972; Kocher, 2004; Seehausen, 2006; Salzburger, 2009). Although they are recognized for the diversity of their trophic morphology (Liem, 1991; Albertson et al., 2003a), dentition (Fraser et al., 2008, 2009) and colour patterns (Seehausen, 1996; Konings, 2007), we have found that they also show incredible diversity in visual sensitivity, which is determined by the visual pigments in their photoreceptors (Carleton & Kocher, 2001; Parry et al., 2005; Jordan et al., 2006; Hofmann et al., 2009).

Visual pigments are composed of an opsin protein bound to a chromophore such as 11-cis retinal (Wald, 1935). Cichlids have seven cone opsin genes which produce visual pigments with sensitivities across the spectrum from ultraviolet to red wavelengths. These cone opsins include short wavelength–sensitive (SWS), rhodopsin-like (RH2) and long wavelength–sensitive (LWS) genes. The peak wavelengths (λmax) of the expressed proteins in the riverine cichlid Oreochromis niloticus are as follows: SWS1 (360 nm), SWS2B (423), SWS2A (456), RH2B (472 nm), RH2Aβ (518 nm), RH2Aα (528 nm) and LWS (561 nm) (Spady et al., 2006). Because the RH2Aα and RH2Aβ genes are genetically and functionally similar, we group them together into the RH2A class, so that the seven opsin genes are organized into six distinct classes.

In previous work, we have found that cichlids utilize two different mechanisms for tuning their visual pigments. First, the cichlid opsin–coding sequences vary to produce small spectral shifts in the corresponding visual pigments. These shifts occur primarily in the genes at the very shortest and longest wavelengths, SWS1 and LWS (Hofmann et al., 2009). Shifts of 5–10 nm in the LWS gene help to drive speciation between red and blue cichlid species in Lake Victoria through sensory drive (Carleton et al., 2005; Terai et al., 2006; Seehausen et al., 2008). Second, cichlid opsin genes are differentially expressed to create large spectral shifts and a diversity of visual palettes. Gene expression differs considerably across the cichlid phylogeny within Lake Malawi with differences between genera and even between species within a given genus (Hofmann et al., 2009). This evolutionary lability suggests that it is genetically easy to alter opsin gene expression to spectrally tune visual sensitivity to species ecology. As opsin sequence tuning seems significant for only SWS1 and LWS, where shifts are 5–10 nm, the majority of spectral shifts occur through changes in opsin gene expression. In Lake Malawi, we have found three gene combinations to be most common. We name these gene sets by the spectral sensitivity of the shortest wavelength gene: UV (SWS1, RH2B, RH2A), violet (SWS2B, RH2B, RH2A) and blue (SWS2A, RH2A, LWS).

The expression of cichlid opsins occurs in two morphologically distinct cones: single and double. These are highly ordered in a retinal mosaic which in cichlids is a central single cone surrounded by four pairs of double cones (Fernald, 1981). Typically, the central cone expresses a SWS opsin (either SWS1, SWS2B or SWS2A). The surrounding double cones contain the longer wavelength pigments from expression of RH2B, RH2A or LWS (van der Meer & Bowmaker, 1995; Parry et al., 2005; Jordan et al., 2006; Carleton et al., 2008). Differential opsin gene expression therefore produces a patterned retina, which varies only in the combination of pigments. This patterning might be controlled by cis-regulatory elements, in a manner similar to the regulation of colour patterns found in Drosophila wings (Gompel et al., 2005; Prud’homme et al., 2006).

Studies in other fish species have identified both cis- and trans-regulatory factors that affect opsin expression. SWS1 opsin expression in salmon responds to retinoic acid and thyroid hormone (Browman & Hawryshyn, 1992, 1994a,b). This is consistent with studies that suggest that thyroid hormone is important in the differentiation of mouse cone photoreceptors (Kelley et al., 1995; Ng et al., 2001; Applebury et al., 2007). The expression of these hormones normally varies during development in fish and amphibians and so act as trans-regulator factors for opsins as well as other genes (Jones et al., 2003; Marchand et al., 2004; Paris & Laudet, 2008). Analysis of cis-regulatory regions have identified several potential transcription factor binding sites in zebrafish (Luo et al., 2004; Tsujimura et al., 2007) and salmon (Dann et al., 2004). However, on the whole, it has been difficult to use genetic approaches because model systems such as zebrafish (Schmitt et al., 1999) and mouse (Neitz & Neitz, 2001) do not show sufficient variation in opsin expression.

To unravel the molecular mechanism by which opsin gene expression changes, we have made two genetic crosses between cichlid species which differ in the expressed opsin gene combinations. We quantify variation in the F2 generation to estimate how many genes control opsin expression. We also test for association between genetic markers located both cis and trans to the opsin genes. This study provides preliminary data for a larger QTL study and so utilizes a smaller set of individuals. As such, our goal is not to identify all the controlling factors. Rather, here we parameterize the phenotype to establish whether there are few or many controlling factors to assess whether it is possible to map these factors. Further, we test whether such factors are cis to the opsin genes as has been found in many other phenotypes. Finding partial evidence for cis regulation, we perform a preliminary genome scan and find evidence that trans-acting factors are key to opsin gene expression.

Materials and methods

Genetic architecture of factors controlling opsin gene expression

F2 crosses

Two intergeneric crosses were made between cichlid species with different opsin expression. The first cross was between Dimidiochromis compressiceps and Copadichromis eucinostomus (DC cross). These species use the blue (SWS2A, RH2A, LWS) and the UV (SWS1, RH2B, RH2A) gene sets, respectively. This cross between a predator and a zooplanktivore was difficult to make, and we were only able to rear one F2 family of 33 individuals to 15 days post-fertilization. The second cross was between Tramitichromis intermedius and Aulonocara baenschi (TA cross). These species use the blue (SWS2A, RH2A, LWS) and the violet (SWS2B, RH2B, RH2A) gene sets, respectively. This cross was prolific, hardy and easy to raise. We crossed two male T. intermedius and five full sib female A. baenschi to create five F1 families. The F1 families were then intercrossed to raise numerous F2 families. For this study, we analysed 49 F2 individuals that came from five F2 families derived from two F1 families. All individuals were raised to at least 6 months in age (sexual maturity).


Individuals were euthanized with MS-222, the eyes were removed and the dissected retina was placed in RNAlater. Total RNA was extracted (RNeasy Kit, Qiagen, Valencia, CA, USA) and reverse transcribed (Superscript III; Invitrogen, Carlsbad, CA, USA). This cDNA template was then used in quantitative RT-PCR to measure the relative expression of all six opsin classes (SWS1, SWS2B, SWS2A, RH2A, RH2B and LWS) according to our previous methods (Spady et al., 2006; Carleton et al., 2008). The expression of these six gene classes is normalized to give expression of each class relative to the total:


where Ti is the relative amount of each gene i, Ei is the PCR efficiency of that gene, and Cti is the gene’s critical cycle number. Here, fi is the fraction that each gene contributes to total opsin expression.

Gene expression can also be normalized within the single and double cones because single cones express only SWS1, SWS2B and SWS2A, whereas double cones express RH2B, RH2A and LWS. So, for example, the fraction of SWS1 expression in single cones is given by:


whereas the fraction of LWS expression in double cones is given by


These normalized measures were used to make triangle plots of single and double cone gene expression.

To visualize the expression data, we estimated the relative sensitivity of the single and double cones. This utilized known peak sensitivities of each gene weighted by the relative gene expression in each individual (Carleton et al., 2008):


This is not meant to imply that there are actual single and double cones with a sensitivity equal to these values. Rather, it is a convenient way to convert gene expression into biologically relevant wavelengths, which can then be visualized in a two-dimensional space. Gene specific peak wavelengths are taken from expressed opsin proteins for Metriaclima zebraSWS1 = 368 nm, λSWS2B = 423 nm, λRH2B = 484 nm, λRH2Aα = 528 nm (Parry et al., 2005)] and for O. niloticusSWS2A = 455 nm, λLWS = 561 nm (Spady et al., 2006)]. These values are very similar to the average obtained from microspectrophotometry (MSP) data for 19 species from Lake Malawi (λSWS1 = 371 ± 8 nm, λSWS2B = 418 ± 5 nm, λSWS2A = 452 ± 3 nm, λRH2B = 482 ± 5 nm, λRH2A = 527 ± 7 nm, and λLWS = 565 ± 5 nm) (reviewed in Carleton, 2009). Although we group the RH2Aα and β together, these have similar protein expression peak wavelengths in both M. zebra and O. niloticusRH2Aα = 528 nm, λRH2Aβ = 518 nm). MSP data for the Malawi species show green cones with an average λmax = 527 nm, which suggests that the RH2Aα gene predominates.

Number of genes controlling expression

The Castle–Wright estimator was used to calculate a lower bound to the number of factors controlling expression of each opsin gene (Wright, 1968; Lande, 1981; Lynch & Walsh, 1998). This method makes several assumptions, including that one parent’s alleles increase the trait and the other’s alleles decrease the trait, the alleles act additively, loci are unlinked, and all loci have equal allelic effects. These assumptions may not hold true in outbred crosses, such as those used here, and so this method provides only a lower limit to the number of controlling factors (Zeng et al., 1990; Zeng, 1992). We applied the Castle–Wright equation as formalized by Lande (1981):


where ne is the number of factors controlling each gene, μ is mean gene expression in parental species A or B, and σ is the variance in expression of the parentals (A, B), F1 or F2. We estimated numbers from the fraction of the single or double cone expression (fi, SC and fi, DC). For the DC cross, we phenotyped two individuals of each parental species as well as 33 F2 which were 15 dpf. We did not have any F1 to phenotype and so have used the larger of the parental variances for each gene as an estimate for σF1. This could lead to further underestimates in gene numbers, beyond the caveats already inherent in the estimator (Lynch & Walsh, 1998). For the TA cross, we phenotyped two T. intermedius and 10 A. baenschi individuals, 28 F1 and 48 F2. All were phenotyped at greater than 6 months of age, when they are considered adult.

We were able to estimate additional parameters for the TA cross where we have a good estimate of F1 variance. First, we estimated the variance of ne from (Lynch & Walsh, 1998; p. 234):




We were also able to estimate the degree of dominance in the TA cross from the means of the F1 and the parentals. Here, dominance is calculated as d = (μF1 − μA)/(μT −μA) − 0.5), where μF1 is the mean of the F1 and μT and μA are the parental means. This was calculated separately for each opsin gene. The Castle–Wright equation assumes additivity of allelic effect. Dominance effects can be included and act to increase the effective number of factors (Wright, 1968, p. 384). If h = d + 0.5, the number of factors with dominance is:


This effect was taken into account in estimates of ne for the TA cross.

Genetic mapping of factors controlling opsin gene expression

Test of cis regulation in both DC and TA crosses

To determine whether opsin gene expression is controlled by cis-regulatory factors, we genotyped microsatellite markers near the opsin genes. The opsin genes are located in three genomic regions (Lee et al., 2005). SWS1 is located on linkage group (LG) 17. RH2B – RH2Aα– RH2Aβ are in a 26-kb tandem array on LG5. SWS2A – SWS2B – LWS is an 18-kb tandem array also on LG5, although approximately 30 cM away from the RH2 array. The location and primers used to amplify each marker are given in Table 1. The UV marker used to genotype the SWS1 gene is in an intron of that gene and the same marker worked for both crosses. The other two opsin clusters (RH2 and SWS2-LWS) required different primer sets in each cross to successfully amplify a closely linked marker in an intergenic region. The most distant marker (redblue4) was 22 kb from the end of the LWS gene and therefore 38 kb from the beginning of the SWS2A – SWS2B – LWS array. These distances are still quite close genetically. The cichlid genetic map is 1300 cM for 109 bp such that 1 cM = 750 kb (Lee et al., 2005). In scoring 48 individuals (96 meioses), we expect fewer than 0.05 recombinants between the redblue4 marker and the beginning of SWS2A. These markers are therefore in tight linkage with the entire region.

Table 1.   Genotyping primers for markers close to the opsin genes. Primer sequences and location relative to the opsins genes are listed.
MarkerCrossLocation and primer sequences
  1. SWS, short wavelength–sensitive; RH2, rhodopsin-like; LWS, long wavelength–sensitive.

BothThird intron of SWS1
DC4.9 kb from end of RH2B and 3.8 kb from begin of RH2Aα
TA11.4 kb from end of RH2Aβ
DC22 kb from end of LWS
TA7.6 kb from end of LWS

Search for trans-acting regions

We performed a low pass search in the DC cross to identify chromosome regions which might control opsin expression. We used a combination of randomly selected markers across the genome, as well as a comparative approach to select fish chromosomes identified as containing candidate genes thought to be important for opsin expression (thyroid hormone receptors, retinoic acid receptors). Markers were selected either from the O. niloticus map (Lee et al., 2005) or from a map made for two Malawi rock-dwelling cichlids (Albertson et al., 2003a). This included 13 markers, three that were cis to the opsin genes and 10 that were located across the genome, in trans to the opsin regions. Additional markers were tested in regions showing positive association, including three on LG4 and 11 on LG13. The opsin markers as well as markers in the regions of association were then tested in the TA cross.

Single marker association

Association between each marker and opsin expression was tested using analysis of variance (anova). Individuals were grouped by genotype, and then an anova was performed to test whether gene expression for each gene differed by genotype. For the DC cross, this was performed for expression of all six genes. In the TA cross, we performed single marker association for expression of all five opsin genes, leaving out SWS1, which was not expressed. Because each gene is a unique data set, those tests are independent. However, each gene is tested for association at multiple markers, and so we need to correct for multiple tests. We make the additional assumption that linked markers are not independent because their genotypes are tightly correlated. The linked markers on LG4 (UNH2126, UNH2156) and LG13 (GM507, UNH871, GM641) were therefore treated as single tests. For each gene, we use sequential Bonferroni correction to test for significance (Hochberg, 1988). The marker tests are ordered by significance value and compared to the significance threshold determined by:


where α = 0.05 and j varies from 1 up to m, the number of markers/regions tested.

Map order

For the markers that were associated with opsin expression and located on LG13, we determined the map order for both the DC and the TA cross and compared that with the tilapia map order of Lee et al. (2005). The DC/TA map order was determined from the recombination rates between markers along with the map order giving the largest likelihood.

The likelihood or LOD score for linkage between two markers is given by:


where R is the number of recombinants between the two markers, NR is the number of nonrecombinants, and θ is the recombination fraction = R/(R + NR). The LOD score for a given map order of three markers is just the sum of the LOD score for the two intervals, if interference is neglected (Ott, 1991). For the LG13 region, we are interested in the map order of GM507, UNH871 and GM641. Because there are no differences in the degrees of freedom between marker orders, we cannot use a likelihood ratio test to assess order. However, we can calculate the probability of marker order 2 vs. marker order 1 from

Pr (order2 vs. order1) = exp(LOD2 − LOD1)

where LOD1 and LOD2 are taken as the scores for the tilapia map order and the Malawi cross (TA or DC) map order, respectively (Liu, 1998).


Gene expression in DC cross

Opsin gene expression was highly variable among the 33 F2 of the DC cross, as shown in the triangle plots (Fig. 1). F2 expression for RH2B and RH2A varied between the extreme values of the parentals. F2 expression for SWS1 and SWS2A fell short of the parental range. This was most significant for SWS2A, where F2 expression was very low (max of 6.8%), in spite of strong expression in the D. compressiceps parent (20.15%). In contrast, expression of SWS2B was transgressive. The F2 showed more expression (0.6–28.9%) than either of the parental species (0% and 2.4%). This is seen in the triangle plot of Fig. 1a, where the parentals fall close to either the SWS1 or SWS2A vertices and yet the F2 vary along the SWS1–SWS2B axis.

Figure 1.

 Triangle plots of single (left) and double (right) cones for the DC and TA crosses showing parental and F2 (inline image) gene expression. For the DC cross, bsl00066 = Dimidiochromis and ▪ = Copadichromis. For the TA cross, bsl00066 = Tramitichromis and ▪ = Aulonocara.

Gene expression in the TA cross

Gene expression was also highly variable in the TA cross. The expression of the SWS2A and SWS2B in single cones varied across the full range defined by the parentals. As expected from the lack of SWS1 in the parentals, there was no significant SWS1 expression in either the F1 or F2. Expression for the double cone genes (RH2B, RH2A and LWS) extended somewhat beyond that of the parentals as shown in the triangle plots of Fig. 1b. However, we do not see the transgressive expression in this cross that we observed in the DC cross.

Gene expression in wavelength space

Plots of gene expression were made for each cross in single/double cone wavelength sensitivity space. This space allows us to look at correlations between the single and double cones and to compare the crosses to our previous work (Carleton et al., 2008; Hofmann et al., 2009). As shown in Fig. 2, the F2 of both crosses varied across the range set by the values of the parental species. This is true for both single and double cone wavelengths. This large variation in the F2 is consistent with the idea that a small number of genes control opsin expression.

Figure 2.

 A plot of the average single vs. average double cone wavelengths for the DC and TA crosses including parentals and F2. For the TA cross, F1 are also shown. Average wavelengths are calculated based on individual gene expression using eqns 3a and 3b.

Correlations among genes

Expression of each opsin was normalized to total opsin expression. This normalization will cause some inherent gene correlations, particularly between the highly expressed double cone genes, RH2B, RH2A and LWS. Here, we note correlations between genes in the single and double cones, which are unlikely to result from normalization (Table 2). For the DC cross, there is a positive correlation between expression of SWS1 and RH2B and a negative correlation of SWS1 with SWS2B and LWS. For the TA cross, RH2B is positively correlated with SWS2B and negatively correlation with SWS2A. The single cone–double cone sensitivity plot (Fig. 2) also shows evidence of gene correlations. Individuals with short wavelength single cones tend to have short wavelength double cones, whereas those with long wavelength single cones have long wavelength double cones. This is most obvious for the DC cross, where the regression coefficient for single and double cone wavelength is 0.55 (F1,31 = 38.2; P = 7.3 × 10−7). However, the regression coefficient for the TA cross is only 0.06 (F1,47 = 3.12 P = 0.084).

Table 2.   Gene correlations. The F statistic and P value for the regressions are listed for each opsin gene combination (*P < 0.01, **P < 10−4). If the P value is statistically significant, the regression coefficient is also listed.
  1. SWS, short wavelength–sensitive; RH2, rhodopsin-like; LWS, long wavelength–sensitive.

(A) DC cross
 LWSF1,31 = 2.7
P = 0.11
F1,31 = 61.3
P = 7.8e−9**
F1,31 = 1.4
P = 0.24
F1,31 = 8.2
P = 0.007*
F1,31 = 16.3
P = 3.3e−4**
 RH2A F1,31 = 1.3
P = 0.26
F1,31 = 0.05
P = 0.83
F1,31 = 0.01
P = 0.92
F1,31 = 1.3
P = 0.27
 RH2B  F1,31 = 3.1
P = 0.09
F1,31 = 29.5
P = 6.2e−6**
F1,31 = 33.5
P = 2.3e−6**
 SWS2A   F1,31 = 4.0
P = 0.053
F1,31 = 8.2
P = 0.007*
 SWS2B    F1,31 = 60.2
P = 9.3e−9**
(B) TA cross
 LWSF1,48 = 128.2
P = 3.7e−15**
F1,48 = 0.01
P = 0.94
F1,48 = 0.003
P = 0.96
F1,48 = 0.4
P = 0.55
F1,48 = 3.6
P = 0.06
 RH2A F1,48 = 11.7
P = 0.001*
F1,48 = 4.8
P = 0.03
F1,48 = 6.9
P = 0.01
F1,48 = 3.1
P = 0.08
 RH2B  F1,48 = 30.0
P = 1.6e−6**
F1,48 = 39.4
P = 9.4e−8**
F1,48 = 0.05
P = 0.83
 SWS2A   F1,48 = 161.3
P = 5.8e−17**
F1,48 = 2.1
P = 0.16
 SWS2B    F1,48 = 0.8
P = 0.37

Castle–Wright estimates of genetic factors

Based on the gene expression differences, the parentals of both the TA and DC cross differ in gene expression by substantial amounts. For the TA cross, the parental differences given in standard deviations is as follows: SWS2b 37.8, SWS2a 152, RH2b 7.6, RH2A 16.3, and LWS 41.1. These large differences support that the alleles of the two parentals work in opposite directions and favour the applicability of the Castle–Wright estimator. However, the other caveats of additivity, linkage and differential parental contributions can still confound these results (Zeng et al., 1990; Zeng, 1992).

We have estimated the number of factors controlling opsin expression using the parental means and the variances of each generation (Table 3). For the DC cross, we use the larger of the F0 variances as an estimate of variances in F1. For the majority of the genes in both crosses, we estimate that 1–2 genetic factors control the differences in expression between species. This estimate is a lower bound, but it supports the idea that a small number of genes control opsin gene expression.

Table 3.   Castle–Wright estimates of number of factors (ne) controlling expression of each of the opsins. For the TA cross, we also list the variance of ne (Var ne), dominance (d) and ne estimated with the effects of dominance (ne,dom).
(A) DC cross
 Dc mean1.2510.6788.090.4538.1638.51
 Ce mean99.980.010.0146.7315.480.00
 Var Dc0.000.1416.300.0014.720.29
 Var Ce3.610.000.001.690.360.00
 Var F2958.03835.6522.43141.3544.68132.68
 Var F1*3.610.1416.31.6914.720.29
  1. SWS, short wavelength–sensitive; RH2, rhodopsin-like; LWS, long wavelength–sensitive.

  2. *Estimated as the larger of Var Dc or Var Ce.

(B) TA cross
 Ti mean18.4780.853.1947.7849.03
 Ab mean98.060.7716.3782.920.71
 Var Ti5.
 Var Ab4.410.283.034.651.38
 Var F2885.28899.5440.53312.23263.07
 Var F1184.79176.2715.01162.89186.38
 F1 mean70.4127.315.9652.8941.15
 Dominance, d0.150.17−0.29−0.35−0.34

There are two exceptions which suggest more complex genetic control. The first is the SWS2A gene in the DC cross where we estimate over 150 genes control expression. This is likely an artefact because the F2 do not show significant expression of this gene, leading to a skewed estimate of the number of genes. The second is the LWS gene in the TA cross where we estimate four genes control expression.


There was some evidence of dominance in the control of gene expression. This was estimated in the TA cross where we have expression data for the F1. Values were d = 0.15 (SWS2B), 0.17 (SWS2A), −0.29 (RH2B), −0.35 (RH2A) and −0.33 (LWS). Here, the single cone genes have positive values, and are closer to Aulonocara gene expression levels, whereas the double cones have negative values, and are shifted towards the levels of gene expression found in Tramitichromis. This suggests a small but significant effect of dominance in the expression of these genes. Including dominance in the estimate of controlling factors only slightly increases their values, and we still estimate 1–2 factors controlling expression for all but the LWS gene (Table 3).

Genetic mapping of regulatory loci

Single marker association tests for the opsin markers in the DC cross (Table 4) found no association between opsin markers and opsin gene expression. This suggests that cis regulation is not important for this cross. However, this result could be limited by our only having 33 individuals available for this cross. Single marker tests with 10 random markers did identify two trans-acting genomic regions associated with gene expression. The first region was identified with marker UNH871 on LG 13 and was associated with SWS1, SWS2B and SWS2A expression. This marker was also linked to LWS expression and was nearly significant for RH2B expression as well. We selected 10 additional markers on LG13 to test for association, although only four of these amplified and showed variation in this cross. Two of these markers were also associated with SWS expression, and all three had P values <0.05. We consider these three markers to be linked and so treat them as a single locus with regard to Bonferroni correction for significance. A second region was identified with marker UNH2156 on LG4 and was associated with RH2A/LWS expression. This association was highly significant (F2,28 = 7.15 P = 0.003). We tested three additional markers on LG4, one of which (UNH2126) also showed significance. After sequential Bonferroni correction, only the region on LG4 is significantly associated with opsin expression.

Table 4.   Test for single marker association between markers at the opsins or elsewhere in the genome and gene expression for each opsin. Linkage group (LG) (and position in cM) is taken from the tilapia map (Lee et al., 2005). The relative marker positions on LG4 are taken from the mbuna map of Albertson et al., 2003a, relative to the known position of UNH911 on the tilapia map. For each marker, we list the number of individuals that were successfully genotyped, the degrees of freedom for the F statistic, and then for each gene we list the F statistic and P value. Markers that are statistically significant after sequential Bonferroni correction are shown in bold (P values are given in Supplementary Table S1).
MarkerLG (cM)Number of individuals genotypedd.f.F statistic **P < 0.01
(A) DC cross
Green opsin5 (18)263, 220.6990.6580.7890.7501.3920.084
Red-blue opsin5 (48)292, 260.0660.0701.1600.6280.5240.502
UV opsin17282, 250.5490.6100.1130.0092.3940.631
UNH2008 2282, 251.7691.6272.9351.0991.9860.403
UNH20164 (16)312, 280.0610.0460.5300.7461.4610.340
UNH21264 (19)292, 260.4811.2503.0040.6275.9883.092
UNH21564 (20)312, 280.0130.0150.4240.7487.152**4.274
UNH9114 (22)NS       
UNH2003 6292, 260.4000.4110.0940.0510.8150.055
UNH2164 6312, 280.0150.0901.2370.4420.9431.332
UNH2153 7321, 301.2671.0970.8670.0880.0050.060
UNH129 8323, 282.0732.0202.3962.1430.8491.541
UNH875 9292, 260.0010.0080.6830.0800.0870.010
UNH208412322, 291.7151.5792.7281.8780.2042.572
UNH93413 (10)313, 270.5250.3931.0460.6100.1180.381
GM37313 (20)311, 290.9090.6991.2090.7240.0040.722
UNH17313 (22)NA       
GM64113 (25)322, 294.3423.8003.4602.3400.4361.684
UNH99713 (31)NV       
GM26213 (32)321, 301.0330.7242.0332.3070.9200.800
GM50713 (32)302, 270.9160.8830.8230.9161.9692.056
UNH95413 (32)NV       
GM68613 (33)281, 264.7795.5270.0472.5740.3201.405
UNH100813 (34)NA       
GM29913 (39)NV       
UNH87113 (41)292, 263.8833.0413.7033.2930.2883.587
UNH216615303, 270.2740.3160.6381.4680.9553.072
MarkerLG (cM)Number of individuals genotypedd.f.F statistic *P < 0.05; **P < 0.01
  1. SWS, short wavelength–sensitive; RH2, rhodopsin-like; LWS, long wavelength–sensitive; NA, no amplification; NS, not scorable; NV, not variable in cross.

(B) TA cross
Green opsin 5472, 440.5000.4393.9010.6330.614
Red-blue opsin 5472, 444.6374.7565.503*2.0330.529
UV opsin17442, 411.6441.7291.1820.0210.063
UNH2156 4452, 420.2440.2450.5040.8381.099
GM37313 (2)NS2, 44     
GM64113 (25)472, 446.415**6.338**0.7500.4150.766
GM26213 (32)NV      
GM50713 (32)472, 446.998**7.520**0.9861.2990.765
GM68613 (33)NA      
GM29913 (39)NS      
UNH87113 (41)472, 447.638**7.666**0.3540.8460.999

Single marker associations were also tested in the TA cross for regions containing the opsin genes as well as the regions positively associated in the DC cross (Table 4). We found positive association for a marker near red opsin with expression of SWS2B (F2,44 = 4.63, =0.015), SWS2A (F2,44 = 4.75, P = 0.013) and RH2B (F2,44 = 5.50, P = 0.007). The linkages with SWS2B and SWS2A were just beyond significance after sequential Bonferroni correction. We also found a marker near green opsin with positive association (RH2B F2,44 = 3.90 P < 0.03), although this was not significant after sequential Bonferroni correction. However, these loci suggest that cis regulation may be important for this cross. We also tested the two strongest trans-acting regions identified in the DC cross and found strong association for markers on LG13 (P < 0.004). However, there was no association with the marker on LG4. Therefore, both crosses showed strong association to a non-opsin region, LG4 for the DC cross and LG13 for the TA cross.

Figure 3 shows a summary of the results for the LG13 genomic region with several different maps. In Fig. 3a, we show markers on LG 13 ordered by the tilapia reference map, noting the P values for association with SWS1 (DC cross) or SWS2b (TA cross) opsin expression. This figure also shows markers that were tried, but either did not amplify (NA), were not variable (NV) or were not scorable (NS). In Fig. 3b, we show the maps determined from recombination fraction between the three LG13 markers for each cross. There is strong evidence for an inversion in this region with a change in the Malawi species marker order, relative to the tilapia reference map. The Malawi marker order is 9700 times more likely than the tilapia marker order for the DC cross (LODMalawi = 12.7; LODtilapia = 3.6) and 17 700 times more likely for the TA cross (LODMalawi = 24.4; LODtilapia = 14.6). For both crosses, GM641 is closest to UNH871 and more distant from GM507. In addition, GM507 is even more distant from UNH871 in the DC cross relative to the TA cross, which perhaps explains why GM507 did not show association with opsin expression in this cross.

Figure 3.

 Genomic region on LG13 showing association with opsin expression. In (a), markers are given in the order of the tilapia LG13 map. Markers are labelled with P values for association with SWS1 (DC cross) or SWS2b (TA cross) expression. In (b), maps are based on recombination events for the actual TA or DC cross individuals for scored markers. SWS, short wavelength–sensitive.

In Table 5, we quantify the fraction of variation in opsin expression which is explained by markers showing an association with opsin gene expression. This is determined from the regression coefficient of phenotype (gene expression) on genotype. In the TA cross, the tightly linked markers significantly explain up to 25% of the variation. The fraction of variation explained in the DC cross is also significant with contributions up to 33% (LG4). However, we must note that it is possible that these contributions are overestimates as a result of the Beavis effect whereby QTL studies in small crosses yield larger relative contributions (Beavis, 1998).

Table 5.   Regression statistics for markers with positive association, including regression coefficient (R2), F statistic and P value.
(A) DC cross
UNH21264    0.25
1,27 = 8.97
1,27 = 6.28
UNH21564    0.33
1,29 = 14.51
1,29 = 6.56
F1,30 = 0.027
F1,30 = 0.005
F1,30 = 2.32
1,26 = 4.78
1,26 = 5.53
F1,27 = 0.83
F1,27 = 2.2
F1,27 = 0.05
  1. SWS, short wavelength–sensitive; RH2, rhodopsin-like; LWS, long wavelength–sensitive; LG, linkage group.

  2. *P < 0.05; **P < 0.01.

(B) TA cross
Red opsin50.16
45,1 = 8.34
45,1 = 8.70
45,1 = 7.04
45,1 = 12.99
45,1 = 12.66
45,1 = 14.2
45,1 = 15.2
45,1 = 14.8
45,1 = 14.6


Opsin expression is labile

Our previous work suggests that opsin expression is highly variable. It can vary between adults of different species, as observed in cichlid species from Lake Malawi, as well as developmentally, as seen in the riverine tilapia, O. niloticus (Carleton et al., 2008). Opsin gene expression can even differ amongst closely related species within the same genus (Hofmann et al., 2009).

In this work, we examine the genetic architecture underlying opsin gene expression and find that expression is controlled by relatively few genes. Opsin expression in the F2 for both crosses shows large variation, essentially covering the full range defined by the parentals. Castle–Wright estimates suggest that the variance in gene expression is controlled by a small number of genetic factors, typically just 1–2 genes. One possible concern in making these estimates is the small number of parentals that we have for the DC cross and for T. intermedius in the TA cross. However, adding additional parental individuals (and increasing parental variance) would only decrease the estimated number of genes because the parental variance is subtracted in the numerator of the Castle–Wright estimator. Therefore, the estimate of a small number of genes is robust to our experimental sampling biases. This simple genetic architecture could facilitate evolutionary change in opsin expression between species as well as contribute to the evolutionary lability of opsin expression observed within the Malawi cichlid flock.

In addition, we have found that expression of certain opsin genes is highly correlated. We observed positive correlations between SWS1 and RH2B in the DC cross, and between SWS2B and RH2B in the TA cross. We also observed correlations between single and double cone wavelength, particularly in the DC cross. These correlations might explain why a few combinations of opsins are prevalent among cichlid species. Three opsin gene combinations are widespread in the three major cichlid radiations of Lakes Malawi, Victoria and Tanganyika: UV (SWS1, RH2B, RH2A), violet (SWS2B, RH2B, RH2A) and blue (SWS2A, RH2A, LWS). These combinations also occur at different developmental stages, most notably in O. niloticus (Carleton et al., 2008). If certain genes are temporally correlated through development and correlated in different species, it seems possible that expression of the correlated opsin genes are controlled by the same transcription factor binding to a common promoter element for these genes. We are currently searching for commonalities in the promoter regions of correlated genes.

We estimate that there are 1–2 factors controlling expression of most of the opsin genes. As there are six gene classes, this suggests 6–12 factors could be important. However, if transcription factors are indeed shared between genes, the number of genetic factors controlling opsin expression might be smaller. This commonality in transcription would further contribute to rapid changes in gene expression over evolutionary times, as well as evolution of a few key visual gene palettes.

Genetic basis of opsin regulation

We found evidence for both cis- and trans-acting factors regulating opsin expression. In the TA cross, there was one marker close to the SWS2A – SWS2B – LWS tandem array on LG5 that was associated with expression of RH2B (and nearly statistical for SWS2A and SWS2B expression). This marker explains 13–16% of the opsin variance. We did not find any cis-acting regions in the DC cross, although we had fewer individuals to test for association. There could be small cis-regulatory effects in this cross that we did not have the power to detect.

We found strong evidence for trans-acting regulators of opsin gene expression. The TA cross showed associations with a region on LG13 which explained at least 20% of the variance. In addition, the DC cross showed an association with a region on LG4 which explained at least 30% of the variance. We have not yet fine mapped these regions. Once we have markers at the causative loci, they may explain even more of the variance.

The loci on LG4 and LG13 are trans to the opsin genes, which are located on LG17 (SWS1) and LG5 (SWS2-LWS and RH2 arrays). Therefore, they likely contain transcription factors or other regulatory molecules that modulate opsin expression. These trans-acting factors may interact with multiple opsin promoters and could facilitate correlated change in several opsins.

Work in other species suggests a few trans-acting candidate genes which could be involved. Thyroid hormone is known to affect mouse cone opsin expression (Ng et al., 2001; Applebury et al., 2007) and both thyroid hormone and retinoic acid have been shown to alter opsin expression in salmon (Browman & Hawryshyn, 1994a,b). The receptors of both of these molecules act as transcription factors through characteristic binding sites in gene promoters and often do so by forming heterodimers with each other (Glass, 1994). We hope to locate these receptors on the cichlid genetic map, using the forthcoming cichlid genome sequence and test for association with these important hormone receptors.

In the DC cross, we observed transgressive expression of the SWS2B and SWS2A genes, which either exceeded (SWS2B) or did not reach (SWS2A) the values of the two parental species. The SWS1 gene is on LG17 and SWS2a is on LG5. There may be some epistatic interactions between controlling loci which act to determine the balance between these genes. We did not see transgressive expression in the TA cross which involved SWS2B and SWS2A expression. These two genes are within 5 kb of each other on LG5 and so might be directly regulated by common factors.

Genetic architecture and comparison to other traits

The genetic architecture of several other cichlids traits have been determined for Lake Malawi rock dwellers. Several traits are Mendelian, based on a single gene. This includes tooth shape (Streelman et al., 2003b) and female orange-blotch colour patterns (Streelman et al., 2003a; Roberts et al., 2009). Other traits are more genetically complex. Cichlid head and jaw morphology traits are controlled by between 5 and 10 genes (Albertson et al., 2003a,b; Albertson & Kocher, 2005). Blue-yellow coloration of different male colour pattern elements was controlled by 4–7 loci (Barson et al., 2007). These traits were also influenced by dominance and epistasis. Our results for visual sensitivity fall in between the simple Mendelian and the complex traits, as expression of most of the opsins is controlled by a few factors. In contrast to jaw morphology or male colour pattern, facile changes in opsin gene expression may enable visual sensitivity to shift, enabling rapid response to changes in the photic environment or the requirements of different foraging modes.

Models suggest that mating traits might be correlated with sensory traits (Endler, 1992). A mating trait is thereby selected to optimally stimulate the sensory pathway used to detect it. There is a close correlation between visual sensitivities and male mating coloration in Lake Victoria cichlids which is a result of changes in opsin sequence (Carleton et al., 2005; Seehausen et al., 2008). Species with longer wavelength visual sensitivity use longer wavelength mating colours. However, it is unclear whether vision drives colour, colour drives vision, or both are driven by the photic environment. Our previous studies suggest that evolution of visual sensitivity of Lake Malawi cichlids is controlled by differential gene expression. This causes large spectral shifts with finer scale shifts as a result of opsin sequence tuning (Hofmann et al., 2009). Gene expression is driven by fewer loci than male colour, and is therefore potentially more labile. As a result, vision could respond more rapidly to selection from the environment and therefore may drive the evolution of male colour. Testing this hypothesis will require more explicit knowledge of the genetic basis of cichlid vision and colour as well as knowledge of the evolutionary forces driving these traits.


Sensory evolution in cichlid fishes involves a number of different mechanisms. In previous work, we have demonstrated the importance of differential gene expression for producing large shifts in spectral sensitivity. In this work, we demonstrate that gene expression changes are likely controlled by just a few genetic factors. As gene expression of certain genes is correlated, these controlling factors could drive several of the opsins. This could explain why there are just three common palette of genes expressed in wild cichlids. Here, we find strong evidence for two trans-regulatory factors and a potential cis-regulatory factor as well underlying opsin gene expression changes. Our previous work has shown that gene regulation, as well as coding sequence, play a role in cichlid visual sensitivity (Hofmann et al., 2009). With our finding that both cis and trans gene regulation are also important, it would seem that all possible mechanisms contribute to the spectral sensitivity of this key sensory system.


This work was supported with funds from NSF (IOS-0841270), NIH (R15 EY016721-01) and the University of Maryland. Additional support was provided by an NSF REU to N.S. and a UNH Summer Undergraduate Research Fellowship to C.K. Thanks to the Kocher lab for comments on this work, Kelly O’Quin for statistical advice, and all the undergraduate fish caretakers.