This paper is dedicated to the memory of Professor Antoni Prevosti.
STRUCTURE AND POPULATION GENETICS OF THE BREAKPOINTS OF A POLYMORPHIC INVERSION IN DROSOPHILA SUBOBSCURA
Article first published online: 19 JUL 2012
© 2012 The Author(s). Evolution© 2012 The Society for the Study of Evolution.
Volume 67, Issue 1, pages 66–79, January 2013
How to Cite
Papaceit, M., Segarra, C. and Aguadé, M. (2013), STRUCTURE AND POPULATION GENETICS OF THE BREAKPOINTS OF A POLYMORPHIC INVERSION IN DROSOPHILA SUBOBSCURA. Evolution, 67: 66–79. doi: 10.1111/j.1558-5646.2012.01731.x
- Issue published online: 4 JAN 2013
- Article first published online: 19 JUL 2012
- Accepted manuscript online: 2 JUL 2012 11:42AM EST
- Received October 11, 2011, Accepted June 14, 2012
- Chromosomal inversion;
- inversion breakpoints;
- nucleotide polymorphism
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
Drosophila subobscura is a paleartic species of the obscura group with a rich chromosomal polymorphism. To further our understanding on the origin of inversions and on how they regain variation, we have identified and sequenced the two breakpoints of a polymorphic inversion of D. subobscura—inversion 3 of the O chromosome—in a population sample. The breakpoints could be identified as two rather short fragments (∼300 bp and 60 bp long) with no similarity to any known transposable element family or repetitive sequence. The presence of the ∼300-bp fragment at the two breakpoints of inverted chromosomes implies its duplication, an indication of the inversion origin via staggered double-strand breaks. Present results and previous findings support that the mode of origin of inversions is neither related to the inversion age nor species-group specific. The breakpoint regions do not consistently exhibit the lower level of variation within and stronger genetic differentiation between arrangements than more internal regions that would be expected, even in moderately small inversions, if gene conversion were greatly restricted at inversion breakpoints. Comparison of the proximal breakpoint region in species of the obscura group shows that this breakpoint lies in a small high-turnover fragment within a long collinear region (∼300 kb).
Structural variation has played an important role in chromosomal evolution as initially revealed by cytogenetic studies (e.g., as reviewed in Powell 1997 for Drosophila) and more recently through the comparison of complete genome sequences from both distantly and closely related species (e.g., Pevzner and Tesler 2003; Bhutkar et al. 2008; Lee et al. 2008; von Grotthuss et al. 2010). Among structural variants, paracentric inversions have greatly contributed to within-chromosome reorganization, whereas translocations and chromosome fusions underly most between-chromosome reorganizations, with transpositions having played a minor role in this context (Ranz et al. 2001; see also Conceição and Aguadé 2008). At the intraspecific level, chromosomal inversion polymorphism has been extensively studied at the cytogenetic level in Drosophila (Powell 1997), in which it is widespread in some species (e.g., D. melanogaster, D. pseudoobscura, D. subobscura, and D. buzzatii) and absent, or nearly absent, in other species (e.g., D. simulans). Inversion polymorphism is, however, not restricted to Drosophila and other insects where their study was first possible due to the presence of polytene chromosomes in some of their organs. Indeed, the availability of molecular markers has allowed the relatively recent identification and characterization of segregating inversions in such diverse taxa as birds and plants (Thomas et al. 2008; Lowry and Willis 2010). Also, inversions segregating at intermediate frequencies have been detected in human populations (Feuk et al. 2005; Stefansson et al. 2005; Bansal et al. 2007).
As for any mutation, the inversions detected through the comparison of extant taxa—that is, fixed between species and segregating within species—constitute a small subset of all the inversions that originated in the past. Most inversions become lost by drift soon after their origin, independently of their effect on fitness. Those adaptive inversions that escape their loss by drift can rapidly increase in frequency and either become established at an intermediate frequency (balanced polymorphism) or become fixed. In contrast, the frequency of those inversions that do not affect the fitness of their bearers can slowly drift on their way to either fixation or loss. There is evidence in multiple species of Drosophila, as well as in other organisms, for the adaptive character of inversion polymorphism even if the underlying mechanisms are generally unknown (Powell 1997).
Concerning the origin of inversions, two mechanisms have been proposed based on available data (Ranz et al. 2007). Thus, inversions can originate through recombination between distant inverted copies of a particular transposable element or repetitive sequence. Alternatively, inversions can originate via staggered double-strand breaks at distant locations of a particular chromosome. For relatively young inversions, the presence of transposable elements at both inversion breakpoints would support their origin through the first mechanism (henceforth mechanism 1), whereas their absence together with the presence of duplicated fragments at one or both inversion breakpoints would support their origin through the second mechanism (henceforth mechanism 2). There are only a few polymorphic inversions for which both inversion breakpoints have been identified and sequenced. At this time scale, there seems to be a difference between species concerning the origin of polymorphic inversions. Data from Anopheles species (Mathiopoulos et al. 1998; Lobo et al. 2010) and from both D. buzzatii (Cáceres et al. 1999; Casals et al. 2003; Delprat et al. 2009) and D. pseudoobscura (Richards et al. 2005) would point to mechanism 1 due to the presence of transposable elements and other repetitive sequences at the breakpoints. Indeed, polymorphic inversion breakpoints in D. pseudoobscura seem enriched in two different medium-sized repetitive sequences (Richards et al. 2005). In contrast, data from D. melanogaster (Wesley and Eanes 1994; Andolfatto et al. 1999; Matzkin et al. 2005) would favor mechanism 2 because duplications, but no transposable elements, were detected at the characterized breakpoints. At the longer timescale (i.e., for inversions fixed between species), there is little evidence for mechanism 1 because no transposable elements have generally been detected at fixed inversion breakpoints (Cirera et al. 1995; Sharakhov et al. 2006; Prazeres da Costa et al. 2009; Runcie and Noor 2009). Transposable elements could have been lost, however, once inversions became fixed. On the other hand, support for mechanism 2 emerged from the comparison of the D. melanogaster and D. yakuba genome sequences (Ranz et al. 2007), which revealed an enrichment of duplicated fragments at within-chromosome rearrangement breakpoints.
Although the number of polymorphic inversion breakpoints characterized at the sequence level for any particular clade of Drosophila species is very low to draw any conclusion relative to the different modes of origin, the available data would point to staggered breaks in D. melanogaster and to repetitive sequences in D. pseudoobscura and D. buzzattii (see, however, Calvete et al. 2012). If the observed trend did hold within species, it would also be important to establish whether modes of origin are species specific or clade specific.
In a first effort to address the above-raised question in the obscura group, we have identified and sequenced the two breakpoints of a polymorphic inversion of D. subobscura: inversion 3 of the O chromosome (i.e., of Muller's element E). The breakpoints of this inversion had been mapped at sections 91B/C and 94E/95A of the Ost cytological map (Künze-Mühl and Müller 1958; Fig. 1). According to its physical (∼3.5 Mb) and recombination (27.4 cM) length (Munté et al. 2005), inversion 3 can be considered a rather short inversion. Inversion 3 originated on the ancestral O3 arrangement and gave rise to the Ost chromosomal arrangement (Fig. 1). A second arrangement (O3 + 4)—derived also from O3 through a single inversion (Inv 4 in Fig. 1)—segregates together with Ost in extant populations of this species. The ancestral O3 arrangement went extinct in the D. subobscura lineage after the origin of the Ost and O3 + 4 chromosomal arrangements. Therefore, we identified the breakpoint regions in a homokaryotypic O3 + 4 line, because the breakpoint regions present in the ancestral O3 arrangement (named AB and CD in Fig. 1) are maintained in extant O3 + 4 chromosomes, except for the reversed order of the distal breakpoint (DC in Fig. 1). Sequence comparison of the two breakpoints of inversion 3 between Ost and O3 + 4 chromosomes indicates that inversion 3 of D. subobscura most likely originated via staggered double-strand breaks, which contrasts with the origin of the arrowhead inversion of D. pseudoobscura via ectopic recombination of repetitive sequences (Richards et al. 2005). This would imply that the association detected in the D. pseudoobscura lineage between inversion breakpoints and two families of repetitive sequences is not a characteristic of the obscura group but specific of the pseudoobscura lineage.
Moreover, our resequencing of the fragments encompassing each breakpoint in a moderately large sample of Ost and O3 + 4 isochromosomal lines extracted from a Spanish natural population (Rozas et al. 1995) has allowed us to evaluate the evolutionary forces that have shaped variation at the breakpoints of both chromosomal arrangements.
Materials and Methods
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
Nineteen D. subobscura isochromosomal lines for the O chromosome were used: 10 O3 + 4 and 9 Ost. These lines from El Pedroso (Spain) are a subset of those used in previous studies (Rozas et al. 1995; Navarro-Sabaté et al. 1999; Munté et al. 2005). A highly inbred line of each D. madeirensis and D. guanche, which had been obtained by over 10 generations of sibmating (Khadem et al. 1998; Pérez et al. 2003), were also used.
PCR AMPLIFICATION AND SEQUENCING
Genomic DNA was extracted from frozen individuals of each isochromosomal line of D. subobscura and from one individual of a highly inbred line of each D. madeirensis and D. guanche using a modification of protocol 48 in Ashburner (1989). Oligonucleotides for PCR amplification and sequencing were designed based on the comparison of the genome sequences of D. melanogaster and D. pseudoobscura and using D. subobscura sequences whenever available (unpubl. results, Barcelona Subobscura Initiative or BSI). Sequences of the primers used for PCR amplification and sequencing are available from the authors upon request. Different Taq polymerases (TaKaRa DNA polymerase from Takara Bio Inc and GoTaq DNA polymerase from Promega) were used according to the expected length of the fragment to be amplified. The amplified fragments were purified with MultiScreen PCR (Millipore) and used directly as templates for sequencing with the ABI PRISM version 3.2 cycle sequencing kit (Applied Biosystems, Foster City, CA) according to manufacturer's conditions. Sequencing products were separated on an ABI PRISM 3730 sequencer (PerkinElmer, Norwalk, CT). Sequences were assembled using the DNASTAR package (Burland 2000) and multiply aligned with the MAFFT version 6.864 program (Katoh and Toh 2008). Multiple sequence alignments were edited with the MacClade version 3.06 program (Maddison and Maddison 1992). All sequences were obtained on both strands. The newly obtained sequences have been deposited in the EMBL/GenBank Data Library under accession numbers HE614146-HE614186.
IN SITU HYBRIDIZATION
Polytene chromosome preparations of D. subobscura were performed according to Montgomery et al. (1987). The fragments amplified by PCR using DNA from an isochromosomal O3 + 4 D. subobscura line (J16 in Figs. S1–S5) were gel-band extracted with the QIAquick kit (Qiagen) prior to their labeling for in situ hybridization. Probes were obtained through biotin-16-dUTP labeling by nick translation of purified PCR amplicons. Prehybridization, hybridization, and detection were as described in Montgomery et al. (1987) using the ABC-Elite Vector Laboratories kit for detection and with a hybridization temperature of 37°C. Digital images were obtained at a 400 magnification using a phase contrast Axioskop 2 Zeiss microscope and a Leica DFC290 camera. The location of the hybridization signals was determined using the cytological map of D. subobscura (Künze-Mühl and Müller 1958) with the standard arrangement for all chromosomes. The length and location of the different fragments used as probes are given in Table S1.
Sequences of the regions that span the breakpoints of inversion 3 in D. subobscura O3 + 4 chromosomes were compared to the D. pseudoobscura genome sequence through the discontiguous megablast algorithm (http://blast.ncbi.nlm.nih.gov/). Moreover, they were analyzed with RepeatMasker (http://www.repeatmasker.org/) to identify any repetitive sequences.
The DnaSP version 5.10.01 program (Librado and Rozas 2009) was used for most analyses of intraspecific and interspecific variation. Nucleotide polymorphism was estimated as: the number of segregating sites (S), the minimum number of mutations (η), nucleotide diversity (π; Nei 1987), haplotype number (h), and haplotype diversity (Hd) (Nei 1987). Insertion-deletion (indel) polymorphism was characterized by: the number of indels (I), average indel length (IL), and indel diversity per site (ID). The D statistic (Tajima 1989) was used as a summary of the frequency spectrum. The level of genetic differentiation between chromosomal arrangements was estimated as DXY (Nei 1987) and FST (Hudson et al. 1992a), and its significance established using the KS* test statistic (Hudson et al. 1992b). Gene conversion tracts were inferred according to Betrán et al. (1997). Interspecific divergence was estimated as the number of nucleotide substitutions per site (K) using D. madeirensis as outgroup, and correcting for multiple hits according to Jukes and Cantor (Jukes and Cantor 1969). Gene genealogies were reconstructed by the neighbor-joining method as implemented in the MEGA version 5.05 program (Tamura et al. 2011). A goodness-of-fit test (χ2L; Kreitman and Hudson 1991) was used to contrast whether levels of nucleotide variation varied among regions. Computer simulations conditioned on the number of segregating sites and under the conservative assumption of no recombination were used to obtain confidence intervals for nucleotide diversity estimates that were used to assess putative differences between chromosomal arrangements.
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
CHARACTERIZATION OF THE REGIONS SPANNING INVERSION 3 BREAKPOINTS IN O3 + 4 CHROMOSOMES
The starting point for the identification of inversion 3 breakpoints of D. subobscura (Fig. 1) consisted of three molecular markers that had been previously mapped close to the inversion breakpoints (Segarra et al. 1996; Munté et al. 2005). Two markers flanked the distal breakpoint on O3 chromosomes (CD in Fig. 1): Acph and S1, located inside (at section 91C) and outside the inverted region (at section 95A), respectively, as represented in Figure 1. The third marker (P2 in Fig. 1) mapped at section 94E and was therefore close to the internal part of the proximal breakpoint (AB in Fig. 1). Comparison of the D. subobscura sequences of markers S1 and P2 with the available genome sequence of D. pseudoobscura (Richards et al. 2005) had revealed that these markers were orthologous to the GA26879 and Abdominal A (AbdA) genes (Munté et al. 2005), respectively.
We used the D. pseudoobscura and D. melanogaster genome sequences (Richards et al. 2005; Tweedie et al. 2009) to anchor our mapping efforts. Sequence comparison revealed that these genomes are collinear to the O3 + 4 chromosome over several hundred kilobases at both inversion breakpoints (AB and CD in Fig. 1). Indeed, the ∼400-kb-long fragment delimited by Acph and GA26879 (S1 marker in Fig. 1), is collinear between these distantly related species, and most likely also between D. pseudoobscura and D. subobscura. In addition, an ∼300-kb-long region spanning the AbdA gene (marker P2 in Fig. 1) is collinear between those species. We obtained probes by PCR amplification using DNA from an O3 + 4 line, in situ hybridized these probes to Ost polytene chromosomes, and determined the cytological signal location using the Ost cytological map (Künze-Mühl and Müller 1958). The sequential strategy depicted in Figure 2 was used to narrow down to each of the two breakpoints of inversion 3.
In the case of the distal breakpoint (CD in Fig. 1), two fragments located ∼270 kb and ∼190 kb from the Acph region in D. pseudoobscura (probes 3a and 3b in Fig. 2, respectively) were used as probes in the first round of in situ hybridizations on Ost polytene chromosomes (Fig. 3A and B). Both markers gave a strong signal at section 95A. Their colocalization with marker S1 indicates that they are also outside the inverted region and, therefore, that the breakpoint is in the fragment flanked by Acph and probe 3b (Fig. 2). In the second round, two new fragments were used as probes, from which probe 3d mapped at section 95A (Fig. 3C) and probe 3e mapped at section 91C (Fig. 3D). Thus, the fragment spanning the inversion breakpoint could be narrowed down to the ∼50-kb-long region that separated probes 3d and 3e (Fig. 2). Three new fragments within this region were used as probes in the third and final round (Fig. 2). Two probes (3g and 3h) gave a single signal at section 91C (Fig. 3E and F), whereas the last probe (named 3f) gave multiple and strong signals (see below) that included the locations of the two cytological breakpoints of inversion 3 (subsections 91B/C and 94E/95A; Fig. 3G). As probe 3f (∼7.5 kb long) covered almost completely the region between probes 3g and 3d, it most likely spanned the distal breakpoint of inversion 3 (Figs. 1 and 2). This was later confirmed using a shorter probe [3f(s)] that clearly hybridized at both inversion breakpoints (Fig. 3H).
In the case of the proximal breakpoint (AB in Fig. 1), we used a similar strategy despite that the task was much riskier because we started from a single marker within the inverted region: marker P2 that partly contained the AbdA gene. As previously indicated, an ∼300-kb-long region spanning this gene was collinear between D. pseudoobscura and D. melanogaster. This region spanned the three genes (Ubx, AbdA, and AbdB; Fig. 2) that constitute the bithorax complex (Martin et al. 1995). In the first round of in situ hybridizations on Ost polytene chromosomes (Fig. 3), two fragments (4a and 4b in Fig. 2) located at the extremes of this collinear region were used as probes. Probe 4a (AbdB) mapped at the distal end of the inversion (at section 94E; Fig. 3I), whereas probe 4b mapped at its proximal end (section 91B; Fig. 3J). The distal breakpoint is, therefore, in the fragment flanked by probes 4a and 4b (Fig. 2). Given the great conservation of the bithorax complex, we assumed that the breakpoint lay between Ubx and fragment 4b (Fig. 2). Therefore, we designed one probe spanning the complete region separating GA16100 and Ubx (fragment 4b1). To confirm our previous assumption, two probes (4a1 and 4a2) were designed between Ubx and AbdA (Fig. 2). As expected, the two probes located between Ubx and AbdA hybridized at section 94E (results not shown) similarly to the AbdA and AbdB probes, whereas the ∼6.0-kb-long fragment corresponding to probe 4b1 hybridized at both ends of inversion 3 (at subsections 91B/C and 94E/95A; Fig. 3K), which confirms that this fragment (∼6.0 kb long) includes the proximal inversion breakpoint (AB in Fig. 1).
The complete AB and CD fragments (i.e., the ∼6.0 kb and ∼7.5 kb fragments that spanned the breakpoints in the ancestral arrangement) were sequenced for one O3 + 4 line (J16 in Figs. S1–S5). These sequences were compared to the corresponding D. pseudoobscura genome sequence using the discontigous megablast algorithm. For the AB region (Fig. 4A), this comparison revealed four fragments with high similarity between species (sequence identity between 78% and 92%), and it allowed the identification of the orthologs of genes GA16100 (modSP) and Ubx, as well as of their flanking regions. For the CD region (Fig. 4B), the megablast comparison revealed three high similarity fragments (sequence identity between 72% and 92%), corresponding to the orthologs of gene GA26869 (trp), gene GA20651 (Jon99C) and its flanking region, and an additional intergenic region.
No similarity was detected in the remaining fragments (i.e., in the ∼2.5-kb-long central fragment of the AB region, and in any of the two short intervening fragments of the CD region). It is worth noting that at the central part of the AB region of D. pseudoobscura, a short coding region (GA26454) is present in this species (Fig. 4A). This CDS has no homolog in any of the Drosophila species with an available whole genome sequence, with the exception of D. persimilis.
Analysis of the AB and CD regions using the RepeatMasker software revealed a few and short low-complexity regions (not shown), whereas the only moderately long region with similarity to known transposable elements was found in the CD region (Fig. 4B). The multiple signals detected in the in situ hybridization using the 3f fragment as probe (Fig. 3G) reflect the presence of these repetitive sequences in this ∼7.5-kb-long fragment spanning the CD breakpoint.
IDENTIFICATION OF THE BREAKPOINTS OF INVERSION 3
Upon sequencing the complete AB and CD fragments for one O3 + 4 line (see above), internal primers were used to isolate the two breakpoints in Ost chromosomes. Primer pairs pA/pB and pC/pD (Fig. 4) did successfully amplify the central part of the AB and CD regions, respectively, in the sequenced O3 + 4 line (yielding ∼1.3-kb- and ∼1.0-kb-long fragments), but amplification failed when using Ost DNA. This result suggested that these fragments spanned the breakpoints in O3 + 4 chromosomes. This was confirmed by using primer pairs pA/pC and pB/pD (Fig. 4) for PCR amplification. As expected, these primer pairs did successfully amplify the AC and BD regions in Ost (yielding ∼1.0–kb- and ∼1.4-kb-long fragments), whereas amplification failed when using O3 + 4 DNA. Fragments spanning the breakpoints in Ost chromosomes were initially sequenced in a single Ost line (J07 in Figs. S1–S5).
Comparison of the breakpoint regions in O3 + 4 (AB and DC) and Ost (AC and BD) chromosomes allowed the detailed identification of the breakpoints (Fig. 4). In the initially sequenced O3 + 4 line (J16), the proximal breakpoint (AB) could be delimited to a 309-bp-long fragment (Fig. 4A), which was duplicated during the inversion process, as revealed by its presence at the two breakpoints of the inverted Ost chromosomes (AC and BD in Fig. S6). This fragment is located in the ∼2.5-kb-long central fragment of the AB region, which exhibits no similarity to the corresponding region of D. pseudoobscura (Fig. 4A). In contrast, the distal breakpoint is included at the end of one of the high similarity fragments of the CD region (Fig. 4B). In the initially sequenced Ost line (J07), the distal breakpoint could be delimited to a 63-bp-long fragment (Fig. 4B), which was deleted during the inversion process (see below).
The extended AB and CD regions were also amplified and partially sequenced in the closely related species D. madeirensis and D. guanche. The size of the AB amplicon was ∼2 kb smaller in these species than in D. subobscura. Indeed, sequencing of the fragment spanning the AB breakpoint region revealed an ∼1.8-kb deletion in these species relative to D. subobscura. Interestingly, the deleted fragment includes the entire proximal breakpoint, with its distal end and that of the inversion breakpoint being nearly coincidental.
POLYMORPHISM AT THE BREAKPOINT REGIONS AND ALONG THE O3 INVERSION
Variation at the breakpoints was surveyed in a population sample of Ost and O3 + 4 chromosomes through the amplification and sequencing of approximately 1.0–1.4 kb long fragments spanning the breakpoints (Fig. S6). Table 1 shows a summary of nucleotide polymorphism at the A, B, C, and D regions. Based on the comparison between the D. subobscura and both the D. madeirensis and D. guanche sequences, the fragment that was duplicated during the inversion process was considered to be part of the A region (Fig. S6). The size of the regions analyzed varies between 309 nucleotides for the D region, and 666 nucleotides for the B region (after excluding alignment gaps in the complete dataset). Estimates of nucleotide diversity in these regions vary between 0.005 and 0.018 in Ost chromosomes and between 0.009 and 0.046 in O3 + 4 chromosomes. Except for the value estimated at the C region in O3 + 4 chromosomes (0.046), these estimates are within the range of previous estimates at silent sites of eight regions of D. subobscura affected by inversion 3 (0.004–0.022; Munté et al. 2005). Indeed, estimates of variation in this arrangement do not differ significantly (χ2L= 1.60, 2 df, P= 0.449) except when the C region is included (χ2L= 29.97, 3 df, P < 0.0001). Considering each region separately, levels of variation are similar in both arrangements at the A and B regions, whereas they are higher in O3 + 4 than in Ost chromosomes in the C and D regions, even though they only differ significantly between arrangements at the C region (χ2L= 25.16, 1 df, P < 0.0001). For the concatenated dataset, the level of variation is significantly higher in O3 + 4 (0.020) than in Ost chromosomes (0.012). Significance (χ2L= 13.39, 1 df, P= 0.0002) vanishes, however, when only the A, B, and D regions are compared (0.013 in both Ost and O3 + 4 chromosomes; χ2L= 0.743, 1 df, P= 0.389). Similar results were obtained by computer simulation under the conservative assumption of no recombination (results not shown). The frequency spectrum of nucleotide polymorphisms, as summarized by Tajima's D statistic, seems in general (although not significantly) shifted toward an excess of low-frequency variants in O3 + 4 chromosomes, with Ost chromosomes exhibiting a weaker trend (Table 1). The A, B, C, and D regions do not only exhibit an appreciable level of nucleotide variation, but also extensive length variation (Table S2). The level of length variation, as measured by the number of indels and indel diversity, is higher at O3 + 4 than at Ost chromosomes.
|O3 + 4||Ost||O3 + 4||Ost||O3 + 4||Ost||O3 + 4||Ost||O3 + 4||Ost|
|L||340 (413)||340 (413)||666 (935)||666 (935)||401 (738)||401 (738)||309 (355)||309 (355)||1814 (2441)||1814 (2441)|
|S||18 (19)||18||28||27||45 (50)||10||12||5||104 (110)||62|
Table 2 gives a summary of the level of genetic differentiation between chromosomal arrangements. Despite some differences in the level of differentiation among regions (with the A and C regions exhibiting higher estimates than the B and D regions), the four fragments exhibit significant genetic differentiation between the two arrangements (as established by Hudson's KS* statistic; Table 2), both when individually and jointly considered.
|Comparison||Fragment||Fixed||P1F2||F1P2||Shared||F ST||D XY (sign.)|
|O3 + 4 vs. Ost|
|DUP-AB vs. DUP-AC||2||12||14||1||0.512||0.033 (***)|
|DUP-AB vs. DUP-BD||0||12||11||0||0.312||0.022 (***)|
|DUP-AC vs. DUP-BD||5||13||11||0||0.572||0.035 (***)|
To gain further information on the inversion origin, nucleotide variation was also estimated at the ∼300-bp-long fragment that was duplicated during this process. This fragment is present as a single copy in the O3 + 4 chromosomes (DUP-AB), and as two copies in the Ost chromosomes (DUP-AC and DUP-BD; Fig. S6). Estimates of nucleotide diversity (Table 3) are similar among copies (0.015 in all three cases). Moreover, their pairwise comparison reveals similar and significant levels of genetic differentiation among copies (Table 2). This is also reflected in the reconstructed gene genealogy (whose unrooted character is due to the absence of this region in both D. madeirensis and D. guanche). Indeed, the presence of three clusters, each corresponding to one of the three copies (Fig. S7), points to their independent evolution.
|L||236 (355)||236 (355)||236 (355)|
The level of nucleotide diversity (scaled over divergence, π/K) at inversion 3 breakpoints in O3 + 4 and Ost chromosomes (Table 1) was compared to that previously detected in the same population at eight regions along this inversion (Navarro-Sabaté et al. 1999; Rozas et al. 1999; Munté et al. 2005). Region A was excluded from this analysis given its absence in the outgroup species D. madeirensis. The scaled π/K estimate at the C region in O3 + 4 chromosomes (0.730) far exceeds all other estimates, similarly to the nonscaled π estimate (Table 1 and Fig. 5). This very high level of variation, therefore, cannot be explained by a higher than average mutation rate at this region. The other π/K estimates for inversion breakpoints do not differ greatly from those at the eight other regions (Fig. 5), even if the lowest estimate in each arrangement corresponds to breakpoint regions. Genetic differentiation between O3 + 4 and Ost chromosomes, as measured by the DXY statistic, shows the highest variation at inversion breakpoints, with the estimates for three of the four breakpoint regions (C region included) exhibiting higher values than more internal regions (Fig. 5). Levels of genetic differentiation are more similar among regions when measured as FST (results not shown).
BREAKPOINTS, POPULATION SAMPLE, AND INVERSION ORIGIN
When all O3 + 4 and Ost sequences are considered (Figs. S1–S5), the two regions spanning the breakpoints are generally delimited by the same two 4-bp motifs (5′ATGC3′ and 5′GCAG3′) in inverted orientation (Fig. S6). Although this is reminiscent of repetitive sequences, the delimited fragments differ greatly in length (∼300 bp at AB vs. ∼60 bp at CD) and exhibit little if any similarity. Moreover, they do not exhibit any similarity to any known transposable element family. The A fragment that was duplicated during the inversion process is generally longer in the DUP-AC copy than in copies DUP-AB and DUP-BD (Fig. S6). The CD breakpoint fragment varied from 40 to 79 bp in O3 + 4 chromosomes (Fig. S4), and it was deleted in Ost chromosomes as previously indicated. The presence of a duplicated fragment at the breakpoints of inverted chromosomes (Ost in this case) would support that the inversion arose via staggered breaks as depicted in Figure 6.
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
ORIGIN OF INVERSION 3
The availability of whole genome sequences of closely related species has definitely provided new insights into the role played by chromosomal inversions in genome evolution (e.g., Pevzner and Tesler 2003; Feuk et al. 2005; Bhutkar et al. 2008; Lee et al. 2008; von Grotthuss et al. 2010). Moreover, it has provided support for the two major molecular mechanisms underlying their origin (e.g., Richards et al. 2005; Ranz et al. 2007). The identification of new polymorphic inversions and of their breakpoints has also benefited from whole genome sequence projects and the associated development of molecular markers. Here we have used a mixed strategy (i.e., taking advantage of whole genome sequences and of molecular markers) to isolate and finely characterize the two breakpoints of a polymorphic inversion of D. subobscura, inversion 3 of the O chromosome, in a population sample of both Ost and O3 + 4 chromosomes. In O3 + 4 chromosomes (with the ancestral configuration for inversion 3; Fig. 1), the inversion breakpoints could be identified as two rather short fragments (∼300 bp and 60 bp long, respectively; Fig. 4) with no similarity to any known transposable element family or repetitive sequence. This observation renders the origin of inversion 3 by ectopic recombination between transposable elements, or other repetitive sequences, highly unlikely. Furthermore, the presence of an ∼300-bp duplicated fragment at the two breakpoints of Ost chromosomes (Figs. 1, 4, and Fig. S6) clearly support that this inversion originated via staggered double-strand breaks. The fact that the two breakpoints are delimited by the same two 4-bp motifs in inverted orientation facilitates the visualization of the inversion process. Indeed, after the occurrence of the staggered breaks encompassing the two breakpoints and the subsequent inversion of the intervening chromosomal region (Fig. 6), the short single-strand motifs would have mediated the annealing of the inversion ends and, thus, the origin of the ∼300-bp-long duplication through synthesis of the complementary strand at both breakpoints, and the deletion of the shorter single-strand fragment. Inversion 3 breakpoints provide one of the clearest examples of the origin of a segregating inversion through the staggered double-strand breaks mechanism.
The characterization and sequencing of polymorphic and fixed inversions had raised the possibility that the mode of origin of inversions was related to the inversion age (Prazeres da Costa et al. 2009), and also that it might be species or group specific. Our results together with previous results would rather suggest that the mode of origin is neither related to the inversion age—as revealed by the staggered breaks mode in inversion 3, a polymorphic inversion—nor group specific—as revealed by the coexistence of both modes in the obscura lineage, that is, through repetitive sequences in the D. pseudoobscura lineage and through staggered double-strand breaks in the D. subobscura lineage.
INVERSIONS AND NUCLEOTIDE VARIATION
The origin of an inversion constitutes an extreme bottleneck that for successful inversions (i.e., those that are preserved and achieve some frequency in the population) implies an initial depletion of variation. The inverted region regains variation through mutation and also through genetic exchange with the ancestral gene arrangement (Navarro et al. 1997). Levels of variation in a particular inversion would thus be dependent on the inversion age and also on the time of coexistence with the ancestral arrangement. Although both gene conversion and double crossovers can contribute to the genetic exchange in inversion heterokaryotypes, double crossovers would only be likely to have an effect in the more central part of the inverted region. Genetic exchange between inverted and noninverted chromosomes is, therefore, expected to increase with distance from the breakpoints (Navarro et al. 1997). In the inverted region, the level of variation would increase following the differential contribution of double crossovers with distance from the breakpoints.
In the case here studied, extant populations of D. subobscura harbor two chromosomal arrangements—-Ost and O3 + 4—that arose independently from the ancestral O3 arrangement through inversions 3 and 4, respectively (Fig. 1). O3 went extinct some time after both inversions occurred, which implies that there was limited time for each of the derived arrangements to exchange information with O3. The location of the regions analyzed—at inversion 3 breakpoints in Ost chromosomes (AC and BD in Fig. 1) but rather distant from inversion 4 breakpoints in O3 + 4 chromosomes (AB and DC in Fig. 1)—might have differentially affected this exchange. Moreover, the overlapping character of inversions 3 and 4 would have greatly affected the level of genetic exchange between Ost and O3 + 4 upon their establishment.
In a previous study of nucleotide variation at eight regions distributed along inversion 3 in a sample of Ost and O3 + 4 chromosomes, levels of nucleotide variation within chromosomal arrangement and of genetic differentiation between arrangements were quite uniform across regions (Munté et al. 2005). Double crossovers would have thus contributed little to the genetic exchange along this rather short inversion. Variation in each arrangement would, therefore, be the result either of mutation or of the genetic exchange majorly due to gene conversion. Here we have surveyed nucleotide variation at and around the breakpoints themselves, where gene conversion (but not mutation) is expected to be restricted due to mechanical problems in the synapsis of chromosomes carrying alternative gene arrangements (Navarro et al. 1997). Accordingly, a lower level of variation within each arrangement and a higher level of genetic differentiation between arrangements would be expected at the breakpoint regions relative to more internal regions. These expectations are not well supported by present results. Indeed, although for each arrangement a breakpoint region exhibits the lowest scaled diversity estimate, the level of variation at breakpoint regions other than the outlier C region is overall of the same order than in more internal regions (Fig. 5). Also, genetic differentiation—as estimated by DXY—does not differ greatly between breakpoint regions (excluding the C region) and more internal regions, with only two of the three breakpoint regions exhibiting a higher estimated value than internal regions (Fig. 5). This trend is even milder when measuring genetic differentiation with FST. These results, and the detection of two small (4 and 14 nt long) tracts in the C region and 2 medium-sized (87 and 125 nt long) tracts in the B region, suggest that gene conversion at inversion breakpoints is not as restricted as previously thought. Indeed, if the actual reduction were mild or affected a small region, its effect at the evolutionary time scale might be difficult to detect at least in some cases.
Nucleotide variation at the eight regions distributed along inversion 3 revealed a higher level of nucleotide diversity in O3 + 4 than in Ost chromosomes, suggesting a more recent origin of Ost (Navarro-Sabaté et al. 1999; Rozas et al. 1999; Munté et al. 2005). The pattern is not so clear at the breakpoint regions here studied. Our results would suggest that, even if Ost originated more recently than O3 + 4 from O3, the time elapsed between these events would not be very long. Indeed, the rather similar level of variation detected at the three copies of the DUP region (Table 3), despite that DUP-AC and DUP-BD originated as a result of the inversion process, would point in this same direction.
THE BREAKPOINT REGIONS IN THE OBSCURA GROUP
Our strategy to identify inversion 3 breakpoints relied heavily on the comparison of the D. melanogaster and D. pseudoobscura genome sequences, and more specifically on the collinearity of long regions between or around the orthologs of known cytological markers in the target species D. subobscura (Figs. 1 and 2). The regions spanning the breakpoints were initially identified and sequenced in one O3 + 4 line, which confirmed that these regions were orthologous to those in both D. melanogaster and D. pseudoobscura. In D. subobscura as well as in the other two species of the subobscura species cluster—D. madeirensis and D. guanche—the AB breakpoint was flanked by the orthologs of genes modSP and Ubx, and the CD breakpoint by those of genes trp and Jon99C. Given the strong association previously detected in D. subobscura between peptidase allozymes variants and chromosomal arrangements O3 + 4 and Ost (Fontdevila et al. 1983; Prevosti et al. 1983), it is worth noting that each breakpoint is flanked by at least one gene encoding an enzymatic protein with peptidase activity (modSP and Jon99C, respectively).
The analysis of the extended AB region sequences in species of the obscura group would suggest that the inversion breakpoint is located in a small fragment with a high turnover despite it being within a long collinear region (∼300 kb). In the three species of the subobscura species cluster, the extended AB region is highly conserved except for an ∼1.8-kb fragment that is absent in both island species D. madeirensis and D. guanche. Interestingly, in D. subobscura one of the ends of this fragment is the breakpoint itself (i.e., the ∼300-bp fragment). It seems unlikely that both states (i.e., with and without this fragment) segregated in the ancestral D. subobscura populations (with the O3 chromosomal arrangement) during the over 2 million years period that separates the origin of the two island species (Ramos-Onsins et al. 1998). It seems more plausible that this fragment was gained after the split of the D. guanche lineage and likely of the D. madeirensis lineage. Moreover, the presence of this fragment in both O3 + 4 and Ost arrangements would require that the variant with this fragment had attained a rather high frequency across most of the D. subobscura O3 distribution area. Indeed, there are some indications from the current geographical frequency distribution of these arrangements that the two inversions originated in different parts of the ancestral distribution area of D. subobscura (Krimbas 1992).
In D. pseudoobscura and D. persimilis, the extended AB region provides further evidence for the high turnover of its central part in the obscura group. Indeed, in the D. persimilis genome sequence, a 5-exon coding region (1454 nt long) has been annotated. In the D. pseudoobscura genome sequence, this CDS is annotated as a 1-exon pseudogene (Fig. 4) due to a donor splice-site mutation and the resulting in-frame stop codon at the beginning of the first intron. This CDS has not been found in any of the other Drosophila species with whole genome sequences (Clark et al. 2007; Tweedie et al. 2009), and it is not found either in the AB fragment of any of the three species of the subobscura species cluster. The origin of this CDS remains an open question and its location in the AB region further suggests the possibility of a short fragment with a high turnover (Fig. 4).
We can conclude therefore that although the AB breakpoint of inversion 3 of the O chromosome could be readily identified and is not associated with any known transposable element family, it is located in a high turnover region. In contrast, the CD breakpoint region is more stable across the obscura species group, because it has only been affected by inversion 3 in D. subobscura. However, in the extended CD region there is an ∼1-kb-long fragment around the breakpoint that exhibits the lowest level of similarity between D. subobscura and D. pseudoobscura (Fig. 4), and in D. subobscura the breakpoint itself (i.e., the ∼60-bp-long fragment) harbors considerable length variation. The question arises of how often inversion breakpoints occur in short unstable regions. This question has been recently addressed at the level of fixed rearrangements (von Grotthus et al. 2010). Indeed, the analysis of chromosome reorganization across the Drosophila genus (i.e., across the 12 Drosophila species whose genome was first sequenced; Clark et al. 2007) revealed that fragile regions have played a more prevalent role than functional constraints in chromosomal evolution (von Grotthus et al. 2010). Addressing this same question at the level of polymorphic inversions awaits, however, the fine and massive characterization of polymorphic inversion breakpoints. Moreover, there are other open questions, such as those concerning the distribution of polymorphic inversions across the genome, which will also benefit from that detailed information.
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
We thank D. Salguero for his excellent technical assistance, Servei de Genòmica, Serveis Cientifico-Tècnics, Universitat de Barcelona, for automated sequencing facilities, and three anonymous reviewers for comments. This paper was prepared with full knowledge and support of the Barcelona Subobscura Initiative (BSI). This work was supported by grants BFU2007–63228 from Ministerio de Educación y Ciencia, Spain, and 2009SGR-1287 from Comissió Interdepartamental de Recerca i Innovació Tecnològica, Generalitat de Catalunya, Spain to MA.
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
- 1999. Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster. Genetics 153:1297–1311. , , and .
- 1989. Drosophila: a laboratory handbook. Cold Spring Harbor Laboratory Press, Cold Spring Harbor , NY .
- 2007. Evidence for large inversion polymorphisms in the human genome from HapMap data. Genome Res. 17:219–230. , , and .
- 1997. The estimation of the number and the length distribution of gene conversion tracts from population DNA sequence data. Genetics 146:89–99. , , , and .
- 2008. Chromosomal rearrangements inferred from comparisons of 12 Drosophila genomes. Genetics 179:1657–1680. , , , , , and .
- 2000. DNASTAR's Lasergene sequence analysis software. Methods Mol. Biol. 132:71–91.
- 1999. Generation of a widespread Drosophila inversion by a transposable element. Science 285:415–418. , , , , and .
- 2012. Segmental duplication, microinversion and gene loss associated with a complex inversion breakpoint region in Drosophila. Mol. Biol. Evol. 29:1875–1889. , ., , and .
- 2003. The foldback-like transposon Galileo is involved in the generation of two different natural chromosomal inversions of Drosophila buzzatii. Mol. Biol. Evol. 20:674–685. , , and .
- 1995. Molecular characterization of the breakpoints of an inversion fixed between Drosophila melanogaster and D. subobscura. Genetics 139:321–326. , , , and .
- 2007. Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218. , , , , , , , , , , et al.
- 2008. High incidence of interchromosomal transpositions in the evolutionary history of a subset of or genes in Drosophila. J. Mol. Evol. 66:325–332. , and .
- 2009. The transposon Galileo generates natural chromosomal inversions in Drosophila by ectopic recombination. PLoS One 4:e7883. , , , and .
- 2005. Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies. PLoS Genet 1:e56. doi:10.1371/ journal.pgen.0010056 , , , , , , , and .
- 1983. Genetic coadaptation in the chromosomal polymorphism of Drosophila subobscura. I. Seasonal changes of gametic disequilibrium in a natural population. Genetics 105:935–955. , , , , , and .
- 1992a. Estimation of levels of gene flow from DNA sequence data. Genetics 132:583–589. , , and .
- 1992b. A statistical test for detecting geographic subdivision. Mol. Biol. Evol. 9:138–151. , , and .
- 1969. Evolution of protein molecules. Pp. 21–132 in H. N. Munro, ed., Mammalian protein metabolism. Academic Press, New York . , and .
- 2008. Recent developments in the MAFFT multiple sequence alignment program. Brief. Bioinform. 9:286–298. , and .
- 1998. Tracing the colonization of Madeira and the Canary Islands by Drosophila subobscura through the study of the rp49 gene region. J. Evol. Biol. 11:439–452. , , , , and .
- 1991. Inferring the evolutionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127:565–582. , and .
- 1992. The inversion polymorphism of Drosophila subobscura. Pp. 127–220 in C. B. Krimbas and J. R. Powell, eds. Drosophila inversion polymorphism. CRC Press, Boca Raton, FL .
- 1958. Weitere Untersuchungen über die chromosomale Struktur und die natürlichen Strukturtypen von Drosophila subobscura Coll. Chromosoma 9:559–570. , and .
- 2008. Chromosomal inversions between human and chimpanzee lineages caused by retrotransposons. PLoS One 3:e4047. , , , , and .
- 2009. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25:1451–1452. , and .
- 2010. Breakpoint structure of the Anopheles gambiae 2Rb chromosomal inversion. Malaria J. 9:293. , , , , , , , , , , et al.
- 2010. A widespread chromosomal inversion polymorphism contributes to a major life-history transition, local adaptation, and reproductive isolation. PLoS Biol. 8:e1000500. doi:10.1371/journal.pbio.1000500 , and .
- 1992. MacClade: analysis of phylogeny and character evolution. Sinauer , Sunderland, MA . , and .
- 1995. Complete sequence of the bithorax complex of Drosophila. Proc. Natl. Acad. Sci. USA 92:8398–8402. , , , , , , , , and .
- 1998. Cloning of inversion breakpoints in the Anopheles gambiae complex traces a transposable element at the inversion junction. Proc. Natl. Acad. Sci. USA 95:12444–12449. , , , , and .
- 2005. The structure and population genetics of the breakpoints associated with the cosmopolitan chromosomal inversion In(3R)Payne in Drosophila melanogaster. Genetics 170:1143–1152. , , , and .
- 1987. A test for the role of natural selection in the stabilization of transposable element copy number in a population of D. melanogaster. Genet. Res. 49:31–41. , , and .
- 2005. Chromosomal inversion polymorphism leads to extensive genetic structure: a multilocus survey in Drosophila subobscura. Genetics 169:1573–1581. , , , and .
- 1997. Recombination and gene flux caused by gene conversion and crossing over in inversion heterokaryotypes. Genetics 146:695–709 , , , and .
- 1999. The relationship between allozyme and chromosomal polymorphism inferred from nucleotide variation at the Acph-1 gene region of Drosophila subobscura. Genetics 153:871–889. , , and .
- 1987. Molecular evolutionary genetics. Columbia Univ. Press, New York .
- 2003. Nucleotide polymorphism in the RpII215 gene region of the insular species Drosophila guanche: reduced efficacy of weak selection on synonymous variation. Mol. Biol. Evol. 20:1867–1875. , , , , and .
- 1997. Progress and prospects in evolutionary biology: the Drosophila model. Oxford Univ. Press, New York .
- 2009. Cloning and sequencing of the breakpoint regions of inversion 5g fixed in Drosophila buzzatii. Chromosoma 118:349–360. , , and .
- 2003. Human and mouse genomic sequences reveal extensive breakpoint reuse in mammalian evolution. Proc. Natl. Acad. Sci. USA 100:7672–7677. , and .
- 1983. Association between allelic isozyme alleles and chromosomal arrangements in European populations and Chilean colonizers of Drosophila subobscura. Pp. 171–191 in M. C. Ratazzi, J. G. Scandalios, G. S. Whitt, eds. Isozymes: current topics in biological and medical research vol. 10: genetics and evolution. Alan R. Liss, New York . , , , , , and .
- 2001. How malleable is the eukaryotic genome? Extreme rate of chromosomal rearrangement in the genus Drosophila. Genome Res. 11:230–239. , , and .
- 2007. Principles of genome evolution in the Drosophila melanogaster species group. PLoS Biol. 5:1366–1381. , , , , , , , and .
- 1998. Molecular and chromosomal phylogeny in the obscura group of Drosophila inferred from sequences of the rp49 gene region. Mol. Phylogenet. Evol. 9:33–41. , , , and .
- 2005. Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res. 15:1–18. , , , , , , , , , , et al.
- 1995. Nucleotide polymorphism at the rp49 region of Drosophila subobscura: lack of geographic subdivision within chromosomal arrangements in Europe. J. Evol. Biol. 8:355–367. , , , , and .
- 1999. Molecular population genetics of the rp49 gene region in different chromosomal inversions of Drosophila subobscura. Genetics 151:189–202. , , , and .
- 2009. Sequences signatures of a recent chromosomal rearrangement in D. mojavensis. Genetica 136:5–11. , and .
- 1996. Differentiation of Muller's chromosomal elements D and E in the obscura group of Drosophila. Genetics 144:139–146. , , and .
- 2006. Breakpoint structure reveals the unique origin of an interspecific chromosomal inversion (2La) in the Anopheles gambiae complex. Proc. Natl. Acad. Sci. USA 103:6258–6262. , , , , , , , , , and .
- 2005. A common inversion under selection in Europeans. Nat. Genet. 37:129–137. , , , , , , , , , , et al.
- 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595.
- 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood. Mol. Biol. Evol. 28:2219–2229 , , , , , and .
- 2008 The chromosomal polymorphism linked to variation in social behavior in the white-throated sparrow (Zonotrichia albicollis) is a complex rearrangement and suppressor of recombination. Genetics 179:1455–1468. , , , , , , , and .
- 2009. FlyBase: enhancing Drosophila gene ontology annotations. Nucleic Acids Res. 37:D555–D559. , , , , , , , , , , et al.
- 2010. Fragile regions and not functional constraints predominate in shaping gene organization in the genus Drosophila. Genome Res. 20:1084–1096. ., , and .
- 1994. Isolation and analysis of the breakpoint sequences of chromosome inversion In(3L)Payne in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 91:3132–3136. , and .
- Top of page
- Materials and Methods
- LITERATURE CITED
- Supporting Information
Table S1. Probe length and distance to either marker P2 or Acph.
Table S2. Insertion–deletion polymorphism at the fragments (A, B, C, and D) spanning the breakpoints of inversion 3.
Figure S1. Fragment A sequence alignment.
Figure S2. Fragment B sequence alignment.
Figure S3. Fragment C sequence alignment.
Figure S4. Fragment D sequence alignment.
Figure S5. Fragments DUP sequence alignment.
Figure S6. Detailed scheme of the two breakpoints of inversion 3.
Figure S7. Neighbor-joining genealogy of the different copies of the proximal breakpoint region.
|EVO_1731_sm_suppmat.pdf||2101K||Supporting info item|
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.