Copy number variation in potato – an asexually propagated autotetraploid species


  • Marina Iovene,

    1. Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
    2. CNR–Institute of Plant Genetics, Bari, Italy
    Search for more papers by this author
  • Tao Zhang,

    1. Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
    Search for more papers by this author
  • Qunfeng Lou,

    1. Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
    2. State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Jiangsu, Nanjing, People's Republic of China
    Search for more papers by this author
  • C. Robin Buell,

    Corresponding author
    1. Department of Plant Biology, Michigan State University, East Lansing, MI, USA
    • Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
    Search for more papers by this author
  • Jiming Jiang

    Corresponding author
    • Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
    Search for more papers by this author

For correspondence (e-mails or


Copy number variation (CNV) has been revealed as a significant contributor to the genetic variation in humans. Although CNV has been reported in several model animal and plant species, the presence of CNV and its biological impact in polyploid species has not yet been documented. We conducted a fluorescence in situ hybridization (FISH)-based CNV survey in potato, a vegetatively propagated autotetraploid species (2n = 4x = 48). We conducted FISH analysis using 18 randomly selected potato bacterial artificial chromosome (BAC) clones in a set of 16 potato cultivars with diverse breeding backgrounds. Six BACs (33%) with insert sizes of 137–145 kb were found to be associated with large CNV events detectable at the cytological level. We demonstrate that the large CNVs associated with two specific BACs (RH102I10 and RH83C08) were widespread among potato cultivars developed in North America and Europe. We measured the transcript abundance of four genes associated with the CNV spanned by BAC RH102I10. All four genes displayed a dosage effect in transcription. Although potato is vegetatively propagated, we observed that female gametes lacking the RH102I10-associated CNV were inferior to those with at least one copy of this CNV, indicating that the RH102I10-associated CNV can impact on the growth and development of the potato plants. Our results show that CNV is highly abundant in the potato genome and may play a significant role in genetic variation of this important food crop.


Copy number variation (CNV) is defined as stretches of DNA from 1 kilobase (kb) to several megabases (Mb) that display different copy numbers in populations (Feuk et al., 2006). Presence/absence variation (PAV) is an extreme example of CNV, in which DNA sequences are present in one genome yet completely absent in another genome (Springer et al., 2009). CNV was discovered initially in the human genome (Iafrate et al., 2004; Sebat et al., 2004), and has been documented in other model animal species, including Drosophila melanogaster (Emerson et al., 2008) and mouse (Yalcin et al., 2011). The human database of genomic variants ( currently contains a collection of >15 900 CNV loci, altogether covering 35% of the human genome. More importantly, many human CNVs are associated with diseases or susceptibility to diseases, either through dosage of a single gene, a contiguous set of genes, or in the case of complex diseases, allelic combinations (Henrichsen et al., 2009; Craddock et al., 2012). Thus, CNV has been demonstrated to be an important contributor to genetic variation in humans (Henrichsen et al., 2009; Stankiewicz and Lupski, 2010).

Copy number variation has recently been reported in several plant species, including maize (Springer et al., 2009; Swanson-Wagner et al., 2010), Arabidopsis thaliana (DeBolt, 2010; Cao et al., 2011), rice (Yu et al., 2011), and soybean (Haun et al., 2011; McHale et al., 2012). Analyses of multiple genotypes in Arabidopsis and maize suggest that CNV may play a significant role in phenotypic diversity and hybrid heterosis in plant species (Swanson-Wagner et al., 2010; Cao et al., 2011). In addition, several recent reports showed that duplication of a single or an array of multiple genes in plants can have a dramatic impact on growth and development (Pearce et al., 2011; Li et al., 2012; Wingen et al., 2012) or generate novel resistance to pests (Cook et al., 2012). These results suggest that CNV may contribute to genetic variation in plants similarly to humans.

Polyploidization plays a more significant role in diversification and evolution of plants compared to animals as approximately 70% of species within the angiosperms are polyploids (Masterson, 1994). The presence of multiple copies of homologous and/or homoeologous DNA sequences in both autopolyploids and allopolyploids presents a technical challenge for CNV analysis. CNV has not been documented in any of the classical polyploid plant species. Here, we report a FISH based CNV survey in potato (Solanum tuberosum, 2n = 4x = 48), a vegetatively propagated autotetraploid species. We demonstrate that CNV is highly abundant in the potato genome. Cultivated potato, due to its vegetative propagation through tubers and autopolyploid nature, has accumulated a large number of CNV loci. Understanding the role of CNVs in potato genetic variation may hold the key for future breeding of this important food crop.


Copy number variation of BAC-sized DNA fragments in potato revealed by FISH

We used a BAC FISH-based approach to explore potential CNV in potato. A set of 18 BAC clones with insert sizes ranging from 110 to 163 kb that mapped to the euchromatic regions of potato chromosome 6 (Iovene et al., 2008) were hybridized to the metaphase chromosomes of two tetraploid potato cultivars, Atlantic and Katahdin (Table S1). If the potato genomic DNA within the BAC clone does not contain duplicated sequences, it will hybridize to a specific position on all four copies of chromosome 6. Indeed, most BACs generated the expected four FISH foci on all four chromosomes in both Atlantic and Katahdin (Figure 1). However, five BACs generated only 1–3 FISH foci in either Atlantic and/or Katahdin (Table 1), suggesting a deletion of the corresponding fragment on the chromosome(s) that lacks the FISH signal. One additional BAC, RH94G20, hybridized to the long arm of all chromosome 6 homologs yet this BAC consistently hybridized to the pericentromeric region of a second chromosome (Figure S1), suggesting a duplication of the RH94G20-associated sequences elsewhere in the genome.

Table 1. Potato chromosome 6-specific BACs associated with copy number variations (CNVs) in Katahdin and Atlantic
BAC cloneSize (kb)Location on linkage map (cm)aNo. of chromosomes carrying FISH focib
  1. a

    Potato linkage group 6 contains a total of 53 cm (Iovene et al., 2008).

  2. b

    No. of signals reported as 2 + 2, 1 + 3, stand for two major signals plus two minor ones; one major signal plus three minor ones, etc.

RH69B12144.010.732 + 2
RH160C14137.215.22 + 21 + 3
RH83C08144.737.31 + 31 + 1
Figure 1.

Examples of FISH-based CNV survey in Atlantic and Katahdin. (a) FISH mapping of BAC RH88J22 (red) together with reference BAC RH60H14 (green) on metaphase chromosomes of Atlantic. FISH signals from BAC RH88J22 were detected on all four chromosome 6 homologs. (b) FISH mapping of BAC RH160K03 (red) together with reference BAC RH60H14 (green) on metaphase chromosomes of Katahdin. FISH signals from BAC RH160K03 were detected on all four chromosome 6 homologs. Bar = 10 μm.

Three of these six BACs generated FISH signals that were consistently different in size and intensity on the four chromosome 6 homologs (Table 1), suggesting that a partial deletion of the fragment is associated with the chromosome having a weak/low FISH signal (see below for BAC RH83C08). However, quantification of the size and intensity of the minor FISH signals is difficult as most BACs contain some repetitive sequences that generate background FISH signals. Thus, a weak/low FISH signal may not be unambiguously and/or consistently distinguished from the background signals.

CNVs associated with BACs RH102I10 and RH83C08 are widespread in potato cultivars

BAC RH102I10 (138 kb) generated only two FISH foci in both Atlantic and Katahdin. To determine the extent of this CNV within cultivated potato, we conducted FISH analysis of RH102I10 in 16 potato cultivars developed by different breeding programs in North America and Europe and in three diploid potato clones (2n = 2x = 24) (Table 2). Only one of the 16 potato cultivars, Juanita, showed four FISH foci (Figure 2) with 1–3 FISH foci in the other 15 cultivars and two foci in all three diploid clones (Table 2 and Figure 2).

Table 2. FISH survey of BACs RH102I10 and RH83C08 among 16 potato cultivars
4x potato cultivar (abbreviation)OriginNo. of FISH signals from RH102I10No. of FISH signals from RH83C08b
  1. a

    DM is homozygous doubled monoploid clone that has been fully sequenced (The Potato Genome Sequencing Consortium, 2011).

  2. b

    na: not available.

  3. c

    Atzimba: two signals in 73% of the observations (n = 37); Kennebec: four signals (or 3 + 1) in 67% of the observations (n = 30); Kerr's Pink: 2 signals in 56% of the observations (n = 16).

Atlantic (Atl)North America21 + 1
Atzimba (Atz)Mexico32c
Freedom Russet (FR)North America13
Hindenburg (Hind)Germany22 + 1
Juanita (Juan)Mexico42
Katahdin (Kat)North America21 + 3
Kennebec (Kenn)North America24c
Kerr's Pink (KP)United Kingdom32c
MegaChip (MC)North America2na
Norkotah (NHK)North America23
Ranger Russet (RR)North America12 + 1
Roslin Eburu (RE)United Kingdom32
Russet Burbank (RB)North America2na
Snowden (Snowd)North America21
Superior (Sup)North America31 + 1
White Pearl (WP)North America21 + 1
2 x clones
DMa 22
RH 21
SH 21
Figure 2.

Copy number variation associated with BAC RH102I10 in four different potato cultivars. (a) Juanita with four copies. (b) Superior with three copies. (c) Snowden with two copies. (d) Ranger Russet with one copy. Red arrows point to the FISH signals derived from BAC RH102I10. Green FISH signals were derived from reference BAC RH60H14 hybridized to all four chromosome 6 homologs in all four cultivars. Bar = 5 μm.

A second BAC clone, RH83C08 (148 kb), generated one normal and three very weak signals in Katahdin, and one normal and one very weak signal in Atlantic (Table 1). RH83C08 generated high background FISH signals and as a consequence, the size and intensity of the minor signals on some chromosome 6 homologs were similar to background signals. However, many of these minor signals can be identified based on the reference FISH signals derived from RH60H14, an overlapping BAC from the chromosome 6 BAC tiling path (Figure 3). A partial deletion of the sequences within RH83C08 is most likely associated with the chromosome 6 homologs that exhibit a minor FISH signal. Only a single potato cultivar, Kennebec, had four major foci with RH83C08 with the other cultivars exhibiting combinations of 1–3 major signals and 1–3 minor signals (Figure 3 and Table 2). In the two diploid clones, RH and SH, only a single major FISH signal was observed (Table 2), whereas in the doubled monoploid, DM, two foci were observed. These results show that the CNV loci spanned by BACs RH102I10 and RH83C08 are present widely in tetraploid potato germplasm.

Figure 3.

Copy number variation associated with BAC RH83C08. (a) FISH mapping of BAC RH83C08 (red) together with reference BAC RH60H14 (green) on metaphase chromosomes of Katahdin. (b) Digitally separated FISH signals from BAC RH83C08. One major signal (arrow) and three minor signals (arrowheads) are marked. (c) Digitally separated FISH signals from BACs RH83C08 and RH60H14. (d) FISH mapping of BAC RH83C08 (red) together with reference BAC RH60H14 (green) on metaphase chromosomes of Ranger Russet. (e) Digitally separated FISH signals from BAC RH83C08. Two major signals (arrows) and one minor signal (arrowhead) are marked. (f) Digitally separated FISH signals from BACs RH83C08 and RH60H14.

Copy number variant associated with BAC RH160C14 spans >200 kb DNA

BACs RH69B12, RH160C14, RH102I10 and RH83C08 were associated with CNVs in Katahdin and Atlantic (Table 1). We selected several BACs that partially overlap with these four BACs in the chromosome 6 tiling path to investigate if the overlapping BACs extend the original CNVs. BACs RH6L15 (overlapping with RH69B12), RH147M20 (overlapping with RH160C14 on one end), RH134F01 (overlapping with RH102I10), and RH127J15 (flanking RH83C08 with an estimated gap of 23 kb) produced four FISH foci in Katahdin and Atlantic (Table S1). Thus, these BACs do not extend the CNVs (Figure 4(a)). However, BAC RH87P14 (overlapping with RH160C14 on the opposite end) showed a FISH pattern identical to RH160C14 (Figure S2), indicating that RH87P14 and RH160C14 span the same CNV (Figure 4(b)). Based on the overlap between RH87P14 and RH160C14, this CNV spans approximately 216 kb. Notably, RH87P14 generated only a single FISH signal in the diploid clone RH (Figure 4(b)), whereas RH160C14 produced the expected two FISH signals in RH, suggesting that the two chromosome 6 homologs of the RH clone contain different variants of this CNV locus.

Figure 4.

Schematic illustration of FISH verification of extension of the CNV loci represented by BACs RH102I10 and RH160C14. (a) BAC RH134F01, which overlaps with RH102I10, does not extend the CNV represented by RH102I10. (b) BACs RH147M20 and RH87P14 flank BAC 160C14. RH147M20 does not extend the CNV. By contrast, RH087P14 extend the CNV in both Atlantic and Katahdin. In the diploid clone RH, only RH87P14 is associated with CNV.

Transmission of RH102I10-associated CNV in potato

We were interested in the transmission of the CNVs identified in Katahdin and Atlantic. The BAC clone RH102I10 hybridized to only two of the four chromosomes in both cultivars. We conducted FISH analysis with this BAC in 17 Atlantic haploids (2n = 2x = 24) and four Katahdin haploids. These haploid clones were derived from the female gametes of the corresponding tetraploid potato genotypes by crossing with ‘haploid extraction clones’ of diploid Solanum species (Hougas and Peloquin, 1957; Hougas et al., 1964). Thus, each haploid receives two of the four copies of chromosome 6 from its parental clone. If the four chromosome 6 homologs segregated randomly and transmitted to the female gametes, then there are three different types of haploids that will have 0, 1, or 2 RH102I10 FISH signals, respectively. The ratio of these three types of haploids will be 1:4:1. Strikingly, the ratio of these three types of FISH signal patterns from the 21 haploids was 0:16:5 (Table 3). Thus, the two chromosome 6 homologs lacking the RH102I10 DNA fragment were not simultaneously transmitted to any of the 21 haploids analyzed, a number that is significantly less than the predicted ratio of 1 (haploids with no signal): 5 (haploids with signal) (binomial test, < 0.02174).

Table 3. Transmission of RH102I10-associated copy number variants to haploid Atlantic and Katahdin clones
Haploid originNumbers of haploids with
0 FISH signals1 FISH signal2 FISH signals
Atlantic haploids (17)0134
Katahdin haploids (4)031

We also analyzed the pedigrees of four of the cultivars examined by FISH using RH102I10. For Kennebec, MegaChip, Ranger Russet, and White Pearl, FISH data were available for the parental or other clones within the pedigree (Figure S3). All sampled clones had at least one copy of RH102I10, suggesting that a minimum copy number of one is a requisite for the level of vigor permitted in cultivated tetraploid potato.

Transcriptional analysis of genes associated with the CNV locus spanned by BAC RH102I10

The BAC clone RH102I10 aligned to superscaffold PGSC0003DMB000000461 of the DM reference genome (The Potato Genome Sequencing Consortium, 2011) and a total of 19 genes were annotated within the region spanning the 138-kb insert of RH102I10. We selected four single copy genes from this region (Table S2, PGSC0003DMG400017574 (P574), PGSC0003DMG400017575 (P575), PGSC0003DMG400017577 (P577), PGSC0003DMG400017582 (P582)) to examine the impact of CNV on gene expression. All four of these genes are broadly expressed across a wide range of developmental, abiotic stress, and biotic stress tissues (The Potato Genome Sequencing Consortium, 2011) and we used quantitative RT-PCR (qRT-PCR) of leaf tissues from 18 haploid/diploid (2x) potato clones and 16 tetraploid (4x) potato cultivars to assess the impact of CNV on transcript abundance. Transcripts of the α-tubulin and actin-97 genes were used separately to normalize the relative abundance of each target transcript in each potato line using six replications (two biological replications, each with three technical replications). Normalized transcript levels for the four genes are shown in Figures 5, S4 and S5. The non-parametric Kolmogorov–Smirnov (K–S) test was used to evaluate differences in the gene expression levels between genotypes with the same ploidy level (2x or 4x) but with different copy numbers of RH102I10. Specifically, within the 2x genotypes, the comparison was performed between genotypes with two copies of RH102I10 and genotypes with only a single copy of RH102I10. Similarly, for 4x lines, comparisons were made between genotypes with four copies versus three copies, three copies versus two copies, and two copies versus one copy of RH102I10. The results were considered significant when the P-values were less than 0.01 in both datasets that were normalized by the α-tubulin and actin-97 genes, respectively.

Figure 5.

Examples of normalized expression levels of genes associated with BACs RH102I10 and RH60H14. (a) Expression of gene P582 associated with RH102I10 in diploid (2x) lines. The expression level of the group of genotypes with two copies of BAC 102I10 (purple boxes) is significantly higher than those with one copy (pink boxes; K–S test, P-value < 7.9 × 10−18). (b) Expression of gene P582 in tetraploid (4x) lines. Yellow, green, light blue and red boxes are tetraploid lines with four, three, two and one copy of 102I10, respectively. Groups containing genotypes with more copies of BAC 102I10 showed a higher expression level than groups containing genotypes with less copy numbers. (c) Expression of gene P963 associated with RH60H14 in 2x lines. (d) Expression of gene P963 in 4x lines. The expression levels between groups in (c) and (d) are not significantly different. Gene expression was normalized to the α-tubulin gene. For each potato line, individual box plot, mean, and standard deviation, were obtained using normalized CT values of six data points (two biological replicates, three technical replicates each). Y-axis represents normalized expression level. The full names of the 4x genotypes are listed in Table 2.

The K–S tests support that, at each ploidy level, genotypes with more copies of RH102I10 consistently showed significantly higher transcript levels of all four genes than genotypes with fewer copies. For example, 2x potato clones with two copies of RH102I10 showed a significantly higher expression level of P582 compared to the 2x clones with only a single copy of RH102I10 (Figure 5a). Similarly, the 4x potato cultivars with a single copy of RH102I10 showed the lowest expression level of P582 compared to the 4x potato cultivars with multiple copies of RH102I10 (Figure 5(b)). Similar results were obtained for the other three genes (Figures S4 and S5, see Table 4 for a list of P-values). Based on linear regression analysis, we found that the normalized transcript levels of these four genes correlated positively and significantly with RH102I10 copy number in different genotypes (Figure 6). This pattern was consistent for both reference genes used for normalization (Table 4).

Table 4. P-values from Kolmogorov–Smirnov (K–S) statistical testsa
2x_2 vs.1 tub2x_2 vs.1 act4x_4 vs.3 tub4x_4 vs.3 act4x_3 vs.2 tub4_3 vs.2 act4_2 vs.1 tub4x_2 vs.1 act
  1. a

    Comparisons of the level of gene expression were made between groups of potato lines with the same ploidy but different copy numbers of the RH102I10 DNA fragment. The first four genes (P574 to P582) are located within the RH102I10 DNA fragment. The last three genes mapped to a non-CNV region (BAC RH060H14) are used as control. Expression levels were obtained using alpha-tubulin and actin-97 as reference genes. Differences in the expression level of the different groups were considered statistically significant when the obtained P-values were <0.01 for both delta Ct data sets calculated using alpha-tubulin and actin-97 for normalization.

  2. b

    2x = diploid lines; 4x = tetraploid lines; 2 vs. 1, …, 4 vs. 1 = two copies vs. one copy of 102I10 DNA fragment, …, four copies vs. one copy of RG102I10 DNA fragment; tub = Ct values normalized using alpha-tubulin; act: normalized using actin-97.

P574 1.27E-116.77E-080.0003138286.77E-055.92E-061.10E-072.97E-094.88E-08
P575 2.81E-144.44E-070.0006425920.0012726342.79E-061.31E-121.32E-072.18E-06
P577 9.79E-161.67E-061.45E-064.79E-063.45E-071.80E-063.61E-074.05E-06
P582 7.88E-182.19E-110.0015113490.0029407654.08E-061.32E-085.19E-085.19E-08
P963 0.8250529670.7404621440.7322097140.6144590530.1064551090.010828140.0330712510.001903027
P047 0.0126445880.113301650.5488116360.7659283380.040030720.010453580.8198608160.575955926
P960 0.545135060.5138209080.0946903810.457771430.3035303680.2037222570.0105441030.014864875
Figure 6.

Correlation between the copy numbers of the RH102I10-associated CNV and the expression levels of gene P582 in groups containing 4x genotypes with different copy numbers of BAC RH102I10 (four copies, n = 1; three copies, n = 4; two copies, n = 9; one copy, n = 2). The copy number of the region is represented on X-axis; expression level (normalized by αctin-97) is reported on the Y-axis. Box plots of the normalized expression level of this gene were obtained combining the expression levels of tetraploid lines with same copy numbers of RH102I10. The expression level of P582 is significantly and linearly correlated with the copy number of the RH102I10 DNA fragment (R2 = 0.7108).

As a control, we also analyzed the expression level of three genes (Table S2, PGSC0003DMG400026963 (P963), PGSC0003DMG400027047 (P047), PGSC0003DMG401026960 (P960)) that mapped to the genomic region corresponding to RH60H14, a BAC clone that hybridized to all four chromosome 6 homologs in all potato cultivars examined and thereby lacking CNV (Figure 1). The same genotypes were used to perform similar K-S tests within each ploidy level (Figures 5(c,d), S6 and S7) and no significant differences were detected in the 2x clones (Table 4). Among the 4x cultivars, a single case of significant difference was found for the expression level of P960 with cultivars that had a single or only two copies of RH102I10 (< 0.015; Table 4). No significant differences were detected in any other comparisons (Table 4).


There is limited information on the extent of CNVs in plant genomes. Sequencing of multiple A. thaliana lines revealed 1059 CNV loci covering 2.2 Mb of the reference Col-0 genome (Cao et al., 2011). These CNV loci account for approximately 2% of the A. thaliana genome with only 393 CNVs overlapping with coding sequences (Cao et al., 2011). A microarray-based survey among 19 diverse maize inbreds and 14 genotypes of teosinte, the progenitor of maize, revealed that approximately 8% of the maize genes are associated with CNVs (Swanson-Wagner et al., 2010). Our FISH-based survey revealed that six of the 18 (33%) randomly selected potato BACs were associated with CNVs. However, this survey most likely under-estimates the extent of CNVs in the potato genome for several reasons. First, the FISH technique is limited in both resolution and sensitivity. As a consequence, a CNV associated with only a portion of the cloned BAC fragment may not be revealed by the FISH method. Small CNVs, such as those detected in A. thaliana using re-sequencing approaches, will be missed by using the FISH method. Second, our estimations of the length of each CNV locus most likely under-estimates the true CNV size as the FISH probes were limited to the fragment length within the BAC clones. Indeed, examination of an overlapping BAC from the chromosome 6 minimal tiling path extended the size of one estimated CNV locus. Third, the 18 BACs used in CNV survey are all located within the euchromatic regions of potato chromosome 6 (Iovene et al., 2008). Low-recombination regions in maize showed elevated frequencies of CNV (Swanson-Wagner et al., 2010) and the pericentromeric heterochromatin of potato, which is suppressed in recombination, may contain a higher amount of CNVs than what we observed based on FISH using BACs-derived euchromatic regions. Thus, it is evident that while we have demonstrated an extensive degree of CNV in terms of frequency and size of the CNV loci, we most likely have under-estimated the number and diversity of CNV in the potato genome.

A haploid-based transmission study revealed that female gametes with fewer copies of the RH102I10-associated CNV may be inferior compared with those with more copies of the CNV (Table 3). The Atlantic and Katahdin haploids used in the analysis were selected by the Wisconsin potato breeding program from a large number of haploids based mainly on their vigor and fertility. All 21 Atlantic and Katahdin haploids analyzed contain at least one copy of RH102I10. Thus, haploids lacking both copies of RH102I10 may never be produced. The RH102I10-associated CNV may contain genes important to female gamete function, thus, gametes may need at least one copy of RH102I10 to be functional. Alternatively, haploids without including a single copy of RH102I10 may be inferior in vigor and these haploids were all eliminated during selection. Interestingly, qRT-PCR analysis showed that all four genes associated with this CNV display a dosage effect in transcript levels (Figure 6). In general, more copies of these genes result in more transcripts, which may ultimately impact on gamete fitness or plant vigor. If a gene(s) in such large CNVs plays a regulatory role, then the CNV, which is similar to a segmental aneuploid, may have a stronger impact on growth and development because the CNV will affect the expression of other genes in the same regulatory pathway (Birchler and Veitia, 2007).

The CNVs detected in A. thaliana and maize by DNA sequencing and microarrays were primarily a few kb in length (Swanson-Wagner et al., 2010; Cao et al., 2011). In contrast, the CNVs detected by FISH using BAC clones in potato are large, greater than 100 kb. The autotetraploid nature of the potato genome provides great potential for retention of deleterious and dysfunctional mutations and deletions compared to diploid species. Furthermore, vegetative propagation enhances the retention rate of mutations that negatively impact gamete development and transmission. The accumulation of mutations in potato is also reflected by the fact that inbreeding of most potato cultivars results in dramatic yield depression and, primarily, in sterile and weak progeny. We hypothesize that large CNVs are less likely to survive in sexually reproducing diploid species and predict that vegetatively propagated, autopolyploid species such as potato have a higher frequency of CNVs than the majority of diploid and allopolyploid plant species.

Cultivar improvement was the most important contribution in the twentieth century to yield increases in the USA for most major crops, including maize, wheat, barley, and cotton (Fehr, 1984). However, an assessment of potato cultivars developed in the nineteenth century compared to those developed in the USA from 1932 to 1991 revealed no yield difference under modern field management practices (Douches et al., 1996). The six-fold yield increase of potato in the USA during the twentieth century was attributed to the use of disease-free seed, nitrogen fertilizer, irrigation, and pest management (Douches et al., 1996). It is intriguing that modern plant breeding did not result in significant yield improvement in potato during the same period while it lead to advances for nearly every other major crop. In fact, Russet Burbank, a clonal selection of ‘Burbank’ released in 1874, still accounts for nearly half of the United States potato acreage.

The vigor and yield of potato cultivars relies on breeders' efforts to maintain the ‘maximum heterozygosity’ in the genome (Carputo et al., 2003). However, the molecular basis of the maximum heterozygosity has been obscure. An early study of enzyme polymorphism in 94 North American potato cultivars revealed an average of 2.13 alleles per locus (Douches et al., 1989); this allelic complexity was supported in subsequent sequence-based analyses (Simko et al., 2004; Kuang et al., 2005). These early studies suggest that the maximum heterozygosity is probably composed of maximum number of alleles at each locus and possibly non-additive genetic variance. Our current study shows that CNV, including extreme CNV in which three of the four copies of a locus are absent, may be another significant contributor to potato maximum heterozygosity. A significant number of deletions has been accumulated in the potato genome, as shown at the genome level in analysis of the doubled monoploid DM and the heterozygous diploid RH (The Potato Genome Sequencing Consortium, 2011) and now in this study by FISH with cultivated potato. These deletions, although not lethal, can have a negative effect on plant vigor especially if genes associated with CNV have dosage effects. Thus, a high yielding potato cultivar may need to combine not only superior alleles, but also advantageous CNV loci. This finding provides an explanation why modern potato breeding has been very ineffective in developing high yielding cultivars. Genome-wide mapping and characterization of CNVs will play an important role in future breeding of potato.

Experimental Procedures


In total, 16 potato cultivars developed by different breeding programs in North America and Europe (Table 2) were chosen for a FISH-based CNV survey. Haploids of Atlantic and Katahdin were developed and maintained by the University of Wisconsin potato breeding program. The homozygous doubled monoploid clone DM was derived from anther culture (Lightbourn and Veilleux, 2007) and is the source for the potato reference genome (The Potato Genome Sequencing Consortium, 2011). Diploid clones RH and SH were parental clones for a potato genetic mapping population (van Os et al., 2006). The BAC clones (Table S1) used in CNV survey have been mapped previously to the euchromatic regions of potato chromosome 6 (Iovene et al., 2008).


Chromosome preparation and FISH were performed following published protocols (Dong et al., 2000; Iovene et al., 2008). BAC DNA was labeled with either biotin-16-UTP or digoxigenin-11-dUTP (Roche Diagnostics, Indianapolis, IN, USA) using a standard nick translation reaction. Chromosomes were counterstained with 4′,6-diamidino-2-phenylindole (DAPI) in VectaShield antifade solution (Vector Laboratories, Burlingame, CA, USA). The FISH images were processed using meta imaging Series 7.5 software. The final contrast of the images was processed using adobe photoshop CS3 software.

Quantification of transcripts of genes associated with CNVs

Using annotation of the DM genome (The Potato Genome Sequencing Consortium, 2011) (, primers were designed for four annotated DM genes that were collinear with RH102I10 and three genes collinear with RH60H14 (Table S2). All primer efficiencies were estimated using template dilutions and the equation E = 10(−1/slope); all were in the 1.8–2.2 range. qRT-PCR was performed using two biological replicates and three technical replicates. For each genotype, two or three tubers were planted in the greenhouse. Seven to 12 days after emergence, one or two fully expanded terminal leaflets from the top of each plant were collected and immediately frozen in liquid nitrogen. Leaf tissue from two or three plants of the same genotype was pooled and RNA was extracted using the Qiagen plant RNeasy kit (Valencia, CA, USA) and the on-column DNase digestion according to the manufacturer's instructions. RNA was then treated with Turbo DNA-free (Ambion/Applied Biosystems, Houston, TX, USA) and RNA quality/quantity and integrity were evaluated using a NanoDrop spectrophotometer and agarose gel electrophoresis, respectively. Super Script III reverse transcriptase (Invitrogen, Carlsbad, CA, USA) and oligodT primers were used to generate the first-strand cDNA. All RT-PCR reactions and subsequent amplicon melting curves were performed in triplicate using Dynamo SYBR Green master on the Opticon 2 of MJ Research (Waltham, MA, USA).

The comparative Ct method and the reference genes α-tubulin and actin-97 were used to calculate the normalized expression level, according to the formula 2−ΔCT where ΔCT = CT (target gene, using the CT of a single technical replicate) − CT (actin-97 or α-tubulin; for each genotype the CT value of each reference gene was obtained by averaging the CT values of the three technical replicates). Thus, for each genotype, six ΔCT values (three for each biological replicate), normalized either by α-tubulin or actin-97, were obtained and used in the subsequent analysis. Statistical analysis was performed in the R statistical analysis environment (Yuan et al., 2006). Within each ploidy group (2x and 4x), the non-parametric K–S statistical test was used to test differences in the gene expression pattern between groups of genotypes with differential CNV for 102I10. Specifically, within the diploid lines, the comparison was carried out between the expression pattern of the genotypes with two copies of the RH102I10 DNA fragment and genotypes with a single copy. Similarly, for the tetraploids, comparisons were made between genotypes with four copies versus three copies, three copies versus two copies, and two copies versus one copy. Differences in the expression pattern of different ploidy-CNV groups were considered statistically significant when the obtained P-values were < 0.01 for both ΔCt datasets calculated using α-tubulin and actin-97 for normalization. As a control, a similar analysis using the same ploid-CNV groups was performed with expression patterns of genes mapping outside the CNV region.


This work was supported by grants DBI-0923640 and ISO-1237969 from the National Science Foundation to C.R.B. and J.J.