• Open Access

Selectable marker genes and unintended changes to the plant transcriptome


  • Brian Miki,

    Corresponding author
    1. Eastern Cereals and Oilseeds Research Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada, K1A 0C6
      * Correspondence (fax 613 7591701; e-mail mikib@agr.gc.ca)
    Search for more papers by this author
  • Ashraf Abdeen,

    1. Eastern Cereals and Oilseeds Research Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada, K1A 0C6
    Search for more papers by this author
  • Yuzuki Manabe,

    1. Eastern Cereals and Oilseeds Research Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada, K1A 0C6
    Search for more papers by this author
  • Phil MacDonald

    1. Biotechnology Environmental Release Assessments, Canadian Food Inspection Agency, Room 111, 159 Cleopatra Drive, Ottawa, ON, Canada, K1A 0Y9
    Search for more papers by this author

* Correspondence (fax 613 7591701; e-mail mikib@agr.gc.ca)


The intended effect of a selectable marker gene is to confer a novel trait that allows for the selection and recovery of transgenic plants. Unintended effects may also occur as a result of interactions between the selectable marker gene or its regulatory elements and genetic elements at the site of insertion. These are called position effects. Other unintended effects may occur if the selectable marker gene has a range of pleiotropic effects related to the functional and regulatory domains within the coding region or the regulatory elements used to drive expression. Both pleiotropic and position effects may generate unpredictable events depending on the process used for transgenesis and the state of knowledge associated with the selectable marker gene. Although some selectable marker genes, such as the neomycin phosphotransferase type II gene (nptII), have no pleiotropic effects on the transcriptomes of transgenic plants, others, such as the bialaphos resistance gene (bar), have pleiotropic effects. These must be clearly understood and accounted for when evaluating the expression patterns conferred by other co-transforming transgenes under study. The number and kinds of selectable marker genes are large. A detailed understanding of their unintended effects is needed to develop transgenic strategies that will minimize or eliminate unintended and unpredictable changes to plants with newly inserted genes.


The adoption of transgenic technologies in basic research and crop development has been very rapid and extensive. Consequently, many of the early experiments with transgenic plants were performed without a complete understanding of the kinds of changes that could be induced by the transformation process, and those directly attributable to the transgene. The terminology used to describe the specific effects of transgenes on plants varies in the literature and can be confusing. For both the generation of basic knowledge and pre-commercialization research, it is important to have clarity on this subject.

In risk assessments of transgenic plants, the terms intended vs. unintended and predictable vs. unpredictable (Table 1) are often used to describe the traits conferred by the transgenes. The goal of the terminology is to describe whether the transgenic plants differ from their traditional counterparts or whether they are substantially equivalent. The concept of substantial equivalence was developed as a guiding principle to start a risk assessment process for genetically modified foods, and utilizes a comparator that has been employed in practice and is accepted as being safe [Organization for European Co-operation and Development (OECD), 1993]. The technologies used to measure the extent of these effects usually focus on traits that are important to the crop under development and the intended effect of the trait that is introduced. The use of substantial equivalence involves assessments of both the agronomy and biology of the transgenic crop, with an emphasis on life history traits as well as the key nutrients and anti-nutrients in the food. Comparisons are made between the modified plant and a closely related unmodified counterpart. Comparisons are made directly with the unmodified counterpart, as well as with the normal range for the specific compound or characteristic within that crop compared with current varieties. Data are generated using plant materials grown under environments in which the crop will be produced to obtain results from a number of different environmental conditions. The parameters measured provide an assessment of the outcomes of numerous metabolic pathways that result in the phenotype of the modified plant. The aim is to determine whether the crop and the foods derived from it are substantially equivalent with the exception of the introduced trait. In Canada, the risk assessment process that is required prior to commercialization is triggered by the novelty of the trait and not by the genetic process used to introduce it into a crop; therefore, novel traits introduced by mutation or wide crosses would undergo the same level of risk assessment (Smyth and McHughen, 2008).

Table 1.  Terminology used to describe the effects of transgene insertion and expression on the plant transcriptome
EffectsDefinition and meaning
PhenotypicEffects that are directly attributable to transgene expression and are transgene locus independent*
PleiotropicThe diversity of phenotypic effects attributable to transgene expression that are transgene locus independent*
PositionAll phenotypic effects that are attributable to the transgene insertion and expression that are transgene locus specific*
Intended‘. . . targeted to occur from the introduction of the gene(s) in question and that fulfil the original objectives of the genetic transformation process.’
Unintended‘. . . represent a statistically significant difference in the phenotype, response or composition of the genetically modified plant compared with the parent from which it is derived, but taking the expected effect of the target gene into account.’
Predictable, unintended‘. . . unintended changes that go beyond the primary expected effect(s) of introducing the target gene(s), but which may be explicable in terms of our current knowledge of plant biology and metabolic pathway integration and interconnections.’
Unpredictable, unintended‘. . . changes falling outside our present level of understanding.’

The recent growth in plant systems biology (Yuan et al., 2008) has elevated the precision and capacity for detail in characterizing the novel features of cloned genes as reflected in transgenic plants. For example, it may reveal evolutionary and functional relationships among diverse genes that could be responsible for unexpected pleiotropic effects (Table 1; for example, Rijpkema et al., 2007). Some may be intended effects, whereas others may be unintended (Figure 1). Depending on the extent of current knowledge, some of these effects may be unpredictable, but become predictable as new information is acquired. Furthermore, each gene insertion varies in expression because of position effects, which result from interactions of the gene with elements at the site of insertion (Table 1; Kim et al., 2007). These may lead to unpredictable, unintended effects if insertion occurs randomly, or predictable, intended effects if a gene is being targeted to a specific site in the genome (Figure 1). To understand the impact of a gene insertion on transgenic plants, it is important to first understand the range of effects that can occur, and then develop experimental strategies that eliminate unexpected position effects and pleiotropic effects, so that the phenotype of the gene can be identified by measurable changes to gene expression patterns, development, composition and responses of the host plant.

Figure 1.

The relationship of the diverse effects of transgene insertion and expression described in the literature. A specific phenotype is the most frequent intended effect conferred by a transgene in a plant. The transgene may also impart a range of phenotypes which constitute the pleiotropic effects of the transgene. These differ from the position effects that modify the phenotype because of the interactions that are induced by processes specific to each insertion site. Both the pleiotropic and position effects may be the unintended effects that are revealed through experimentation with transgenic plants. These need to be understood in order to determine the true phenotype of the transgene. With increased knowledge of the gene and the development of technologies that eliminate or minimize the potential for pleiotropic and position effects, the predictability of achieving the intended phenotype increases and the risk of unintended effects decreases.

In plants, tools for generating and assessing transgenic plants have become pivotal for our understanding of the functional relationships among unknown genes emerging from genomics studies. Consequently, large-scale profiling techniques have been recommended and undertaken to provide detailed measurements on the various effects of transgenesis for the risk assessment of transgenic crops (Kuiper et al., 2001; Cellini et al., 2004). In this article, we attempt to merge the concepts and terminology used in risk assessment research with recent developments in our understanding of transgenesis emerging from systems biology studies. Particular attention is paid to selectable marker genes, as they vary greatly and usually remain linked to the transgenes undergoing evaluation and assessment; however, the principles that apply to selectable marker genes also apply to other transgenes.

Heritable changes to the transcriptome are not introduced by transgenesis

The question of whether transgenic plants are equivalent to non-transgenic isogenic lines at the level of global gene expression now appears to be clear. Microarray analysis has shown that the transcriptome of transgenic Arabidopsis can be unchanged by Agrobacterium-mediated transformation, and that it may undergo exactly the same reprogramming as untransformed plants in response to abiotic stress (El Ouakfaoui and Miki, 2005). Similar results have been demonstrated in transgenic wheat produced by biolistics transformation (Baudo et al., 2006). There is no evidence to support the suspicion that the insertion of transgenes activates or introduces cryptic genetic processes that generate heritable changes to the transcriptome. The stability of the transcriptome to transformation processes supports the use of transgenic plants for the metabolic engineering of pathways leading to biosynthetic products, such as dhurrin, without significant disturbance to the transcriptome (Kristensen et al., 2005). These conclusions are conditional on the use of an appropriate experimental design and knowledge of the genes, processes and downstream effects.

Position effects are unintended effects generated in the plant genome

Position effects result from interactions between the transgene and processes occurring at the site of insertion, together with its downstream effects. Position effects are therefore specific to the transgene locus (Table 1). They can vary widely and include the inadvertent creation of transcriptional or translational gene fusions, misexpression or knock-out mutations, induction of gene silencing and chromatin remodelling. T-DNA insertion without selection occurs randomly throughout the plant genome, including heterochromatic regions, and at frequencies that reflect the proportion of exons, introns and 5′ upstream and 3′ downstream regions (Kim et al., 2007). The generation of many independent transgenic lines by this method should therefore create a range of position effects and allow for the eventual identification of lines in which unfavourable position effects are absent.

Once identified, these loci may be used as preferred sites for gene targeting strategies designed for the nuclear genome (Day et al., 2000). Bacterial and yeast recombinases have proven to be useful for gene stacking (Ow, 2005) and gene replacement (Nanto et al., 2005; Louwerse et al., 2007) at specific transgene loci targeted for site-specific recombination. The use of these new technologies encourages the generation of plants with predictable expression patterns, and should therefore reduce the potential for unintended position effects (Figure 1). In chloroplast transformation, the insertion of transgenes normally occurs by homologous recombination without the need for exogenous recombinases, thereby eliminating position effects among transplastomic plants (Verma and Daniell, 2007).

Genes within the transforming DNA interact with elements at the site of insertion. This is illustrated by the finding that selection pressure will favour the recovery of transgenic plants with insertions in transcriptionally active chromatin regions (Kim et al., 2007). This knowledge has been exploited in promoter trap and gene trap/activation tagging strategies to identify regulatory elements and genes at insertion sites (Koncz et al., 1989; Weigel et al., 2000). Promoter traps have also revealed the existence of cryptic promoter elements in the tobacco genome, which may be tissue specific (T218; Fobert et al., 1994) or constitutive (tCUP; Foster et al., 1999), yet not associated with genes. In activation tagging, elements of the strong 35S promoter are often used to activate the expression of neighbouring genes at the insertion site. Examples include the Arabidopsis gene CKI1 (cytokinin-independent 1), which is selectable in tissue culture (Kakimoto, 1996), and the tomato gene ANT1 (anthocyanin 1), which can be screened by the generation of purple colour (Mathews et al., 2003). The 35S promoter is also commonly used for the expression of the neomycin phosphotransferase type II gene (nptII) to confer high levels of kanamycin resistance. This can interfere with the regulation of co-transforming transgenes, such as the root-specific promoter lateral root primordial 1 (LRP1) (Yoo et al., 2005) or sequentially transformed genes (Daxinger et al., 2007).

Vectors can be designed to reduce the gene interactions within the T-DNA and between the T-DNA and adjacent genomic DNA at the insertion site. For example, the tCUP promoter (Malik et al., 2002; Coutu et al., 2007) provides a useful alternative, as it does not interact with a number of tested promoters (Gudynaite–Savitch et al., in press). Furthermore, vectors can be designed with promoters positioned away from the T-DNA borders and separated from each other by coding regions or spacer DNA. These technical measures significantly increase the likelihood that transgenes will be expressed faithfully and predictably by the regulatory elements fused to them, so that the phenotype is not altered (Miki, 2008). In chloroplasts, homologous recombination allows the replacement of coding regions without changing the composition of the regulatory regions (Verma and Daniell, 2007).

The development of transgenic material with new traits differs from breeding strategies which introgress traits by crossing, followed by backcrossing to eliminate unwanted variation. In transgenic plants, variations introduced by position effects are linked to the transgene at the transgene locus, and may be difficult or impossible to eliminate by backcrossing once they have been generated. By understanding position effects and the development of emerging technologies, such as gene targeting, the unintended effects may be largely eliminated (Figure 1).

Research in transgenic wheat and soybean has demonstrated that the level of variation in the transcriptomes of transgenic and non-transgenic lines is lower than among conventionally bred material (Baudo et al., 2006; Cheng et al., 2008). This was supported by studies on transgenic potato and tomato proteomes (Corpillo et al., 2004; Lehesranta et al., 2005), and transgenic potato and wheat metabolomes (Catchpole et al., 2005; Baker et al., 2006). The accumulated data clearly indicate that transgenic crops can easily be considered to be substantially equivalent to non-transgenic crops, as the extent of variation revealed through various profiling techniques falls well within and usually below the range of naturally occurring levels of variation. The opportunity exists with transgenic crops to reduce the level of variation associated with the introduction of new traits to much lower levels than through breeding.

Pleiotropic effects are related to the complexity of the transgene phenotype

The full range of recurring locus-independent changes induced by transgenes constitutes the pleiotropic effects (Table 1). The study of pleiotropic effects in transgenic plants provides an excellent model for identifying functional aspects of a cloned gene. If the gene originates from a distant biological source and is novel to plants, it may simply introduce a new phenotype or novel trait that is isolated from interactions with other plant processes. A good example is the nptII gene from Tn5 which confers resistance to aminoglycoside antibiotics. It was the first chimeric gene construct developed for use as a selectable marker in plants (reviewed in Miki and McHugh, 2004). Other cloned genes may alter plant processes inadvertently because of functional redundancies within their structures, in addition to imparting the new intended trait. The evolution of multicellular organisms has resulted in the functional separation of duplicated genes by spatial separation of gene expression. Genes may be very tissue or species specific in situ, but may yield broad pleiotropic effects when expressed as a transgene outside of the usual expression domain. Such experiments can reveal functional similarities and evolutionary relationships among genes that were not previously understood. The members of the large MADS-box family of transcription factors, which coordinate many plant transcription networks (reviewed by Causier et al., 2005; Rijpkema et al., 2007), provide an excellent example of how the above events have shaped important plant processes, such as flowering (Soltis et al., 2007).

Another interesting example of the importance of the expression domain is the spinach betaine aldehyde dehydrogenase (BADH) gene, which generates extensive pleiotropic effects when expressed in the nuclear genome, but acts as a selectable marker without pleiotropic effects when used as a chloroplast selectable marker (Verma and Daniell, 2007).

In nuclear transformation, a desired trait or an intended effect may be identified in the phenotype of a transgene; however, pleiotropic effects may generate a range of unintended effects (Figure 1). The unintended pleiotropic effects, if understood, may be eliminated through technologies such as directed evolution or engineering of nucleotide and protein sequences (Figure 1). This is possible because many important genes with regulatory roles are modular in nature and have acquired and lost different functional domains during evolution. Interesting examples include the receptor kinases (Shiu and Bleecker, 2001) and transposase-derived transcription factors (Lin et al., 2007).

The enzyme acetohydroxyacid synthase, which catalyses a regulatory step in branched-chain amino acid biosynthesis, appears to have retained a non-functional ubiquinone-binding domain that is the target for several herbicides, including the sulphonylureas, imidazolinones and triazolopyrimidines. At least five herbicide classes act as non-competitive and highly specific inhibitors of the enzyme (reviewed by Singh and Shaner, 1995; Tan et al., 2005). Selective resistance and/or cross-resistance to the herbicides can be achieved by targeting mutations to specific amino acids in the domain, thus providing a range of activities for use as selectable marker genes (Olszewski et al., 1988; Aragao et al., 2000).

Pleiotropic effects of selectable marker genes

The phenotype of a selectable marker gene is basically man-made. There are very few studies on the pleiotropic effects of selectable marker genes on the plant transcriptome despite the critical need for this information when interpreting the functions and roles of co-transforming genes of interest. It is a commonly held belief that the most highly used selectable marker genes, such as nptII, the hygromycin phosphotransferase gene (hpt) and the bialaphos resistance gene (bar), are not associated with pleiotropic effects. Early research provided extensive indirect data to support this conclusion (reviewed by Nap et al., 1992), but uncertainty existed without the use of comprehensive profiling technologies, such as microarray analysis, which emerged later. Only recently has this been confirmed for nptII using large-scale profiling analysis (El Ouakfaoui and Miki, 2005).

Recent data on the chlorsulfuron resistant 1 (CSR1) gene, which codes for acetohydroxyacid synthase, have been published recently (Manabe et al., 2007). A substitution mutation (Ser653Asn) in the Arabidopsis csr1-2 gene confers resistance to imidazolinones. The transcriptome of csr1-2, grown in the presence or absence of herbicide, is identical to wild-type lines, indicating the absence of pleiotropic effects resulting from mutation. Furthermore, the data show that potential targets for herbicide action, other than the enzyme coded by the gene, are unlikely. In both of the above examples, the specificity of the enzyme substrate or enzyme inhibitor probably plays a role in limiting the potential for pleiotropic effects. These kinds of selectable marker genes fall into the category of positive conditional selectable marker genes, because a specific external substrate is required for the selection system to perform (Miki and McHugh, 2004).

The specificity of the external substrate or its derivatives and the target process that mediates the selection event vary among systems. For example, a recent study of the bar gene as a selectable marker revealed the existence of pleiotropic effects when transgenic Arabidopsis lines were examined using microarray analysis (Abdeen and Miki, 2009). Although the presence of the bar gene in Arabidopsis resulted in the differential expression of only a very small number of genes (four genes), the application of the herbicide glufosinate altered the expression of at least 80 genes, 29 of which were specific to the transgenic plants. Phenotypic alterations were not apparent and the genes appeared to represent a stress-related detoxification response specific to the derivatives of the herbicide. Although the number of genes was clearly smaller than the early response of wild-type plants to glufosinate, over one-third appeared to differ from those of wild-type plants. It is interesting that other studies have shown a fitness cost associated with the expression of the bar gene in specific barley lines, Oregon Wolfe Barley Dominant hybrids (Bregitzer et al., 2007), suggesting that pleiotropic effects associated with bar and pat (phosphinothricin-acetyltransferase) need to be studied in a range of crops and varieties, as they may occur in certain circumstances.

Conditional positive selection systems that lack specificity have also been considered (Yuan et al., 2006). These may involve mechanisms for sequestering selective agents into inactive forms or subcellular compartments. An example is the Arabidopsis Atwbc19 gene, which codes for an ATP binding cassette (ABC) transporter. Expression of the ABC transporter will result in the translocation of kanamycin to the vacuole, but not other antibiotics, thus conferring kanamycin resistance (Mentewab and Stewart, 2005). This type of selection system might have pleiotropic effects, as the substrates of most ABC transporters are unknown and are in need of characterization. For tonoplast-targeted transporters, potential pleiotropic effects could be variable among environments, and are likely to have a broad range of sequestration activities or unintended pleiotropic effects, such as the accumulation of unexpected compounds that might compromise product quality (see Rommens, 2006).

The presence of limited numbers of pleiotropic effects may not necessarily influence the risk assessment of a transgenic plant, as variation in expression is a natural condition that reflects the dynamic nature of the transcriptome in nature. The concept of substantial equivalence is a starting point in risk assessment, and allows for variation within the limits of what is naturally occurring (reviewed in Cellini et al., 2004). However, the detailed pleiotropic effects need to be understood, as they may influence the interpretation of scientific results when co-transforming genes of interest are being examined in transgenic plants in which selectable markers, such as bar or Atwbc19, are retained at the transgene locus.

Implications for new selectable marker genes

The potential for the generation of new selectable marker genes is great. Over 50 different kinds have been described, which vary widely in the manner in which selection is achieved (Miki and McHugh, 2004). In the development of alternative selectable marker genes, the extent and kinds of pleiotropic effects that the gene may create in plants have not generally been studied. At this time, it is not known with certainty whether bacterial genes that introduce novel and specific mechanisms for selection, such as nptII, more strongly limit the extent of pleiotropic effects than do plant genes, such as Atwbc19 (Mentewab and Stewart, 2005; Yuan et al., 2006). It is assumed that substrate specificity may be extremely important in limiting the unintended pleiotropic effects (Rommens, 2006). Other examples of plant and non-plant genes with similar intended effects or phenotypes include the isopentyl transferase gene (ipt) from Agrobacterium (Endo et al., 2001) and the Arabidopsis plant growth activator 22 pga22 gene (Zuo et al., 2002). Both encode similar enzyme activities that alter cytokinin levels and generate significant selectable changes in plant development without the need for toxic substrates. Because the pleiotropic effects cannot be separated from the selection system in this case, elimination of the marker gene is essential to generate substantially equivalent transgenic plants. It is understood that technologies for creating marker-free plants (Sugita et al., 2000; Zuo et al., 2002; Darbani et al., 2007) are essential if these are going to be used for practical purposes.


A considerable body of knowledge has been accumulated on the most commonly used selectable marker genes, for example nptII, hpt and bar/pat (reviewed by Miki and McHugh, 2004; Ramessar et al., 2007) Yet, only recently has a search for the unintended pleiotropic effects generated by them been examined owing largely to the recent emergence of molecular and biochemical profiling technologies (El Ouakfaoui and Miki, 2005; Kristensen et al., 2005; Baudo et al., 2006; Manabe et al., 2007; Cheng et al., 2008). The knowledge generated has provided an insight into our understanding and use of the concept of substantial equivalence for the risk assessment of transgenic crops. To date, the transgenic crops examined are substantially equivalent to existing non-transgenic crops, except for the introduced novel trait, because the variation falls well within the naturally occurring variation levels.

For basic research in functional genomics and systems biology, the use of transgenic plants is an important complement to mutants to obtain an understanding of the functions of unknown genes and how they integrate within the regulatory networks of the plants. For these studies, the degree of similarity between the transgenic line and the non-transgenic line must exceed the standard for substantial equivalence. Ideally, the comparator should be an isogenic line, for example, a non-transgenic sibling. Otherwise, subtle or narrow phenotypes of an introduced gene may be lost within the variation existing among different genotypes. Furthermore, the variation induced by transgenesis, including the co-transforming marker genes, must be understood or eliminated, as it could result in the misinterpretation of gene function. Research has shown that nptII has no pleiotropic effects on the transcriptome, whereas bar and glufosinate treatments may induce a number of them. This knowledge is essential in the analysis of data intended to isolate the effects of the inserted gene of interest.

A variety of new genes with novel phenotypes are being developed as selectable marker genes. To be effective replacements for well-studied selectable markers, such as nptII, it is important that the full range of unintended pleiotropic effects be understood. If the pleiotropic effects cannot be eliminated from the intended phenotype, technologies will need to be adopted to create marker-free plants or that will isolate the impacts of the pleiotropic effects from the phenotype. Many technologies are under development that may address these issues and could be adapted into general transformation protocols.


The coauthors are grateful to Nick Tinker (Agriculture and Agri-Food Canada, Ottawa, ON, Canada) and Marina Steele (Canadian Food Inspection Agency, Ottawa, ON, Canada) for reviewing the manuscript before submission. This research was supported by the CFIA.