Molecular methods for strain typing of Candida albicans: a review



Jihene Ben Abdeljelil, Laboratoire de Parasitologie-Mycologie, Hôpital Farhat Hached 4000, Sousse, Tunisie. E-mail:


Candida albicans is one of the most medically important fungi because of its high frequency as a commensal and pathogenic microorganism causing superficial as well as invasive infections. Strain typing and delineation of the species are essential for understanding its biology, epidemiology and population structure. A wide range of molecular techniques have been used for this purpose including non-DNA-based methods (multi-locus enzyme electrophoresis), conventional DNA-based methods (electrophoretic karyotyping, random amplified polymorphic DNA, amplified fragment length polymorphism, restriction enzyme analysis with and without hybridization, rep-PCR) and DNA-based methods called exact typing methods because they generate unambiguous and highly reproducible typing data (including microsatellite length polymorphism and multi-locus sequence typing). In this review, the main molecular methods used for C. albicans strain typing are summarized, and their advantages and limitations are discussed with regard to their discriminatory power, reproducibility, cost and ease of performance.


Candida albicans is a common commensal microorganism in humans and some warm blood animals. It colonizes mucosal surfaces of healthy subjects and is considered as a component of the normal flora of the digestive and genitourinary tracts (Kam and Xu 2002). The standard genome of C. albicans is diploid and composed of eight pairs of chromosome homologs, ranging in size from about 3·3 to 0·95 Mb (Chibana et al. 2000). Candida albicans has a predominantly clonal mode of reproduction (Pujol et al. 1993; Lockhart et al. 1995; Mata et al. 2000). It was the first medically important pathogenic fungus for which the complete genome sequence was determined (Jones et al. 2004). The first genome to be sequenced was that of the SC5314 clinical isolate that showed a high degree of heterozygosity, including more than 55 700 single-nucleotide polymorphisms (SNPs) in the 32-Mb diploid genome (Jones et al. 2004). Despite its diploidy and clonal reproduction, C. albicans is able to achieve a high rate of a genetic diversity in several ways, including recombination, chromosomal polymorphisms, gene replacement and cryptic mating, reflecting the plasticity of the genome (Chibana et al. 2000; Selmecki et al. 2010).

Over the last three decades, C. albicans has become a medically important pathogen, responsible for superficial as well as deep infections. Most infections are opportunistic depending on the immune defence of the host and changes of the environment of the yeast in the organism (Hajjeh et al. 2004). Indeed, since the early 1980s, invasive candidiasis, mainly caused by C. albicans, has emerged as a prominent problem (Pfaller 1996; Pfaller and Diekema 2007), because of the increasing number of immunocompromised patients and the advances in medical intensive care. Consequently, C. albicans has become a frequent cause of nosocomial infections and represents a serious threat as invasive candidiasis is associated with high mortality rates that exceed 30% in many reports, despite availability of antifungal treatments, including novel azoles and echinocandins (Boucher et al. 2004; Pfaller and Diekema 2007; Horn et al. 2009).

Although most Candida infections appear to originate from an endogenous source, nosocomial transmission is not uncommon and may occur either by cross-infection or by exposure to a common infecting source (Huang et al. 1998; Pfaller et al. 1998; Ben Abdeljelil et al. 2011). This underlines the need for implementing appropriate control measures, which obviously requires a high knowledge of the biology and the epidemiology of Candida species including C. albicans, that are recognized to be particularly complex. Indeed, the same healthy individual can harbour the same strain at different body locations, or carry unrelated strains at the same or different body sites (Kam and Xu 2002), strains can replace each other in recurrent infections (Odds et al. 2006), undergo microevolution (minor changes in genotype over a relatively small number of cell generations) (Lockhart et al. 1995; Shin et al. 2004; Bougnoux et al. 2006) and substrain shuffling (changes in subpopulations within individuals over time) (Lockhart et al. 1995; Lott and Scarborough 2008), be transferred from one individual to another (Vazquez et al. 1993; Schmid et al. 1995; Ben Abdeljelil et al. 2011), specific strains may predominate in particular geographical areas and some strains may be endemic in some hospitals and can undergo microevolution in the hospital setting (Pfaller et al. 1998; Viviani et al. 2006). This highlights the need for efficient methods for typing and delineation of strains. The earliest methods used in the typing of C. albicans were based on phenotypic characteristics including serotyping, biotyping, morphotyping, resistance to various chemicals and toxins and antifungal susceptibility profiles (Warnock 1984). However, phenotypic techniques have a very low degree of discrimination and reproducibility, which obviously constitutes a limitation for reliable epidemiological analysis. The advent of the molecular DNA-based techniques revolutionized the knowledge on the biology and epidemiology of C. albicans. Since then, various molecular approaches have been described that target various levels of polymorphism within a species (Odds and Jacobsen 2008).

Herein, the main molecular techniques used so far in C. albicans strains typing are reviewed, including techniques that have been the subject of previous reviews but also the more recently described techniques in this field. The review aims (i) to describe the principles of these molecular approaches, (ii) to highlight their strengths and their drawbacks and (iii) to discuss their input in different types of C. albicans molecular investigations.

General Considerations

Genotyping of C. albicans strains allows the following: (i) investigation of nosocomial candidiasis to identify outbreak-related strains, distinguish epidemic from endemic or sporadic strains and determine the origins of infection, the routes of acquisition and transmission of strains; (ii) assessing the diversity of isolates within a carrier and investigation of the recurrent infections to recognize particularly virulent strains if any; (iii) monitoring of the emergence of drug-resistant strains; (iv) studying of the population structure, diversity and dynamics of the species.

The choice of the method for strain typing is contingent upon the nature and the defined objectives of each molecular investigation. The performance of each typing technique should be assessed in terms of discriminatory power, reproducibility and ease of performance and interpretation. Discriminatory power refers to the ability of the technique to identify the same strain or highly related strains in independent isolates, to assess microevolution within the same strain, to group or cluster moderately related strains and to discriminate between unrelated strains (Clemons et al. 1997; Chen et al. 2005; Chowdhary et al. 2006).

Interassay reproducibility is assessed by the repeat testing of the same strain that is expected to yield an identical result, and reproducibility between laboratories that allow comparison of data is essential for any standardization of the typing approach and the construction of general data bank for use by the international scientific community.

Ease of performance and feasibility refers to the cost of specialized equipment and reagents, the facility to implement the technique in routine use, the technical complexity of the procedure and the ease of result interpretation. Moreover, data should be suitable for computerized analysis and storage, which is needed for the construction of databases (Schmid et al. 1990; Singh et al. 2006).

Typing methods used for molecular characterization of C. albicans can be separated in two classes based on different approaches. The first class includes non-DNA-based methods that assess the genetic intraspecies polymorphism indirectly by using comparisons of protein fingerprints [multi-locus enzyme electrophoresis (MLEE)]. The second class includes DNA-based techniques that directly analyse the polymorphism within various DNA markers and are separated into conventional typing methods and methods called exact DNA-based techniques because they generate unambiguous and highly reproducible typing data. The comparison of the performance characteristics of these methods is shown in Table 1.

Table 1. Characteristics of the most widely used techniques for genotyping of Candida albicans
 Discriminatory powerReproducibilityEase of useEase of interpretationSetup costCost/ isolateTime required (days)Comments
  1. AFLP, amplified fragment length polymorphism; EK, electrophoretic karyotyping; MLEE, multi-locus enzyme electrophoresis; MLP, microsatellite length polymorphism; MLST, multilocus sequence typing; RAPD, random amplified polymorphic DNA; REA, restriction enzyme analysis.

Non-DNA-based techniques
MLEEGoodGoodVery difficultDifficultLowLowSeveral daysProtein-based method Time required depends on the number of enzymes to be assessed
Conventional DNA-based techniques
EKLowGoodVery difficultVery easyHighLow4 
MacrorestrictionHighGoodVery difficultEasyHighModerate4 
REA with Southern blottingHighHighVery difficultEasyHighModerate2 
RAPDGoodLowVery easyEasyModerateLow1 
AFLPHighGoodDifficultEasyModerate to highModerate2 
Rep- PCRModerateGoodEasyVery easyModerateLow1 
Exact DNA-based techniques
MLPVery highVery highVery easyVery easyVery highModerate1Unambiguous data
MLSTVery highVery highModerateVery easyVery highVery high2

Unambiguous data

Online databases

Non-DNA-Based Typing Methods

Multi-locus enzyme electrophoresis

MLEE characterizes enzymatic proteins and assesses their polymorphism by analysing their electrophoretic mobility on gels after specific enzyme staining procedures. The migration of an enzyme is influenced by its molecular size and its net charge. Changes in the mobility of an enzyme protein reflect a change in its amino acid sequence and, thus, by inference, the encoding DNA sequence. Therefore, if the enzyme banding patterns of two isolates differ, such differences are assumed to be DNA based and heritable. In addition, it is assumed that in diploid organisms, enzyme expression is co-dominant, that is, every allele at a locus is expressed, alleles being rarely missing. MLEE may provide valid measures of genetic distance because it assesses defined multilocus differences and information relevant to population genetic studies as well as to epidemiology (Pujol et al. 1993; Arnavielhe et al. 1997).

MLEE is one of the earliest methods used for genetic and epidemiological studies of C. albicans and provided the first insight into the populations' structure of the species (Caugant and Sandven 1993; Pujol et al. 1993; Arnavielhe et al. 1997). It has been further used to fingerprint other Candida species (e.g. C. tropicalis, C. lusitaniae, C. haemulonii, C. parapsilosis and C. guillermondii) (Merz et al. 1992; Doebbeling et al. 1993; Lehmann et al. 1993; Lin et al. 1995; San Millan et al. 1997).

MLEE is a reliable technique with a relatively high discriminatory power in the distinction between unrelated strains and shows good reproducibility (Arnavielhe et al. 2000; Boriollo et al. 2006). It outperforms many of the DNA-based methods and remains a useful tool for molecular investigation of natural populations of C. albicans (Santos et al. 2012).

The major drawback of the MLEE is that it assays the genome indirectly and evaluates variations accumulated very slowly in the species. On the other hand, MLEE does not detect all variations at the nucleotide level as nucleotide substitutions do not necessarily lead to a change in the amino acid composition of the enzyme. In addition, it may not be sensitive enough in measuring rapid microevolutionary changes and, thus, may not be an adequate method for revealing substrain shuffling or microevolution in an infecting population over time. Indeed, when compared with other typing methods for their capacities to assess microevolution within a strain, MLEE showed a modest degree of resolution power (Boriollo et al. 2010). Finally, MLEE is time-consuming because it requires the combination of results from at least 10 enzymes to provide reliable typing data and detection of variability among isolates (Boriollo et al. 2006, 2010).

Conventional DNA-Based Typing Methods

Pulsed-field gel electrophoresis

Electrophoretic karyotyping

The advent of pulsed-field gel electrophoresis (PFGE) technique in 1984 revolutionized the study of the genome organization of eukaryotic organisms (Schwartz and Cantor 1984). In this technique and its variants [orthogonal-field alternative gel electrophoresis (OFAGE), field inversion gel electrophoresis (FIGE), contour-clamped homogeneous electric field (CHEF) or transverse alternate field electrophoresis (TAFE)], intact DNA molecules migrate through an agarose gel matrix under the influence of pulsed fields, which permit easy separation of DNA molecules of several megabases. As the size of C. albicans chromosomes range from around 1 to 4 Mb, this technique is ideal for the separation of chromosome-sized DNA molecules, the analysis of the chromosomal banding patterns, known as electrophoretic karyotypes, and the detection of karyotypic variations within the species.

Briefly, cells are mixed with enzymes to remove the cell wall and then embedded in an agarose plug, which protects the large DNA molecules from shearing forces. Protease and detergent are added to remove membranes and digest proteins. The yeast chromosome-sized DNA fragments are separated according to size and then visualized by ethidium bromide staining. Assessment of differences in banding patterns can be performed visually or using computer-assisted methods (Sangeorzan et al. 1995).

The earliest use of electrophoretic karyotyping (EK) demonstrated the extent of variation in the karyotypes of unrelated C. albicans isolates (Magee and Magee 1987; Lasker et al. 1989; Iwaguchi et al. 1990; Pittet et al. 1991; Sangeorzan et al. 1995). Indeed, despite the occurrence of a standard karyotype that consists of eight pairs of homologous chromosomes, variant karyotypes are very common among clinical isolates. Differences in the number of bands detected and their mobility patterns were presumed to be due to chromosome-length polymorphisms (different-sized chromosome homologs), reciprocal chromosome translocation or missing chromosomes (Bougnoux et al. 2007; Selmecki et al. 2010).

EK has been extensively used to fingerprint C. albicans and other Candida species (e.g. C. lusitaniae, C. parapsilosis, C. tropicalis and C. glabrata) (Espinel-Ingroff et al. 1999; Trtkova and Raclavsky 2006). It has a moderate discriminatory power when used for typing unrelated C. albicans isolates and does not allow the discrimination between moderately related isolates nor the detection of microevolutionary changes in the same strain. However, EK shows good reproducibility, and the interpretation of the karyotypes is straightforward (Pittet et al. 1991; Sangeorzan et al. 1995).

With respect to the technical aspects, the major limitations associated with EK relate to the initial cost of the specialized equipment and the prolonged turnaround time. A single experiment can involve electrophoretic separation times of several days, and the preparation of plugs is both tedious and time-consuming (Pittet et al. 1991; Sangeorzan et al. 1995).

Currently, EK analysis is usually associated with other techniques, and it is shown to be the least discriminatory method when used for typing related isolates (Chen et al. 2005; Ben Abdeljelil et al. 2010, 2011). Nevertheless, the resolution power of EK analysis can be improved by the prior digestion of chromosome-length DNA with endonucleases before PFGE (Voss et al. 1995; Clemons et al. 1997; Shin et al. 2004; Ben Abdeljelil et al. 2012b).

Chromosomal restriction fragment patterns

In this method, the intact genomic DNA, embedded in the agarose plug, is separated by PFGE after digestion with a restriction enzyme with relatively few recognition sites. The restriction of the chromosome-length DNA with rare-cutting endonucleases, known as macrorestriction analysis, provides many variable large fragments, so that the patterns generated are complex enough to allow a quantitative measure of genetic relatedness between two isolates (Chu et al. 1993). SfiI and BssHII have been the most used endonucleases due to their higher resolving power (Voss et al. 1995; Poikonen et al. 2001; Shin et al. 2004; Ben Abdeljelil et al. 2012b).

Macrorestriction analysis is a well-established method in C. albicans typing because of its stability and good reproducibility and its ability to differentiate between clinical isolates. It has been successfully used to investigate outbreaks and performed well when compared with other typing techniques (Krawczyk et al. 2009). In addition, it is a reliable tool for detecting microevolution among sequential isolates of C. albicans. By using restriction endonuclease analysis of genomic DNA by SfiI and BssHII, Shin et al. demonstrated that some C. albicans strains isolated from patients with catheter-related candidemia undergo microevolution during catheter colonization and showed a good agreement between the macrorestriction analysis and the Southern hybridization with the C1 fragment of Ca3 as a probe, C1 fingerprinting being an excellent indicator of microevolutionary events. In the macrorestriction analysis, microevolution within a strain refers to minor changes with a one- or two-band difference in the patterns (Shin et al. 2004). Macrorestriction analysis raises the same technical difficulties as the EK, but it has a higher resolving power.

Restriction enzyme analysis

Restriction enzyme analysis (REA) was widely used in the earliest epidemiological investigations of C. albicans infections (Matthews and Burnie 1989; Smith et al. 1989; Pfaller et al. 1990; Hunter 1991; Pearce and Howell 1991; Vazquez et al. 1991; Pfaller 1992; Robinson et al. 1993; Carlotti et al. 1994; Romano et al. 1994).

In this technique, total genomic DNA is purified and subsequently cleaved by a frequent cutting restriction endonuclease (e.g. EcoRI, MspI, BglII, HinF1 or HindIII) that produces a large number of short fragments resulting in a sequence-dependent restriction fragment length polymorphism (RFLP). The generated fragments are separated using common agarose gel electrophoresis and visualized after staining with ethidium bromide. Variation between strains is evidenced by different banding patterns. These variations occur as a result of changes or deletion of restriction site sequences or DNA deletions and insertions between the recognition sites (Smith et al. 1989; Carlotti et al. 1994).

REA is straightforward, rapid and inexpensive. However, it may result in the generation of complex patterns with a large number of bands of unequal intensities, thereby making their objective interpretation and the differentiation of strains very difficult, whether visually or using computer-assisted methods. Unique sequences have a low degree of resolution as they are poorly stained by the ethidium bromide in contrast to repeated sequences that mainly correspond to rDNA and mitochondrial DNA sequences occurring in multiples copies in fungal genomes. Such repeated sequences are not polymorphic enough to assess genetic relatedness of moderately related isolates, whereas by convenience, comparison of REA patterns is often based on differences between the highly stained bands. The low discriminatory power together with difficulties in the interpretation of patterns has been considered as the major limitations in REA analysis (Vazquez et al. 1991).

To simplify the banding patterns obtained by restriction endonuclease digestion and to increase the discriminative power of REA analysis, generated fragments can be hybridized with species-specific radiolabelled or biotinylated DNA probes after transfer to nitrocellulose or nylon membranes (Southern blotting). Probes recognize repetitive sequences dispersed throughout the genome as a result of sequence homology. In contrast to ethidium bromide staining, hybridization probes allow the selective visualization of a limited number of fragments and generate profiles that are easier to interpret but, at the same time, complex enough to provide an accurate and sensitive measure that reflects the relatedness of isolates (Vazquez et al. 1991; Olive and Bean 1999). The most commonly used probes are the related (but nonidentical) 27A and Ca3 probes containing repetitive genomic sequences (Soll et al. 1987, 1989; Sadhu et al. 1991; Mercure et al. 1993). Ca3 is a moderately repetitive 11-kb gene fragment successfully used in many epidemiological studies. It contains an additional repetitive sequence, the B sequence, and generates a more complex hybridization pattern than the one generated by 27A probe (Anderson et al. 1993). Its effectiveness is due to the fact that it hybridizes to: (i) repetitive sequences dispersed throughout the genome, identifying interstrain variability at a variety of dispersed loci, (ii) some hypervariable sequences, revealing microevolution within a strain and (iii) additional less variable sequences.

When used to probe Southern blots of EcoRI-digested DNA, Ca3 identifies over 20 bands of various intensities that include invariable (monomorphic), moderately variable and hypervariable bands (Schmid et al. 1990). The Ca3 probe has been demonstrated to be highly effective in the analysis of C. albicans populations and has identified five major clades (named I, II, III, SA and E) of closely related strain types (Schmid et al. 1995; Pujol et al. 1997, 2002; Blignaut et al. 2002; Soll and Pujol 2003).

Clades exhibit different geographical specificities and phenotypic characteristics. Strains from SA and E predominate in South Africa and Europe, respectively. Strains from clade I have been reported from all investigated geographical areas. In vitro resistance of C. albicans to flucytosine was found to be nearly restricted to clade I isolates (Pujol et al. 2004).

The complex Ca3 probe has been used in a large range of epidemiological studies involving significant numbers of strains and was found to be highly effective in assessing microevolution within clonal populations over time, due primarily to the hypervariability of genomic sequences homologous to the C1 fragment of the probe (Soll et al. 1991; Anderson et al. 1993; Schmid et al. 1993; Schröppel et al. 1994; Lockhart et al. 1995, 1996). The C1 region contains the repetitive DNA sequence (RPS element), which is present at different locations in the C. albicans genome (Iwaguchi et al. 1992; Lockhart et al. 1995). The microevolutionary changes identified by Ca3 are due to the insertion and deletion of the full-length RPS elements at specific sites dispersed throughout the genome and thus can be detected after hybridization of genomic DNA with the C1 fragment of Ca3 (Lockhart et al. 1995; Pujol et al. 1999). Moreover, Ca3 hybridization has proven to be reproducible and amenable to computer-assisted analysis. Databases have been established for subsequent retrospective and comparative studies (Schmid et al. 1990; Schröppel et al. 1994; Lockhart et al. 1996).

Inherent drawbacks of Southern hybridization techniques include several laborious and time-consuming steps and the need to use DNA probes, which requires a transfer procedure to a solid support as well as adequate detection systems making implementation of the method in a medical laboratory analysis difficult. In addition, fingerprint data do not lend themselves to interlaboratory data exchange.

Random amplified polymorphic DNA

The random amplified polymorphic DNA (RAPD) technique is based on the amplification of genomic DNA with single short (typically 10 bp) primers of arbitrary sequence. Primers bind at random to the target DNA resulting in the amplification of fragments of unknown sequence. The amplification reaction is carried out under conditions of low stringency (typically 35–40°C, 2·5 mmol l−1 MgCl2). Amplified products are separated on an agarose gel and stained with ethidium bromide. The interpretation of RAPD patterns is based on the number and the size of the amplified fragments. Overall, the RAPD assay generates relatively complex patterns that greatly vary among unrelated isolates. RAPD has been extensively used for typing of C. albicans, other Candida species (e.g. C. dubliniensis, C. parapsilosis, C. lusitaniae, C. tropicalis, C. glabrata and C. krusei) (Lehmann et al. 1992; Lott et al. 1993; Zeng et al. 1996; Coleman et al. 1997) and other fungi (e.g. Aspergillus fumigatus, Aspergillus flavus, Cryptococcus neoformans and Blastomyces dermatitidis) (Buffington et al. 1994; Yates et al. 1995; Anderson et al. 1996; Boekhout and van Belkum 1997) because it offers many advantages including simplicity, rapidity and cost-effectiveness and does not require any DNA sequence information (Lehmann et al. 1992). The discriminatory power of RAPD is moderate; nevertheless, if a greater level of discrimination is required, different primers can be used in independent runs and the data combined for the final analysis (composite DNA type). This approach allows the discrimination between unrelated isolates, clustering of related isolates, identification of identical or highly related isolates and assessment of microevolution within the same strain. Indeed, Pujol et al. demonstrated a high concordance between RAPD analysis, MLEE and Southern blotting using the Ca3 probe. The Ca3 probe exhibited the greatest resolving power mainly in assessing microevolution within the strain (Pujol et al. 1997).

Although the RAPD technique has been used successfully in several clinical studies, it raises the well-known problem of reproducibility and data comparison between laboratories. Indeed, banding patterns are easily affected by slight changes in the experimental conditions (Mg2+ concentration, primer-to-template–concentration ratio, Taq polymerase concentration and source, the model of thermal cycler, thermal cycling parameters, etc.) (Meunier and Grimont 1993; Howell et al. 1996). Nevertheless, a good interassay reproducibility can be achieved under rigidly controlled conditions. The poor interlaboratory reproducibility of the technique remains insurmountable. For this reason, the RAPD method has been criticized by many investigators, but is still widely used in investigating C. albicans epidemiology at a local level, mainly because of its satisfactory discriminatory power in this setting, simplicity and accessibility under minimum molecular laboratory conditions and moderate investment (Lehmann et al. 1992; Steffan et al. 1997).

Amplified fragment length polymorphism

Amplified fragment length polymorphism (AFLP) was developed in the mid-1990s (Vos et al. 1995). Briefly, it involves digestion of genomic DNA with two restriction enzymes (usually a frequent cutter and a rare cutter) followed by ligation of oligonucleotide adaptors to the sticky ends of the restriction fragments (Ball et al. 2004). Adaptors with site restriction sequences serve as target for primer annealing, and the ligated products are then amplified under high stringency conditions. The procedure allows the selection of a subset of the restriction fragments. Typically, 50–100 amplified fragments are generated. To be visualized, these fragments need to be separated in high-resolution electrophoresis systems (denaturing polyacrylamide gels). Fluorescent dye-labelled primers can be used allowing the detection of amplified fragments in gel-based or capillary DNA sequencers. This variant technique is referred to as fluorescent amplified fragment length polymorphism (FAFLP) allowing the highest resolution of all fragments of different size (Kantardjiev et al. 2006; Levterova et al. 2010).

AFLP usually involves two PCR steps. The first one consists of the preselective amplification using unlabelled primers with a single selective nucleotide in the primer. The reaction product is then diluted to obtain the adequate template concentration for the second PCR amplification in which additional selective nucleotides are added in the primer to improve specificity. During this latter amplification, the labelled primers are used.

AFLP is a highly resolutive typing technique. Like RAPD, AFLP is a universal and multilocus marker technique that can be applied to genomes of any source without requiring any prior sequence information. However, AFLP is more reproducible than RAPD as it uses specific primers, and amplification is achieved under high stringency conditions (Vos et al. 1995; Ball et al. 2004).

Although AFLP has proved to be reliable and reproducible as a genotyping method, it has been rarely used for C. albicans typing mainly because it is multiple-step, fairly expensive and requires a relatively high level of expertise (Ball et al. 2004; Lopes et al. 2007; Levterova et al. 2010).

Repetitive sequence-based PCR

The repetitive sequence-based PCR (rep-PCR) method uses primers that target noncoding repetitive sequences interspersed throughout the fungal genome (Stern et al. 1984; Versalovic et al. 1991). The amplified DNA fragments, when separated by conventional electrophoresis, constitute a genomic fingerprint that can be employed for strain typing (Versalovic and Lupski 2002). The development of a commercially available, semi-automated rep-PCR assay system, the DiversiLab System (Bacterial Barcodes, Inc., Houston, TX, USA), provided an easier standardization and higher reproducibility as compared to the manual, gel-based rep-PCR (Chau et al. 2004; Healy et al. 2005). The system allows for archiving of fingerprint patterns and creation of databases.

Rep-PCR is easy to carry out and to implement in clinical laboratories at relatively low cost. Many studies using rep-PCR for genotyping of C. albicans (Chau et al. 2004; Chen et al. 2005; Abaci et al. 2011) and other members of the genus have been reported (Redkar et al. 1996; Healy et al. 2005).

The discriminatory power of rep-PCR is moderate; however, it has been improved by combining rep-PCR and PFGE-based typing methods (Chen et al. 2005), but such approach makes the molecular investigation more difficult and increases its cost.

Exact DNA-Based Typing Methods

Most of the techniques described above have proven to be effective and reliable typing methods for the investigation of the C. albicans epidemiology at local level. These techniques are suitable for clinical studies in individual laboratory settings, but the exchange and comparison of data between laboratories are difficult, if not impossible. At present, only two techniques lend themselves well to this exchange because they generate unambiguous results with an excellent reproducibility. For this reason, these techniques are called exact DNA-based typing methods. They include microsatellite length polymorphism (MLP) and multilocus sequence typing (MLST) (Klaassen 2009).

Microsatellite length polymorphism

MLP typing is a PCR-based system that exploits the high variability in the repeat number of microsatellite sequences, defined as tandemly repetitive stretches of two to six nucleotides. Microsatellite markers consist of a defined primer pair flanking a specific microsatellite region in the genome. The PCR fragments amplified differ in length according to the number of repetitions of the microsatellite stretch. Microsatellites display a high polymorphism level and a mendelian codominant inheritance and thus can serve as excellent candidates for genetic analysis. For each isolate, MLP typing identifies the presence of one (homozygous) or two (heterozygous) different fragments, or alleles, at a given locus. As microsatellites are highly mutable sequences, one concern with MLP approach is that alleles may be identical not by inheritance but by mutation, resulting in homoplasy (Field et al. 1996; Ortí et al. 1997).

Fluorescently labelled primers are used to amplify specific loci, and the length of the alleles is measured by migration of the PCR products in a high-resolution gel electrophoresis achieved by an automatic sequencer (Botterel et al. 2001; Sampaio et al. 2005). The lengths of the alleles are numeric data and can be easily compared. MLP is easy to perform, rapid and is amenable for automation and high throughput.

Overall, MLP is one of the most discriminative methods for C. albicans typing. However, its resolving power depends on the microsatellite marker used. Several polymorphic microsatellite loci (e.g. EF3, CDC3, HIS3, ERK1, 2NF1, CCN2, CPH2, EFG1, CAI and CAIII to CAVII) have been identified in the C. albicans genome, exhibiting an unequal discriminatory power ranging between 0·57 (for CAV) and 0·97 (for CAI) (Field et al. 1996; Bretagne et al. 1997; Lunel et al. 1998; Metzgar et al. 1998; Botterel et al. 2001; Sampaio et al. 2003, 2005). The selection of microsatellite marker to use should be based on its typing performance, mutation rate and discriminative power. Combining several markers located on different chromosomes in the same typing system allows more accurate characterization of C. albicans strains by MLP analysis. Multiplex PCR systems co-amplifying many microsatellite markers are possible when primers are labelled with different dyes allowing rapid and highly resolutive strains typing (Botterel et al. 2001; Sampaio et al. 2005). By combining the three CDC3, EF3 and HIS3 microsatellite polymorphic markers in a PCR multiplex system, Botterel et al. (2001) reported a discriminatory power of 0·97. A higher value, 0·998, was obtained by Sampaio et al. (2005) with the multiplex system based on the CAI, CAIII and CAVI markers. Recently, high-resolution melting (HRM) technology was added to microsatellite marker typing of C. albicans strains and shown to increase the discriminatory power of MLP analysis (Costa et al. 2010; Ben Abdeljelil et al. 2012a).

In contrast to conventional techniques where the number of specimens tested in the same run is limited by the constraints of conventional electrophoresis, MLP analysis proved to be very suitable for application in large-scale epidemiological studies as up to 96 samples can be analysed in a single run. (Bretagne et al. 1997; Dalle et al. 2000, 2003; Botterel et al. 2001; Sampaio et al. 2005; Eloy et al. 2006).

Reproducibility of sequential tests is high (Botterel et al. 2001). However, interlaboratory exchange of microsatellite data can be difficult, because the use of different equipment and electrophoresis conditions can interfere with the allocation of a given length to a specific PCR product (Wenz et al. 1998; Garcia-Hermoso et al. 2010). To standardize microsatellite analysis for C. albicans and to assess interlaboratory reproducibly of results, Garcia-Hermoso et al. (2010), developed an allelic ladder including the most frequent alleles of the CDC3 microsatellite marker and tested it as an internal reference in six laboratories. Despite variations in size determination, alleles were correctly assigned, making data transportable between laboratories. However, the discriminatory power of CDC3 (0·77) is too low to be used as the sole locus in MLP typing of C. albicans strains (Botterel et al. 2001). Hence, further allelic ladders need to be designed for the other microsatellite markers, especially those exhibiting higher discriminatory power such as CAI (Sampaio et al. 2005). Unfortunately, with current technological limitations, the development of a global database for MLP analysis remains impossible.

MLP is a robust technique and strongly recommended for epidemiological studies of C. albicans and other fungi such A. fumigatus and C. neoformans (Bart-Delabesse et al. 2001; Karaoglu et al. 2008). However, inherent drawbacks of MLP include the high cost of specialized equipment and the difficulty of implementing the technique in routine use, especially in developing countries. To overcome this difficulty, Li and Bai described single-strand conformation polymorphisms (SSCP) of the microsatellite CAI and showed that SSCP analysis of this microsatellite marker is a powerful (discriminatory power of 0·993) and cost-effective approach for rapid strain typing of C. albicans in clinical laboratories, especially in the detection of microevolutionary changes (Li and Bai 2007). This technique was later used for accurate typing of vulvovaginitis C. albicans isolates (Fan et al. 2008; Liu et al. 2009).

Multilocus sequence typing

MLST is typically based on the analysis of nucleotide sequence polymorphisms within the sequences of internal fragments of six to eight independent genes (loci). Genes chosen for MLST analysis are generally those with housekeeping functions that are subject to stabilizing selection. In addition, selected loci must provide as much sequence diversity as possible to allow high levels of allelic discrimination (Bougnoux et al. 2002).

MLST involves amplification of DNA fragments (400–500 bp) by PCR followed by DNA sequencing. For each housekeeping locus, different sequences are considered as distinct alleles. Each isolate is therefore characterized by a series of alleles at the different loci that correspond to the multilocus sequence type. Data generated by this DNA sequence analysis are unambiguous and can be stored and readily accessible in databases (Bougnoux et al. 2002; Tavanti et al. 2005a). In contrast to MLP, MLST proved to be highly reproducible between laboratories and thereby liable to standardization and portability (Tavanti et al. 2005a). It allows the exchange of genotyping data and the construction of international databases accessible via Internet. Online global databases for many microorganisms are currently available at, including data from epidemiological studies carried out worldwide. This permits global epidemiological and population analysis.

MLST methodology was first developed and used for typing pathogenic bacteria and later some pathogenic fungi (e.g. Cryptococcus gatti, Fusarium solani and Batrachochytrium dendrobatidis) (Zhang et al. 2006; Bovers et al. 2007; Morgan et al. 2007). The technique was applied to C. albicans in the early 2000s. The first protocol, based on the analysis of six loci and a second one, based on the analysis of eight loci, were developed by Bougnoux et al. in 2002 and Tavanti et al. in 2003, respectively. Later, an optimized protocol based on a set of seven loci was proposed as an international consensus unifying the MLST scheme for C. albicans (Bougnoux et al. 2003). Results from strain typing using this system are shared through public Internet-linked database ( where data have been accumulated from different geographical locations. This database provides an interesting source to evaluate the worldwide diversity of C. albicans and the relationships of isolates identified at various locations. It is worth mentioning that MLST is the only typing method that has a public database, not only for C. albicans but also for C. tropicalis, C. glabrata, C. dubliniensis and C. krusei. In contrast, C. parapsilosis shows too little sequence diversity to be typeable by MLST (Dodgson et al. 2003; Tavanti et al. 2005a,b,c; Jacobsen et al. 2007; McManus et al. 2007). Because C. albicans is diploid, nucleotide sequences generated by the MLST analysis are likely to show heterozygosity at polymorphic sites, and therefore, strains are unambiguously characterized by a diploid sequence type (DST) (Bougnoux et al. 2002).

MLST analysis has been successfully applied to population genetics and molecular phylogeny studies of C. albicans (Tavanti et al. 2005a; Bougnoux et al. 2002, 2006; Chowdhary et al. 2006; Odds et al. 2006, 2007). In population genetic studies, MLST data have confirmed previous studies in the field and, in some instances, refined our understanding of the epidemiology of C. albicans (Bougnoux et al. 2006; Chen et al. 2006; Odds et al. 2006).

Population genetic analysis of C. albicans MLST data concerning isolates obtained from separate sources showed that the species could be divided into a large number of clades and that clades differed in the proportions of isolates they included according to the geographical origin and the anatomical sources. However, the link between the geographical origin and the clade is not absolute probably because of movement and migration of human populations (Odds 2010). Clade I, found to be the most prevalent MLST cluster of related C. albicans strains, has been associated with resistance to flucytosine and terbinafine. In addition, association between clade and the lengths of tandem repeats in some cell surface proteins, but not with virulence or type of infection, have been demonstrated (Odds 2010).

Candida albicans population structure as determined by MLST typing showed a good correlation in the clustering obtained with Ca3 fingerprinting, with clades I, II, III and SA delimited by Ca3 probe (Robles et al. 2004; Tavanti et al. 2005a; Odds et al. 2007). However, clade E as identified by Ca3 fingerprinting has been separated by MLST analysis into various clades (Tavanti et al. 2005a; Odds et al. 2007).

As data from more and more C. albicans isolates from different geographical sources are deposited in the database, data need to be regularly reviewed to provide a more robust reference basis for definition of MLST clades and to further characterize the genetic diversity of C. albicans.

Many epidemiological studies on C. albicans using MLST have been reported. Viviani et al. (2006) used MLST analysis to investigate a suspected outbreak of C. albicans candidemia cases that occurred in the same hospital ward between 1987 and 1991 and showed that eight cases were caused by the same endemic strain, over a 4-year period.

By comparing Ca3 Southern hybridization and MLST for their ability to discriminate between C. albicans isolates (n = 37) obtained from recurrent cases of oropharyngeal candidiasis in ten HIV-positive patients, Chowdhary et al. (2006) showed that MLST was at least as efficient as Ca3 Southern hybridization for defining genetic relatedness of sequentially isolated strains. Similarly, Robles et al. (2004) demonstrated that MLST was at least as effective as RAPD, MLEE and Ca3 Southern hybridization techniques. Chen et al. (2006) also demonstrated that MLST was superior in discriminating epidemiologically related strains than the restriction analysis of genomic DNA by BssHII (REAG-B).

Additional reports suggest that MLST may have some limitations in discrimination of unrelated strains. Shin et al. performed MLST to investigate the genetic relatedness among C. albicans bloodstream isolates (n = 156) recovered from 10 Korean hospitals. The isolates generated 112 unique DSTs among which 17 DSTs were shared by 61 isolates (39·1%). In addition, authors carried out REAG-B to characterize isolates indistinguishable by MLST analysis. They showed that isolates with identical DSTs, but which originated from different hospitals, had different REAG-B patterns (Shin et al. 2011). Similarly, Myoung et al. reported that, despite its high discriminatory power in distinguishing between epidemiologically related strains and detecting microevolution within the same strain, MLST may fail in distinguishing unrelated isolates. Indeed, the authors showed that some unrelated isolates were identical or similar according to the MLST analysis, but exhibited different REAG-B and C1 fingerprinting patterns. Thus, isolates with identical MLST types may need further genotypic characterization to assess clonal relationships (Myoung et al. 2011).

Limitations of MLST in the characterization of unrelated strains may be explained by the fact that (i) the method analyses the sequences of only seven 300–400bp loci, so that isolates with identical DSTs may differ substantially through large genomic rearrangements in regions that do not encompass the sequenced loci and (ii) the diploid nature of C. albicans can result in two strains yielding identical DSTs, even though they may differ in the organization of the heterozygous bases at the polymorphic sites (Bougnoux et al. 2002, 2003). An additional problem raised by the C. albicans MLST analysis is that three of the chromosomes of C. albicans (3, 5 and 7) are not represented in the consensus scheme. Lott and Scarborough developed SNP analysis by microarray to complement MLST typing and to extend the SNPs to all the yeast chromosomes. The array consists of multiple replicates of 79 SNPs, derived from 19 loci located on all eight chromosomes, including the seven genes (57 SNPs) that comprise the MLST consensus scheme. The remaining 22 SNPs are from 12 additional loci located on the remaining chromosomes (Lott and Scarborough 2008). Further studies are needed to assess the input of the system in the epidemiology and population genetic studies of C. albicans.

According to Garcia-Hermoso et al. (2007), MLP and MLST are similarly efficient in typing and grouping C. albicans strains, indicating that MLST is a suitable technique for clustering analysis. However, it is important to state that for the typing of a single isolate by MLST analysis according to the consensus scheme, seven PCRs are needed, whereas MLP analysis requires only one multiplex PCR per isolate making MLP analysis faster, less tedious and expensive and more adapted for clinical studies than the MLST analysis. However, MLST analysis still remains more appropriate to investigate population structure of C. albicans than MLP-based approach.


Both conventional DNA-based methods and exact DNA-based methods have provided useful insights into the epidemiology and population structure of C. albicans. The major drawback of the conventional methods lies in their lack of standardization, reducing their potential for interlaboratory comparisons and therefore global population studies; but these methods are very suitable to investigate epidemiological trends at a local level. Exact DNA-based methods, including MLP and MLST, have emerged as very efficient typing tools. The main advantage of these methods is that generated data are unambiguous and highly reproducible and can be stored in databases offering an unprecedented degree of portability and accessibility to all interested users. Such techniques are much more appropriate for global epidemiology. At present, MLST is the only typing method that has a public database and represents the most powerful approach for phylogenetics of C. albicans, whereas MLP analysis needs further standardization. Both MLST and MLP are costly and require specialized equipment that represents an obstacle to their implementation in clinical mycology laboratories, especially those in developing countries with low income. The choice of the appropriate typing method depends on the purpose of the investigation. With the technical resources currently available, MLST remains the more suitable technique for C. albicans population genetic studies that aim to assess the population structure, diversity and dynamics of the species. However, in clinical studies (investigation of nosocomial candidiasis, recurrent infection, distinguishing epidemic from endemic or sporadic strains, and determination of the origins of infection, the routes of acquisition and transmission of strains), the selection of the technique depends on the local facilities. MLP analysis outperforms virtually all conventional DNA-based typing methods, but requires an expensive and specialized platform, which limits its implementation in developing countries where such platforms may be centrally managed and used by several investigators for a variety of molecular studies (bacterial, fungal or other) to reduce the cost of these studies and make the platform profitable. In the absence of such platforms, other techniques may be recommended, such as SSCP analysis of the CAI polymorphic microsatellite marker, AFLP and PFGE after chromosomal restriction digest.

Conflict of Interests

The authors declare that they have no conflicts of interest.