By continuing to browse this site you agree to us using cookies as described in About Cookies
Notice: Due to essential maintenance the subscribe/renew pages will be unavailable on Wednesday 26 October between 02:00- 08:00 BST/ 09:00 – 15:00 SGT/ 21:00- 03:00 EDT. Apologies for the inconvenience.
Mutations and/or overexpression of various transporters are known to confer drug resistance in a variety of organisms. In the malaria parasite Plasmodium falciparum, a homologue of P-glycoprotein, PfMDR1, has been implicated in responses to chloroquine (CQ), quinine (QN) and other drugs, and a putative transporter, PfCRT, was recently demonstrated to be the key molecule in CQ resistance. However, other unknown molecules are probably involved, as different parasite clones carrying the same pfcrt and pfmdr1 alleles show a wide range of quantitative responses to CQ and QN. Such molecules may contribute to increasing incidences of QN treatment failure, the molecular basis of which is not understood. To identify additional genes involved in parasite CQ and QN responses, we assayed the in vitro susceptibilities of 97 culture-adapted cloned isolates to CQ and QN and searched for single nucleotide polymorphisms (SNPs) in DNA encoding 49 putative transporters (total 113 kb) and in 39 housekeeping genes that acted as negative controls. SNPs in 11 of the putative transporter genes, including pfcrt and pfmdr1, showed significant associations with decreased sensitivity to CQ and/or QN in P. falciparum. Significant linkage disequilibria within and between these genes were also detected, suggesting interactions among the transporter genes. This study provides specific leads for better understanding of complex drug resistances in malaria parasites.
We hypothesized that parasite transporters play an important role in the P. falciparum response to antimalarial drugs, including CQ and QN, and that the levels of the responses result from additive and/or interacting contributions of multiple proteins. Genetic variations and/or changes in gene expression in different parasite transporters should therefore affect the parasite response to antimalarial drugs. The P. falciparum genome sequencing project provides an excellent opportunity to study the potential contributions of transporters to drug resistance. The majority of the parasite genes are available in genome databases (Gardner et al., 2002). A comprehensive search of the parasite genome for genetic changes in putative transporters provides a unique approach to identify candidate transporters involved in drug transport or otherwise contributing to drug resistance.
To identify genes contributing to QN sensitivity as well as genes that may modulate parasite response to CQ, we collected single nucleotide polymorphisms (SNPs) from 49 genes that encode predicted or known transporters and transport regulatory proteins available from public databases. We then genotyped a total of 97 culture- adapted isolates (34 from Africa, 42 from Asia, 16 from the Americas and five from Papua New Guinea) and measured the dose–responses to CQ and QN in vitro. We show that SNPs from multiple transporters are associated with elevated levels of the parasite response to CQ and/or QN and provide evidence of co-selection of SNPs from the associated genes, supported by linkage disequilibra (LD) between genes on different chromosomes.
Results and discussion
SNPs from 49 putative transporter genes
We searched the P. falciparum genome databases at the websites of the genome sequencing consortium (websites of TIGR, Sanger Center and Stanford University) for protein motifs similar to known transporters. A total of 113 kb of DNA, 99 kb coding and 14 kb non-coding, containing 49 putative transporter sequences, was amplified from four parasite isolates (Hb3 of central America, Dd2 of south-east Asia, D10 of Papua New Guinea and 7G8 of South America) and sequenced. SNPs and polymorphic microsatellite (MS) sites were identified after alignment of the DNA sequences from five isolates (including 3D7 from the genome sequencing project). Two hundred and thirty-one polymorphic sites, including 67 MS and 164 SNPs, were obtained from 42 of the 49 genes (Table 1). Of the SNPs, 130 are in coding regions (cSNPs) and 34 are in non-coding regions. This gives an overall frequency of one SNP per 690 bp DNA (nucleotide polymorphism, θ= 7.3 × 10−4), one SNP per 764 bp in coding regions (θ = 6.3 × 10−4) and one SNP per 412 bp in non-coding regions (θ = 1.2 × 10−3) respectively. Among the 130 cSNPs, 96 are non-synonymous substitutions (74%) and 34 are synonymous. MS are mostly in introns and are present at a frequency of one in ≈ 1.7 kb. These results confirm that the P. falciparum genome is highly polymorphic when parasites from around the world with different CQ selection histories are scored, with a polymorphic site occurring every 0.49 kb DNA in just five isolates, which is consistent with that reported for chromosome 3 (Mu et al., 2002).
Table 1. . Summary of genes encoding 49 putative transporters and single nucleotide polymorphisms (SNPs).
PlasmoDB link (4.0)
Non- coding (kb)
Total Seq (kb)
Nucleo. Poly. (θ)
. Indicates genes that were mapped to chromosomes by typing polymorphic sites in or near the genes in 35 progeny from a genetic cross (Su et al., 1999). The chromosomal locations for other genes are according to the assignment of genome databases.
. The average nucleotide polymorphism from five isolates.
The gene names are according to blast search hits or annotations from the genome sequencing centres.
The SNPs are not evenly distributed among the genes. There are genes that contain no SNPs, possibly reflecting functional constraints (Table 1). However, 14 genes are quite variable, having three or more non-synonymous substitutions, but few or no synonymous changes. This pattern suggests the possibility of positive selection, although the high frequency of non-synonymous substitutions partly reflects the paucity of synonymous sites (Mu et al., 2002).
Parasite responses to CQ and QN
To investigate whether SNPs from the transporter genes are associated with drug sensitivity, we determined in vitro IC50s to both CQ and QN for 97 cloned isolates (Table 2). Plots of descending CQ and QN IC50 values showed continuous distributions (Fig. 1A and B) characteristic of multigenic, quantitative traits. One obvious gap in the distribution of CQ IC50 (dashed line in Fig. 1A) can be attributed to mutations in pfcrt: all the parasites below this interval carry a wild-type pfcrt allele, whereas parasites above this gap have mutant alleles (data not shown). Significantly, isolates carrying an identical pfcrt mutant allele displayed a wide range of IC50 values, indicating the participation of additional genes in modulating the level of response to CQ. A similar, but smoother continuous distribution of QN IC50 from the isolates was observed (Fig. 1B), suggesting the lack of a major genetic determinant. Although the majority of the CQ-sensitive (CQS) parasites carrying the wild-type pfcrt allele also have lower QN IC50 values, parasites carrying the mutant pfcrt alleles vary widely for QN IC50 (Fig. 1C). Best-fit curves of IC50 values (Fig. 1C) showed approximately parallel lines for CQ and QN, indicating the likely involvement of common genes, especially pfcrt, in levels of sensitivity to both drugs (Fig. 1C).
Table 2. Parasite isolates, origins and responses to CQ and QN.
The IC50 of each isolate for CQ and QN, representing median values from at least five independent assays, was normalized to that of control parasite Dd2 (mean IC50: 404.1 nM for CQ and 315.9 nM for QN) included in each assay to account for day-to-day assay variation.
SNPs from multiple transporter genes associated with CQ and QN responses
To identify genes that may contribute to responses to CQ and QN, we analysed the relationship between the SNPs and the quantitative parasite drug responses by direct genotype–phenotype association. When all 97 isolates were tested for associations (columns headed ‘All’, Table 3), 15 SNPs from six genes (pfcrt, G2, G7, G25, G30 and G49) were strongly associated with the quantitative CQ responses at P < 0.001, and four additional SNPs, from pfmdr1, G47 and G70, gave more marginal P-values (Table 3). Similar P-values were obtained for associations of these SNPs with the QN responses of the 97 isolates (Table 3). These P-values give an intriguing hint of multiple drug susceptibility determinants, as the strong CQ associations include most SNPs from pfcrt, and four of the associated genes encode putative ABC transporters (pfmdr1, G2, G7 and G49). However, the associations in the worldwide cohort of 97 parasites may reflect co-ancestry of lineages in different geographical regions in addition to drug response associations. Accordingly, we also analysed the parasites by different continental regions, where they have distinct CQ selection histories and CQR origins (Wootton et al., 2002).
Table 3. SNPs from 11 P. falciparum putative transporter-encoding genes associated with CQ and QN responses.
P-values are emphasized if significant association was determined by permutation analysis (bold) or linear regression/ANOVA analysis (P < 0.05, underlined): see Experimental procedures for these two independent statistical strategies. Some strong P-values for ‘All’ regions may reflect geographical subdivision (indicated in the comments column) rather than CQ or QN associations, as confirmed by direct geographical association tests (data not shown): in these cases, only weak or non-significant P-values were obtained for isolates from individual continents.
. Informativeness index (I, see Experimental procedures) was used to exclude SNPs with low minor allele frequency: loci with I < 0.5 are denoted non-informative (NI). Cases with marginal I-values of 0.5–0.7 should also be viewed with care because of relatively low minor allele frequency or parasite sample sizes. &1 represents a trinucleotide insertion and &2 is also a microsatellite polymorphism 700 bp downstream of the stop codon.
. Indicates a singleton third allele.
Putative transporter 10 TM segments with CEGA motif
The separate associations for Asia, Africa and the Americas (Table 3) show that 14 SNPs from six genes (pfcrt, pfmdr1, G2, G30, G49 and G55) are strongly associated with CQ responses in at least one geographical region (P < 0.01), and four additional SNPs from G7, G25 and G49 have P-values < 0.022 (cut-off threshold from permutation analysis). Interestingly, different SNPs from pfmdr1 are significantly associated with CQ responses among different parasite populations; the SNPs at amino acid position 86 and 1034 are significantly associated with CQ responses in parasites from Africa and South America respectively (Table 3), a finding consistent with results reported previously (Volkman and Wirth, 1998; Foote et al., 1990; Reed et al., 2000; Adagut and Warhurst, 2001; Babiker et al., 2001). For the South American parasites, which are all CQ resistant, most of the SNPs in pfcrt are not informative for quantitative drug response associations because of the absence of the ancestral pfcrt allele in this region. The majority of the pfcrt SNPs (except amino acid position 72 and 97) from African and Asian parasites were very significantly associated (P < 0.0001) with CQ response (Table 3).
Additionally, 12 SNPs from five genes showed evidence of association with the higher QN IC50 in the continental subpopulations (P < 0.022, Table 3). These included seven strongly significant (P < 0.001) pfcrt SNP associations in both Asian and African populations and, to a lesser degree, SNPs in G30 (Africa), G54 (Asia) and G70 (Asia). Also, two SNPs from G2 and one from G49, which have marginally non-significant P-values in the permutation tests, are significant by the regression analysis.
The strong association between pfcrt SNPs and responses to both CQ and QN is consistent with a scenario in which pfcrt may physically interact with both drugs, which is supported by various observations: (i) substitutions of K with I or N at amino acid position 76 of pfcrt changed the parasite response to both CQ and QN simultaneously (Cooper et al., 2002); (ii) the majority of CQS parasites carrying the wild-type pfcrt also have low QN IC50 (Fig. 1C); and (iii) the antimalarial effects of CQ and QN are antagonistic (Skinner-Adams and Davis, 1999). The results also agree with historical observations that no QN failures were reported before CQR (Peters, 1987) and are consistent with the proposal that the use of CQ may have led to background mutations contributing to steady decreases in QN potency (Knowles et al., 1984). The majority of the associations with both CQ and QN are corroborated by significant P-values (P < 0.05) from analysis of variance and linear regression (underlined in Table 3).
Negative controls and SNPs from 39 putative housekeeping genes
Of the 39 putative transporter genes that have SNPs, 28 show no significant associations with CQ or QN responses. These SNPs/genes can serve as controls for false associations due to population structures or other unknown factors. Additionally, we also searched for SNPs from 39 putative housekeeping genes (or partial genes) on chromosome 3 among five isolates (Dd2, 3D7, Hb3, D10 and 7G8), totalling 33.6 kb coding and 3.9 kb non-coding sequences (Mu et al., 2002) (Table 4). These SNPs were then assayed in all 97 isolates. Only 13 of these 39 putative housekeeping genes (compared with 39 of 49 putative transporter genes) showed nucleotide substitutions, a significant difference between the transporter and housekeeping classes in the number of polymorphic genes (χ2 test, P < 0.001). Nucleotide substitution rates within these 13 housekeeping genes are also significantly lower (χ2 test, P < 0.0001), suggesting purifying selection acting on the housekeeping genes and/or positive selection on the transporters (θ = 2.0 × 10−4 for the housekeeping genes compared with θ= 7.3 × 10−4 for the transporter genes). Of the 13 genes with nucleotide substitutions, only two satisfied our informativeness index criterion of I≥ 0.5 among the 97 isolates (guanine nucleotide-binding protein I = 0.59 and 40S ribosomal protein S12 I = 0.66). The SNPs in these two genes are not significantly associated with either CQ or QN responses according to our P-value thresholds from permutation analysis (data not shown).
Table 4. . Single nucleotide polymorphisms in 39 putative housekeeping genes.
Total seq. (bp)
. Indicates averaged frequency and nucleotide polymorphism respectively.
The gene names are according to annotations in PlasmoDB (4.0).
Co-selection of pfcrt amino acid T76 and pfmdr1 Y86 positions by CQ treatment has been reported recently (Adagut and Warhurst, 2001; Babiker et al., 2001; Djimde et al., 2001), suggesting either that the two genes may work in concert in determining CQ resistance levels or that the mutation at pfmdr1Y86 may compensate deleterious pfcrt mutations. To detect potential co-selection, we evaluated LD (D) between pairs of SNPs within a geographical region, including Africa (A), Asia (B) and South America (C) (Fig. 2). In addition to strong LD within pfcrt, the most notable finding was the strong LD detected between pfcrt and pfmdr1 (position 86), G7, G30 and G55 in African parasites, and between pfcrt and G2, G7, G47 and G49 in Asian parasites (P < 0.00001, red, Fig. 2A and B). Additionally, strong LD is present in African isolates within G2 and G54 and between G25 and G49 (Fig. 2A), and in Asian parasites within G2, G49 and G54 and between the following pairs: G7/G47, G7/G49, G47/G49 and G54/G70 (Fig. 2B). For parasites from South America, strong LD is detected between pfcrt and pfmdr1 (position 1034 only), pfmdr1/G47 and G49/54 in addition to LD within pfcrt, G2 and G54 (Fig. 2C) and, to a lesser extent (P = 0.001–0.00001), between pfcrt and pfmdr1 (position 1034). Some SNPs do not show LD in the South American population because many of the positions (for example in pfcrt) have only one allele. These LDs were corroborated using estimates of R 2 implemented in DNASP (Rozas and Rozas, 1999; data not shown).
The LD between pfcrt and pfmdr1 in parasites from Africa (pfmdr1 position 86) and South America (pfmdr1 position 1034) supports previous reports suggesting co-selection or involvement of both pfcrt and pfmdr1 in CQR (Adagut and Warhurst, 2001; Babiker et al., 2001; Djimde et al., 2001; Chen et al., 2002). It is interesting that pfcrt is linked to different SNPs in pfmdr1 in parasites from Africa and South America. This implies that specific genetic backgrounds are associated with various pfcrt mutant alleles and that patterns in LD reflect distinct drug selection histories in Africa and South America (Wootton et al., 2002). Lack of LD between pfcrt and pfmdr1 in the Asian population could also be a result of extensive mefloquine use that may counter the selective effect of CQ (Duraisingh et al., 1997). Strong LD between the genes located on different chromosomes (Table 1) provides indirect evidence that some genes work in concert with pfcrt in CQ and QN responses.
Of course, many unknown factors may also contribute to the observed LD, including other antimalarial agents not tested in this study; therefore, strong LD between putative transporters does not constitute formal proof that they are linked as a result of CQ and QN selection. Similarly, the strong associations between SNPs and drug response phenotypes do not formally prove that drug selection acted historically on these SNPs per se rather than on closely linked loci. However, our results as a whole provide support for the hypothesis that multiple drug response determinants have acted in different combinations on different continents.
All the proteins encoded by the genes associated with CQ and QN responses are predicted to be membrane-spanning transporters or transport regulators that function in either the plasma or the organellar membrane (Table 1). Four of the genes (G2, G7, G49 and G55) encode ABC transporter/ATPases similar to pfmdr1, consistent with the hypothesis that these genes are generally involved in drug transport. G30 encodes a GTPase predicted to be a translation factor (Leipe et al., 2002) that may affect protein synthesis. The majority of substitutions observed are unlikely, however, to cause drastic distortions in the structures of the encoded proteins. It is possible that some of the genes may affect gene expression levels caused by polymorphisms in non-coding regions such as those in G7, G30 and G54.
Results from this study show that, in addition to pfcrt and pfmdr1, SNPs from several putative transporters are significantly associated with elevated responses to CQ and/or QN, providing evidence that the level of CQ and QN response is a multigenic phenomenon and that mutations in different transporter genes may impact the response to antimalarial compounds. Additionally, we provide strong evidence that pfcrt is also involved in responses to QN. There appear to be shared genes underlying CQ and QN responses, and mutations in pfcrt are probably necessary, but not sufficient, to confer QN resistance. With the near saturation of mutant pfcrt alleles in south-east Asia and South America and an increasing prevalence in Africa, the genetic background for QN treatment failure may exist in current parasite populations. Overlapping, but not identical sets of genes, including many encoding unknown proteins, may provide an explanation for why a parasite can be resistant to CQ but highly sensitive to QN. Although the number of parasite isolates tested in this study is relatively small, identification of these genes implicated in drug responses provides a foundation for their further functional characterization and allows for a better understanding of the genetic basis of drug resistance in malaria parasites. The molecular roles of these candidate genes and transporter proteins in CQ and QN responses can now be tested rigorously using transfection and gene knock-out experiments.
Oligonucleotide primers were designed to amplify and sequence DNA of the transporter genes from four isolates first (Dd2, Hb3, D10 and 7G8). These isolates have been genotyped previously with 342 MS markers and shown to have diverse genetic backgrounds (Wootton et al., 2002). Primers, 20–25 bp long, were synthesized in a DNA synthesizer (Applied Biosystems) or obtained through a commercial supplier (Invitrogen). Polymerase chain reaction (PCR) set-ups include 4 µl of DNA (≈ 5 ng), 0.5 µl of each primer (50 pM) and 45 µl of PCR mix containing 5 µl of 10× PCR buffer, 1.0 µl of dNTPs (10 mM) and 0.1 µl (5 U µl−1) of Taq polymerase (Invitrogen). All the amplifications were performed with one cycling condition: 94°C for 2 min, 35 cycles of 94°C for 20 s, 52°C 10 s, 48°C for 10 s and 60°C for 1–4 min, and 60°C for 5 min. Five microlitres of the PCR products were run on 1% agarose gel to check for quality of amplification. If there was a single band, and no obvious ‘primer–dimer’ was present, the PCR product was treated with 1 µl of ExoSAP-IT (USB) at 37°C for 15 min and 80°C for another 15 min. The PCR product (2–5 µl) was used in a sequencing reaction using dichlororhodamine or BigDye terminator chemistry on ABI377 or ABI3100 (Applied Biosystems).
SNP discovery and verification
DNA sequences were aligned using SEQUENCHER 3.1 (Gene Codes Corporation) or ASSEMBLYLIGN software (Oxford Molecular). All potential SNPs and each of the ambiguities were verified by visual inspection. Polymorphic MS sequences were aligned with the software first, then with visual assistance to minimize artificial SNPs.
SNP association and linkage disequilibrium analyses
Genetic analysis was adapted to the haploid inheritance system of P. falciparum. We used two independent methods, namely permutation analysis and quantitative trait locus (QTL) regression analysis, to evaluate association between the multilocus alleles and the drug susceptibility phenotypes. Permutation analysis used refinements of methods described by Churchill and Doerge (1994), Long and Langley (1999) and Zhao et al. (2000). The association statistic, Ai for the ith polymorphic site, used the discrete small-sample equivalent of the t-test statistic based on the direct distance between the binary-encoded genotype vector and the scaled (0,1) phenotype vector of log IC50 values. The significance of Ai was determined from the distribution of simulated values obtained from 1000 random permutations of the scaled phenotype vector on the fixed indices of the genotypes. If Ai was within the range of values from permutations, an exact, distribution-independent P-value was obtained from the proportion of permuted samples more extreme than Ai. If Ai was more extreme than the computed permutation distribution, its P-value was estimated by extrapolation using Fisher's Z-score approximation. These unadjusted P-values are presented in Table 3; however, as long recognized (Good, 1953), such estimates do not translate directly into conventional significance levels. Accordingly, empirical significance thresholds were determined using second-order nested permutations of the permuted phenotype vectors. Intervals estimated by this test were: ‘marginal’, P < 0.022 (one false positive expected by chance in the entire multilocus association analysis); and ‘highly significant’, P < 0.001. Such thresholds are sample size independent and account for both the non-normality of the permutation distribution and the multiple tests of the multilocus analysis, thus enabling us to minimize the false discovery rate of associations.
We also used the same 1000 permutations to make an empirical estimate of the ‘informativeness index’, Ii, of each ith SNP locus for the four geographical sets of isolates analysed (‘All regions’, ‘Asia’, Africa’ and ‘Americas’). Ii is the ratio Ti/ Tmax, where Ti is the range (Aupper–Alower) of the association statistic obtained from the 1000 permutations for locus i, and Tmax is the corresponding range from 1000 random permutations for a maximally informative model locus with equal allele frequencies. Ii is independent of sample size and provides a consistent criterion to exclude the less informative SNPs from significance tests: loci with I < 0.5 are denoted ‘NI’ (not informative) in Table 3.
Population-based quantitative trait loci (QTL) regression association was performed as described previously (Zhao et al., 2001). Briefly, each locus tested is assumed to have two alleles, A (wild-type) and a (mutant), with frequencies PA and Pa respectively. Let yi be the IC50 value of the ith parasite response to CQ and QN and xi be an indicator variable of the allele at the locus, defined as xi = 1 if the allele is a and xi = −1 if the allele is A. QTL analysis was modelled by the following regression:
yi = µ + xiα + ei
where µ is overall population mean, α is genetic additive effect and ei is an error with E[ei] = 0 and Var (ei) = σ2e. The generalized likelihood ratio test statistic, which follows an F1,n−2 distribution, was used to test the null hypothesis H0: α = 0. P-values for the F statistics were assessed using regression ANOVA tests.
In view of caveats about inferences from single LD measures (Hedrick and Kumar, 2001), we used two independent methods to calculate the amount of LD between alleles (nucleotides) at different polymorphic sites. The basic LD measure, D (Lewontin and Kojima, 1960), was computed for all pairs of loci from the numerically encoded allele vectors of loci A and B, and its significance was evaluated by permutation analysis. The observed D-value was compared with the distribution computed from 400 random permutations of the allele values of locus B on the fixed indices of the locus A vector. P-values and confidence intervals were determined from these permutation tests, as described above for genotype–phenotype associations: the thresholds for unadjusted P-values (Fig. 2) were P < 0.001 (marginal) and P < 0.00001 (highly significant). LD was also calculated using a pairwise R2 estimate implemented in the DNASP package (Rozas and Rozas, 1999). A concatenated sequence (haplotype) of the associated genes for each isolate was created and aligned according to geographical origins and imported into DNASP. The significance levels of R 2 were evaluated by the χ2-test option of this package. Nucleotide polymorphism (θ) was also calculated using DNASP.
We greatly appreciate the valuable discussions with Drs T. J. C. Anderson and Y.-F. Wang. We also thank Ms Brenda R. Marshall for editorial assistance. This study would not have been possible without the efforts of the P. falciparum genome sequencing consortium that includes The Institute of Genome Research, Sanger Center and Stanford University.