SEARCH

SEARCH BY CITATION

Keywords:

  • Human malaria;
  • selection signatures;
  • pyruvate kinase-deficiency;
  • PKLR;
  • molecular markers

Summary

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

The genetic component of susceptibility to malaria is both complex and multigenic and the better-known protective polymorphisms are those involving erythrocyte-specific structural proteins and enzymes. In vivo and in vitro data have suggested that pyruvate kinase deficiency, which causes a nonspherocytic haemolytic anaemia, could be protective against malaria severity in humans, but this hypothesis remains to be tested. In the present study, we conducted a combined analysis of Short Tandem Repeats (STRs) and Single Nucleotide Polymorphisms (SNPs) in the pyruvate kinase-encoding gene (PKLR) and adjacent regions (chromosome 1q21) to look for malaria selective signatures in two sub-Saharan African populations from Angola and Mozambique, in several groups with different malaria infection outcome. A European population from Portugal, including a control and a pyruvate kinase-deficient group, was used for comparison. Data from STR and SNP loci spread along the PKLR gene region showed a considerably higher differentiation between African and Portuguese populations than that usually found for neutral markers. In addition, a wider region showing strong linkage disequilibrium was found in an uncomplicated malaria group, and a haplotype was found to be associated with this clinical group. Altogether, this data suggests that malaria selective pressure is acting in this genomic region.

According to the World Malaria Report 2008 (World Health Organization, WHO, 2008), 109 countries are currently endemic for malaria, 45 of which are within the African region, and 247 million malaria cases were estimated among the 3·3 billion people at risk in 2006. These cases resulted in nearly a million deaths, mostly of children under 5 years old. Despite this disastrous picture, the current combination of tools and methods to combat malaria, including long-lasting insecticidal nets and artemisinin-based combination therapy (ACT), supported by indoor residual spraying of insecticide and intermittent preventive treatment in pregnancy, is leading to a significant reduction of cases in some countries, such as Gambia (Ceesay et al, 2008), Kenya (O’Meara et al, 2008) and São Tomé and Príncipe (unpublished observations). However, both Anopheles mosquito and Plasmodium parasite have developed resistance to insecticides (Anto et al, 2009) and new drugs (Noedl et al, 2008), which clearly shows that the fight against the disease continues to be a difficult challenge.

Malaria has been reported as one of the strongest known forces for evolutionary selection in the recent history of the human genome. The genetic component of susceptibility to malaria is complex and multigenic, with a variety of genetic polymorphisms reported to influence both pathogenesis and host response to infection (Kwiatkowski, 2005; Min-Oo & Gros, 2005; Williams, 2006). The identification of these variants might, therefore, help to improve the development of therapeutic and disease-prevention strategies.

The most common and best characterised malaria protective polymorphisms are those involving erythrocyte-specific structural proteins and enzymes, such as sickle cell disease and glucose-6-phosphate dehydrogenase (G6PD)-deficiency. More recently, pyruvate kinase (PK)-deficiency has also been reported as protective against malaria in murine models (Min-Oo et al, 2003) and two studies have reported the in vitro culturing of P. falciparum in PK-deficient blood with a significant decrease in parasite replication (Ayi et al, 2008; Durand & Coetzer, 2008). However, the possibility that PK-deficiency may affect susceptibility to malaria in humans remains to be confirmed.

Apart from results in murine models and in vitro cultures, there is no population data supporting a positive association between PK-deficiency and malaria protection. Given the differences in selection pressure that mice and humans have been exposed to over tens of millions of years, the major susceptibility genes in the two species are unlikely to be the same (Hill, 1998), and the possibility that any crucial insufficiency of the erythrocytes, besides PK-deficiency, may influence the development of the parasite make clear the need to perform additional studies to clarify this question. Moreover, until now, contrary to G6PD-deficiency or sickle cell disease, elevated frequencies of PK-deficiency have not been recorded in malaria endemic areas; however, a systematic analysis has never been done and even the information about the frequency of PK-deficiency in African populations is clearly limited (Manco et al, 2001; Mateu et al, 2002).

The first study including a population genetic approach concerning the possible association between the PKLR gene (PK-encoding gene) and malaria was carried out at the Island of Santiago, Cabo Verde (Alves et al, 2010). Although no association was then found between any PKLR polymorphism and infection status, a strong linkage between distant loci in the gene and adjacent regions was reported only in non-infected individuals. This linkage could mean that there is a more conserved gene region that is selected if protective against the infection and/or disease. The present study aimed to further analyse this previous preliminary result by looking at the PKLR gene and adjacent regions in individuals belonging to different population groups (from Angola and Mozambique, both malaria endemic countries, and from Portugal, a country with no malaria transmission) and to different malaria status (asymptomatic infection, mild and severe malaria), with the goal of identifying potential selection signatures in this genomic region imprinted by malaria.

Material and methods

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Study areas

Angola and Mozambique are both sub-Saharan countries. Angola (capital Luanda, 8°50′ 18″S, 13°14′ 4″E) is localised in south-western Africa and is bordered by the Atlantic Ocean to the west; Mozambique (capital Maputo, 25°57′ 55″S, 32°35′ 21″E) is in south-eastern Africa with its east coast on the Indian Ocean. Both have a tropical climate with two seasons, one wet and warm from September to May, and the other dry and cold from June to August. Malaria, predominantly caused by Plasmodium falciparum, is endemic (Cuamba et al, 2006; Mabunda et al, 2008). Portugal (39°30′N, 8°00′W) is in south-western Europe. Malaria transmission was interrupted in nearly all parts of the country by 1958 and eradication was confirmed by WHO in 1973 (Bruce-Chwatt, 1977).

Sampling

A total of 417 DNA samples were analysed in this study. There were 316 collected from both uninfected and infected non-related children with a different malaria outcome: 166 from Luanda, Angola (ANG) [44 with severe malaria, 43 with uncomplicated malaria, 37 from asymptomatic infected individuals and 42 from healthy aparasitaemic individuals (uninfected)] and 150 from Maputo, Mozambique (MOZ) (51 with severe malaria and 99 with uncomplicated malaria). The pooling of all samples from Angola (ANG) and Mozambique (MOZ) constituted the African group (AFR). Two groups from Portugal were also analysed: there were 80 samples from healthy individuals (control Portuguese group, PT-C) (described in Alves et al, 2007) and 21 belonging to individuals with PK-deficiency (PT-PKD) (described in Manco et al, 1999, 2000).

Malaria outcome was defined as follows: (i) Severe malaria (SM): slide positive for blood-stage asexual P. falciparum at any parasite density, fever (axillary temperature ≥37·5°C), haemoglobin level of Hb≤50 g/l and/or other symptoms, such as coma, prostration or convulsions; (ii) Uncomplicated malaria (UM): slide positive for blood-stage asexual P. falciparum at any parasite density, fever (axillary temperature ≥37·5°C) and haemoglobin level of Hb>50 g/l; and (iii) Asymptomatic infection (AI): slide positive for blood-stage asexual P. falciparum at any parasite density in the absence of fever or other symptoms of clinical illness. The additional group of uninfected children (NI) was defined as slide negative and the absence of fever or other symptoms of clinical illness. Slide negativity was afterwards confirmed by Polymerase Chain Reaction (PCR). The illness group (ILL) comprised all the individuals expressing clinical disease: SM plus UM.

Blood collection and DNA extraction

Blood sample collections by finger-prick were carried out in Angola in August 2005 and in Mozambique during 2006 from children aged 3 months to 15 years who reported to the Emergency Services of the Paediatric Hospital David Bernardino, Luanda (Angola) or to the Paediatric Emergency Services of Central Hospital of Maputo, Health Centre of Bagamoyo or Health Centre of Boane (Mozambique). The blood was drawn after the clinician examination (malaria was considered to be the primary diagnosis if Plasmodium parasites were found in the peripheral blood and if other likely causes of the clinical presentation could be excluded at the admission) but before the administration of any anti-malarial therapeutics and/or blood transfusion. The registration of symptoms, axillary temperature, haemoglobin level and history of malaria was done for all individuals.

The investigation was approved by both the Ministry of Public Health of Angola and Mozambique and by the local Ethical Committees at the institutions involved in the study. Each individual and parent/tutor of the children was informed of the nature and aims of the study and told that participation was voluntary; informed consents were obtained from all individuals.

DNA was extracted using standard phenol-chloroform or chelex procedures from peripheral blood. In the case of infected individuals, human and Plasmodium DNA were extracted simultaneously.

Genotyping

A section of chromosome 1q21, including the PKLR gene and adjacent regions, with a total length of ≈ 95 Kb, was genotyped for 4 Short Tandem Repeats (STRs) and 15 Single Nucleotide Polymorphisms (SNPs). Samples were also genotyped for 32 Ancestry Informative Insertion/Deletion polymorphisms (AI-INDELs) distributed throughout the genome. The localization of polymorphisms in chromosome 1 is represented in Fig 1.

image

Figure 1.  The 95 kb fragment analysed in this study, including PKLR gene. (A) Localization in chromosome 1q21; (B) The 4 STR loci (PKV, PKA, IVS11 and IVS3) genotyped in the present study and genes near PKLR; (C) The 15 SNP loci analysed spread along a region closer to the gene PKLR. Adapted fromhttp://www.hapmap.org.

Download figure to PowerPoint

STRs

The STRs used were IVS3 (in intron 3), IVS11 (intron 11), PKA (≈ 25 kb upstream from the PKLR gene) and PKV (≈ 65 kb upstream from the gene) and were genotyped after multiplex PCR as described in Alves et al (2010).

SNPs

SNPs localised in a region closer to PKLR than the abovementioned STRs were genotyped using a SNaPshot (Applied Biosystems, Foster City, CA, USA) multiplex reaction.

The DNA sequence of chromosome 1q21, including the PKLR gene and flanking regions, was screened for SNPs in the HapMap database (http://hapmap.ncbi.nlm.nih.gov/). A total of 13 SNPs were selected in a region of 40,970 bp that spanned the PKLR gene (chr1:153515199..153556169; data source: HapMap Data Rel 22/phaseII Apr07, on NCBI B36 assembly, dbSNP b126), starting at 18 334 bp upstream and extending to 11 055 bp downstream of the gene. All the SNPs described for the PKLR gene were selected for genotyping, except rs3020781, which had amplification difficulties. SNPs outside of the gene that showed variation in the reference African population (Yoruba, Nigeria), with a minor allele frequency above 15% and distances between contiguous SNPs greater than 1600 bp, were included in the study.

Two additional mutations were investigated in the PKLR gene: 1456C>T, because it is the most common mutation in South Europe, namely in Portugal (Manco & Abade, 2001) and the only one described in PK-deficient Afro-American individuals (Beutler & Gelbart, 2000), and 1614A>T, identified in São Tome and Príncipe (Manco et al, 2009).

Primers were designed for the flanking regions of each of the 15 SNPs in the GenBank database sequence AY316591 with Primer 3 software v.0.4.0 (Rozen & Skaletsky, 2000; primer sequences in Table SI). Primers were first tested in singleplex and then multiplex reactions were carried out according to Goios et al, 2008, using the Qiagen Multiplex PCR Kit (Qiagen, Hilden, Germany).

For each SNP, an SBE-Primer was designed with Primer 3 software (Table SII). Amplified products were purified with ExoSAP-IT (Amersham Biosciences, Uppsala, Sweden) and SNaPshot reactions were then performed using the SNaPshot Multiplex Kit (Applied Biosystems) in a reaction volume of 5 μl with primer concentrations as indicated, under the following conditions: 96°C for 10 s, 55°C for 5 s, and 60°C for 30 s, repeated for 27 cycles. The final products were purified with SAP (Amersham Biosciences) and run in an abi prism 3130 Genetic Analyzer. Allele assignment was performed using GeneMapper 4.0 (Applied Biosystems).

Ancestry informative INDELs

The high levels of genetic substructure in Africa, even within small geographic regions, require the determination of individual ancestry and proper correction for substructure in association studies (Campbell & Tishkoff, 2008). To look into the structure of our African groups and to investigate if our PT-PKD group could have a relevant African genetic component, which would suggest that PK-deficiency could be frequent in that region, 32 INDEL polymorphic regions localised throughout the genome were genotyped as described in Santos et al (2010). In this work, we used only a subset of the original assay, comprising the INDELs that are especially informative of African and European ancestry. An additional reference Portuguese group (PT-REF) that was previously typed for these INDEL loci (Santos et al, 2010) was also used in this analysis.

Statistical analysis

Analysis was performed by comparing population groups (ANG, MOZ, PT-C, PT-PKD) and malaria status groups (SM, UM, AM, NI, ILL). STR and SNP results were explored with Arlequin 3.1 (Excoffier et al, 2005): determination of the allele frequencies, expected and observed heterozygosity and population pairwise FST values, Hardy–Weinberg equilibrium tests, Linkage Disequilibrium (LD) tests, haplotype frequency estimation and analysis of molecular variance (amova). When there were multiple tests, Bonferroni’s correction was applied, dividing 0·05 by the number of tests to obtain the actual cut-off for significance. The allelic association of SNPs and STRs with malaria status groups was assessed by a Pearson’s 2 × 2 contingency table chi-square test using Simple Interactive Statistical Analysis (SISA, http://www.quantitativeskills.com/sisa/). Odds ratios (OR) and 95% confidence intervals (CI) were estimated using SISA. Allelic richness with rarefaction of private alleles was calculated with HP-Rare (Kalinowski, 2005). Bayesian clustering analysis as implemented by Structure 2.2 (Pritchard et al, 2000) was used to infer population substructure/ancestry from the INDEL data set, without prior information on sampling groups, under the admixture model with correlated allele frequencies. Ten independent runs with 105 burn-in steps and 105 interactions were done for each value of K (= 1 to 5 clusters). For INDELs, ARLEQUIN 3.1 (Excoffier et al, 2005) was also used for FST calculations.

Results

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

STRs

The allele frequencies for the four STR loci found in ANG, MOZ, PT-C and PT-PKD are shown in Table SIII. The IVS3 locus presented the greatest diversity indices in all groups, with the highest number of alleles and expected heterozygosity. In both African groups, the observed genotype frequencies were according to Hardy-Weinberg expectations for all loci except for IVS3, which revealed a heterozygosity significantly below the expected (P ≤ 0·000). In Portuguese groups, all loci were in Hardy-Weinberg equilibrium in the control PT-C (P = 0·378 for IVS3) but not in the PT-PKD group, which showed a strong deviation from the expected values for IVS3 (≤ 0·000) and IVS11 (= 0·006).

When FST values were calculated, no significant differentiation was obtained for the pair ANG vs. MOZ (FST = 0·002; = 0·189). When Portuguese groups were compared, significant values were obtained, as expected: FST(PT-C vs. PT-PKD) = 0·025; ≤ 0·000. Since no differentiation was found between Angola and Mozambique, a single group was formed for all of the African samples (AFR) and it was compared to Portuguese groups to investigate if African and Portuguese PK-deficient individuals were genetically closer in this genomic region than African and Portuguese controls. If so, we could hypothesise that PK-deficiency could be frequent in Africa (because of some kind of selective advantage conferred by the disease). The FST values obtained were as follows: FST(AFR vs. PT-C) = 0·102 and FST(AFR vs. PT-PKD) = 0·153 (≤ 0·000 for both tests).

No significant differentiation was found between the several malaria status groups, whether considering each of the four STR loci separately or all together. As FST was not significant when comparing ANG and MOZ, UM and SM, samples from both countries were pooled into two larger groups, but still no significant values were found between these groups. No STR or SNP allele was associated with any malaria status group (> 0·05) and OR values were non-significant for all groups. Moreover, when STR allelic private richness was calculated (considering 42 genes for all groups as PT-PKD only included 21 samples), private alleles were not identified, supporting the previous result. However, allele 16 of locus IVS11 (χ2 = 10·918; < 0·001 and OR = 6·200 with 95% CI 1·858–20·685) and allele 36·2 of locus IVS3 (χ2 = 13·265; < 0·001 and OR = 5·961 with 95% CI 2·072–17·154) were significantly associated only with PT-PKD. These two specific alleles were not associated with any particular malaria status group.

The African groups ANG and MOZ showed a marked LD for all pairs of loci (≤ 0·000). Conversely, the group PT-C only showed LD for the closer loci (PKV/PKA and PKA/IVS11), while the PT-PKD group only showed LD for PKV/IVS11. However, when the African malaria status groups were analysed separately, only UM sets from both Angola and Mozambique had significant results for all pairs of loci (≤ 0·008), i.e. significant LD for a region spanning ≈ 75 Kb (IVS3 was not considered for this test as it was not in Hardy-Weinberg equilibrium). Furthermore, when UM samples from Angola and Mozambique were pooled in one single larger group, the previous result was reinforced: ≤ 0·000 for all LD tests between locus pairs. Therefore, we searched for a haplotype (PKV/PKA/IVS11/IVS3) that could be associated with this larger UM group and 9/11/13/34 revealed this association, although it was borderline (χ2 = 5·898, = 0·015; OR = 5·267; 95% CI: 1·188–23·355).

The population groups studied all revealed a large number of low frequency inferred haplotypes. The most common haplotypes were: in ANG, 10/14/12/38, 11/12/15/35, 11/11/17/35 and 10/13/12/34, with an approximate frequency of 3% each; in MOZ, haplotype 9/11/13/34 was prominent (6·3%, from which 5·5% were in UM) and four additional haplotypes were also frequent (≈ 3%): 10/13/14/35, 11/9/17/37·2, 10/13/12/35 and 10/14/12/38; in PT-C, the most frequent haplotype was 9/9/14/40·2 (5·6%), followed by 10/9/14/38·2, 10/9/14/39·2 and 9/9/14/37·2 (about 4%); and in PT-PKD, the most frequent haplotypes were 10/9/14/38·2 (23·8%), 9/9/15/36·2 (19·0%) and 9/9/16/38·2 (11·9%). These last two were not detected in PT-C and 9/9/15/36·2 was exclusively found in PT-PKD.

An amova that considered these four loci for comparison in the follow three populations, Africa (NI, AM, UM and SM from Angola, UM and SM from Mozambique), Portugal – control (PT-C) and Portugal – PK-deficiency (PT-PKD), resulted in a significant percentage of variation between the three populations (10·92%, P ≤ 0·000) and within each group (88·97%, P ≤ 0·000). A non-significant value was obtained between groups within each population (0·12%, P = 0·512).

SNPs

Overall, 15 SNPs were analysed in this study: 13 were identified in the HapMap database and two were mutations previously described to be associated with PK-deficiency. These mutations were not identified in any of the African groups studied or in the control Portuguese individuals. Mutation 1456C>T was identified in eight Portuguese PK-deficient individuals, two of whom were homozygous for the T allele (Manco et al, 1999, 2000). The allele frequencies found in the studied population groups are shown in Table SIV.

No significant differentiation was found between ANG and MOZ or between PT-C and PT-PKD, whether considering all 13 loci simultaneously or separately. A significant differentiation was found between African and Portuguese groups: FST(AFR vs. PT-C) = 0·239, FST(AFR vs. PT-PKD) = 0·341, P ≤ 0·000 for both tests.

Comparing NI, AI, SM and UM from Angola and Mozambique, FST values were not significant for any pairs of groups tested. Given that there were no differences between the two African populations, UM and SM from both countries were pooled into larger groups for comparison, but still no differences were found. The same result was obtained when these groups were compared to NI and AI.

The observed heterozygosity was according to the Hardy-Weinberg expected frequencies in all population groups but, strikingly, when performing an analysis on the malaria status groups from Angola, all loci in UM and SM that were localised in exon 12 (pk_177, pk_176 and pk_972) or downstream (pk_276, pk_184, pk_352 and pk_355) had a deviation from Hardy-Weinberg equilibrium (P < 0·050) with an excess of heterozygotes (as seen in Fig 2). However, when Bonferroni’s correction was applied (P < 0·004 for significance), none of these results were statistically significant. However, when individuals of SM and UM were combined into the single ILL group, the deviation was significant even under Bonferroni’s correction. These results were not obtained for the Mozambican groups, where the observed heterozygosity was similar to expectation.

image

Figure 2.  Observed (A) and expected (B) heterozygosity of the SNP loci in Portuguese groups and malaria status groups from both Angola and Mozambique. ANG-UM and ANG-SM revealed a heterozygote excess for all loci included between pk_276 and pk_176. ANG-NI: Angola – non-infected; ANG-AI: Angola – asymptomatic infection; ANG-UM: Angola – uncomplicated malaria; ANG-SM: Angola – severe malaria; MOZ-UM: Mozambique – uncomplicated malaria; MOZ-SM: Mozambique – severe malaria; PT-C: Portugal – control group; PT-PKD: Portugal – PK-deficiency group.

Download figure to PowerPoint

African populations showed higher haplotype diversity than the Portuguese. The five main inferred haplotypes (pk_276/pk_184/.../pk_361, ordered as in Fig 1) were identified in both ANG and MOZ and also observed in the malaria status groups from each country. No specific haplotype was associated with any group. In PT-C, two main haplotypes, already identified in the African groups, were observed: G/G/T/C/G/A/G/T/C/G/A/C/A/T/A (frequency of 76%) and A/A/C/G/A/G/T/T/C/C/A/G/C/C/C (frequency of 18%). In PT-PKD, two main haplotypes were identified: one was the most common in PT-C, whereas the other was exclusive to this group, because of the mutation 1456T (G/G/T/C/G/A/G/T/T/G/A/C/A/T/A), which was in complete LD with all adjacent loci (Fig 3). When we looked for selective sweeps in African groups in this genomic segment, they were not found: in a general way, the expected heterozygosity in loci from ANG and MOZ was higher but followed the trend observed in PT-C and PT-PKD (Fig 2).

image

Figure 3.  Estimated frequencies of inferred haplotypes in the studied population groups. ANG: Angola; MOZ: Mozambique; PT-C: control Portuguese; PT-PKD: Portuguese with PK-deficiency. The segment between pk_276 and pk_176 was extremely conserved in all haplotypes with only two possible allelic combinations, indicated by different greys in the lower panel.

Download figure to PowerPoint

Similarly to amova using the STRs, amova using all of the SNP loci resulted in significant percentages of variation between the populations [Africa (NI, AM, UM and SM from Angola, UM and SM from Mozambique), Portugal – control (PT-C) and Portugal – PK-deficiency (PT-PKD)] and within each group (25·47%, P ≤ 0·000 and 74·52%, P ≤ 0·000, respectively). The percentage of variation between groups within each population was not significant (≤0·00%, P = 0·481).

A combined analysis was performed using all STR and SNP loci, and the results supported those reported above: significant FST values were obtained when African groups were compared to Portuguese groups. A significant differentiation was also obtained between the two Portuguese groups, PT-C and PT-PKD.

Ancestry informative INDELs

The structure of African and Portuguese (PT-PKD and PT-REF) groups was examined through the genotyping of 32 INDELs. K = 2 was, undoubtedly, the most likely number of clusters, corresponding to the African and Portuguese samples. Even when K = 3 to K = 5 were tested, the division between African and Portuguese clusters was obvious (Fig 4). A clear differentiation was achieved between African and PT-REF (FST = 0·392; ≤ 0·000) and African and PT-PKD (FST = 0·423; ≤ 0·000) groups. MOZ and ANG could be slightly differentiated (FST = 0·003; ≤ 0·000) by genetic distance analysis but not when using Structure 2.2 software, even when only the two African groups were considered (data not shown). No differentiation was achieved between PT-REF and PT-PKD, or between malaria status groups within MOZ or within ANG under any circumstance.

image

Figure 4.  Estimated population structure determined with Structure 2.2. (no prior information of sampling groups, under the admixture model with correlated allele frequencies; ten independent runs with 105 burn-in steps and 105 interactions). Each bar represents a single individual and is partitioned into K different grey-shaded segments that represent the individual’s estimated coefficients of ancestry. K = 2 is the most suitable division, with clusters corresponding to the Portuguese (mainly light grey) and African (mainly dark grey) samples. 1- PT-REF [reference group from Portugal (Santos et al, 2010)]; 2- PT-PKD (individuals with PK-deficiency from Portugal) 3- ANG (Angola); 4-MOZ (Mozambique).

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

A combined analysis with STR and SNP data was used to search for malaria selection signatures in the PKLR gene region. Two different approaches were performed: inter-population analysis, opposing two populations from malaria endemic regions (Angola and Mozambique) to a Portuguese population with no malaria, and an intra-population analysis, comparing malaria status groups within populations.

STR and SNP allelic frequencies in ANG and MOZ were similar and quite different from PT-C and PT-PKD, reflecting structural differences. In fact, when sample structure was tested using ancestry informative INDEL markers, two clusters were clearly formed: one with all ANG and MOZ samples and one including all PT-PKD and PT-REF samples.

FST among human populations from major geographical regions, based on more than 370 STRs, was estimated to be 0·05 (Rosenberg et al, 2002), and it was estimated to be 0·10 when based on 600,000 SNPs (Li et al, 2008). Moreover, an amova using the same STR loci (Rosenberg et al, 2002) showed 3·6% to 5·2% variation between major regions of the world and 3·1% variation between populations within Africa. In this study, FST values obtained between African and Portuguese groups were considerably higher, varying between 0·102 and 0·153 for STRs and between 0·239 and 0·341 for SNPs. In addition, an amova for STR loci had a significant outcome of 10·92% variation between Africans and Portuguese, whereas variation between groups within each population was 0·12%. In a typical multilocus sample, it is reasonable to assume that all autosomal loci have experienced the same demographic history and the same rates and patterns of migration. Loci showing unusually large amounts of differentiation may indicate regions of the genome that have been subject to diversifying selection (Holsinger & Weir, 2009) of which malaria could have been the cause. The amova results show that, whereas variation between Africa and Portugal more than doubled in this study, the opposite occurred in the degree of variation between groups within populations, suggesting that some (selective) force is homogenising this genomic fragment in African regions and, at the same time, extending the differences between Africa and other global areas. Curiously, the FST value for Africans versus. PT-PKD was higher than for Africans versus. PT-C, suggesting that, even if PK-deficiency is frequent in sub-Saharan Africa, mutations should be different from those found in the Portuguese.

Concerning the Portuguese groups, differentiation was only significant when STR data was used, which may be explained by the different molecular resolution of SNPs and STRs: in humans, the average nucleotide mutation rate is assumed to be 2·5 × 10−8 and the STR mutation rate has been estimated to be 10−2–10−5 per generation (Tishkoff & Verrelli, 2003). Thus, SNPs are best used for inferring human evolutionary history over longer time scales and STRs can be used to trace recent demographic events (Agrafioti & Stumpf, 2007). Therefore, we can presume that Portuguese PK-deficiency variants have emerged recently, which is supported by the lower diversity found within this group.

No differentiation was ever obtained between malaria status groups, either using SNPs or STRs, although insufficient sampling of each group may be influencing this result. Of all the STR loci, IVS3 in the PKLR gene was the only one with frequencies that were out of Hardy-Weinberg equilibrium in the African groups, with a significant excess of homozygotes. This had already been observed in a previous study with African samples from Cabo Verde (Alves et al, 2010). Conversely, as expected, the control group PT-C, had a heterozygosity that was similar to that expected. These data suggest that IVS3 homozygosity is being promoted in some manner. Possible causes for the Hardy-Weinberg equilibrium deviation include admixture and substructure or non-random mating patterns. However, as this deviation was observed in several African populations, it is possible that it is caused by the impact of selection pressures from environmental conditions (e.g. infectious diseases like malaria). IVS3 is in intron 3, a critical functional location as it is where the splicing of exon 2 occurs for the production of PKL mRNA, and as it is not a simple polymorphic locus (it includes eight contiguous variation regions), it should be carefully analysed.

The LD test for the STRs showed a significant LD along the entire studied region for UM. This is interesting as suggests an association between this conserved genomic block and a mild malaria outcome. Moreover, this LD emphasises the result previously found in Cabo Verde, where an LD test revealed an association of these same loci but in non-infected individuals (Alves et al, 2010). Additionally, this LD outcome is not expected under neutrality, which also supports our results: several datasets show differences in haplotype structure between African and non-African samples, where blocks are significantly smaller in African samples and extend longer and are less diverse in non-Africans (Tishkoff & Verrelli, 2003). Reinforcing the LD result, a haplotype was identified as associated with this group: 9/11/13/34. This association must be further analysed since it is not robust (= 0·015), but we believe that insufficient sampling may be the cause for this deficiency, as this association was identified only when UM and SM samples from both Angola and Mozambique were pooled together in a larger group.

The LD test for the SNPs had a significant result in all groups and populations for all pairs of SNP loci in exon 12 and upstream (between loci pk_276 and pk_176). Curiously, the ILL group from Angola had a significant SNP heterozygote excess exactly in the same region. Three of these loci are located in exon 12 of PKLR, and the remaining are in the HCN3 gene. This gene, coding for a hyperpolarisation-activated cyclic nucleotide-gated potassium channel 3, is a voltage-gated channel performing ionic, potassium and sodium transport (Uniprot database/Swiss-Prot Q9P1Z3) and is highly expressed in early erythroid cells (Su et al, 2004), which produce mature erythrocytes. Heterozygosity in this genomic fragment seems to be associated with clinical malaria in Angola but not in Mozambique, suggesting that, additionally to malaria, some geographic factor may be involved in this scenario.

Five main inferred SNP haplotypes were identified in ANG and MOZ and only two in PT-C (contained within those five) and two in PT-PKD. These results were expected as African populations are older and have maintained a larger N whereas non-African populations have experienced a bottleneck event during the expansion of modern humans out of Africa within the past 100 000 years (Tishkoff & Verrelli, 2003). The high mutation rate of STRs explains why the same STR haplotype diversity is present in both African and non-African regions. Haplotype 6 was exclusive to PT-PKD, differing only from haplotype 3 (the most common in PT-C) at the pk_1456 locus. As a result of its strong LD, the segment between pk_276 and pk_176 was extremely well-conserved in all haplotypes, with only two possible allelic combinations. The remaining segment revealed strong recombination. Neither of the two mutations that were potentially associated with PK-deficiency in Africa (as indicated in previous reports) were identified in our African samples.

Previous studies have also examined this particular genomic fragment, seeking other disease-associated variants. Multiple studies in populations from diverse origins have shown linkage of type 2 diabetes (T2D) to chromosome 1q over a broad region and the PKLR gene arises as the first candidate (Wang et al, 2002, 2009; Das & Elbein, 2007). A search for prevalence of T2D in the African continent revealed that Afro-Americans have a two-fold increase in risk for T2D compared to other populations in the United States, but its prevalence is lower in Africa (1–2%) than among people of African descendant in industrialised nations (11–13%) (Rotimi et al, 2004). In addition, this region includes the GBA gene, coding for the housekeeping enzyme beta-glucocerebrosidase, which has mutations causing Gaucher disease; however, especially high frequencies of this disease have not been detected in Africa (Goldblatt & Beighton, 1979). Therefore, the probability that these diseases would be selectively acting on this genomic region is lower than it is for malaria, denying the possibility of relevant selective confounding factors.

In summary, in this study, several results were obtained supporting the hypothesis that malaria is acting as a selective force in the PKLR gene region. Firstly, FST values between African and Portuguese populations using STR and SNP data from this specific fragment were considerably higher than those found using STR and SNP neutral markers, and the same was observed with amova, revealing that this genomic section is under selection; secondly, the LD block included a more extensive region in the mild malaria group and a haplotype was found to be associated with this clinical group, suggesting that this conserved genomic block is associated with some protection against malaria severity. Thus, the output of this work, using human population data, seems to be in agreement with the results previously obtained with murine models and in vitro Plasmodium culturing. For future work, a larger number of samples from malaria status sets should be used and locus IVS3 should be carefully analysed. A more extensive field work with deeper phenotype discrimination and identification of PK abnormal alleles is currently under way.

Acknowledgements

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

We thank all individuals and parents/tutors of children who participated in this study and to all health technicians working at Emergency Services of the Paediatric Hospital David Bernardino (Luanda, Angola), Paediatrics Department of Central Hospital of Maputo, Health Centres of Bagamoyo and Boane (Maputo, Mozambique) for all technical support.

This study was supported by ‘Financiamento Programático do Laboratório Associado CMDT.LA/IHMT’ and POCI/SAU-ESP/55110/2004 (Fundação para a Ciência e Tecnologia/Ministério da Ciência, Tecnologia e Ensino Superior, FCT/MCTES, Portugal). P. Machado, R. Pereira and A. P. Arez were funded by FCT/MCTES Portugal (SFRH/BD/28236/2006, SFRH/BD/30039/2006 and SFRH/BPD/1624/2000—until 2007, respectively).

References

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information
  • Agrafioti, I. & Stumpf, M.P. (2007) SNPSTR: a database of compound microsatellite-SNP markers. Nucleic Acids Research, 35 (Database issue), D71D75.
  • Alves, C., Gomes, V., Prata, M.J., Amorim, A. & Gusmão, L. (2007) Population data for Y-chromosome haplotypes defined by 17 STRs (AmpFlSTR YFiler) in Portugal. Forensic Science International, 171, 250255.
  • Alves, J., Machado, P., Silva, J., Gonçalves, N., Ribeiro, L., Faustino, P., Do Rosário, V.E., Manco, L., Gusmão, L., Amorim, A. & Arez, A.P. (2010) Analysis of malaria associated genetic traits in Cabo Verde, a melting pot of European and sub Saharan settlers. Blood Cells, Molecules and Diseases, 44, 6268.
  • Anto, F., Asoala, V., Anyorigiya, T., Oduro, A., Adjuik, M., Owusu-Agyei, S., Dery, D., Bimi, L. & Hodgson, A. (2009) Insecticide resistance profiles for malaria vectors in the Kassena-Nankana district of Ghana. Malaria Journal, 8, 81.
  • Ayi, K., Min-Oo, G., Serghides, L., Crockett, M., Kirby-Allen, M., Quirt, I., Gros, P. & Kain, K.C. (2008) Pyruvate kinase deficiency and malaria. The New England Journal of Medicine, 358, 18051810.
  • Beutler, E. & Gelbart, T. (2000) Estimating the prevalence of pyruvate kinase deficiency from gene frequency in the general white population. Blood, 95, 35853588.
  • Bruce-Chwatt, L.J. (1977) Malaria eradication in Portugal. Transactions of the Royal Society of Tropical Medicine and Hygiene, 71, 232240.
  • Campbell, M.C. & Tishkoff, S.A. (2008) African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annual Review of Genomics and Human Genetics, 9, 403433.
  • Ceesay, S.J., Casals-Pascual, C., Erskine, J., Anya, S.E., Duah, N.O., Fulford, A.J.C., Sesay, S.S.S., Abubakar, I., Dunyo, S., Sey, O., Palmer, A., Fofana, M., Corrah, T., Bojang, K.A., Whittle, H.C., Greenwood, B.M. & Conway, D.J. (2008) Changes in malaria indices between 1999 and 2007 in The Gambia: a retrospective analysis. Lancet, 372, 15451554.
  • Cuamba, N., Choi, K.S. & Townson, H. (2006) Malaria vectors in Angola: distribution of species and molecular forms of the Anopheles gambiae complex, their pyrethroid insecticide knockdown resistance (kdr) status and Plasmodium falciparum sporozoite rates. Malaria Journal, 5, 2.
  • Das, S.K. & Elbein, S.C. (2007) The search for type 2 diabetes susceptibility loci: the chromosome 1q story. Current Diabetes Reports, 7, 154164.
  • Durand, P.M. & Coetzer, T.L. (2008) Pyruvate kinase deficiency protects against malaria in humans. Haematologica, 93, 939940.
  • Excoffier, L., Laval, G. & Schneider, S. (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online, 1, 4750.
  • Goios, A., Gusmão, L., Rocha, A.M., Fonseca, A., Pereira, L., Bogue, M. & Amorim, A. (2008) Identification of mouse inbred strains through mitochondrial DNA single-nucleotide extension. Electrophoresis, 29, 47954802.
  • Goldblatt, J. & Beighton, P. (1979) Gaucher disease in the Afrikaner population of South Africa. South African Medical Journal, 55, 209210.
  • Hill, A.V. (1998) Host genetics of infectious diseases: old and new approaches converge. Emerging Infectious Diseases, 4, 695697.
  • Holsinger, K.E. & Weir, B.S. (2009) Genetics in geographically structured populations: defining, estimating and interpreting FST. Nature Reviews Genetics, 10, 639650.
  • Kalinowski, S.T. (2005) HP-Rare: a computer program for performing rarefaction on measures of allelic diversity. Molecular Ecology Notes, 5, 187189.
  • Kwiatkowski, D.P. (2005) How malaria has affected the human genome and what human genetics can teach us about malaria. The American Journal of Human Genetics, 77, 171192.
  • Li, J.Z., Absher, D.M., Tang, H., Southwick, A.M., Casto, A.M., Ramachandran, S., Cann, H.M., Barsh, G.S., Feldman, M., Cavalli-Sforza, L.L. & Myers, R.M. (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science, 319, 11001104.
  • Mabunda, S., Casimiro, S., Quinto, L. & Alonso, P. (2008) A country-wide malaria survey in Mozambique. I. Plasmodium falciparum infection in children in different epidemiological settings. Malaria Journal, 7, 216.
  • Manco, L. & Abade, A. (2001) Pyruvate kinase deficiency: prevalence of the 1456C-->T mutation in the Portuguese population. Clinical Genetics, 60, 472473.
  • Manco, L., Ribeiro, M.L., Almeida, H., Freitas, O., Abade, A. & Tamagnini, G. (1999) PK-LR gene mutations in pyruvate kinase deficient Portuguese patients. British Journal of Haematology, 105, 591595.
  • Manco, L., Ribeiro, M.L., Máximo, V., Almeida, H., Costa, A., Freitas, O., Barbot, J., Abade, A. & Tamagnini, G. (2000) A new PKLR gene mutation in the R-type promoter region affects the gene transcription causing pyruvate kinase deficiency. British Journal of Haematology, 110, 993997.
  • Manco, L., Oliveira, A.L., Gomes, C., Granjo, A., Trovoada, M., Ribeiro, M.L., Abade, A. & Amorim, A. (2001) Population genetics of four PKLR intragenic polymorphisms in Portugal and São Tomé e Princípe (Gulf of Guinea). Human Biology, 73, 467474.
  • Manco, L., Trovoada, M.J. & Ribeiro, M.L. (2009) Novel Human Pathological Mutations. Gene Symbol: PKLR Disease: pyruvate kinase deficiency. Human Genetics, 125, 340.
  • Mateu, E., Perez-Lezaun, A., Martinez-Arias, R., Andres, A., Vallés, M., Bertranpetit, J. & Calafell, F. (2002) PKLR-GBA region shows almost complete linkage disequilibrium over 70 kb in a set of worldwide populations. Human Genetics, 110, 532544.
  • Min-Oo, G. & Gros, P. (2005) Erythrocyte variants and the nature of their malaria protective effect. Cellular Microbiology, 7, 753763.
  • Min-Oo, G., Fortin, A., Tam, M.F., Nantel, A., Stevenson, M.M. & Gros, P. (2003) Pyruvate kinase deficiency in mice protects against malaria. Nature Genetics, 35, 357362.
  • Noedl, H., Se, Y., Schaecher, K., Smith, B.L., Socheat, D. & Fukuda, M.M. (2008) Evidence of artemisinin-resistant malaria in western Cambodia. The New England Journal of Medicine, 359, 26192620.
  • O’Meara, W.P., Bejon, P., Mwangi, T.W., Okiro, E.A., Peshu, N., Snow, R.W., Newton, C.R. & Marsh, K. (2008) Effect of a fall in malaria transmission on morbidity and mortality in Kilifi, Kenya. Lancet, 372, 15551562.
  • Pritchard, J.K., Stephens, M. & Donnelly, P. (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945959.
  • Rosenberg, N.A., Pritchard, J.K., Weber, J.L., Cann, H.M., Kidd, K.K., Zhivotovsky, L.A. & Feldman, M.W. (2002) Genetic structure of human populations. Science, 298, 23812385.
  • Rotimi, C.N., Chen, G., Adeyemo, A.A., Furbert-Harris, P., Parish-Gause, D., Zhou, J., Berg, K., Adegoke, O., Amoah, A., Owusu, S., Acheampong, J., Agyenim-Boateng, K., Eghan, Jr, B.A., Oli, J., Okafor, G., Ofoegbu, E., Osotimehin, B., Abbiyesuku, F., Johnson, T., Rufus, T., Fasanmade, O., Kittles, R., Daniel, H., Chen, Y., Dunston, G. & Collins, F.S. (2004) A genome-wide search for type 2 diabetes susceptibility genes in West Africans: the Africa America Diabetes Mellitus (AADM) Study. Diabetes, 53, 838841. Erratum in: Diabetes, 53, 1404.
  • Rozen, S. & Skaletsky, H.J. (2000) Primer 3 on the WWW for general users and for biologist programmers. In: Bioinformatics Methods and Protocols: Methods in Molecular Biology (ed. by S.Krawetz & S.Misener), pp. 365386. Humana Press Inc, Totowa, New Jersey, USA.
  • Santos, N.P., Ribeiro-Rodrigues, E.M., Ribeiro-dos-Santos, A.K., Pereira, R., Gusmão, L., Amorim, A., Gerreiro, J.F., Zago, M.A., Matte, C., Hutz, M.H. & Santos, S.E. (2010) Assessing individual interethnic admixture and population substructure using a 48 insertion-deletion ancestry-informative marker panel. Human Mutation, 31, 184190.
  • Su, A., Wiltshire, T., Batalov, S., Lapp, H., Ching, K.A., Block, D., Zhang, J., Soden, R., Hayakawa, M., Kreiman, G., Cooke, M.P., Walker, J.R. & Hogenesch, J.B. (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proceedings of the National Academy of Sciences of the United States of America, 101, 60626067.
  • Tishkoff, S.A. & Verrelli, B.C. (2003) Patterns of human genetic diversity: implications for human evolutionary history and disease. Annual Review of Genomics and Human Genetics, 4, 293340.
  • Wang, H., Chu, W., Das, S.K., Ren, Q., Hasstedt, S.J. & Elbein, S.C. (2002) Liver pyruvate kinase polymorphisms are associated with type 2 diabetes in northern European Caucasians. Diabetes, 51, 28612865.
  • Wang, H., Hays, N.P., Das, S.K., Craig, R.L., Chu, W.S., Sharma, N. & Elbein, S.C. (2009) Phenotypic and molecular evaluation of a chromosome 1q region with linkage and association to type 2 diabetes in humans. Journal of Clinical Endocrinology & Metabolism, 94, 14011408.
  • WHO. (2008) World Malaria Report 2008. http://apps.who.int/malaria/wmr2008/malaria2008.pdf
  • Williams, T.N. (2006) Red blood cell defects and malaria. Molecular and Biochemical Parasitology, 149, 121127.

Supporting Information

  1. Top of page
  2. Summary
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Table SI. SNP loci selected for analysis (ordered according to localization), allelic frequencies and primers used for multiplex PCR.

Table SII. Single Base Extension (SBE) primers used for SNaPshot reaction.

Table SIII. STR loci allele frequencies found in Angola (ANG), Mozambique (MOZ), control Portuguese (PT-C) and PK-deficient Portuguese (PT-PKD).

Table SIV. SNP loci allelic frequencies observed in Angola, Mozambique and Portuguese groups.

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

FilenameFormatSizeDescription
BJH_8165_sm_Tables.doc266KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.