• heterozygosity;
  • malaria;
  • microsatellite;
  • Plasmodium falciparum;
  • stepwise mutation


Microsatellite loci are generally assumed to evolve via a stepwise mutational process and a battery of statistical techniques has been developed in recent years based on this or related mutation models. It is therefore important to investigate the appropriateness of these models in a wide variety of taxa. We used two approaches to examine mutation patterns in the malaria parasite Plasmodium falciparum: (i) we examined sequence variation at 12 tri-nucleotide repeat loci; and (ii) we analysed patterns of repeat structure and heterozygosity at 114 loci using data from 12 laboratory parasite lines. The sequencing study revealed complex patterns of mutation in five of the 12 loci studied. Alleles at two loci contain indels of 24 bp and 57 bp in flanking regions, while in the other three loci, blocks of imperfect microsatellites appear to be duplicated or inserted; these loci essentially consist of minisatellite repeats, with each repeat unit containing four to eight microsatellites. The survey of heterozygosity revealed a positive relationship between repeat number and microsatellite variability for both di- and trinucleotides, indicating a higher mutation rate in loci with longer repeat arrays. Comparisons of levels of variation in different repeat types indicate that the mutation rate of dinucleotide-bearing loci is 1.6–2.1 times faster than trinucleotides, consistent with the lower mean number of repeats in trinucleotide-bearing loci. However, despite the evidence that microsatellite arrays themselves are evolving in a manner consistent with stepwise mutation model in P. falciparum, the high frequency of complex mutations precludes the use of analytical tools based on this mutation model for many microsatellite-bearing loci in this protozoan. The results call into question the generality of models based on stepwise mutation for analysing microsatellite data, but also demonstrate the ease with which loci that violate model assumptions can be detected using minimal sequencing effort.