The genetic differentiation of populations is a key parameter in population genetic investigations. Wright’s FST (and its relatives such as GST) has been a standard measure of differentiation. However, the deficiencies of these indexes have been increasingly realized in recent years, leading to some new measures being proposed, such as Jost’s D (Molecular Ecology, 2008; 17, 4015). The existence of these new metrics has stimulated considerable debate and induced some confusion on which statistics should be used for estimating population differentiation. Here, we report a simulation study with neutral microsatellite DNA loci under a finite island model to compare the performance of GST and D, particularly under nonequilibrium conditions. Our results suggest that there exist fundamental differences between the two statistics, and neither GST nor D operates satisfactorily in all situations for quantifying differentiation. D is very sensitive to mutation models but GST noticeably less so, which limits D’s utility in population parameter estimation and comparisons across genetic markers. Also, the initial heterozygosity of the starting populations has some important effects on both the individual behaviours of GST and D and their relative behaviours in early differentiation, and this effect is much greater for D than GST. In the early stages of differentiation, when initial heterozygosity is relatively low (<0.5, if the number of subpopulations is large), GST increases faster than D; the opposite is true when initial heterozygosity is high. Therefore, the state of the ancestral population appears to have some lasting impacts on population differentiation. In general, GST can measure differentiation fairly well when heterozygosity is low whatever the causes; however, when heterozygosity is high (e.g. as a result of either high mutation rate or high initial heterozygosity) and gene flow is moderate to strong, GST fails to measure differentiation. Interestingly, when population size is not very small (e.g. N ≥ 1000), GST measures differentiation quite linearly with time over a long duration when gene flow is absent or very weak even if mutation rate is not low (e.g. μ = 0.001). In contrast, D, as a differentiation measure, performs rather robustly in all these situations. In practice, both indexes should be calculated and the relative levels of heterozygosities (especially HS) and gene flow taken into account. We suggest that a comparison of the two indexes can generate useful insights into the evolutionary processes that influence population differentiation.