FST outlier tests are a potentially powerful way to detect genetic loci under spatially divergent selection. Unfortunately, the extent to which these tests are robust to nonequilibrium demographic histories has been understudied. We developed a landscape genetics simulator to test the effects of isolation by distance (IBD) and range expansion on FST outlier methods. We evaluated the two most commonly used methods for the identification of FST outliers (FDIST2 and BayeScan, which assume samples are evolutionarily independent) and two recent methods (FLK and Bayenv2, which estimate and account for evolutionary nonindependence). Parameterization with a set of neutral loci (‘neutral parameterization’) always improved the performance of FLK and Bayenv2, while neutral parameterization caused FDIST2 to actually perform worse in the cases of IBD or range expansion. BayeScan was improved when the prior odds on neutrality was increased, regardless of the true odds in the data. On their best performance, however, the widely used methods had high false-positive rates for IBD and range expansion and were outperformed by methods that accounted for evolutionary nonindependence. In addition, default settings in FDIST2 and BayeScan resulted in many false positives suggesting balancing selection. However, all methods did very well if a large set of neutral loci is available to create empirical P-values. We conclude that in species that exhibit IBD or have undergone range expansion, many of the published FST outliers based on FDIST2 and BayeScan are probably false positives, but FLK and Bayenv2 show great promise for accurately identifying loci under spatially divergent selection.