Researchers in comparative research increasingly use multilevel models to test effects of country-level factors on individual behavior and preferences. However, the asymptotic justification of widely employed estimation strategies presumes large samples and applications in comparative politics routinely involve only a small number of countries. Thus, researchers and reviewers often wonder if these models are applicable at all. In other words, how many countries do we need for multilevel modeling? I present results from a large-scale Monte Carlo experiment comparing the performance of multilevel models when few countries are available. I find that maximum likelihood estimates and confidence intervals can be severely biased, especially in models including cross-level interactions. In contrast, the Bayesian approach proves to be far more robust and yields considerably more conservative tests.