An effect size of about .70 (or .40–.70) is often claimed for the efficacy of formative assessment, but is not supported by the existing research base. More than 300 studies that appeared to address the efficacy of formative assessment in grades K-12 were reviewed. Many of the studies had severely flawed research designs yielding uninterpretable results. Only 13 of the studies provided sufficient information to calculate relevant effect sizes. A total of 42 independent effect sizes were available. The median observed effect size was .25. Using a random effects model, a weighted mean effect size of .20 was calculated. Moderator analyses suggested that formative assessment might be more effective in English language arts (ELA) than in mathematics or science, with estimated effect sizes of .32, .17, and .09, respectively. Two types of implementation of formative assessment, one based on professional development and the other on the use of computer-based formative systems, appeared to be more effective than other approaches, yielding mean effect size of .30 and .28, respectively. Given the wide use and potential efficacy of good formative assessment practices, the paucity of the current research base is problematic. A call for more high-quality studies is issued.