How Credible Is the Evidence, and Does It Matter? An Analysis of the Program Assessment Rating Tool


  • Carolyn J. Heinrich

    Corresponding author
    1. University of Texas at Austin
    • Carolyn J. Heinrich is the Sid Richardson Professor of Public Affairs and director of the Center for Health and Social Policy in the Lyndon B. Johnson School of Public Affairs at the University of Texas at Austin. She previously served as director and professor in the La Follette School of Public Affairs at the University of Wisconsin–Madison and associate director of the Institute for Research on Poverty and assistant professor at the University of North Carolina at Chapel Hill. She holds a doctorate in public policy studies from the University of Chicago.

    Search for more papers by this author


This research empirically assesses the quality of evidence that agencies provided to the Office of Management and Budget in the application of the Program Assessment Rating Tool (PART), introduced in 2002 to more rigorously, systematically, and transparently assess public program effectiveness and hold agencies accountable for results by tying them to the executive budget formulation process and program funding. Evidence submitted by 95 programs administered by the U.S. Department of Health and Human Services for the PART assessment is analyzed using measures that capture the quality of evidence and methods used by programs and information on characteristics of agencies that might relate to program results and government funding decisions. The study finds that of those programs offering some evidence, most was internal and qualitative, and about half did not assess how their performance compared to other government or private programs with similar objectives. Programs were least likely to provide externally generated evidence of their performance relative to long-term and annual performance goals. Importantly, overall PART and results scores were (statistically) significantly lower for programs that failed to provide quantitative evidence and did not use long-term measures, baseline measures or targets, or independent evaluations. Although the PART program results ratings and overall PART scores had no discernible consequences for program funding over time, the PART assessments appeared to take seriously the evaluation of evidence quality, a positive step forward in recent efforts to base policy decisions on more rigorous evidence.