Two recent papers have quantified long-term ozone (O3) changes observed at northern midlatitude sites that are believed to represent baseline (here understood as representative of continental to hemispheric scales) conditions. Three chemistry-climate models (NCAR CAM-chem, GFDL-CM3, and GISS-E2-R) have calculated retrospective tropospheric O3 concentrations as part of the Atmospheric Chemistry and Climate Model Intercomparison Project and Coupled Model Intercomparison Project Phase 5 model intercomparisons. We present an approach for quantitative comparisons of model results with measurements for seasonally averaged O3 concentrations. There is considerable qualitative agreement between the measurements and the models, but there are also substantial and consistent quantitative disagreements. Most notably, models (1) overestimate absolute O3 mixing ratios, on average by ~5 to 17 ppbv in the year 2000, (2) capture only ~50% of O3 changes observed over the past five to six decades, and little of observed seasonal differences, and (3) capture ~25 to 45% of the rate of change of the long-term changes. These disagreements are significant enough to indicate that only limited confidence can be placed on estimates of present-day radiative forcing of tropospheric O3 derived from modeled historic concentration changes and on predicted future O3 concentrations. Evidently our understanding of tropospheric O3, or the incorporation of chemistry and transport processes into current chemical climate models, is incomplete. Modeled O3 trends approximately parallel estimated trends in anthropogenic emissions of NOx, an important O3 precursor, while measured O3 changes increase more rapidly than these emission estimates.