The purpose of this study was to evaluate 10 process-based terrestrial biosphere models that were used for the IPCC fifth Assessment Report. The simulated gross primary productivity (GPP) is compared with flux-tower-based estimates by Jung et al. [Journal of Geophysical Research 116 (2011) G00J07] (JU11). The net primary productivity (NPP) apparent sensitivity to climate variability and atmospheric CO2 trends is diagnosed from each model output, using statistical functions. The temperature sensitivity is compared against ecosystem field warming experiments results. The CO2 sensitivity of NPP is compared to the results from four Free-Air CO2 Enrichment (FACE) experiments. The simulated global net biome productivity (NBP) is compared with the residual land sink (RLS) of the global carbon budget from Friedlingstein et al. [Nature Geoscience 3 (2010) 811] (FR10). We found that models produce a higher GPP (133 ± 15 Pg C yr−1) than JU11 (118 ± 6 Pg C yr−1). In response to rising atmospheric CO2 concentration, modeled NPP increases on average by 16% (5–20%) per 100 ppm, a slightly larger apparent sensitivity of NPP to CO2 than that measured at the FACE experiment locations (13% per 100 ppm). Global NBP differs markedly among individual models, although the mean value of 2.0 ± 0.8 Pg C yr−1 is remarkably close to the mean value of RLS (2.1 ± 1.2 Pg C yr−1). The interannual variability in modeled NBP is significantly correlated with that of RLS for the period 1980–2009. Both model-to-model and interannual variation in model GPP is larger than that in model NBP due to the strong coupling causing a positive correlation between ecosystem respiration and GPP in the model. The average linear regression slope of global NBP vs. temperature across the 10 models is −3.0 ± 1.5 Pg C yr−1 °C−1, within the uncertainty of what derived from RLS (−3.9 ± 1.1 Pg C yr−1 °C−1). However, 9 of 10 models overestimate the regression slope of NBP vs. precipitation, compared with the slope of the observed RLS vs. precipitation. With most models lacking processes that control GPP and NBP in addition to CO2 and climate, the agreement between modeled and observation-based GPP and NBP can be fortuitous. Carbon–nitrogen interactions (only separable in one model) significantly influence the simulated response of carbon cycle to temperature and atmospheric CO2 concentration, suggesting that nutrients limitations should be included in the next generation of terrestrial biosphere models.