Budburst models have mainly been developed to capture the processes of individual trees, and vary in their complexity and plant physiological realism. We evaluated how well eleven models capture the variation in budburst of birch and Norway spruce in Germany, Austria, the United Kingdom and Finland. The comparison was based on the models performance in relation to their underlying physiological assumptions with four different calibration schemes. The models were not able to accurately simulate the timing of budburst. In general the models overestimated the temperature effect, thereby the timing of budburst was simulated too early in the United Kingdom and too late in Finland. Among the better performing models were three models based on the growing degree day concept, with or without day length or chilling, and an empirical model based on spring temperatures. These models were also the models least influenced by the calibration data. For birch the best calibration scheme was based on multiple sites in either Germany or Europe, and for Norway spruce the best scheme included multiple sites in Germany or cold years of all sites. Most model and calibration combinations indicated greater bias with higher spring temperatures, mostly simulating earlier than observed budburst.