The European summer of 2003 was exceptionally warm, and there is evidence that human influence has at least doubled the risk of such a hot summer. It is possible that by the 2040s, summers over southern Europe will be as warm or warmer 50% of the time. Because of the related socioeconomic impacts, there is growing interest in investigating changes in climate extremes across the world and how they may change in the future. We examine observed and simulated summer temperatures over a set of regions covering the Northern Hemisphere. Simulated changes are consistent with observed changes over the vast majority of regions when the climate simulation includes changes in anthropogenic and natural influences. We detect the dominant influence of anthropogenic factors on observed warming in almost every region, which has led to a rapidly increasing risk of hot summers. We show that hot summers which were infrequent 20–40 years ago are now much more common and that our projections indicate that the current sharp rise in incidence of hot summers is likely to continue.
 In large parts of Europe, the summer of 2003 was exceptionally warm, with large areas some 3 K (up to 5 standard deviations) above the 1961–1990 mean [Schär et al., 2004]. The summer of 2003 was “very likely,” with 95% confidence, the warmest since 1500 A.D. across Europe [Luterbacher et al., 2004] and the larger Mediterranean land area [Xoplaki et al., 2006] and the warmest since 755 A.D. across the Alps [Büntgen et al., 2006]. There was a large number of excess deaths, ∼ 30,000 [Koppe et al., 2004] across Europe and in France the increased mortality has been associated with the extremely warm temperature anomaly lasting 2 weeks in August over the country [Trigo et al., 2005]. The heat wave and associated dry conditions also had ecological impacts, with excessive wildfires and damage to forests and freshwater ecosystems. Crop yields across much of Europe declined and industry and the societal infrastructure was affected, for instance in France there was increased demand for electricity at the same time as some power stations were shut down [Intergovernmental Panel on Climate Change (IPCC), 2007a]. Recently, there has been an understandable increase in the interest in investigating if anthropogenic influences are impacting on such heat waves and whether the frequency or severity of the events will increase in the future.
 Globally annual mean temperatures increased by 0.74 K between 1906 and 2005 [Trenberth et al., 2007] and by 0.13 K/decade over the last 50 years. Various studies have attributed most of this warming to changes in anthropogenic emissions [International ad hoc Detection and Attribution Group, 2005; Hegerl et al., 2007]. For instance one study [Tett et al., 2002] found that during the 20th century greenhouse gas emissions increased temperatures by about 1 K, offset by about −0.4 K from aerosols, while the natural influences of solar and volcanic activity caused smaller compensating warming/cooling intervals. More recent studies, using a range of climate models, have largely supported those results [Hegerl et al., 2007] also suggesting that it is likely that the warming from greenhouse gas emissions alone would have been larger than observed increases between 1950 and 1999. With an increase in global temperatures extreme events associated with temperature are likely to change in frequency and intensity [Meehl et al., 2007]. Indeed, changes in a number of indices of climate extremes have been observed over the globe during the last 50 years [Alexander et al., 2006] and over Europe during the 20th century [Moberg et al., 2006], such as decreases/increases in occurrences of cold/warm days and nights, increases in lengths of warm spells, and increases in heavy precipitation events. The anthropogenic influence on changes in temperatures of the warmest night and coldest night and day of the year have been detected over the last 50 years [Christidis et al., 2005] and the trend in the warmest night is projected to increase significantly during the 21st century. Observed mean climate changes have already been noted to have had effects on many natural systems with some evidence of impacts on managed and human systems [IPCC, 2007b] with mean changes in climate likely to have varying degrees of positive and negative impacts in the future.
 Further changes in extremes may have large socioeconomic impacts above that of the underlying changes. The Intergovernmental Panel on Climate Change report on climate impacts stated that ecosystems, agriculture, and human health are likely to be adversely effected by future changes in climate extremes. In particular, increased frequency of warm spells/heat waves may lead to decreases in water quality, crop yields, and quality of life for vulnerable people and increases in water demand, wild fires, and heat-related mortality [IPCC, 2007b]. However, some climate changes associated with extremes, (e.g., warmer and more frequent hot days) may also lead to positive impacts such as increased crop yields, reduced mortality, and reduced demand for heating in colder environments. Understanding what may have contributed to such extreme event changes in the past and how they may change in the future is thus an important subject of study.
 The European heat wave and hot summer of 2003 has been extensively investigated [Trenberth et al., 2007; IPCC, 2007a]. Changes in atmospheric circulation patterns and Mediterranean sea surface temperatures are suggested to have had an influence on European summer temperatures and heat waves [Xoplaki et al., 2003; Della-Marta et al., 2007b] with the length of heat waves over western Europe doubling since 1880 [Della-Marta et al., 2007a]. The 2003 Europe heat wave and hot summer has been associated with a persistent blocking high pressure system [Black et al., 2004] and soil moisture changes from reductions in precipitation and increased drying [Fischer et al., 2007]. However, it is not possible to definitively state that one extreme event, such as a hot summer, is associated with any underlying climate change, unless you can show the event is unprecedented. While mean changes in temperatures may be detectable over internal natural variability and the main attributable contributions estimated, it is not possible to do the same for extreme events if the incidence of such events is so rare that one cannot observe a change in the frequency of the event over a period of time, for instance if the event has only occurred once. However, one approach to associate a cause with an extreme warm event is to investigate the change in the probability of an extreme under different circumstances using a model to simulate the changes. One study [Stott et al., 2004] examined the southern Europe region (Mediterranean Basin) and deduced that the likelihood of an extremely warm summer, similar to the exceptionally warm summer of 2003, occurring in the absence of anthropogenic warming was near 1 in a 1000 years. With anthropogenic warming this likelihood at least doubled with a best estimate of 1 in 250 years. A climate model projection following the A2 SRES scenario [Nakicenovic and Swart, 2001] indicates that the warmth of the 2003 summer may be common in the Mediterranean basin by 2040, happening every other year, and may appear anomalously cold by 2080 [Stott et al., 2004].
 Any changes in the frequency and strength of heat waves will have significant ecological and socioeconomic impacts [IPCC, 2007b]. As increases in mean summer temperatures are also associated with changes in heat waves [e.g., Della-Marta et al., 2007a] the examination of changes in frequencies of warm/hot summers have obvious scientific and policy related implications. This interest in how the frequency of extreme seasonal events like hot summers may change in the future has been reflected in many other studies. The type of heat waves, like those in Europe during August 2003, are projected to be a “taste of things to come” across Switzerland [Beniston, 2004; Schär et al., 2004; Beniston and Diaz, 2004] and there could be increases in intensity and number of heat waves (of 10 d length) over areas of North America and Europe [Meehl and Tebaldi, 2004] by 2100. A study using a physics parameter ensemble of equilibrium climate models (atmosphere-slab ocean coupled) responses to a doubling of CO2 found a similar result over much of the globe [Clark et al., 2006]. Mean summer temperatures occurring only 1 year in 20, in a control climate, could happen almost every year across much of the globe, with a doubling of CO2, as deduced in a different study [Barnett et al., 2006] using the same model ensemble.
 A similar study examining a range of atmosphere ocean coupled model simulations of 21st century climate with different emission scenarios found that 1 in 20 year warm summers could increase in frequency by between 30% and 61% for different regions by 2081–2100 [Weisheimer and Palmer, 2005]. Using the same technique but with a wider range of models, a study found increases across regions between 85% and 100% for the A1B climate scenario [Christensen et al., 2007]. Another study examined three atmosphere-ocean coupled model simulations of 21st century climate [Baettig et al., 2007]. They found increases in the number of additional events, by the 2071–2100 period over the 1 in 20 year frequency deduced from the 1961–1990 period, for several climate indicators including warmest/wettest/driest year/summer/winter. By the end of the 21st century 1 in 20 year warm summers would occur almost every year across the models examined. An advantage of these latter studies is that they examined the change in frequency of extreme warm events using data gathered over large regions across the globe rather than just one region. Examining a solitary region after an event may introduce a selection bias which may overemphasise its global significance. For instance a very rare warm outlier (e.g. a 1 in 1000 year event) in one area may be more common over ten areas (e.g., a 1 in 100 year event). However, a disadvantage in these same studies is that the models ability to simulate past changes in the frequency of warm events have not been validated against past observed changes.
 In this paper we aim to address these issues by comparing observed historic Northern Hemisphere (NH) summer temperatures variations with simulated temperatures from an atmosphere-ocean coupled climate model. We examine summer land temperatures across different regions to minimize any possible selection effects. We use an optimal detection analysis to quantify anthropogenic and natural influences on changes in mean summer temperatures across the regions. We track the observed change in frequency of warm summers over the 20th century to determine whether uncommon warm events are becoming increasingly frequent. Finally, we investigate how well these changes are simulated and what the implications are for future changes in the frequency of warm summers.
 Northern Hemisphere summer means (JJA) of land near surface temperatures, CRUTEM3v [Brohan et al., 2006], that covers the period 1850 to 2006, inclusive, are compared with simulations using the most recent generation Met Office Hadley Centre coupled Atmosphere-Ocean general circulation model, HadGEM1 (See Stott et al.  for further details about HadGEM1). We use simulations of the 20th century (1860–2006), incorporating anthropogenic only factors (ANTHRO), anthropogenic and natural factors combined (ALL), and natural only factors (NATURAL). We also use 21st century simulations (SRES scenarios A1B and A2) and a multicentury control simulation (CONTROL). The 20th century historic transient simulations comprise of ensembles of three members each with different initial conditions sampled from the CONTROL. Full details of these simulations can be found in Table 1.
Table 1. Models Used in Analysis and the Different Experimental Setupsa
1860–2009; 3 initial condition (taken from CONTROL) ensemble members. Up to 1999, historic variations in Well mixed green house gases, Sulphate aerosols (direct and indirect effects), Ozone (troposphere and stratosphere), industrial black carbon emissions, biomass burning aerosols, Land use changes. 2000–2010 using A1B scenario for above variations.
1860–2009; 3 initial condition (taken from CONTROL) ensemble members. Up to 1999, historic variations in Well mixed green house gases, Sulphate aerosols (direct and indirect effects), Ozone (troposphere and stratosphere), industrial black carbon emissions, biomass burning aerosols, Land use changes, Solar irradiance changes and stratospheric volcanic sulphate aerosols 2000–2010 using A1B scenario for anthropogenic emissions and average 11 year solar cycle and exponentially decaying volcanic aerosol.
No runs. Deduced using difference between ALL and ANTHRO. The variability of the resultant NATURAL will be twice as large as the variability of equivalent ALL or ANTHRO.
2000–2099; 1 initial condition, (taken from an ensemble member of ANTHRO), ensemble member. Variations in anthropogenic forcings, as in ANTHRO, but following A2 SRES scenario.
2000–2099; 1 initial condition, (taken from same ensemble member of ANTHRO as used by A2), ensemble member. Variations in anthropogenic forcings, as in ANTHRO, but following A1B SRES scenario.
 The simulated NH summer land 1.5 m temperatures are extracted and regridded to the CRUTEM3v spatial grid (5° × 5° resolution) and masked with the equivalent dated observational mask such that the simulated fields have the same coverage as the observations. The simulated fields dated after 2006 are masked by the observational coverage of 2006. We process the simulated data by finding the gridded anomalies with respect to the 1961–1990 mean, as was done with CRUTEM3v. We calculate the area means of the NH, continental regions of North America, Europe, and Asia and 14 so-called Giorgi regions which are wholly in the Northern Hemisphere [Giorgi and Francisco, 2000] as defined in Table 2.
 We limit our analysis to data after 1900 as the overall spatial coverage of the observations across the Northern Hemisphere prior to 1900 is below 40% and several regions have no data at times. The coverage of the available observations with respect to the total land area in each region is shown in Figure 1. The coverage of available data varies over time and across the regions. The Greenland and Sahara regions have the least amount of data coverage having maximum coverages of 51% and 65%, respectively, in the midcentury and minimum coverages of 16% and 3% in 1900. Other regions have much higher coverage, such as the western, central, and eastern North America regions with coverages not lower than 80%. Generally, coverage across the regions has increased between 1900 and 1950 and then either remained constant or slightly decreased.
3. Mean Summer Temperature Variations
 We first examine the mean summer temperatures for the NH (Figure 2). The ensemble mean for ALL and NATURAL is shown, together with the individual initial condition realizations. For clarity the ensemble means for ANTHRO, NATURAL and ALL are also shown each with the observed NH temperatures in Figure 3. Overall the correspondence between the ALL ensemble mean and CRUTEM3v is quite remarkable over the whole period, with NATURAL unable to match the variations after the 1970s, suggesting that anthropogenic influence is needed to explain the change. Much of the multidecadal variability is captured by the ALL simulations, including the 1 K warming observed since the start of the 20th century. ANTHRO does not capture the early century warming, 1900–1940, but NATURAL does, consistent with this warming being predominantly naturally driven [Tett et al., 2002; Stott et al., 2006]. On shorter annual timescales, the cooling following the Santa Maria (1902), Katmai/Novarupta (1912), Agung (1963), and Pinatubo (1991) volcanic eruptions are simulated by ALL. The simulated cooling following El-Chichon (1982) is not obvious in the observations, perhaps because of the warming from the large El Nino event in 1983 offsetting any volcanic cooling [Jones et al., 2004]. The observations rise above the estimated internal climate variability after 1990, showing that this rise is significant with respect to the 1961–1990 reference period. Temperatures are also colder before 1920, again showing that there has been significant change over the 20th century as a whole.
 Correlations of the observed annual mean summer temperatures with the simulations are shown in Table 3. To calculate the significance of the correlations, samples of the CONTROL are processed in the same way as the 20th century historic transient simulations such that three sampled segments, each 106 years long, are averaged to make up a surrogate ensemble mean simulation of the 1900–2006 period which is then correlated with the observations. This is repeated in a Monte-Carlo analysis 1000 times, resampling the control, to produce a probability distribution of correlations of the observations with unforced simulations. The simulated correlation is then compared with the 95 one-sided percentile of the distribution to deduce significance. The ALL simulation has a correlation value of 0.84 which corresponds to explaining 70% of the observed annual variability. Whilst NATURAL can only explain 13% of the variability by itself, ANTHRO explains 49%. Inspection of Figure 2 suggests that most of correlation with NATURAL is likely from the variability following the major volcanic eruptions and the coincident warming in the early part of the 20th century, which in the model is a combination of solar irradiance increases and lack of volcanic eruptions compared to the activity at the end of the 19th century. However, the warming seen in ANTHRO and ALL since the 1970s is the dominant feature explained in the observations.
Table 3. Correlations of Summer Mean Regional Mean Temperatures Between Ensemble Mean Simulations and Observed Regional NH Summer Meansa
Numbers in bold pass a statistical significance test at 95%.
Figure 4 shows 10 a moving averages of the ensemble means of ALL and NATURAL for each of the regions. All regions, apart from Central North America, show warming since 1900 that has accelerated since around 1980. The degree of the warming is different across the regions, for instance for Europe the warming is ∼1 K, while across North America warming it is nearer 0.5 K. Overall the temperature changes in the continental regions of Europe, Asia, and America are simulated well by the model and show clear increases by the start of the 21st century. Summer temperatures in most of the 14 smaller-scale Giorgi regions show a very good comparison between the model and the observations. There is similar interdecadal variability and increasing warming since the 1960s, although some regions across North America have less warming near the turn of the 20th and 21st centuries than simulated. Additionally, observed temperatures in Central North America are warmer than simulated around the 1930s, albeit within the internal variability as estimated by the model. An examination of the sensitivity of the observed changes to the inclusion of observational uncertainty [Brohan et al., 2006], (Appendix A.2), finds that the conclusions are very robust to this source of uncertainty.
 Correlations of the simulated annual regional means with the observed summer temperatures are given in Table 3 with their significances. In almost all the regions the highest correlations are when the anthropogenic influences are combined with the natural influences. The amount of correlation between the annual values varies from region to region. For instance, ALL can explain 42% of the variability in annual observed variations in the Mediterranean Basin but only 8% in Northern Europe. Almost all the regions have higher correlations of observations with ANTHRO than with NATURAL. The only region examined that has a correlation that is largest when NATURAL is compared with the observations is Central North America, where the results suggest that NATURAL can still only explain 8% of the observed interannual variability. As stated above, this regional decadal variability has some discrepancies with ALL. By inspection of Figure 4 some of the correlation of NATURAL with the observations may be due to the increased temperatures during the early part of the century, in the model due to solar irradiance increases and quieter volcanic activity following the larger eruptions at the end of the 19th century, and in the interannual variation following volcanic eruptions at the end of the 20th century (not shown). In most of the areas examined the changes in the observations over the last 30 years are not simulated by NATURAL, including the warming since 1980.
 As the correlations in Table 3 are on annual variations, the results may not give accurate representations of what may be contributing to the longer timescale climate variations. An optimal detection analysis is carried out in section 4 and provides a more thorough examination of the attribution of the climate variations.
 In all regions, under the two future emission scenarios examined here, summer temperatures are projected to continue rising at a rapid rate. The likelihood of a summer in the Mediterranean Basin warmer than that of the 2003 summer (2.30 K with respect to 1961–1990) has been estimated, using another model; HadCM3 (Appendix A.1), to be more than 50% by the 2040s [Stott et al., 2004]. These HadGEM1 results suggest that the date could be up to a decade earlier (Figure 4). For Europe as a whole, 2003 was also exceptionally warm (1.64 K with respect to 1961–1990) and this threshold is estimated to be exceeded every other year by 2030, regardless of which of the two emission scenarios is followed. However, this result does not fully represent a wider range of uncertainties, as it is drawn from a solitary model, HadGEM1. We partially examine the sensitivity of the results to the use of a different model, an earlier generation Hadley Centre model (HadCM3), (Appendix A.1) and find similar results to those from the HadGEM1 model. Spanning a wide range of different models [e.g., Weisheimer and Palmer, 2005] is outside the scope of this study.
4. Detection Analysis
 An observed change in climate can be said to be detected when the changes are statistically significant from some reference period. Because of the lack of long term observations which are not influenced by external forcing factors (e.g., anthropogenic emissions or natural changes) a model simulation of internal variability is used to create the statistics to test the change against. We examine the internal natural variability of the model by comparing its power spectra with that of the observed summer temperatures [Stott et al., 2004, 2006; Hegerl et al., 2007] (Figure 5). For all regions the model appears to capture most of the observed variability over the range of timescales examined. This suggests that the model has an adequate representation of observed internal variability across the regions, an important requirement for the following detection analysis. As shown in section 3, many of the observed changes in mean summer temperatures are detectable over an estimate of internal climate variability, with respect to the 1961–1990 reference period. A formal attribution analysis involves examining model simulations driven by various external forcing factors and estimating the contributions to the observed change whilst taking into account estimates of internal climate variability [International ad hoc Detection and Attribution Group, 2005].
 One common approach that will be applied here is based on a technique, called optimal detection, presented in other studies that have attributed large proportions of the observed warming to increases in well mixed greenhouse gases with offsetting contributions from sulphate aerosols, ozone, and natural influences [Tett et al., 2002; Allen and Stott, 2003; Stott, 2003]. The technique can be applied to other climate indices other than annual surface temperature [Hegerl et al., 2006, 2007], such as indices related to warm summers, that is, trends in warm nights [Christidis et al., 2005], growing season length [Christidis et al., 2007], or area burned by forest fires in Canada [Gillett et al., 2004].
 We apply an optimal detection analysis to 10 a regional means for the period 1907–2006 using the same methodology as applied by Stott et al.  to Mediterranean Basin summer temperatures but here extending the analysis to the Northern Hemisphere, three continents, and 14 Giorgi regions. We limit the analysis to these relatively large spatial scales as studies have shown that models may underestimate variability on scales smaller than 2000 km [Stott and Tett, 1998]. However, recent studies [Karoly and Stott, 2006] suggest that analyzing smaller scales is possible and that this may be an interesting area for future investigation.
 The optimal detection analysis is in essence a linear regression of the simulated temperatures against the observed temperatures. The methodology is built around multiple regression [Allen and Tett, 1999; Tett et al., 2002; Allen and Stott, 2003], where the observed decadal mean temperatures (y) can be represented as a linear sum of I simulated signals (xi), where xi is the ensemble average of the simulations for the given forcing factor combination, such that
where βi are the unknown scaling factors to be estimated in the regression, ν0 is the estimate of the noise in the observations, and νi an estimate of the noise in the simulated temperatures [Allen and Stott, 2003; Stott, 2003; Stott et al., 2004]. The observations and simulations are pre-whitened by projection onto the leading eigenvectors calculated from an estimate of the inverse of the covariance matrix of internal variability (in this case deduced from the CONTROL), representing the dominant orthogonal modes of internal variability. This improves the signal to noise ratios by normalizing the patterns to the internal variability of the climate. For further information, see Tett et al. , Allen and Stott , and Hegerl et al. .
 One important aspect of this method is that any uncertainty in the magnitude of the forcing and climate response is compensated for by scaling the model responses (xi) by the signal amplitudes (βi) [Tett et al., 2002]. For example if, because of uncertainty in the forcing magnitude, the climate response is smaller than observed, then the scaling factor (βi) may allow for this by scaling up the response. However, uncertainties in the forcing/response space-time patterns themselves are not taken account of here as this requires sampling different models with a range of different simulated responses [Huntingford et al., 2006].
 To test whether the best-estimate combination of signals is consistent with the regression model, the residual variability is compared with the variability deduced from 2450 years of HadCM3 CONTROL (Table 4) and a F-test is used to reject the result if the probability value is outside the 0.05–0.95 range [Tett et al., 2002].
Table 4. Models Used in Analysis and the Different Experimental Setupsa
1860–2002; 4 initial condition (taken from CONTROL) ensemble members. Historic variations in Well mixed green house gases, Sulphate aerosols (direct and indirect effects), Ozone (troposphere and stratosphere).
1860–2002; 4 initial condition (taken from CONTROL) ensemble members. Historic variations in Well mixed green house gases, Sulphate aerosols (direct and indirect effects), Ozone (troposphere and stratosphere), Solar irradiance changes and stratospheric volcanic sulphate aerosols
1860–2002; 4 initial condition (taken from CONTROL) ensemble members. Historic variations in Solar irradiance changes and stratospheric volcanic sulphate aerosols
1980–2099; 3 initial condition (taken from ensemble members of ANTHRO) ensemble members. Variations in anthropogenic forcings, as in ANTHRO, but following B2 SRES scenario.
1980–2099; 3 initial condition (taken from ensemble members of ANTHRO) ensemble members. Variations in anthropogenic forcings, as in ANTHRO, but following A2 SRES scenario.
 When the simulations used are ALL or ANTHRO, the scaling factors βALL or βANTHRO are produced by the analysis. The linear combination of two signal scaling factors produces the individual components. In the case of the combination of ALL and ANTHRO, the climate response of NATURAL, xNATURAL, can be deduced as the difference between xALL and xANTHRO. From equation (1) [Tett et al., 2002] the scaling factors for the individual forcing contributions can be deduced,
where βanthropogenic and βnatural are the scaling factors for the anthropogenic only components and the natural only components, respectively.
 For the period 1907–2006 we calculate 10 year regional means and project them onto estimates of the internal noise covariance calculated from 1090 years of the HadGEM1 control (the Control simulation from an earlier generation model, 2450 years of HadCM3, is used for significance testing). We calculate the scaling factors for each of the forced simulations and their respective 5–95% uncertainty ranges. We calculate the number of degrees of freedom to truncate the optimized eigenvector space, limited by the number of elements in the patterns (in this case just 10 decadal mean temperatures) to be 9.
 The results of the optimal detection analyses are shown in Figure 6, where the bars represent the 5–95% uncertainty ranges for each of the forcings scaling factors, for each region considered. A signal is defined as being “detected” when the 5–95% range of the scaling factors do not cross zero, that is, the null hypothesis that the observed response to a particular forcing is zero, is rejected. Similarly, if the 5–95% range overlaps with the value of 1, then it implies the model is overall a good match with the observations. However, a scaling factor significantly smaller or larger than a value of 1 suggests that the model is either over or under responding or the forcing factor is too big or too small or both. Alternatively, important forcing factors may have not been included or a mode of internal variability is not being simulated.
Figure 6 shows that the anthropogenic influence, ANTHRO, is detected over the Northern Hemisphere as a whole, over all three continental regions examined and over all of the 14 Giorgi regions except for the Central North America region. Similarly the combined anthropogenic and natural influence, ALL, is detected in all regions, apart from CNA. Where detected ALL has a scaling factor (βALL) consistent with 1 for many of the regions, that is the 5–95% range encompasses 1. In these regions ALL has a good match with the observations and requires no significant scaling. For the NA, AS, WNA and NAS regions the 5–95% range of βALL is less than 1, suggesting that the model is warming more than is observed in those regions and has to be scaled down. For the SAH region the 5–95% range of βALL is greater than 1, suggesting the model is warming less than observed and so the model has to be scaled up.
 The detection of anthropogenic influences across the Northern Hemisphere is consistent with results by Christidis et al.  where the anthropogenic influence is detected in the change of JJA seasonal mean of night time temperatures over the last 50 years of the 20th century.
 Anthropogenic influence is detected separately in combination with natural factors over the northern hemisphere as a whole, and over all three continental scale regions. For the smaller-scale Giorgi regions, similarities between the patterns of temperature change estimated by the ALL and ANTHRO simulations mean that it is more difficult to separate the anthropogenic and natural components of the observed temperature changes. However, anthropogenic influence is detected in combination with natural factors in 8 of the 14 Giorgi regions.
 Anthropogenic factors in the Mediterranean basin are detected when in combination with natural factors, similar to the result in the work of Stott et al. , but contrary to that study, NATURAL is also detected. NATURAL is also detected in combination with ANTHRO in several regions and in the large regions of the Northern Hemisphere, North America, and Europe.
 The detection of anthropogenic and natural factors in the western and eastern North America regions is in contrast to the lack of detection of these factors separately in central North America. As shown earlier, the central North America region does not have as much warming at the end of the 20th century as is simulated and there are much larger observed temperature variations in the midcentury than simulated. Investigations of the possible cause of the increased variability in the midcentury across central North America [Schubert et al., 2004; Sutton and Hodson, 2005; Kunkel et al., 2006] indicate that internal variability, perhaps associated with ocean multidecadal variations such as the Atlantic Multidecadal Oscillation, which would not deterministically be simulated by models, and land/atmosphere processes are dominating the observed changes. Additionally, HadGEM1 does not simulate ENSO variability very well [Johns et al., 2006], which can influence North American continental temperatures. However, as Figure 5 shows, the model is not underestimating the internal variability in the 2–9 year period band, associated with ENSO variability, across North America.
 To help separate the contributions to observed climate variations from externally forced changes and internal variability, there are a number of possible methods that could be considered for future research. Increasing the number of ensemble members or using multimodels [Hegerl et al., 2007] to attempt to improve the signal to noise ratios, or alternatively applying a multivariable analysis where other variables, such as other temperature fields [Jones et al., 2003], precipitation [Karoly and Braganza, 2005], or circulation patterns [Wu and Karoly, 2007] are incorporated into the analysis.
5. Warm Summers
 As described in section 1 previous studies have examined the change in likelihood of extreme warm summers in the future using climate models. Changes in this statistic in the future is important as it is closely related to mortality rates due to excessive summer heat [e.g., Trigo et al., 2005; Garcia-Herrera et al., 2005] and other socioeconomic impacts [IPCC, 2007b]. However, model simulations of such extreme events are more difficult to validate or compare with the past than mean changes. Since extreme warm statistics are rare by definition, comparison with model simulations is more prone to uncertainties.
 One way of exploring this is as described by Stott et al. . An analysis, as described in section 4, provides attributed warming for different forcing agents, which are then used with estimates of seasonal variability to calculate the change in frequency of future extreme events. However, this method effectively only validates modeled changes in long term means (in this case 10 year long means) against the observed changes and not changes in the frequencies of the extremes over time. Here we describe a method that, while it does not directly examine very extreme warm events and how they change, nonetheless compares changes in the frequencies of more common warm summers between model and observations.
 We calculate how an “historic” 1 in 10 year warm summer changes in frequency over time. For the observations and the model simulations we use ordered statistics to estimate the 90th percentile of the warmest summer in each grid point for the 1961–1990 period if at least 25 out of the possible 30 data points are available. This gives the thresholds for a 1 in 10 year warm summer in that period for that grid point. We then calculate the number of times that threshold is crossed, again for each grid point per number of available data points, during a moving 10 year window for all the available remaining data between 1900 and 2006. The average number of threshold crossings per 10 year is then calculated for each of the regions. We calculate the ensemble average for the ALL and NATURAL simulations. This statistic gives the average probability of a particular grid point in a region exceeding its “historic” 1 in 10 warm summer threshold. By averaging across an area, we increase the number of points available to calculate the exceedance statistics, a methodology similar to techniques used previously [Clark et al., 2006; Barnett et al., 2006; Weisheimer and Palmer, 2005].
Figure 7 shows the average exceedance per moving 10 year window for the observations and model simulations. For the large-scale continental regions the model compares favorably with the observed rate of exceedances. For the NH as a whole, observed “historic” 1 in 10 summers are now occurring 4 years in 10 with ALL simulating the changes since 1900 very well and NATURAL unable to explain the changes since the 1970s. The observed continental means are also captured very well by the model with “historic” 1 in 10 summers now occurring between 3 and 5 years in every 10 years. For all except one of the 14 Giorgi regions, observed “historic” 1 in 10 year events have now become, in the last 10 years, at least 3 in 10 year events whilst such events have become 6 in 10 year events in the Sahara and 7 in 10 year events in the Mediterranean basin. In central North America, ALL overestimates the relatively small increase in observed exceedances and does not match the observed changes in the 1930s and 1940s, although the changes are mostly within the estimates of internal variability. For most of the 14 Giorgi regions the observed increases in exceedances are well captured by the model.
 The future projections suggest the possibility of warm summers exceeding the “historic” 1 in 10 year threshold every other year within the next decade. By the mid-21st century many of the regions could see summers rarely being colder than the “historic” 1 in 10 year warmest summer threshold.
 With only two realization of future projections of climate by HadGEM1, there are unquantified uncertainties on the exact rates of future increase in exceedance but similar conclusions are found when the analysis is repeated with the HadCM3 model [Stott et al., 2000], Appendix A.1. Similarly, the sensitivity of the observed changes to the inclusion of observational uncertainty [Brohan et al., 2006], Appendix A.2, is not large compared to the overall changes during the period examined.
 Overall, HadGEM1 successfully captures the variations of past summer temperatures in the great majority of NH regions examined, and projects an increasing incidence of very warm summers in future, a conclusion supported by other modelling studies [Schär et al., 2004; Weisheimer and Palmer, 2005; Barnett et al., 2006; Meehl and Tebaldi, 2004]. Surface-atmosphere coupling is likely to play a role, with feedbacks in soil moisture, temperature and precipitation particularly important in summers [Seneviratne et al., 2006; Fischer et al., 2007; Rowell and Jones, 2006] and further improvements in data and modelling for these factors could be important for increasing the accuracy of projections of future summer temperatures. The dominant influence of anthropogenic factors has been detected in almost all of the regions. This study has shown that historically infrequent very warm summers have already become much more frequent as a result of anthropogenic increases in well mixed greenhouse gases and other anthropogenic factors, not only in Europe but also across much of the Northern Hemisphere. The risk of hot summers is currently rapidly increasing, raising the likelihood of record breaking heat waves around the World, as seen in Europe in 2003 and 2006 and in North America in 1995 and 2006.
A1. Results Using HadCM3
 There may be differences between different climate models that impact on the historic simulations and their future projections. This can be demonstrated by looking at the results using HadCM3 [Johns et al., 2003; Stott et al., 2000] model simulations (Table 4), which was processed in the same way as for HadGEM1 as described in section 2. Ten year running means of regional summer temperatures are shown in Figure A1. The overall results are very similar to HadGEM1 (Figure 4) with only the HadCM3 ALL simulation matching the observed changes over the 20th century for many of the regions, and NATURAL unable to match the observed warming over the last 20 years. Similar to the results above the HadCM3 ALL simulation does not capture the warming in the 1930s and over estimates the warming after about 1980 in Central North America. It also underestimates the warming over the last 20 years in the Mediterranean Basin and the Sahara, while HadGEM is closer. The projections of change into the 21st century are broadly similar between the two models, with the difference in the Mediterranean basin perhaps more related to differences in the late 20th century. The B2 scenario demonstrates the spread from the different scenario may be wider than between the models. Comparing more models [e.g., Weisheimer and Palmer, 2005] is outside the scope of this study.
A2. Observational Uncertainty
 To investigate the impact of observational uncertainty, we use estimates of uncertainties for the nonvariance corrected observational data set CRUTEM3 [Brohan et al., 2006] to create a number of realizations of the observations sampling the uncertainties. We use CRUTEM3 here instead of CRUTEM3v as in the rest of this study, as the uncertainties are provided for this data set. However, the differences between the two data sets are not significant for this analysis. Uncertainties in three major groups; sampling, station, and bias are given in the study by Brohan et al. . A further uncertainty, (coverage uncertainty), is not incorporated because it concerns the uncertainty on the mean temperature in an area due to not having total area coverage for all time points (e.g., Figure 1). The observational coverage uncertainty is most important when comparing area averages of temperatures with different data coverages [Brohan et al., 2006]. However, as we are here comparing with model results which has the same coverage, by construction, as the observations then it is not appropriate to include that uncertainty. We create 100 realizations of the CRUTEM3 each one sampling the distribution space of the uncertainties. For each realization the regional means are calculated and the 1 in 10 summer exceedance statistics are calculated. From the subsequent realizations we find the 5–95% range, from ordered statistics.
Figure A3 shows that the inclusion of observational uncertainty has only limited impact on the overall changes of temperature in the regions. For the largest spatial scale regions the uncertainties only have a noticeable effect before the 1940s. Repeating the methodology described in section 5 to obtain changes in exceedances of “historic” 1 in 10 summers, Figure A4 shows the uncertainties in the observed changes in exceedances of 1in10 warm summers and again show that the impacts of observational uncertainty is limited.
 We thank Philip Brohan for providing detailed advice on the use of the CRUTEM3v data set and the many colleagues of the Hadley Centre, Met Office who developed and run the climate models. We would also like to thank the anonymous reviewers and Juerg Luterbacher for their very helpful comments and suggestions regarding this paper. This research was funded by the UK Department of the Environment, Food and Rural Affairs under contract PECD 7/12/37.