How well do atmospheric reanalyses reproduce observed winds in coastal regions of Mexico?

Atmospheric reanalyses are widely used for understanding the past and pre-sent climate. They have become increasingly used within the renewable energy sector for assessing wind and solar resources for different regions of the globe in conjunction with observations. Mexico is a country with considerable potential for wind energy production, especially around coastal sites and therefore the characterization of wind resource in these areas of the country is imperative for the most beneficial use of these resources. In this study, we assess how well three global reanalyses, namely ERA-Interim, ERA5 and MERRA-2, can reproduce wind observations at a number of key sites across the country. We find that the reanalyses' ability to reproduce these observations is highly variable between different regions in Mexico. Correlation coefficients are around 0.9 in the south of the country where the winds are strongest, but much lower (around 0.5) in Baja California Sur due to the complex coastal topography of the region. ERA5 outperforms ERA-Interim and MERRA-2 consistently across the vast majority of sites and so this reanalysis is recommended for local wind power studies.


| INTRODUCTION
Atmospheric reanalyses are an important tool to investigate the past and present climate, particularly when interested in regions where meteorological observations are sparse.They consist of a combination of comprehensive observations (e.g., satellites, in situ weather stations, radiosondes) assimilated into a state-of-the-art numerical weather model (Fujiwara et al., 2017).In this study, we investigate how well atmospheric reanalyses can reproduce the wind speeds observed across a variety of sites in Mexico, and the consequences of the results for the wind power industry.The three we investigate in this study are ERA5 (Copernicus Climate Change Service, 2017; Hersbach et al., 2020) and ERA-Interim from the European Centre for Medium-range Weather Forecasts (ECMWF; Dee et al., 2011), and MERRA-2 (Molod et al., 2015) from NASA.For the purpose of this study, we utilize the observations collected during a field campaign that took place between 2005 and 2007, funded by the United Nations Development Programme Global Environmental Finance (UNDP-GEF) Unit and implemented by Mexico's Instituto de Investigaciones Eléctricas (now Instituto Nacional de Electricidad y Energías Limpias) (UNDP, 2012).
Reanalyses are frequently used for studies of potential wind power generation, and its associated weather conditions due to their global coverage and the length of time they cover, and thus it is important to investigate how well they are able to reproduce localized wind observations.The wind energy sector is one of the biggest users of reanalyses (Gregow et al., 2015).This is because it is hugely important for the wind energy sector to understand the impacts of hourly to inter-annual variability on potential and existing wind power generation for the country of interest (Brower et al., 2013).Examples of the use of reanalyses for wind energy purposes are therefore growing rapidly.In the United Kingdom, Kubik et al. (2013) and Cannon et al. (2015) used the MERRA reanalysis to investigate 33 years of wind power generation.Cannon et al. (2015) noted a high correlation with observations but the reanalysis systematically produced lower winds than those observed across the country.Furthermore, the performance of the reanalyses was statistically weaker over high orography.
Recent studies have also used atmospheric reanalyses to establish the key weather patterns that drive winds and wind power generation in the country and Europe (e.g., Bloomfield, Brayshaw, & Charlton-Perez, 2020a;Bloomfield, Suitters, & Drew, 2020b;Collins et al., 2018;Grams et al., 2017;van der Wiel et al., 2019) and are thus dependent on the ability of the reanalysis to accurately reproduce both large-and small-scale meteorological patterns.To look at higher-resolution wind power patterns, Gonz alez- Aparicio et al. (2017) downscaled the MERRA reanalysis over Europe.However, using reanalyses for small-scale variability can have some limitations as some important meteorological phenomena, such as mesoscale convective systems, can be poorly represented by reanalyses (Rivas & Stoffelen, 2019).Reanalyses have also been used in areas with sparse observations, such as the ocean surrounding Antarctica, to quantify bias in ship anemometric observations (Landwehr et al., 2019).Recently, reanalyses have begun to be used for studies of wind power in Mexico.Thomas et al. (2020) investigated the key wind drivers that drive wind power generation in Mexico using ERA5, and Morales-Ruvalcaba et al. ( 2020) used MERRA-2 to deduce capacity factors at different sites across the country.As atmospheric reanalyses are not designed for use on small scales, such studies for individual wind farm locations are very sensitive to how well each reanalysis is able to reproduce observations at these sites.
A number of studies have attempted to quantify which atmospheric reanalysis has the ability to best simulate different meteorological variables in different regions of the globe (e.g., Kaiser-Weiss et al., 2015;Kumar & Hu, 2012) and some used interpolation to re-grid observations and compare them to reanalyses' output for different variables for the entire globe (e.g., Donat et al., 2014;Ramon et al., 2019).For the United States, Rose et al. (2012) investigated how several reanalyses performed at reproducing observed winds and wind power metrics.They found spatial variability in the performance of the reanalyses, with some areas much better represented than others.A few studies have focused on investigating whether surface wind observations can be reproduced by reanalyses.Carvalho (2019) compared surface winds for MERRA-2 with a number of other global reanalyses including ERA-Interim and found that each reanalysis had its own strengths and weaknesses in different areas of the globe.MERRA-2 was found to have less error nearer the poles, but its coarser resolution at lower latitudes means that it may not be the most reliable for the sub-tropics.Rivas and Stoffelen (2019) investigated how both ERA-Interim and ERA5 compared with ASCAT satellite wind vector observations over the oceans, finding that ERA5 was a much better match globally with the observations.Some studies have focused on how well wind speeds are reproduced regionally.Olauson (2018) compared winds at turbine height for MERRA-2 for Sweden with ERA5 data, finding that for Sweden, ERA5 performed better in all metrics.Alvarez et al. (2014) investigated the performance of MERRA-2, ERA-Interim and NCEP Reanalysis II in reproducing ocean-surface wind speeds in the Southern Bay of Biscay.NCEP reanalyses were also compared with observations in the United Kingdom by Sharp et al. (2015).A full description of how ERA5 and MERRA-2, along with two high-resolution models, performed at reproducing winds and wind power in France was given by Jourdier (2020), who found ERA5 was very highly skilled at reproducing observations in the country, but with large negative biases over mountainous regions.On a global scale, Ramon et al. (2019) explored how well five different atmospheric reanalyses can reproduce wind speeds observations at a large spread of sites (however with none in Mexico).They concluded that ERA5 performs the best for short-term wind events, but on longer time-scales, no one reanalysis stood out.Furthermore, tropical cyclones (Hodges et al., 2017;Malakar et al., 2020) and extratropical cyclones (Wang & Isaac, 2016) have been assessed in terms of how well they are represented in widespread atmospheric reanalyses, finding good agreements in locations, but underestimates in intensities of the cyclones depending upon the dataset used.
None of these comparisons have focused on longterm wind observations in Mexico, except for a short period covered by intense anemometric observations between 2005and 2007(Morales-Ruvalcaba et al., 2020).They used linear interpolation to map the reanalyses data to the locations of their observations, and a logarithmic profile to interpolate from the reanalysis output heights to the height of the anemometers.Using this method, they found that for most of Mexico, MERRA-2 is able to reproduce local observations well (after bias correction), but this skill is not reproduced at all sites.As a country characterized by a wide variety of geographical conditions, which has been recently investing in the wind power industry, it is important to deduce the feasibility of using reanalyses for long-term wind studies.Furthermore, with new reanalyses being released, such as ERA5 and MERRA-2, it is important to include these newer datasets in these studies.In this study, we expand this analysis, to include the ERA-Interim and ERA5 reanalyses only for the period between 2005 and 2007 when a wind energy focused observation campaign took place in Mexico.We investigate whether any of these reanalyses improve our ability to reproduce observations at each of the sites across the country.
The rest of the article is organized as follows: in Section 2, we outline the methodology including the data and reanalyses utilized in this study.In Section 3 we outline our results: the statistical correlations and meansquare errors of the atmospheric reanalyses with respect to the observations, how changing the temporal resolution influences these correlations and an investigation into why the Veracruz observation site has very low statistical correlation.Finally, in Section 4, we summarize our findings.

| Observations
In this study, we utilize a set of anemometric observations commissioned by the UNDP-GEF unit and implemented by Mexico's Instituto de Investigaciones Eléctricas (now Instituto Nacional de Electricidad y Energías Limpias), which took place between 2005 and 2007 UNDP (2012).The selected eight weather stations were those in continuous operation at a high temporal resolution for the whole of 2006, allowing for direct comparisons for the same time period and hence meteorological conditions (see Morales-Ruvalcaba et al., 2020;Thomas et al., 2020).They represent a reasonable spread of geographic locations across the country, as shown in Figure 1.Most of the sites included here are close to sea-level, meaning that they well represent the coastal regions of the country, but less well the mountainous regions to the centre of the country.Table 1 gives further details about the stations shown in Figure 1, as well as the stations codes used later in this study, and the regions the stations sit within.The observations are not assimilated by any of the reanalyses included in this The altitude from the ERA5 reanalysis across Mexico where the darker colours representing high altitude and light showing near sealevel.The locations of anemometric stations across Mexico used in this study are also shown by the blue diamonds and bold labels (see Table 1 for details on stations) study, and therefore they are independent of the reanalysis data sets.
The anemometric stations use cup anemometers that take wind speed measurements every 10 min at either one, two or three separate heights ranging between 10 and 50 m.a.g.l.At the same heights, wind vanes are mounted to provide wind directions.Whilst the data are not quality controlled for tower wake effects, the measurements are tailored for the wind energy industry and so each weather station is well clear of any obstructions, which could disrupt the fetch in any direction.Further information regarding these weather stations and associated data can be found in Morales-Ruvalcaba et al. ( 2020) and references therein.

| Atmospheric reanalyses
In this study, we utilize the ERA-Interim and ERA5 atmospheric reanalyses from the ECMWF.ERA-Interim makes use of the four-dimensional variational analysis (4D-Var) assimilation system (Dee et al., 2011).More recently, ERA5 has been released with higher spatial and temporal resolution than ERA-Interim.It also uses the 4D-Var data assimilation scheme and goes back to 1979 (very recently extended back to 1950).Whilst ERA5 has both a greater temporal and spatial resolution, ERA-Interim has widely been used in the meteorology community and hence is included in this study.The comparison between these data sets also gives a good example of the effect of the change in resolution, as the underlying models have not been significantly updated between versions.The fields used in this study are the 10 and 100 m wind vectors from ERA5 and the 10 m and closest corresponding model level to 100 m wind vectors in ERA-Interim.The height at the altitudes at the station corresponds closely to 100 m to match with the analysis of the ERA5 data.
The Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) (Gelaro et al., 2017) is the successor of the previous reanalysis from NASA, MERRA.It is an improvement on version 1 as it adds considerably more satellite data, an improved data assimilation scheme for ozone and aerosol, improved representation of each of the stratosphere and land surface, as well as a reduction in the number of jumps associated with the addition of local observations (Molod et al., 2015).Most notably for this study, wind speeds in MERRA-2 are much more in line with other reanalyses and observations around the globe (Carvalho, 2019).Therefore, in this study, we will only include the latest version for comparison.MERRA-2 data run from 1980 to near-present with an hourly temporal resolution, similarly to ERA5, but with a coarser spatial resolution.It is one of the few global reanalyses that assimilates data from the entire constellation of NASA EOS satellites.The fields used from MERRA-2 are the 10 and 50 m wind vectors.
Table 2 shows that each of the three reanalyses run over approximately the same period, but have different spatial and temporal resolutions.ERA5 has the highest spatial resolution of about 28 km at the latitudes spanned by Mexico, which compares favourably to about 60 km for MERRA-2 and about 70 km for ERA-Interim.It also has an hourly temporal resolution, compared with 6-hourly for ERA-Interim.Furthermore, it has 137 terrain-following heights compared with 72 for MERRA-2 and 60 for ERA-Interim, resulting in a higher vertical resolution (Ramon et al., 2019).Wind-component output at 100 m is also useful when interpolating parameters to the height of the observations.The data assimilation scheme used to construct each reanalysis differs too.MERRA-2 uses a 3D-VAR-FGAT (First Guess at Appropriate Time) scheme (Gelaro et al., 2017), whereas ERA-Interim and ERA5 use a 4D-VAR scheme.
For most of the selected anemometric stations, wind speed and direction measurements are taken every 10 min at two heights: 20 and 40 m.For direct comparison with these observations, each atmospheric reanalysis has to be interpolated from the nearest model grid points to the observation location and then extrapolated to the height of the observation.For the horizontal interpolation to the station location, a bi-linear scheme is used from the four closest grid points in each reanalysis.The logarithmic profile given in Equations ( 1)-( 3) is used to extrapolate the wind speed data (U) to the height (z) of the observation, where z 1 is the height of the lower anemometer and z 2 is the height of the higher.
For ERA5 and MERRA-2, two near-surface wind components are available, at 10 and 100 m in ERA5 and 10 and 50 m in MERRA-2.These values are used to interpolate to the 40 m observation (or the single observation at 15 m for BCS1 and BCS2) via Equation (1).However, for the ERA-Interim case, there are only 10-m wind components and then 60 model levels.We thus use model level 57, whose nominal height is 100 m (Berrisford et al., 2011) and convert from geopotential to geometric height, which allows for the altitude of the anemometric station.Whilst this interpolation method has limitations due to turbulence in the wind induced from different land use changes and upstream vegetation (Kent et al., 2018), the observation sites were chosen to have long fetches clear of obstruction and thus the logarithmic assumption is reasonable in this case.An unavoidable consequence of the different heights given between the atmospheric reanalyses means that, depending on any errors that are introduced on the logarithmic fit have the potential to provide some small differences between the reanalysis.For example, MERRA-2 has a model level at 50 m (but not 100 m), meaning that the fit is constructed for 2 points, which are much closer than from the 100 m values from ERA-5, which could provide a better fit when reproducing 40 m wind values.
Once the values of the observed and modelled wind speeds are established, a mean is then taken to gain 6-hourly, 12-hourly and daily averages for the observations for direct comparison with each of the reanalyses.The same averaging is applied to the reanalyses to obtain equivalent time series with these resolutions for each of the three reanalyses as well as the observations.

| Satellite scatterometer data
The NASA Quik Scatterometer (QuikSCAT) mission will be used in this study to investigate times when the interpolated reanalyses data diverged from the observations, by having large-scale wind observations to compare the reanalysis fields against.QuikSCAT was launched on 19 June 1999 and aimed to accurately detail surface wind speeds over the oceans, which means that whilst the mission is not representative of the observation sites in this study, the observations are coastal and so QuikSCAT does provide evidence of winds near the observations.QuikSCAT accurately observes wind speeds in all conditions except for heavy rain (Hoffman & Leidner, 2004).It also has a higher spatial resolution than ERA5, and whilst its 1800 km wide swath allows for excellent global coverage, there are some areas each day that are not covered by the swathes of the satellite.Previous work has compared satellite observations of winds with ERA reanalyses (e.g., Rivas & Stoffelen, 2019), but in this study, we will instead use these data to investigate times when there is a disagreement between the observations and the reanalyses.

| Correlations between observations and reanalyses
We now compare the daily mean interpolated atmospheric reanalyses and observations for the whole of 2006. Figure 2 shows a tendency for the reanalyses to underestimate the wind speeds at most stations, particularly in the case of ERA5 (red), an observation also noted in Thomas et al. (2020).This is particularly clear for Chiapas, Oaxaca and Veracruz, as shown by the histograms on the right-hand side.ERA5 consistently underestimates wind speeds at the majority of sites.This is most striking in the histogram of wind speeds at OA01, where a second peak at higher wind speeds is seen in the observations but not for any of the reanalyses.The second peak is caused by the strong, northerly Tehaunos winds that pass through the Isthmus of Tehuantepec, particularly during the winter.The fact that the reanalyses do not reproduce this peak is likely due to the poor representation of these winds through the mountains due to their relatively low spatial resolution.This also explains the underestimates of wind speeds at VZ02 and CI01 situated in that same region.Underestimation of the wind speed is seen at most sites across Mexico with the exception of YC01, which has a clear ordering that sustains throughout the year, where ERA-Interim overestimates wind speeds, MERRA-2 underestimates them and ERA5 is rather accurate with the wind speeds.At some sites, there are wind speed events that are either picked up by some reanalyses but not all (e.g., around day 210 at San Hilario, where a brief period of stronger wind is only seen in MERRA-2, but is overestimated), or seen by no reanalyses (e.g., early autumn in Veracruz, where observed sustained stronger winds are not reproduced by any reanalysis).
To quantitatively investigate how well each reanalysis reproduces the observations, we first compute Pearson correlation coefficients (R) between each reanalysis and the observations for each of the anemometric stations.The results of this are shown in Table 3.For most of the stations, there is a good correlation between each of the reanalyses and the observations with most having R > 0.8.Chiapas has the largest correlation coefficient for each of the three reanalyses.ERA5 has the strongest correlation with the observations in comparison with ERA-Interim and MERRA-2, for all stations apart from Chiapas, where it is only slightly less good than ERA-Interim.In some cases, the improvement with using ERA5 is as much as a 0.25 increase in the correlation coefficient (Baja California Sur sites).This is likely to be because of the increased spatial resolution of ERA5 compared with the other reanalyses.The coastal locations of these sites mean that reanalyses with a coarser resolution may have a large error on where the coastline lies and may treat the site as being either over ocean or further inland than it is, increasing the error.
The two Baja California Sur stations have the smallest correlation coefficients between reanalyses and observations.This is likely due to the complex geography of the peninsula.The peninsula itself is only between 40 and 240 km wide, and so in parts may not even be resolved by ERA-Interim or MERRA-2.Furthermore, the centre of the peninsula has somewhat complex orography with a mountain range running down the length of the region, as can be seen in Figure 1.Thus, it is unsurprising that reanalyses might have some difficulty in reproducing winds in this region.
These stations are the only two that contain a period where a tropical cyclone was present in the region.The first on 24 July 2006 was caused by tropical storm Emilia whose track took in close to the west coast of Baja California Sur.The second on 3 September was Hurricane John, which made landfall as a category 2 hurricane at the Southern tip of Baja California before moving northwards towards BCS1 and BCS2 and weakening (Pasch et al., 2009).Whilst tropical cyclone locations are generally well represented by reanalyses (Hodges et al., 2017), the intensities are often underestimated (Schenkel & Hart, 2012) and thus we remove these events from the data series before computing correlations.When removing these two tropical cyclones from the data, the correlation does not increase and so the representation of tropical storms by the reanalysis plays little or no role in the low correlation with the observation in this region for these particular storms.However, the improvement of the correlation coefficients for ERA5 is evidence that improving the resolution of reanalyses is going a long way to improving the representation of meteorological variables in this region.
We next investigate how the performance of each atmospheric reanalysis changes with firstly the temporal resolution used and secondly with the time period investigated.Table 3 shows how the period over which the data are averaged over influences the Pearson correlation coefficient between each reanalysis and the observations.For each anemometric station and reanalysis, the correlation becomes greater with lower temporal resolutions.There is much less of a difference in the correlation coefficients between ERA5 and the observations between the different time resolutions than there are for the other two reanalyses.ERA-Interim shows the biggest improvement from the 6-hourly resolution to the daily, which is most obvious for the Sinaloa (SI01) station where R changes from approximately 0.4-0.8.Indeed, for the 6-hourly resolution, ERA-Interim has R < 0.6 for all but Chiapas and Oaxaca, situated in the Isthmus of Tehuantepec, which is known for experiencing sustained wind patterns (Thomas et al., 2020) due to the Tehuanos winds that blow from the north through the gap in the mountains (Hurd, 1929;Pr osper et al., 2019;Steenburgh et al., 1998).This is likely due to the fact that ERA-Interim data have a 6-h resolution and so there is only a one-time instance, rather than a mean of several, being compared with the T A B L E 3 Pearson correlation coefficients, R, between the observations and of ERA-Interim (left), ERA5 (centre), and MERRA-2 (right) for data in the time resolutions given in the subscripts of the column headers mean of the observations over that period.The increase in correlation also suggests that there is so much of a diurnal cycle in the reanalyses at the sites.Being able to accurately model the diurnal cycle in reanalyses would be useful for observing how wind power generation could vary on sub-daily time-scales.
Table 4 shows the mean-squared error (MSE; in percent) between each of the reanalyses and the observations for each of the eight anemometric stations.For five of the eight stations, the MSE is below 1 for all reanalyses.No reanalysis has consistently lower MSEs than any other.However, for the sites to the north or west of the country (BCS1, BCS2, TM02, SI01), ERA5 does has a tendency to have lower errors than ERA-Interim or MERRA-2.The highest MSEs are generally found in the Isthmus of Tehuantepec (CI01 and OA01).The OA01 station has MSEs greater than 1 for all three of the reanalyses, with the highest error between ERA-Interim and the observations.For CI01, only ERA5 has a high MSE, whereas ERA-Interim and MERRA-2 perform much better in this metric.In this region, ERA5 has a very large negative bias in wind speed (not shown), which is likely to be the cause of this.This bias is observed across most of the country, but is much smaller across the majority of the sites.

| Seasonal dependence of correlation
As wind patterns across Mexico can vary by time of year, given the source of winds (e.g., trade winds, tropical cyclones or cold surges from the north; Maldonado et al., 2018), it is important to at how the correlation between each reanalysis and the observations varies with season.Figure 3 shows the variation in R with each season for each anemometric station.Each station has a section of nine coloured blocks representing the correlation between the data shown on the vertical axis against that shown on the horizontal axis.The bottom-left three blocks in each section are black as the data would be repeated from the blocks to the right of them.The colour gives a visual representation of the value of the correlation coefficient, where white or pale yellow is a high correlation coefficient of R > 0.85, whereas red is a low coefficient of R < 0.35, as shown on the colour bar at the bottom of the figure.
Firstly, examining the first row of each section of Figure 4, we observe how the observations compare to ERA-Interim (left), ERA5 (centre) and MERRA-2 (right) The progression of Pearson correlation coefficients for each station (y-axis) through each season (x-axis).High correlations are shown in white, whereas low correlations are shown in red.Each rectangle represents the correlation coefficient during the season given at the top of the plot between the observations on the y-axis and the reanalysis on the x-axis for each season.The correlation coefficient for most stations and seasons are relatively high (most blocks are yellow, indicating correlation coefficients of 0.8-0.9),and a seasonal dependence on the correlation coefficients at each station is observed.There are also times when the correlation falls away for some stations during seasons.BCS1, which displayed the lowest correlation coefficients in general, has a much stronger correlation with the reanalyses during summer and autumn (0.6-0.8) than during the winter or spring (0.2-0.5).This difference is most apparent with MERRA-2 but is less pronounced in ERA5, which, from Figure 2d, appears to be due to some strong wind events in the time series that are seen in the observations but are poorly represented by the reanalyses.BCS2 located less than 100 km from BCS1 exhibits much higher correlation coefficients for spring, suggesting that the source of the differences between the reanalyses and the observations here is more local in nature.
The greatest difference in correlation with season is seen from the VZ02 anemometric station.In winter and spring, the correlation coefficients are found to be rather high (0.8-0.9).However, this correlation decreases substantially in the summer and autumn down to approximately 0.1-0.3.Figure 2b shows that the observations clearly have higher wind speeds for this site through the autumn than any of the reanalyses.In the next section we investigate this difference further.

| Why have reanalyses underestimated winds at Veracruz?
The clear degradation of the correlation between reanalyses and observation during autumn at the Punta Delgado site in Veracruz (VZ02) requires further explanation when investigating the representation of in situ observations by reanalyses data sets.Figure 4 shows time series of both the observations and each reanalysis for autumn 2006.Figure 4b shows the difference in wind speed.Throughout the period, the wind speed is higher in the observations than in any of the reanalyses.However, there are several events where strong winds were recorded at the observations site but were not reproduced by ERA5, centred around 22 August, 6 September and 21 September, respectively.
Figure 4c shows the wind direction through the same period.Generally, the wind direction gives a much better match between the time series.Both the observed and ERA5 wind directions turn northerly at the same time for each of the high wind speed events.For the second and third of these events, the wind direction moves away from the north 1-2 days earlier in ERA5 than the observations but this does not correspond to any major changes in the ERA5 wind speed at these times.With this in consideration, and the fact that the VZ02 anemometric station has a long fetch in the northerly direction, means that local effects on the observations are unlikely to be a cause of this inconsistency in wind speed.
To investigate the cause of this difference, and how widespread this area of disparity is, we make use of satellite data of wind speed from the QuikSCAT mission.As described in Section 2, QuikSCAT provides wind components globally across the oceans.Thus, as VZ02 is a coastal observation site, we are able to compare ERA5 to the satellite observations in the locality of the weather station.We interpolate the QuikSCAT observations to the spatial resolution of ERA5 using linear interpolation.The interpolated QuikSCAT winds for 4 September 2006 (red), ERA-Interim (blue), MERRA-2 (orange) and anemometric weather station observations (black) time series of wind speed (a) and wind direction (c) for the Punta Delgado site in Veracruz, VZ02.(b) and (d) show the differences between the observations and the reanalyses for the wind speed and direction, respectively.Green highlighted regions display times when there is a large discrepancy between the reanalyses and observations are shown in the top-left panel of Figure 5, with the ERA5 wind speeds shown in the top-right panel.This date was chosen as it has the largest difference between the observations and the reanalyses, which are displayed in Figure 4b.We then subtract the QuikSCAT data from the ERA5 data to produce the difference plot shown in the bottom panel in Figure 5, where blue regions represent where ERA5 is overestimating the wind speeds, and red shows the reverse.Areas in grey are those not covered by QuikSCAT.
The difference image at the bottom of Figure 5 shows regions of much stronger observed winds than those modelled by ERA5 that match with the regions of strong winds in Figure 4.The difference in wind speed between the datasets is greater than 5 m/s in the Gulf of Mexico nearby to VZ02.By investigating how these regions change with time over the 5-day period surrounding 4 September, we find that the location and size of the differences vary with time.During 5 September, the difference is still present close to VZ02, but the spatial extent is much smaller and limited to within 50 km of the coastline.One possibility for the cause of these differences is gust fronts generated by mesoscale convective systems that develop on the day and are not well represented by ERA5, as has previously been suggested by Rivas and Stoffelen (2019).Satellite imagery from GOES-12 (not shown) reveals organized convective cells in the regions where the largest wind differences between the observations and ERA5 are found.Whilst it is unclear whether this discrepancy is due to bias in ERA5 or error in the satellite surface wind observations below high clouds, the in situ observations from VZ02 match well with the local QuikSCAT wind speed in the region.

| CONCLUSIONS
In this study, we have compared for the first time, interpolated atmospheric reanalyses with observations from coastal anemometric sites across Mexico through 2006 when detailed wind observations were available.Despite this not being the designed use for reanalyses, we have found that they reproduce observations remarkably well in the majority of sites, with correlation coefficients of greater than 0.8 found for five of the eight sites for daily observations.The main exceptions are the two observation sites in Baja California Sur (BCS1 and BCS2), which have lower correlation coefficients of 0.5-0.8 and Veracruz (VZ02), where the correlation coefficient varies hugely throughout the year from 0.9 in the spring to 0.2 in the autumn.The difference in BCS1 and BCS2 is considered to be due to the complex topography of the region, which is much better represented by the higher spatial resolution of ERA5.High correlations at sites infer that the reanalyses are largely reproducing the general highs and lows in wind speeds, which means that the general wind patterns are being well represented.CI01 and OA01 (in the Isthmus of Tehuantepec) have relatively high mean percentage errors compared with other stations, due to a bias in the wind speeds in the reanalyses.Low errors are preferable to provide wind energy users with exact power generation at a particular site for long-term resource assessment.VZ02 is somewhat more complicated.By analysing the time series, this was found to be due to several strong wind events in the autumn at VZ02 that were not observed at all in ERA5.By analysis of QuikSCAT data, it is hypothesized that this is likely due to a mesoscale convective system that hit the area during those times but was not at all well reproduced in any reanalysis.
ERA5 consistently outperforms ERA-Interim and MERRA-2 across the vast majority of the coastal sites in this study, regardless of resolution and metric used.On average, the improvement in correlation coefficient was found to be around 0.1 compared with either MERRA-2 or ERA-Interim.One interesting exception was Chiapas (CI01) where ERA-Interim out-performed ERA5.At this site, ERA5 also had a very large negative bias with respect to the observations, which was not reproduced in either of the other reanalyses.When reducing the temporal resolution of the data investigated, ERA-Interim shows the largest improvement from very low correlation coefficients across all sites on its 6-h resolution, up to more comparable values on daily resolutions.All reanalyses show sufficiently high correlation with observations at a daily resolution to be useful tools for wind energy studies, although for modelling on sub-daily timescales, ERA5 is recommended as the best reanalysis tool to use.Generally, this infers that the improved resolution between ERA-Interim and ERA5 results in a better match to in situ observations, although this is not the case in every location.
This study of course only covers a short period as we utilize the specialist anemometric stations in place for a study of wind speeds for wind power applications.It is possible to extend this work to use Mexico's weather service stations, which would give more sites across the country and a longer period of study.However, these sites are often located in urban areas with wind observations only at the standard height for wind observations at 10 m.a.g.l.An investigation into utilizing more of these observations (not shown) yielded very low observed winds and hence large biases.The AEOLUS satellite, launched in 2018, will also give a direct comparison for Mexico, which could also give useful comparisons for the country (although some independence between the reanalyses and the observations might be lost).
Generally, we conclude that for most of the coastal regions of Mexico, atmospheric reanalyses can well replicate observations.The consequences are such for wind power applications: we recommend the use of ERA5 for wind resource assessments in Mexico over ERA-Interim or MERRA-2.Reanalyses should be used with caution for these purposes.Care should be taken in areas with more complex topography, such as Baja California Sur, as even the higher resolution of ERA-5 is unable to accurately reproduce local observations.For all other coastal locations in the study, the correlation coefficients between ERA-5 and the observations are very high, meaning that ERA5 could be an adequate substitute for investigating local wind variability over time for current and future wind farms.
Figure 2 shows time series of each of these for each of the eight anemometric stations, with the observations shown in black, ERA-Interim in blue, ERA5 in red and MERRA-2 in yellow.Histograms are shown to the right of each panel to display the relative wind speeds from each dataset.

F
I G U R E 2 Time series (left) and histograms (right) of wind speeds for the eight anemometric stations across Mexico.The black lines show the observations, whereas the blue, red and orange show the ERA-Interim, ERA5 and MERRA-2 reanalyses, respectively, interpolated to the location and height of the observations

F
I G U R E 5 Wind fields of (a) QuikSCAT wind observations, and (b) ERA5 wind observations on 4 September 2006, when large differences are observed between in situ observations and reanalyses.Blue is low wind speeds and white is high.Regions with no data are coloured grey.Panel (c) shows the ERA5 minus QuikSCAT wind speeds, where red represents areas of stronger winds in QuikSCAT and blue vice versa.The green star shows the location of the anemometric station, VZ02 T A B L E 1 A summary of the anemometric stations used in this study, including the codes used to reference the stations throughout this paper T A B L E 2 A summary of the three reanalyses utilized in this study