A historical surface climate dataset from station observations in Mediterranean North Africa and Middle East areas

Authors

  • Manola Brunet,

    Corresponding author
    1. Department of Geography, Centre for Climate Change, University Rovira i Virgili, Tortosa, Spain
    2. Climatic Research Unit, School of Environmental Sciences, University of East Anglia, Norwich, UK
    • M. Brunet, Correspondence: Department of Geography, Centre for Climate Change, University Rovira i Virgili, Campus Centre, URV, 43071 Tarragona, Spain, E-mail: manola.brunet@urv.cat

    Search for more papers by this author
  • Alba Gilabert,

    1. Department of Geography, Centre for Climate Change, University Rovira i Virgili, Tortosa, Spain
    Search for more papers by this author
  • Phil Jones,

    1. Climatic Research Unit, School of Environmental Sciences, University of East Anglia, Norwich, UK
    2. Department of Meteorology, Center of Excellence for Climate Change Research, King Abdulaziz University, Jeddah, Saudi Arabia
    Search for more papers by this author
  • Dimitrios Efthymiadis

    1. Department of Geography, Centre for Climate Change, University Rovira i Virgili, Tortosa, Spain
    2. Climatic Research Unit, School of Environmental Sciences, University of East Anglia, Norwich, UK
    Search for more papers by this author
    • Correction added 13 February 2015 after original online publication: Dimitrios Efthymiadis has been added to the author list.

  • This study was supported by the European Union EURO4M project (FP7-EC Cooperation Theme 9, SPACE, grant no. 242093).

Abstract

Historical climatic data from station observations taken in North African and Middle East Mediterranean countries since the second half of the 19th century have been digitized and quality-controlled in the framework of the EU-funded European Reanalysis and Observations for Monitoring (EURO4M) project. Daily maximum and minimum temperatures and precipitation totals, along with sub-daily data for surface air pressure have been recovered by using historical data sources involving book/logbook collections archived in national and international data centres. The new dataset produced comprises climatic time series for 79 stations that have operated in southern and eastern Mediterranean countries. While the developed time series have data gaps, every effort has been made to infill these gaps, to improve assessments of the long-term changes in climate variability in the region.

Dataset

Identifier: doi:10.5281/zenodo.7531

Creator: Centre for Climate Change (C3), Department of Geography, University Rovira i Virgili

Title: C3-EURO4M-MEDARE Mediterranean historical climate data

Authors: Centre for Climate Change/URV

Publisher: ZENODO

Publication year: 2013

Resource type: Book

Version: 1.0

Introduction

A better understanding of the physical mechanisms of the Mediterranean's climate variability is crucial for developing advanced projections of future climate. To achieve this, a basin-wide knowledge of historical climate variations, with high temporal resolution, and over long-term scales is required to assess climate model simulations. Such knowledge requires long-term and high-quality climate time series, whose current availability is uneven in the Mediterranean, as the northern-basin countries (belonging to Europe) enjoy good data coverage, whereas the southern part (North Africa and Middle East) is a data-sparse region (Brunet et al., 2013). In addition, station-based data are also needed as input to more accurate reanalysis and gridded datasets. The southern data paucity is not a result of the lack of measurements, since meteorological observations were taken since the mid-19th century by former colonial endeavours. Although deserts dominate the area, individual stations or networks were deployed in the populated parts of southern and eastern Mediterranean locations and the meteorological records taken were often published in periodical publications.

In this context, the EURO4M project (http://www.euro4m.eu/index.html), in connection with the World Meteorological Organization (WMO) MEditerranean DAta Rescue (MEDARE: http://www.omm.urv.cat/MEDARE/), has set, among other objectives, the recovery of historical climate data from North Africa and Middle Eastern Mediterranean countries; namely Morocco and Spanish enclaves, Algeria, Tunisia, Libya, Egypt, Cyprus, Lebanon and Syria. This data rescue (DARE) effort has been carried out in coordination with other relevant DARE Initiatives and projects to avoid duplication and maximize resources. These other initiatives include projects such as the French historical climate and weather observations rescue project entitled Access to climate Archives despite Asbestos – (AAA; Jourdain & Dandin, 2011), the international atmospheric circulation reconstructions over the earth initiative (ACRE; http://www.met-acre.org/Home; Allan et al., 2011) and the European ReAnalysis of Global CLIMate Observations project (ERA-CLIM; http://www.era-clim.eu/).

The chosen climatic variables are atmospheric daily minimum (TN) and maximum temperature (TX), daily precipitation total (RR) and sub-daily air pressure (PP), all observed at meteorological stations that have operated in these countries. For air pressure, especially the historical records, this often represents air pressure adjusted to sea level (SLP). Although the period of interest was the pre-1950, the spatial and temporal span of the data recovered was finally dictated by the data sources located and accessed, which are described in Section 'Data sources used and rationale for meteorological station selection'. Details on the quality controls (QC) to which the digitized data are subject are given in section 'Data digitization and quality control', while section 'Dataset structure and future prospects' concludes by outlining the structure of the new dataset developed and provides some notes on future prospects.

1 Data sources used and rationale for meteorological station selection

The data sources used were sought in worldwide online repositories and at national archives containing historical climate data document collections, such as meteorological logbooks, yearly books, or weather charts. In most cases, these data sources are series of scanned volumes containing data at different time scales covering the station network of a country (often including data from adjacent countries), or for a specific observatory/station. Before the Second World War, they were published mainly by French, British and Italian colonial authorities, whereas national authorities administrated the meteorological services since independence and organized the respective data publications. These publications are secondary data sources, since they are transcriptions of original meteorological logbooks gathered from various stations. They have the advantage of having passed a data quality screening (this is indicated by some comments found next to the values and also by monthly summaries of data corrections), but they may also include transcription errors that occurred during the transference from the original to the secondary source.

Most of the data sources used were located in the online repository of the Central Library of the US National Oceanic and Atmospheric Administration (NOAA) which comprises digital (scanned) versions of many meteorological data collections from all over the world (developed in the framework of NOAA/NCDC Climate Database Modernization Program, 2000–2011; http://docs.lib.noaa.gov/rescue/data_rescue_home.html). Climatological departments of other national meteorological agencies also provided digital and scanned data documents from their archives. Météo-France provided Tunisian daily data series digitized in the framework of the CIRCE project (http://www.circeproject.eu), and also scanned copies of French data publications. The UK Met Office made available scanned copies of British colonial-era data collections (ACRE initiative) through the British Atmospheric Data Centre (BADC; http://badc.nerc.ac.uk/browse/badc/corral/images/metobs). The Libyan National Meteorological Center (LNMC) made available data catalogues for various stations from the country. The Spanish meteorological agency (AEMet) provided scanned copies of bulletins including data for stations in North Africa. Finally, at the library of the Ebro Observatory (Tortosa, Spain), supplementary data books were located which filled in data gaps within the overall climatic data series being recovered. Table 1 provides a list of the various climatic data books/collections used from all these data centres for recovering Mediterranean historical climate data. NOAA's Central Library was the main data source for the dataset development (71% of the total data), whereas the imaged data acquired from Météo-France (11%), LNMC (11%), UK Met Office (3%), AEMet (2%) and Ebro Observatory (2%) also played an important role, especially for specific countries and stations.

Table 1. Collections of climatic data sources used
Abbreviated nameData sourceCountries coveredData centresYear range
ABCM-FranceAnnales du Bureau Central Météorologique de FranceAlgeria, Egypt, Lebanon, TunisiaNOAA, Météo–France, Ebro observatory1884–1914
AO-KsaraAnnales de l' Observatoire de KsaraLebanonNOAA1921–1971
ASM-FranceAnnuaire de la Société Météorologique de FranceAlgeriaMétéo–France1852–1867
AULO-BeirutAmerican Univ. (Syrian Protestant College) – Lee Observatory. BeirutSyriaNOAA1914–1915
BCM-LibanBulletin Climatologique Mensuel du LibanLebanonNOAA1928–1970
BM-AlgérieBulletin Météorologique de l'AlgérieAlgeria, Morocco, TunisiaNOAA1877–1938
BM-CirenaicaBolletino Meteorologico della CirenaicaLibyaNOAA1928–1931
BM-MarocBulletin de Météorologique du MarocAlgeria, Morocco, SpainNOAA1953–1978
BMA-ItalianaBollettino Meteorologico dell'Africa ItalianaLibyaNOAA1932–1936
BMD-EspañaBoletín Meteorológico Diario de EspañaMorocco, SpainAEMet1899–1948
Cairo-MRCairo. Meteorological ReportsEgyptNOAA1904–1941
CIRCECIRCE-project digital data filesTunisiaMétéo–France1899–1961
Egypt-DWREgypt. Daily Weather ReportsEgyptNOAA1907–1957
Helwan-MRHelwan Observatory Meteorological ReportsEgyptNOAA1942–1944
Libyan-NMCLibyan National Meteorological Center ArchivesLibyaLNMC1916–2008
MCD-SyriaMonthly Climatological Data. SyriaSyriaNOAA1955–1975
SM-TunisService Météorologique de TunisTunisiaNOAA1907–1932
UK-CRUK Climatological ReturnsCyprusUK Met Office1881–1922
UK-DWRUK Daily Weather ReportsEgyptUK Met Office1900–1904

For each EURO4M/MEDARE-targeted country, the meteorological stations selected for data digitization followed this rationale:

  • Stations that have the longest and most complete historical records, either on their own or in combination with other records from different sources.
  • Stations for which there is a potential of merging their data with digitized series existing in climatic national and international databanks (spanning recent and current decades) and, therefore, may lead to the development of long-term climate time series.
  • Stations that form a network covering the Mediterranean part of each country, i.e. within a zone extending no more than ˜200 km from the coastline (only a few exceptions were made, the most prominent being the remote El Golea station in Algerian Sahara), and having a roughly even spatial distribution.

The 79 stations selected are listed in Table 2, while the location of their sites is shown in Figure 1.

Table 2. List of stations, climatic variables and data periods recovered
CountryLocation/Station nameWMO codeLatitudeLongitudeAltitude (m)VariablesLength
  1. TX, daily maximum temperature; TN, daily minimum temperature; RR, daily precipitation amount; PP, sub-daily air pressure observations.

  2. a

    WMO pseudo-code.

MoroccoTangier city6010035.78°N5.82°W86TN, TX, RR, PP1912–1961
Tangier airport6010135.73°N5.90°W15TN, TX, RR, PP1961–1978
Al Hoceima6010735.18°N3.85°W12TN, TX, RR, PP1965–1978
Oujda6011534.78°N1.93°W478TN, TX, RR, PP1910–1978
Tetuan6031835.58°N5.33°W10TN, TX, RR1920–1978
SpainCeuta6032035.89°N5.35°W87TN, TX, RR1933–1939
Melilla6033835.28°N2.96°W47TN, TX, RR1899–1962
AlgeriaSkikda-Cap Bougarouni6035537.08°N6.47°E195TN, TX1931–1938
Annaba-Cap de Garde6035736.97°N7.79°E161TN, TX, RR1909–1937
La Calle (El Kala)6036736.90°N8.44°E10PP1877–1938
Algiers-Ville/Université6036936.78°N3.07°E59TN, TX, RR, PP1877–1938
Algiers-Bouzareah6037236.80°N3.03°E344TN, TX, RR1893–1920
Algiers-Cap Caxine6037436.80°N3.04°E38TN, TX, RR1878–1879
Tizi Ouzou6039536.72°N4.05°E222TN, TX, RR, PP1879–1838
Fort National6039536.63°N4.20°E942TN, TX, RR, PP1884–1938
Bejaia-Cap Carbon6040036.78°N5.10°E225TX, TN1926–1938
Bejaia-Bougie (Port)6040136.75°N5.10°E9TN, TX, RR1909–1926
Constantine6041936.37°N6.62°E660TN, TX, RR, PP1880–1938
Orleansville (Chlef)6042536.17°N1.34°E112TN, TX, RR, PP1879–1838
Setif6044536.18°N5.40°E1081TN, TX, RR, PP1878–1938
Oran6046135.70°N0.65°W53TN, TX, RR, PP1852–1966
Oran-Cap Falcon6048535.77°N0.80°W78TN, TX, RR1896–1938
Tebessa6047535.42°N8.12°E863TN, TX, RR, PP1879–1938
Nemours (Ghazaouet)6051735.10°N1.85°W83TN, TX, RR, PP1878–1938
Sidi-Bel-Abbés6052035.20°N0.63°W476TN, TX, RR1880–1938
Biskra6052534.85°N5.72°E125TN, TX, RR, PP1880–1938
Laghouat6054533.80°N2.89°E767TN, TX, RR1888–1938
Geryville (El-Bayadh)6055033.68°N1.00°E1320TN, TX, RR1888–1938
El-Golea6059030.55°N3.07°E394TN, TX, RR, PP1892–1938
TunisiaBizerte Cap Blanc6071437.33°N09.84°E264TN, TX, RR, PP1899–1961
Bizerte Karouba6071437.23°N09.82°E6TN, TX, RR, PP1920–1959
Tunis6071536.80°N10.17°E36TN, TX, RR1886–1938
Tunis-el-Aouina6071536.83°N10.23°E4TN, TX, RR, PP1925–1957
Kelibia6072036.84°N11.11°E82RR1907–1932
Jendouba Souk-el-Arba6072536.48°N08.80°E144TN, TX, RR, PP1946–1957
Kairouan6073535.67°N10.10°E65TN, TX, RR, PP1930–1957
El Djem60743a35.33°N10.70°E112TN, TX, RR1900–1932
Sfax6075034.72°N10.72°E23TN, TX, RR, PP1886–1957
Tozeur6076033.95°N08.11°E50TN, TX, RR, PP1897–1938
Gabes6076533.89°N10.11°E4TN, TX, RR, PP1887–1957
Djerba6076933.88°N10.85°E4TN, TX, RR1898–1912
LibyaNalut6200231.87°N10.98°E621TN, TX, RR1932–1953
Zuara6200732.88°N12.08°E3TN, TX, RR1920–1955
Trípoli Airport6201032.67°N13.15°E81TN, TX, RR1943–1955
Trípoli Sidi El Mesri6201032.87°N13.22°E25TN, TX, RR1916–2008
Tripoli City6201032.90°N13.18°E25TN, TX, RR1925–1974
Misurata6201632.32°N15.05°E32TN, TX, RR1925–1956
Sirte6201931.20°N16.58°E13TN, TX, RR1925–1955
Benghazi Benina6205332.08°N20.27°E132TN, TX, RR1944–1955
Benghazi Regima (Ragma)6205332.07°N20.07°E322TN, TX, RR1922–1935
Agedabia6205530.72°N20.17°E7TN, TX, RR1924–1955
Shahat6205632.80°N21.88°E648TN, TX, RR1921–1955
Derna6205932.76°N22.66°E10TN, TX, RR1928–1955
EgyptSalloum6230031.55°N25.18°E4TN, TX, RR, PP1919–1957
Mersa Matruh6230631.33°N27.22°E25TN, TX, RR, PP1920–1957
Port Said6233331.28°N32.23°E6TN, TX, RR, PP1884–1957
Cairo Abbassia6237130.08°N31.29°E30TN, TX, RR1900–1908
Cairo Ezbekiya6237430.05°N31.25°E20TN, TX, RR1909–1957
Giza (Cairo)6237530.03°N31.21°E28TN, TX, RR1924–1957
Helwan (Cairo)6237829.86°N31.34°E116TN, TX, RR, PP1904–1957
Siwa6241729.20°N25.48°E–15TN, TX, RR, PP1912–1957
Ismailia6244130.60°N32.23°E10TN, TX, RR, PP1884–1956
El Suez6245029.93°N32.55°E10TN, TX, RR, PP1907–1957
CyprusPaphos1760034.77°N32.43°E30TN, TX, RR1901–1922
Nicosia1760735.19°N33.37°E152TN, TX, RR, PP1881–1922
LebanonRayack4010233.85°N36.00°E920RR1928–1970
Trípoli4010334.45°N35.82°E20RR1931–1970
Les Cedres (Al Arz)4010534.25°N36.05°E1925RR1939–1964
Ksara4010633.82°N35.89°E918TN, TX, RR, PP1912–1971
Hermes40108a34.40°N36.38°E700RR1932–1970
Rachaya40109a33.50°N35.85°E1235RR1933–1970
SyriaJarablus4000536.82°N38.00°E350TN, TX, RR1928–1975
Aleppo4000736.18°N37.22°E390TN, TX, RR1955–1975
Lattakia4002235.50°N35.78°E7RR1928–1975
Tartous4005034.90°N35.87°E5RR1928–1975
Homs4005534.75°N36.72°E487TN, TX, RR1914–1959
Palmyra4006134.55°N38.30°E404TN, TX, RR1928–1975
Damascus4007933.48°N36.23°E720TN, TX, RR1928–1955
Dara'a4009532.60°N36.10°E532TN, TX, RR1928–1933
Figure 1.

Location of sites for which data rescue was exercised; some sites comprise more than one station (see Table 2 for details).

2 Data digitization and quality control

Using the data sources mentioned above, data digitization for the selected stations was key-entered and carried out with special care. The varying quality of the hand written or typed data pages and their scanned copies posed many difficulties when digitizing the data: scanned pages were sometimes too dark or too faded and this affected the readability not only of meteorological data but also of their corresponding dates. Therefore, date identification was crucial and time-consuming, since there are cases of missing data pages, double/triple copies of the same page or deviations from an ascending chronological page order found in the data books used. All these cases were potential sources for errors affecting the accuracy of the digitized data files; potential mistakes that without a visual cross-checking could not have been avoided and had possibly introduced non-systematic biases and additionally potentially compromised data reliability for use in future applications.

Data QC was the next step and comprised three stages:

  1. Visual cross-comparison between the data source and the digitized data to verify the fidelity of digitization (transcription accuracy): sample data were examined across the overall data period to check if the correct station was indeed used (especially in the case of multi-station data pages), if the dates were correctly assigned and if the targeted climatic variables were correctly transcribed.
  2. Automatic QC to identify non-systematic errors in time-series: the RClimDex software package (Zhang & Yang, 2004), reinforced with the ‘extraQC’ software (Aguilar & Prohom, 2011) were employed to identify potential temperature and precipitation data errors. The latter tool is an improved, version of the standard ‘RClimDex’ software and performs a series of additional tests to further ensure internal consistency (e.g. consecutive identical values and rounded values) and temporal coherency (large inter-daily differences), in addition to the usual gross-error and tolerance tests. Suspicious values were labelled and examined against the data sources to validate or reject them and, therefore, to either retain them or set them to missing, accordingly. For air pressure data QC, various statistical tests were developed aimed at identifying cases of extreme low/high air pressure records and also cases of zero-variance (‘consecutive identical values’) or high variance (‘jumps’ or ‘outliers’) for consecutive-day observations (and also for consecutive intra-day observations, if available).
  3. Cross-station data checks by plotting, in parallel, data from two or more nearby stations to examine the inter-station consistency and ensure spatial coherency. Digitization and potential data source errors were identified as in the previous stage.

To deal with potential data source errors, ancillary data/information details were sought in the data books: data from nearby stations, the general weather setting (e.g. cloudiness, rainfall, wind direction/strength, weather charts), and reports of extreme meteorological events. If the information gathered could support the credibility of an unusual/suspicious datum value, the datum was left unchanged. Otherwise, a datum value change was made by setting it to a missing value (−99.9), unless the correct original value could be deduced from the ancillary information using expert judgement. The latter correction was made in certain cases, such as the swapping of Tx and Tn data, the adjustment of temperature values by multiples of 10°C, the derivation of the correct pressure value from the isobar lines drawn on the accompanying weather charts (if available). It should be noted that the data source error correction scheme was a conservative one: changes were made only when the data values appeared to be clearly unrealistic and replacement values were inserted when there was a strong certainty about them (based on the consultation of ancillary information). Overall, 0.5% of the data digitized were eventually corrected through the multi-stage QC procedure, with ~10% of them corresponding to data source error correction (half of these corrections involved substitution with missing values, while for the rest a new, corrected value was introduced, as explained above). A summary of the automatic QC results is provided in Table 3 and their traceability ensured by the accompanying documentation to the dataset provided as supporting information to this article (C3-EURO4M-MEDARE_documentingQC.txt).

Table 3. Results summary of the quality controls' (QC) applied to the daily minimum temperature (TN), daily maximum temperature (TX), daily precipitation (RR), hourly air surface pressure (PP) series
ParametersSuspicious valuesCorrected values
TotalTolerance testTemporal coherencyInternal consistencyTotalTranscription errorsData source errors
TotalNew valuesMissing values
TN, TX, RR556352%10%38%390822441662864798
PP798716%84% 5772572745423
All data13550   968079711707906801

Figure 2 shows the data volumes recovered per climatic variable and year. For Oran station, with the most ancient data recovery, there are years with data since the 1850s (Figure 3). Several other stations have data series starting in the late 1870s and 1880s, and the yearly amount of data then increases till the mid-1930s. The data recovery is limited over the Second World War period, increases again in the 1950s and is of only a modest amount in the 1960s–1970s. Only for one station in Libya, at Tripoli (Sidi El Mesri), the data recovery extended into the 1980s and 2000s (Figure 4). Missing volumes in the data source collections used led to distinct data amount minima for some years within the recovery period (as shown in Figure 2). The right-most column in Table 2 provides the data temporal range for every station recovered. Despite the use of multiple data sources to achieve an as complete as possible recovery of station data, the station series still have missing daily data and certain multi-year gaps exist within their temporal span (see Figures 3 and 4). Much of the missing data in from the last 3–4 decades and these data are likely digitized in National Meteorological Service (NMS) archives. Unfortunately, at present, these daily data series are not freely available.

Figure 2.

Data volumes (number of daily data per climatic variable per year) recovered.

Figure 3.

Data recovery (number of daily data per year) of daily maximum temperatures (TX) for Oran, showing data gaps both at the annual and interannual scales.

Figure 4.

Time series of recovered daily maximum temperatures (TX), daily minimum temperatures (TN) and daily precipitation totals (RR), for Tripoli (Sidi el Mesri) in Libya.

3 Dataset structure and future prospects

The quality-controlled dataset developed comprises daily data for the 79 stations selected. The dataset consists of four data files, each of them including all station time series for each of the climatic variables targeted (TN, TX, RR and PP) in ASCII format. Data values in these times series run continuously from the starting year (1852) to the final year (2008) of the data recovery period, even if in some intervals (days, months, or years) there were no data recovered: missing data values (i.e. −99.9) were used for those data gaps. While the minimum/maximum temperature and precipitation data are accompanied by the respective date data (year, month, day of month), for air pressure data the observational hour is additionally provided.

The dataset is accompanied by a ‘readme’ file with information on the data file format and the station meta-data: station names, approximate WMO codes, geographical coordinates, climatic variables recovered, data period ranges, data sources used, time coordinates of observational times (local or UTC) and the periods with original (unadjusted to sea level) air pressure data.

All the dataset files are available from the ZENODO repository (http://www.zenodo.org/), while the station time series are also available from the ECA&D website (http://www.ecad.eu/).

This new dataset developed aims to cover a major data gap which has limited our knowledge on long-term climate variability in the southern and eastern Mediterranean regions. Although one of the station records recovered goes back in time to the mid-19th century, most of the time series start in the late 19th century or early 20th century and are far from being continuous with many data gaps remaining to be filled. It is expected that after merging the time series included in the C3-EURO4M-MEDARE dataset, combining them with additional digital data from other data-banks (principally NMS databases to cover the data gaps in recent decades), and once the temperature and precipitation series are subjected to homogenization, the time series will provide an advanced insight into the history of the Mediterranean climate.

Acknowledgements

The dataset development was funded by the European Union, Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 242093 (European Reanalysis and Observations for Monitoring – EURO4M project). CIRCE has been a project funded by the European Union (FP6/2002-2006, n0 036961). Olivier Mestre and Sylvie Jourdain provided the CIRCE Tunisian data. Khalid Ibrahim El Fadli provided the LNMC data for Libya. David Mallol, Clara Lopez, Gisela Ponce, Nilo Nagera, Alberto Fernández, Victor Vidal, Juan Jose Ferreras, Mireia Sánchez, Nolia Tomás, Roger Dobon, Sara Barceló, all of them students at URV, have contributed to the dataset development by digitizing data books and performing the initial data quality control.

Ancillary