A global data set of archeomagnetic and paleomagnetic data covering the past 7000 years has been compiled. It consists of 16,085 results of inclination, 13,080 of declination, and 3188 of intensity for the time span 5000 BC to 1950 AD. Declination and inclination data come partly from existing databases and partly from original literature. A new global compilation of intensity data for the millennial scale is included. Data and dating uncertainties are discussed as we attempted to obtain an internally coherent data set. The global distribution of the data is very inhomogeneous in both time and space. All the data are compared to predictions from the previous 3000 year global model, CALS3K.1. This collection of data will be useful for global secular variation studies and geomagnetic field modeling, although southern hemisphere data are still underrepresented. In particular, we will use it in a further study to update and extend the existing global model, CALS3K.1. The huge increase in data compared to the previous compilation will result in significant changes from current models. As we might have missed some suitable data, we encourage the reader to notify us about any data that have not been included yet and might fit in, as improving our global millennial scale models remains our aim for the future. The data files described in this paper are available from the EarthRef Digital Archive (ERDA) at http://earthref.org/cgi-bin/erda.cgi?n=331.
 The main geomagnetic field originating in the Earth's core shows significant change particularly on long timescales from centuries to millions of years, where complete reversals occur. Systematic worldwide direct measurements of the geomagnetic field have been carried out for approximately 2 centuries only [e.g., Alexandrescu et al., 1997]. Many declination and fewer inclination data mainly from shipboard for navigational purposes are available from the last 4 centuries [e.g., Jonkers et al., 2003]. For earlier times knowledge of the field evolution has to be obtained from indirect measurements on remanently magnetized sediments, lavas, or archaeological artifacts. Acquiring paleosecular variation data from lake sediment cores, a series of lava flows or a set of archaeological magnetized material is a considerable effort. Usually, the focus is on individual time series or time series from limited regions. Quite a large number of data now exist, however, so that first attempts at compiling and modeling global data sets on the millennial scale have been carried out. Several databases are stored at the World Data Center Boulder. Daly and Le Goff  compiled a global set of regional secular variation curves from archeomagnetic data. S. P. Lund and C. G. Constable (Global geomagnetic secular variation for the past 3000 years, manuscript in preparation, 2005; hereinafter referred to as Lund and Constable, manuscript in preparation, 2005) [see also Constable et al., 2000] in a similar way created a data set named PSVMOD1.0 of 24 globally distributed secular variation curves. Hongre et al.  developed a low-degree spherical harmonic model for the past 2000 years based on the Daly and Le Goff  secular variation curves, complemented by sediment records from two lakes (Argentina, New Zealand) and volcanic data from Hawaii and Sicily. Constable et al.  presented global field snapshots in 100 year intervals for the past 3000 years based on PSVMOD1.0, which Korte and Constable  improved to a temporally continuous model CALS3K.1 (Continuous Archeomagnetic and Lake Sediment 3k years model, version 1). For the latter models only directional data, i.e., declination and inclination, were used due to the lack of a global compilation of archeomagnetic intensity data, although CALS3K.1 included a constraint on axial dipole variation.
 Here we present a global data set which was compiled for the purpose of improving model CALS3K.1. We assembled a new global collection of archeomagnetic intensity data, fully described by A. Genevey et al. (A new archeointensity database for the past 10 millennia, manuscript in preparation, 2005; hereinafter referred to as Genevey et al., manuscript in preparation, 2005), and tried to create an internally coherent data set of directional data from both existing databases and literature. A priori smoothing or reduction of regionally distributed data to one location as in both the data sets by Daly and Le Goff  and Lund and Constable (manuscript in preparation, 2005) are avoided here to allow maximum spatial and temporal resolution of the global model. The global data distribution in time and space is presented, measurement errors and dating uncertainties are discussed and finally all the data are compared with CALS3K.1 model predictions.
 While we made every effort to include all data that we were aware of and could obtain from any source at the time of starting our modeling efforts, we might have missed some data and new data are constantly being produced by colleagues from all over the world. We encourage the reader to notify us about any data that we might have missed so that we can update the data set presented here in the future.
2. General Data Sources and Treatment
 The collection consists of two substantially different kinds of data: lake sediment paleomagnetic data and archeomagnetic data (including lava flows). First, the two data sets differ by their acquisition process of magnetization, which are detrital remanent magnetization (DRM) and thermal remanent magnetization (TRM), respectively. Second, lake sediment data provide time series for a specific location, while archeomagnetic data are mostly spatially scattered individual results. Only rarely are archaeological artifacts of enough different ages available at one location to give a secular variation series, as in the case of a German bread-oven with several successive floors spanning 450 years [Schnepp et al., 2003a]. Both sediments and archaeological artifacts can provide directional data, i.e., inclination and declination. However, while intensity results exist for a large number of archeomagnetic sites, for lake sediments only relative intensities can be determined. Only absolute intensities have been included in this compilation in order to avoid calibration biases. With lake sediments often two or even more long cores are taken from one lake. The results generally are stacked and frequently smoothed and plotted as a continuous curve. Our data set is not homogeneous in this respect. It contains unsmoothed stacks as well as smoothed data, depending on the data source. We always tried to use the original, unsmoothed data when available, but several of the sediment records were digitized from published curves. In those cases not even the original sampling interval is preserved.
 Uncertainties in archeomagnetic and paleomagnetic results are much higher than in direct field measurements and have several sources which have been discussed in detail by Lanos et al. . To obtain declination values, the azimuthal orientation of the probe has to be known. In particular with sediment cores this often is not the case. Sediment declination records therefore are often not oriented or simply oriented to give an average of zero over the time span they cover. This clearly is not a satisfying assumption as the average magnetic field declination is not zero over any arbitrary time span. To assess orientation problems in declination records, we compared all sediment time series to CALS3K.1 model predictions [Korte and Constable, 2003] and reoriented the data series by comparing averages of the overlapping time spans of data and model prediction if a clear offset existed. It turned out, however, that only the three records of Lakes Eacham, Barrine (both Australia) and Biwa (Japan) were unoriented and had to be adjusted in this way. The angles by which data series were rotated are given in the individual descriptions below. Inclination errors in sediments arise if a core is not taken exactly vertically or through inclination flattening for certain sediment types or grain shapes. As these are systematic errors similar to declination orientation errors they should also be recognizable in a comparison to model predictions. We did not find evidence for such inclination errors in any of the time series used.
 Determining the age of archeomagnetic and paleomagnetic samples is another source of uncertainty. Very good dating can be achieved for those archeomagnetic samples for which the age is well determined by archeology. For some (archeological or volcanic) sites directly related to historical events the age can be accurate to within one or few years. In most cases, the age uncertainties typically range from a few decades to a few centuries depending on the area and the archaeological knowledge that we have of our past history. For sediments, but also for some archeomagnetic samples, radiocarbon (14C) dating is commonly used. Some sediments are varved and can be dated by varve counting, which is more accurate and contains fewer sources of uncertainty than radiocarbon dating.
 Several sources of uncertainty are inherent to radiocarbon dating. Radiocarbon ages are not equal to calendar ages, they have to be calibrated. Calibration curves have been compiled by different authors, the most commonly used ones are by Clark  and the more recent one by Stuiver and Reimer , Stuiver et al. [1998a], and Stuiver et al. [1998b]. For a global data set it is clearly desirable to use consistent calibration on all data. We did not completely achieve this. For some of the data published before the improved calibration curve of Stuiver and Reimer  only calibrated ages were available and it was impossible to recalibrate those. Wherever radiocarbon ages were available which were not calibrated by the more recent method we calibrated those data with the program CALIB by Stuiver and Reimer , version 4.3, which is based on the 1998 international calibration data sets by Stuiver et al. [1998a] and Stuiver et al. [1998b]. We always used the following parameter settings of this program: 1998 atmospheric decadal variation curve, smoothed with a moving average over 5 points, which corresponds to 50 years, probability method, and no further corrections or assumptions. The method used generates a probability distribution of calibrated ages compatible with the radiocarbon age and its Gaussian age distribution. For details, see Stuiver and Reimer  or the CALIB manual on their Web page (http://depts.washington.edu/qil/calib/). Radiocarbon age calibration was necessary for most of the lake sediment data.
 One further problem is that radiocarbon dates are originally only available for a few tie-points along the depth of the sediment core. An interpolation is necessary to obtain ages for the whole secular variation curve. The easiest method is a linear interpolation, but methods like interpolation by splines and using additional age information like, e.g., correlation to other cores or pollen ages are also used. An assumption for linear interpolation is a constant sedimentation rate. However, as the relation between radiocarbon ages and calendar ages is not linear, the tie point ages should be calibrated and the interpolation done with the calendar ages. Instead, it is common practice to interpolate the radiocarbon ages first, one reason being that sometimes no calibration is done and only secular variation curves against radiocarbon ages are published. The difference will probably be negligible with respect to all other factors of uncertainty in radiocarbon dating, in particular as the assumption of a constant sedimentation rate over centuries itself is not usually valid. We mention it here, however, to emphasize the point that it is almost impossible to obtain truly consistent ages from different samples or sediment cores by radiocarbon dating. We have to keep this problem in mind when comparing different records and interpreting features of global models. For a while we thought about trying to redate all available lake sediments by going back to the original data versus depth, calibrating the radiocarbon dated tie-point ages and doing linear or spline interpolations. Very soon we understood, however, that this would be a major project of its own for several reasons. First, the necessary information can be hard to obtain for older studies. Often secular variation curves are published only either against depth or against age and tie point depths and ages are not always given in the literature. Moreover, for several of the sediment cores adjustments have been made by additional information like pollen ages or correlation. As long as these adjustments are not based on correlation of paleosecular variation features they are likely to be an improvement. Correlation of paleosecular variation features can be misleading if done over large distances because the actual magnetic field can be quite variable over distances as short as a few 100 km. For global field modeling we seek data with independent age control which is violated if adjustments are made on the basis of correlation in paleosecular variation features. We leave the task of obtaining a more consistent data set of lake sediments to further studies and hope assumptions made by the original authors about dating their data are sufficiently similar that our global compilation are reasonably consistent within the dating uncertainties, which can be quite large. The discussion about those uncertainties and how we assessed them for global modeling is given in section 5. First we will describe the individual data and sources in more detail.
 There are two databases with directional data for recent millennia available at World Data Center Boulder (http://www.ngdc.noaa.gov/seg/geomag/): the paleosecular variation database Secvr00 [McElhinny and Lock, 1996] and the archeomagnetic directional database Archeo00 by Don Tarling, University of Southampton, UK. Two others, the paleosecular variation database Psvrl00 [McElhinny and McFadden, 1997] and the paleointensity database Pint00 [Perrin et al., 1998] mainly contain data much older than the time span we are interested in. All of our archeointensity data therefore were collected from the original literature or directly from the authors and put together in a MS Excel sheets collection (Genevey et al., manuscript in preparation, 2005). Although for some archeomagnetic samples both direction and intensity measurements exist, we treat directions and intensity separately here. Generally if directional measurements were made on archeomagnetic samples or lake sediments both inclination and declination results are available, with some exceptions which are given in the following description.
 The Secvr00 database contains 14 regional records, which are mostly composite secular variation curves of archeomagnetic data, all compiled by Daly and Le Goff , but in four cases are a combination of results from sediments of different lakes. Composite archeomagnetic curves for one location are obtained by reducing data from different locations to an average location for that region. The reduction only takes into account the axial dipole part of the field and therefore introduces additional errors in the reduced results. For global modeling we do not need time series for specific locations, but can deal with data which are scattered in both space and time. We therefore prefer to use the original data and did not consider any of these composite curves. Many of those original data were present in the Archeo00 database. Likewise, we did not use the combined lake sediment results as all the individual lake records were also available.
3. Detailed Data Description by Region
 In this section we describe the lake sediment time series, and archeomagnetic directional data and intensity data grouped by continent or region and give their references. More detailed information about the compilation of intensity data is given by Genevey et al. (manuscript in preparation, 2005). The data files described below are available in the EarthRef Digital Archive (ERDA), with the assignment of references to all individual data (http://earthref.org/cgi-bin/erda.cgi?n=331). Table 1 summarizes name, geographic location, our abbreviation code and time span of data series for the lake sediments.
In Age Range, >5000 BC means the time series goes further back than the time we are interested in. In Nr. of Data only data since 5000 BC are considered, and declination and inclination of one age and location are counted as one data point.
350 BC – 1810 AD
525 AD – 1675 AD
2110 BC – 1445 AD
>5000 BC – 1358 AD
lake Barombi Mbo
>5000 BC – 616 AD
North Queensland, Australia
>5000 BC – 863 AD
3280 BC – 1950 AD
2447 BC – 1927 AD
>5000 BC – 1683 AD
Lac du Bourget
250 AD – 1930 AD
Western Victoria, Australia
>5000 BC – 1644 AD
>5000 BC – 1349 AD
>5000 BC – 1662 AD
North Queensland, Australia
3815 BC – 1445 AD
>5000 BC – 1453 AD
>5000 BC – 1799 AD
>5000 BC – 1839 AD
Western Victoria, Australia
>5000 BC – 770 AD
Great Lakes, USA
>5000 BC – 1457 AD
Western Victoria, Australia
>5000 BC – 1850 AD
>5000 BC – 2293 BC
>5000 BC – 1664 AD
2773 BC – 1830 AD
>5000 BC – 1839 AD
British Columbia, Canada
3529 BC – 1861 AD
>5000 BC – 1826 AD
110 BC – 1590 AD
>5000 BC – 1393 AD
>5000 BC – 1380 AD
1291 BC – 1950 AD
North Island, New Zealand
637 BC – 1734 AD
Lake St. Croix
>5000 BC – 1868 AD
Great Lakes, USA
>5000 BC – 1854 AD
4687 BC – 1899 AD
743 BC – 1969 AD
Laguna el Trebol
3967 BC – 1567 AD
4837 BC – 1377 AD
4770 BC – 1900 AD
539 BC – 1877 AD
2487 BC – 1855 AD
>5000 BC – 1837 AD
Northern England, UK
>5000 BC – 614 AD
 In the corresponding directional data tables the number of data is not the sum of individual declination and inclination results, but if both directions are present for one time and location they are counted as one so that the number of data in combination with the age range gives an idea of the temporal density. Note that the numbers of data points per century vary considerably between the records depending on original sampling rate or digitizing rate. Table 2 gives comparable information for archeomagnetic directional data grouped by country or region and Table 3 summarizes that information for the intensity data. The codes for the intensity data are two letter abbreviations reflecting general location with “I” for intensity as third letter.
 The distribution of all the archeomagnetic data is strongly dominated by European data, while for other regions, particularly the whole southern hemisphere, data are scarce. Consequently the region files, in which the data are grouped do not cover equal areas. For example we chose to have individual files for several parts of Europe, while there is only one file for both Australia and New Zealand, in order to find a compromise between geographic area covered and amount of data in one file. We do not want the number of data in one file to be too large nor too small for easy visual comparison. Figure 1 shows the locations of the lakes and average locations of the archeomagnetic regions given in the tables with contour plots of the concentration of data according to location.
3.1. North and Meso-America
 Seven lake sediment records from the Secvr00 database cover the last millennia and were or could be dated, making them useful for our collection. They come from Fish Lake [Verosub et al., 1986], Mara Lake [Turner, 1987], Lake Huron [Mothersill, 1981], Lake Superior [Mothersill, 1979], Lake St. Croix [Lund and Banerjee, 1985], Kylen Lake [Lund and Banerjee, 1985], and a combination named Minnesota lakes [Creer and Tucholka, 1982]. The latter is in fact a combination of Kylen Lake and Lake St. Croix records and therefore has not been considered further. The individual records are available and the combination does not contain any independent information. Data for Lake St. Croix are given only against depth, not against (radiocarbon) ages in the database. Lund and Banerjee  document the tie-point depths and radiocarbon ages for this lake. The original radiocarbon ages are obviously much too old and Lund and Banerjee  give arguments why they chose to shift the 14C timescale by 980 years. We used these adjusted ages, interpolated linearly between the tie-points and calibrated this timescale like those of other lake sediments with the CALIB program as described in the previous section. Data from Lake LeBoeuf were digitized from the work of King . One Hawaiian sediment record, Lake Waiau [Peng and King, 1992], is also available in Secvr00.
 Archeomagnetic directional data for Arizona [Sternberg, 1989a], New Mexico, Colorado and Utah, which we will be referring to as Southwestern U.S. data, can be found in the Archeo00 database (cited there by personal communication from J. L. Eighmy). Some data from Mexico [Eighmy and Sternberg, 1990] are also in that database. We added data from a Mexican stalagmite by Latham et al. . A compilation of Northwest American and Hawaiian lava flow data have been added from tables given by Hagstrum and Champion  and Hagstrum and Champion , respectively. For the latter one only the virtual geomagnetic pole positions, not the actual directions from which they were obtained, were listed. The directions were calculated by reversing the calculation of virtual geomagnetic pole position from sample position and geomagnetic directions by the equations given in paleomagnetic textbooks [e.g., Merrill et al., 1996]. The data set for Arkansas has been published by Wolfman . In the Archeo00 database there are also a few data from Guatemala and El Salvador [Eighmy and Sternberg, 1990]. Together with the Mexican data we refer to them as the region of Meso-America. A few data from Martinique [Genevey et al., 2002] were added.
 Directional data for South America is sparse. In the Secvr00 database there are time series of three Argentinian lakes: Brazo Campanario, Lago Morenito and Laguna el Trebol [Creer et al., 1983]. Data from Palmer Deep, Antarctic peninsula come from a recent publication by Brachfeld et al. . At such high southern latitude the declination data from that site are of dubious quality and probably should be rejected for global modeling.
 There are no directional archeomagnetic data from South America.
 Europe is the region most densely covered with data. Secvr00 contains sediment records from three British, three French, three Greek and one German lakes. Those are Loch Lomond [Turner and Thomson, 1979], Lake Windermere and Llynn Geirionydd [Turner and Thomson, 1981], Lac Morat, Lac d'Annecy and Lac du Bourget [Hogg, 1978], Lakes Volvi, Trikhonis and Begoritis [Creer et al., 1981] and Meerfelder Maar [Brown, 1981]. With the French and Greek lakes by Hogg  and Creer et al.  we have the problem that only calibrated ages are given in the database and we were not able to gather sufficient information (depths, tie point radiocarbon ages) to recalibrate them with the more recent calibration curve. We decided to keep them anyway as we estimated that the difference in calibration generally was less than 100 years from comparisons on other sediment records, for which calibrations by both the Clark  and the Stuiver et al. [1998a] curve were available. Vukonjärvi [Huttunen and Stober, 1980] and Pohjajärvi [Saarinen, 1998], which is dated by varve counting, are records from two Finnish lakes. The former has been digitized from the original publication, while the latter has been obtained by personal communication directly from the author. For Vukonjärvi an age scale had been developed from pollen ages. On the basis of the resolution of this age model and the sedimentation rate the age errors were estimated as 50 years. Data for 3 sediment cores from Vatndalsvatn in Iceland have been digitized from Thompson and Turner . We used only the data from cores VDVS2 and VDVS3, for which radiocarbon ages were available. On the basis of the table of Thompson and Turner  an age model of linear sections was developed. Each individual core data was dated and calibrated, then the two cores were stacked.
 Extremely few data exist from Africa. Data for Lake Barombi Mbo in Cameroun come from Thouveny and Williamson . Data from Lake Victoria, Uganda, were digitized from Mothersill . For both lakes the radiocarbon ages were calibrated as described in section 2. An inclination only record of Lake Turkana, Kenya was compiled by Lund and Constable (manuscript in preparation, 2005) from Barton and Torgersen  and unpublished work of J. King. As we did not want to use the 100 year interval data of Lund and Constable (manuscript in preparation, 2005), we redated the data series by interpolating the tie-point radiocarbon ages and calibrating them.
 A few data from Morocco [Kovacheva, 1984; Najid, 1986] can be found in the Archeo00 database. A few data from the Canary Islands by Soler et al.  found in the Archeo00 database have been included in the African regional file.
 For intensity from Africa we only know of a small number of data from Tunisia [Thellier and Thellier, 1959], which have been added to the Southern Europe file.
 Secvr00 contains records of several Australian lakes: the two north Australian and closely adjacent Lakes Barrine and Eacham [Constable and McElhinny, 1985] and Lakes Keilambete, Bullenmerri and Gnotuk [Barton and McElhinny, 1981] in southern Australia and also in close vicinity of each other. Declinations from Lake Barrine and Lake Eacham are not oriented and proved to be too low in comparison to CALS3K.1. They were adjusted by +37.4° and +10.0°, respectively. A record from Lake Pounui, New Zealand by Turner and Lillis  also is available in the database.
 A few Australian archeomagnetic directions are published by Barbetti . Calibration of the radiocarbon ages given there was done using CALIB in the same way as for the lake sediments. Locations for the sites were read from the map published by Barbetti and Polach . Mostly the site averages given by Barbetti  were used, but only if they truly came from one site with one date. In some cases composite site averages for two or three sites with nearly the same age are given. We calculated averages for the individual sites for those. A few data points were digitized from Barbetti . Some archeomagnetic directions from New Zealand were added from Robertson .
 A few statistics about the number of data between 5000 BC and 1950 AD are tabulated in Table 4. Sediment data dominate archeomagnetic data by more than a factor of two. More than four times as many directional data per component as intensity data could be compiled, the reason being the lack of absolute intensity information from sediments. The southern hemisphere is poorly represented in particular in all archeomagnetic data, both directional and intensity.
Table 4. Numbers of Compiled Data Between 5000 BC and 1950 AD
Component and Hemisphere
Inclination, N hemisphere
Inclination, S hemisphere
Declination N hemisphere
Declination, S hemisphere
Intensity, N hemisphere
Intensity, S hemisphere
 It is obvious that the data are not distributed evenly in space, and nor are they in time. The numbers of data in Tables 12 to 3 together with the time spans give a general idea of temporal distribution and the figures in section 6 will show this for the individual files. To obtain a clearer picture of global coverage through time, histograms of number of data in 100 year bins are shown for the individual continents in Figure 2. The increase of archeomagnetic data with time is obvious in. Even more striking is the imbalance between very scarce southern hemisphere data and northern hemisphere data with a particularly high concentration in the European region (note the different scales of the y axes). A comparable plot for the intensity data is given by Genevey et al. (manuscript in preparation, 2005).
Figure 3 gives more specific information about how the spatial distribution of observations varies with time. Each panel shows the log of the concentration C(λ, ϕ) of data at latitude, λ, longitude, ϕ: in effect a kind of density function for observations. The concentration is calculated by representing each datum location by a Fisherian probability density function centered on the data location (λi′, ϕi′) with precision parameter κ = 1000, corresponding to an angular standard deviation of about 2.6 degrees. If θi measures the angular deviation from the data position (λi′, ϕi′) at the location (λ, ϕ), then the contribution to the data concentration is f(θi). See Merrill et al.  for a description of the Fisher distribution. Thus for N data within a 500 year interval the total data concentration is given by
It is readily seen that (since each individual density function integrated over the sphere returns a value of 1) the integral of the function C over the whole of the Earth's surface returns the total number of data points within each 500 year interval. Consequently the differences among the panels in Figure 3, each of which shows log10C(λ, ϕ) for the specified time interval, also reflect the changes in number of data with time.
5. Uncertainty Estimates
 In global modeling the reliability of individual data is taken into account by weighting the data according to their uncertainty estimates. Magnitude and internal consistency of uncertainty estimates therefore are an important issue for the global data set. The original error estimates from the individual data are not consistent.
 Archeomagnetic and lava flow data often are site mean results averaged from several individual sample measurements. Values for the 95 per cent confidence cone about the mean direction, α95, are often given for directional site results. Whenever α95 was given we calculated one standard deviation errors for declination and inclination in the following way [see, e.g., Piper, 1989]: for N > 4, α95 can be approximated as
The precision estimator k is given by
with N the number of samples averaged in the site result and R being the magnitude of the site result vector. Uncertainties of site mean inclination and declination, δI and δD, are related to α95 by
However, α95 does not correspond to one standard deviation (σ) of a normal distribution, α63 does. As
we have to rescale the uncertainties and obtain σ uncertainties by
For some sites, however, only one sample has been measured or no α95 uncertainty is provided by the authors for other reasons. Moreover, α95 frequently is calculated using the number of specimens, which often come from a much smaller number of independent samples [Schnepp et al., 2003b], so that α95 is underestimated. We therefore will utilize minimum uncertainty estimates for global modeling that were assigned to all archeomagnetic results whose original uncertainty estimates were smaller than these fixed values. We adopt the previously used values from Constable et al.  and Korte and Constable  of 2.5° for inclination and 3.5° for declination. Moreover, with one exception none of the lake sediment data in the Secvr00 database, nor digitized data or data provided in digital form directly by the original authors came with uncertainty estimates. Here we also adopted 3.5° for inclination and 5.0° for declination as used by Constable et al.  and Korte and Constable  for lake sediments. All of these estimates were originally justified by Constable et al.  as result of a comparison between archeomagnetic and lake sediment data to UFM [Bloxham and Jackson, 1992] (predecessor of GUFM by Jackson et al. ) for overlapping times. UFM/GUFM are models for the past centuries based on historical and recent direct measurements of the field, i.e., models totally independent of our data set and of higher accuracy. The estimates prove to be the right order of magnitude and were judged to be too small rather than too large by the modeling attempts of Constable et al.  and Korte and Constable . The only exception where uncertainty estimates were given for sedimentary data are the Palmer Deep (PAD) data by Brachfeld et al. . The maximum angular deviations (MAD) from principal component analysis are available, and we used those as error estimates for inclination together with a minimum of our 3.5°, used for all other sediments. This choice is important in this case because there are a few data with very high MAD values around 4000 BC that would suggest an excursion if they were reliable.
 Archeointensity data are more difficult to determine reliably than directional information because of the diversity of the experimental protocols used to derive the results. Results obtained from thermal methods dominate our new intensity data compilation (about 90% were obtained using classical thermal methods derived from the procedures of Thellier and Thellier or Wilson). In detail, the procedures which were employed notably differ between authors. For instance, tests carried out to detect magnetic alteration may have been performed or not depending on the study. Effects due to thermo-remanent magnetization (TRM) anisotropy and cooling rate dependence of TRM acquisition were also not systematically taken into account. In addition, the number of studied samples per site, and even the definition of an intensity site, vary between the authors. The reliability of the available results is therefore clearly not equivalent. The original archeointensity data errors range from less than 1% to 20%, but errors of less than 5% seem unlikely with respect to the reproducibility of measurements. Comparisons of 19th and 20th century results with GUFM [Jackson et al., 2000] also suggest that a minimum uncertainty estimate of 5 to 10% is more appropriate. Moreover, we need consistent uncertainty estimates to obtain a consistent global model and so cannot only depend on the original data errors to rank the data. Instead, we categorize the error uncertainties following several criteria. These criteria, which will be fully described by Genevey et al. (manuscript in preparation, 2005), rely on the method used (Thermal, Original Shaw, Modified Shaw, Microwave, etc.), the original relative dispersion of the mean, and several parameters as test for alteration, TRM anisotropy effect evaluated or not, the number of samples per site and the number of sub samples analyzed per fragment. As a result, three categories were defined. The first class corresponds to the data judged as more reliable. A relative dispersion between 6 and 10% was given to these data. The second category comprises those data which fulfill all our selection criteria to be ranked as highly reliable, except that their original dispersion is too high (greater than 10%). We keep the original dispersion of these results to weight them. The third category corresponds to the data which are judged as less reliable. We attribute a relative dispersion of 20% to these data when their original dispersion was less than 20%, otherwise we kept the original dispersion.
 Age uncertainties, as discussed above, can be large but also are not always provided in all of the original studies. We used the errors from the original literature wherever possible. When we (re)calibrated 14C ages with the method by Stuiver and Reimer , we obtained age uncertainties in the following way: If no uncertainties were available for the 14C ages, we assigned values of 50 years to them as input for the calibration. The probability method of CALIB version 4.3 produces the medium probability calendar age and lists the one standard deviation ranges around the calendar ages with respective probabilities on which this medium probability age is based. We estimated a simplified uncertainty from this by taking the difference between the highest and lowest age of all the ranges and dividing it by two times the number of ranges. We did not take into account the probabilities given for the individual ranges. This applies mostly to the lake sediments. For those lake sediment data where we had only calibrated ages we assigned uncertainties of 25 years for data calibrated by modern methods and 50 years for the few European ones calibrated by the older curve by Clark . 25 years is slightly smaller than the average resulting estimate from our calibrations, but mostly falls into the same category with our technique of taking the age uncertainties into account as part of the measurement errors as described below. Pohjajärvi was dated by varve-counting. We believe this method to be of higher accuracy than radiocarbon dating and assigned age uncertainties of only 5 years to all of the ages. An additional problem with sediment data is that the sediments might carry a PDRM (postdepositional remanent magnetization), which means that the field is recorded with some delay (lock-in time) with respect to the sediment deposition, making the magnetization age somewhat younger than the sediment age. Estimates of such a lock-in time can lie in the order of several tens to hundreds of years [e.g., Stockhausen, 1998]. To maintain independent age control, we did not try to adjust any sediment records for PDRM age effects a priori but waited to see how well our first global models based on all the data fit the different time series. When no age uncertainties were given for archeomagnetic data we assigned values in four categories depending on age and accuracy of the given age. Arguing that the age of older artifacts or lava flows mostly will not be as well-known as those of younger ones, we assigned 250 years of uncertainty to all results older than 0 AD. For younger results, we assigned uncertainties of 100 years, 50 years or 10 years if the accuracy of the given age seemed to be a century, a decade or one year, respectively. For the archeointensity data this is a slight modification of the first uncertainty estimates assigned by Genevey et al. (manuscript in preparation, 2005), where the categories were defined purely by age of the samples if no uncertainties were given by the authors. The reason for this adjustment was that dating error estimates by us needed to be coherent with the many existing uncertainty estimates or age range guesses from the original authors. We recognize, however, that they might be over-optimistic in some cases. For our current global modeling approach it is internal coherence of uncertainty estimates which is most important.
 So far our temporally continuous global modeling method does not allow us to take dating uncertainties directly into account. Instead we increased the data uncertainties in relation to the dating uncertainties, because through secular variation an erroneous age has an equivalent effect to an error in field component data. These errors are independent from the measurement errors, so they have to be added according to the rule that the square of the overall error is the sum of the individual errors squared. We used a simple but not linear scheme with four categories, based on average temporal variations of the geomagnetic field. The values are listed in Table 5. The uncertainty of data with dating errors of less than 10 years was not modified and data with age uncertainties larger than 500 years were rejected. Measurement and dating uncertainties are given individually in the files we provide at EarthRef.
Table 5. Increase of Magnetic Uncertainty Estimates in Categories Depending on Age Uncertainties
 For some of the archeomagnetic results instead of an exact location only a region or locations plotted on a map were given in the publications. If the locations could not be determined with an accuracy of 0.1 degrees the uncertainties of location were estimated and used to increase the uncertainty of the respective magnetic results in a similar way to the age uncertainties. We used linear relations between changes in latitude and longitude and gradients of the magnetic field elements. The factors were determined from average spatial gradients of recent global model predictions and are given in Table 6.
Table 6. Linear Increase of Magnetic Uncertainty Estimates per Degree of Uncertainty in Latitude and/or Longitude
6. Comparison of Compiled Data to CALS3K.1 Model Predictions
 In Figures 4567891011 to 12 we present the declination and inclination time series of the lake sediments, the directions of the archeomagnetic data of our regional files and the archeomagnetic intensities. Error bars have been plotted for the archeomagnetic data on the basis of our uncertainty estimates. These error bars are the estimated measurement errors which in some cases still are smaller than the minimum errors we will use for further global studies and do not include dating or location uncertainties. As described above, the lake sediment data do not have individual error estimates, and the fixed values of 5° and 3.5° for declination and inclination, respectively, have not been plotted. For the archeomagnetic data grouped in regions we have to keep in mind that the data in these time series do not come from exactly the same locations. Consequently, for regions with large latitudinal or longitudinal extension, e.g., South and Meso-America (SAI) of intensity data or China and Mongolia (CHA) of directional data, a deviation of data points from a smooth curve might be due to real field gradients between the sites, not data errors. For several of the lake sediments, as for example Lake Eacham (EAC, Figure 8) or Vatndalsvatn (VAT, Figure 4), it is obvious that the time series consists of data from more than one core which are not really coherent. In using the data for global modeling it might be worth experimenting with using data from only one of the cores to see whether a better fit of model and data can be obtained from any of them respectively. For closely adjacent lakes good agreement between the data series does not always occur as in the particularly striking case of Lakes Superior (SUP) and Huron (HUR) in North America (Figure 7). Again we did not decide a priori which data we regard as the more reliable one but include both in their original form in our data compilation.
 Continuous CALS3K.1 model predictions for the sediment or average region locations and GUFM predictions have been added to all the plots for comparison. The smoothed time series of the data set PSVMOD1.0 [Constable et al., 2000; Lund and Constable, manuscript in preparation, 2005], which are the basis for CALS3K.1 have also been added to the plots at locations for which they exist. Note again that for the regional data those might be reduced to a slightly different location than the one for which the model predictions are given. The PSVMOD1.0 curves from both Australia and New Zealand are shown in the Australia (AUS) plot of Figure 11, which includes New Zealand data. Likewise the PSVMOD1.0 curves from Mongolia and PRC are shown in the China and Mongolia (CHA) plot of Figure 10. Reasonable general agreement exists between model predictions and most of the data, confirming the quality of CALS3K.1 which was constructed from a strongly smoothed subset of the data presented here. In some cases the agreement is very good, even though no data from a location has been used in constructing CALS3K.1, as seen in the declination of Birkat Ram (BIR, Figure 6). On the other hand the agreement between data and model may look surprisingly bad for locations whose data had been used in constructing CALS3K.1. This is mostly due to the fact that the PSVMOD1.0 curve and our data do not agree for these locations, as in the example of Fish Lake (FIS, Figure 7). For Fish Lake this discrepancy is due to the fact that a temporal adjustment had been applied to the 100-year interval time series of PSVMOD1.0 (Lund and Constable, manuscript in preparation, 2005). This adjustment seems to be based on paleomagnetic comparisons and thus is not independent. Moreover, comparing the declination of Fish Lake with that of Mara Lake (MAR) further north did not seem to justify this compression of timescale. Therefore in this case, as in all other cases where no independent arguments for adjusting ages independent of paleomagnetic comparisons existed, we kept the data on the original timescale. This is also a concern particularly for the Icelandic Vatndalsvatn (VAT) and Finnish Vukonjärvi (VUK). We leave it to our further work of global modeling to experiment with adjustments of timescales based on model fits to the data.
 Good agreement between data and model predictions is obtained for intensity (Figure 12), although no intensity data were used directly in constructing CALS3K.1 and the early epochs of GUFM. This is no surprise but confirms the validity of the theory that the archeomagnetic field can be described completely by the directional information and a scaling factor [Hulot et al., 1997]. The scaling factor in the case of CALS3K.1 was a simplified axial dipole change, for which the global virtual axial dipole moment (VADM) change determined by McElhinny and Senanayake  from a smaller set of intensity data has been used as a template. A closer look reveals that the agreement is worse in the first few centuries of CALS3K.1 (1000 BC to about 500 BC) and that the model predictions are mostly too high as seen clearly in Europe or India (INI, Figure 12). This, however, is consistent with more recent global dipole results by Yang et al. . As already noticed by Korte and Constable , those results suggest that the dipole strength assumption of CALS3K.1 is slightly too high in the model's early centuries. VADMs for the new intensity compilation have been calculated by Genevey et al. (manuscript in preparation, 2005) and will be discussed in more detail there.
 We have compiled a data set of both directional and intensity paleomagnetic and archeomagnetic data from the past 7000 years, consisting of 16,085 values of inclination, 13,080 values of declination, and 3188 values of intensity, respectively. The data were compiled from existing databases and original literature. Data series have not been smoothed any more than the final results given in the original literature. All radiocarbon ages have been calibrated in a consistent way with the CALIB program by Stuiver and Reimer , version 4.3. Ages calibrated by older methods have been recalibrated wherever original 14C ages were available. To maintain independent age scales, adjustments to timescales based on paleomagnetic comparisons have been avoided. Special thought has been given to data uncertainty estimates. Data and dating uncertainties given in the data sources were considered and adopted or new estimates assigned in an attempt to obtain consistent uncertainty estimates for the whole data set.
 Data distribution is significantly inhomogeneous in time and space, but seems to be sufficient for global modeling attempts. In a subsequent paper we will use this data set for improving CALS3K.1 and developing a CALS7K model for the past 7000 years [see Korte and Constable, 2005]. As we might have missed suitable data for this work and new data might already have become available, we encourage the reader to let us know, so that we can update the data compilation and further improve our global millennial scale models in the future.
 We wish to thank all the colleagues who collected paleomagnetic and archeomagnetic samples/cores, carried out the original analyses, and made their results publicly available. Without their work this data compilation and global geomagnetic models on archeomagnetic and paleomagnetic timescales would not be possible. Figure 1 was produced using the GMT software by Wessel and Smith . Funding for this project was partly provided by NSF grant EAR 0112290. We thank Richard Holme and two anonymous reviewers for their constructive/encouraging reviews of the original manuscript.