Digitizing weather observations from World War II US naval ship logbooks

The number and coverage of weather observations over the oceans were considerably reduced during World War II (WW2) due to disruptions to normal trade routes. The observations that do exist for this period are often unavailable to science as they are still only available as paper records or scanned images. We have rescued the detailed hourly weather observations contained in more than 28,000 logbook images of the US Navy Pacific Fleet stationed at Hawai'i during 1941– 1945 to produce a dataset of more than 630,000 records. Each record contains the date


| INTRODUCTION
Accurate weather observations over the ocean are vital for global assessments of climate change (IPCC, 2019).In particular, sea surface temperature (SST) and marine air temperature (MAT) are considered essential climate variables (Bojinski et al., 2014).SST observations are also used as boundary conditions for atmospheric reanalyses, hence any uncertainty in SSTs affects our confidence in estimates of global changes (Kent & Kennedy, 2021).Ship type, on-board instruments and methods of observation have evolved considerably over the last 200 years, which means that amassed SST observations are a heterogenous mixture of observations from different ships, instruments and practices, which can generate artificial variability if not corrected (Kent et al., 2017;Thompson et al., 2008).
All reconstructions show that the global oceans have warmed since the start of the 20th Century, but there is anomalous warmth in global mean SSTs during the World War II (WW2) period (between 1941 and 1945) when compared to the preceding and following 5-year periods (Chan & Huybers, 2021).Also, the uncertainty in the estimated anomaly for this period is several times larger than for more recent periods.
Several possible explanations have been put forward to account for this anomaly, referred to as the WW2 warm anomaly (WW2WA) by previous studies, such as the reduced number of observations (Chan & Huybers, 2021;Freeman et al., 2017) and changes in the types of SST measurement (Cornes et al., 2020;Kent et al., 2017).When WW2 commenced, trade routes were severely disrupted, limiting observations taken by voluntary observing merchant ships (VOS) which usually criss-cross the global oceans.This caused a large drop (58%; Freeman et al., 2017) in the number of marine observations available for the duration of WW2.
More crucially, poorly documented changes in the observing practices may have led to large biases and errors.For example, the preference for taking SST measurements from the inlet water pipes used to cool engines (known as Engine Room Intake, ERI), in contrast to hauling canvas/ wooden buckets onboard, resulted in a warm bias in the aggregated SSTs (Kennedy et al., 2019).The rapid rate of these transitions is not always well documented and can be mis-labelled which impedes the correct adjustments being applied to the observations (Chan & Huybers, 2020).Another practice changed during WW2 was that more observations were taken during daytime than nighttime.Both of these changes are assumed to be due to the need to reduce exposure to the enemy ships and avoid being detected (Chan et al., 2019;Chan & Huybers, 2021).Without additional data and documentation of prevailing practices, disentangling the reasons for the WW2WA is very difficult.
Other types of weather observation are equally useful.For example, observations are assimilated in a similar process to making weather forecasts to produce longterm reanalyses such as the 20th Century Reanalysis (20CRv3; Compo et al., 2011;Slivinski et al., 2019).The 20CRv3 spans from 1863 to 2015 and assimilates surface pressure observations to produce an 80-member ensemble of 3-hourly estimates for surface and upper-air parameters.Due to their spatial completeness and temporal continuity, these reanalyses have become datasets of choice to quantify climate variability over decades to centuries.However, the quality of reanalyses is often poor in times and places where coverage is sparse in observational datasets (ISPD v4; Compo et al., 2019 and ICOADS;Freeman et al., 2017) such as the WW2 period.These reanalyses can be improved by assimilating newly rescued observations (e.g.Hawkins et al., 2023).
With this context, we present weather data rescued from WW2-era US Navy ships' logbooks.Observations from naval vessels are the primary sources of marine observations for the WW2 period but many were destroyed as an act of war, or simply forgotten due to the length of time they were considered classified.To fill gaps in observational coverage and contribute to improving metadata regarding observing practices, the NOAA-funded project 'Old Weather: World War II' gathered thousands of volunteers to transcribe weather observations from logbooks of US destroyers and other naval ships which were part of the US Pacific fleet based at Hawai'i.These ships saw action in the Indo-Pacific and Far-East, taking observations at times and places where few or no other digitized observations exist.
These new observations and metadata will be invaluable for improving reconstructions of past climate.In Section 2, we describe the format of the logbooks, accompanying metadata such as important features of ships, instrument types, their placement on the ships and instructions for the observers.In Section 3, we describe the citizen-science project and steps taken to process raw data into a quality-controlled observational dataset.In Section 4, we explore the resulting dataset by analysing its spatio-temporal features.We examine existing datasets (ICOADS) to highlight the impact that new data will make.We then compare observations contained in the new dataset with 20CRv3, to demonstrate how the uncertainty in 20CRv3 might be reduced.Finally, in Section 5, we offer some lessons learnt from designing the citizen science project through to producing the final dataset and discuss the potential of our new dataset for understanding the WW2WA.

AND OBSERVATION METADATA
In 2017, the National Declassification Center (NDC) at the National Archives and Records Administration (NARA) released nearly 200,000 pages of formerly classified U.S. Navy Command Files from the WW2 era.The files consist primarily of records from the Pacific Theatre between 1941 and 1946.The files contain many kinds of documents, maps, ship logbooks, photographs etc.Here we focus on the ship logbooks containing meteorological observations.

| Logbook format
The weather logbook page for each day was divided into two parts: Navigation and Meteorological observations, and Remarks, facing each other.The left leaf contained a number of columns for position and weather observations (see example in Figure 1).The right leaf consisted of free text descriptions of relevant events that occurred onboard the ship (see example in Figure S1).
The Navigation and Meteorological sheets contain spaces for the name and/or hull number of the vessel, date and detailed meteorological, hydrographic and navigational data.These data include wind speed and direction, barometric pressure, air and water temperature, visibility and overall weather conditions, as well as latitude and longitude.Meteorological observations were taken every hour, and positional observations were recorded three times per day at 8 am, 12 pm and 8 pm.
The entries on the Remarks sheet generally appear in four-hour blocks that correspond to the major "watches": 0400-0800, called the morning watch; 0800-1200, the forenoon watch; 1200-1600, the afternoon watch; 1600-2000, the dog watch; and 2000-2400, the first watch and 0000-0400, the middle watch.Each block of entries was signed off by the officer of the deck, usually a junior officer (often an ensign or lieutenant [junior grade]), and approved by the navigation officer (National Archives, 2016).
The logbook was usually a standard printed form supplied to all USN ships (as shown in Figure 1), but as the war progressed a new amendment was passed to write meteorological observations and navigation information separately.This was probably done to deny the enemy information about fleet movements should one of the ships be captured.In these documents, called War Diaries, positional information was written alongside remarks and kept confidential (see example in Figure 2).General directions to personnel on how to complete the logbook are shown in Figure S5.All the position and meteorological information was typeset rather than hand-written.

| Ship type, placement of instruments and measurement techniques
Logbooks from three USN battleships, one aircraft carrier, eight destroyers, six cruisers and one gun boat, for a total of 19 ships, were used in this project (Table 1).Each class of vessel has a different design and dimensions which affects the placement of instruments, and therefore they have different potential offsets in the measurements taken.For example, barometric pressure observations depend on the height of the measurement above sea level.And, the cruising speed of each ship type differs which may have an impact on the SST measurements if measured by the bucket method.
It is seen in many contemporary photographs that portable Stevenson-type screens were hung off of the island aft of the bridge (Figure S2).The barometer would normally be hung in the bridge room.Ships of this period usually had their bridge structures placed relatively high above the waterline to provide good visibility for the crew to observe and engage in combat.Except for gun boats, the height of the bridge in all ships was around 50-80 ft above the water line (Table 1, US Intelligence, 1945), but it is important to note that bridge height varied slightly depending on the specific class and design of the ship, as well as any specific modifications made to the ship during its service or specifically for wartime.
All ships in the current collection were fitted with Kew or mercurial barometers.It was recommended to be placed "as far from a heat source as possible" (Weather Bureau, 1938, 1941), alongside an attached thermometer to enable a temperature correction to be made.As a secondary source of observations aneroid barometers were also placed onboard the ship (Figure S3).Both dry bulb and wet bulb thermometers were recommended to be housed in a Stevenson-type screen to minimize the effects of insolation and conduction of heat, and shelters to be made of wood and painted white.However, no suggestion is made as to where it must hang on the superstructure of the ship; instead moveable shelters were recommended.These portable screens were to be hung from the weather side before the observations were taken, in a position where it was freely exposed to the wind and unaffected by artificial sources of heat.It has been found from contemporary photographs that this advice was not always followed.Some of the screens were affixed near galleys absorbing extra heat from the ship.Even some were painted black or grey to match ships' camouflage (Figure S4).It was recommended to use clean water to wet muslin wrapped around the wet-bulb and to replace muslin frequently to maintain accuracy of the observations.
For measuring SSTs the ERI method was recommended over the bucket method (Weather Bureau, 1938, 1941).Pulling a bucket filled with water from the side of a fast-moving ship often resulted in large amounts of water spilled leaving insufficient water in the bucket for effective measurement.The thermometer for the ERI method was recommended to be placed between the centrifugal pump used to circulate water and the ship's side.

| Instructions for the observers
The navigation officer was in charge of preparing and keeping ship logbooks in the required format (USN Regulations 1920).The Weather Bureau, U.S. Dept. of Commerce, published 'Instructions for Marine Observers' (Weather Bureau, 1938, 1941) to standardize and facilitate taking weather observations at sea as safely and accurately as possible.The main instructions pertain to recording the day and time, ship's local time, format of ship's latitude and longitude to be entered and meteorological observations.
Wind direction is recommended to be recorded as the true direction from which the wind is blowing with options consisting of calm (No direction) and directions of a 32-point compass.For wind force, a 12-point Beaufort scale was recommended to be used, with zero being calm, up to 12 for hurricane force winds.Similarly, pressure observations are recommended to be recorded in inches of mercury after applying necessary corrections.Observations from the attached thermometer were requested to be recorded in whole degrees F. For air temperatures, a portable thermometer with an outer screen placed on the weather side is recommended to be used, with measurements also in whole degrees F.
For SST observations, if the bucket method is employed, it is recommended that the bucket is dry, or at least empty of all residual water before a throw.Water should be hauled as far away as possible from ship's discharges.It is instructed to haul up the bucket as quickly as possible without spilling too much water and to carry the bucket immediately to a sheltered place to avoid strong winds and direct sunlight.The thermometer is recommended to be left in the bucket sufficiently long to acquire the water temperature accurately, recorded to the nearest whole degrees F.
If the ERI method is used, observations must be reported from the engine room at the observed time, ensuring that the temperature reported is the current temperature of the water entering the ship rather than the temperature of water circulating in the system.Temperature should be recorded to the nearest whole degrees F. Importantly, if weather conditions or other factors hinder measuring observations by the bucket method, the observations should not be substituted with ERI method observations unless clearly labelled.

| Number of logbooks used
Figure 3 shows the number of ships and number of logbook images (including War Diaries) for each year (1941)(1942)(1943)(1944)(1945) in the collection used here.Each ship's logbook usually starts from 1 January and ends on 31 December each year but the number of images per ship per year can vary due to not all logbook pages surviving or because not all surviving pages are scanned.Sometimes duplicate pages exist, and the extra War Diary pages are included for 1943 and 1944.We can further divide each year's collection into constituent ships (Figure 4).Many ships listed here were present at Pearl Harbour during the attack by Japanese bombers on 7th of December 1941.However, all ships listed here saw action in the Pacific theatre at some point during the war.Out of these, USS Hull and USS Monaghan sank in 1944 when hit by Typhoon Cobra in the Philippines Sea, hence there are no observation-days for 1945 for these ships (Cressman, 2000).

| RESCUE OF LOGBOOK DATA
For the volume of data contained in the collection described here, a traditional manual transcription approach would have taken many person-years of effort.Instead, the availability of scanned images of the ship logbooks enabled the creation of a citizen science project to ask volunteers to transcribe the observations into digital form more efficiently.
The Zooniverse platform (www.zooniverse.org)offers a flexible framework upon which various citizen science projects have been built.Many different themes are represented on the platform, from astronomy, biology, ecology and conservation, to historical documents.The original Old Weather project was one of the first projects to extract historical weather observations contained in ship logbooks from an extended period around WW1. Since then many projects have successfully used Zooniverse to digitize historical weather observations, e.g.Weath erRes cue.org (Craig & Hawkins, 2020;Hawkins et al., 2019), Rainf allRe scue.org(Hawkins et al., 2022), South ernWe ather Disco very.org(Lorrey et al., 2022), Climate History Australia (Gergis et al., 2022) and Meteorologum ad Extremum Terrae (Lakkis et al., 2022).We adopted a similar approach to recover the USN ship data in a project called Old Weather WW2.

| Old weather WW2
The 'Old Weather WW2' project was designed keeping the columnar structure of the original logbooks in mind.The navigational and observations page (Figure 1) is divided into a number of columns for each weather parameter observed and positional-ancillary information.The questions to the volunteers were arranged into workflows, where each workflow is logically self-contained and asks for partial information about the logbook page.Combining transcriptions from all workflows gives full information about the logbook page.
We chose to prioritize weather and positional information, and so the routine day-to-day information and remarks were omitted from this data rescue exercise.Five workflows were defined: Navigation, Barometer (AM and PM) and Temperature (AM and PM).The navigation workflow asked about date, position (8 am, Noon, 8 pm), place (if known) and zone.The barometer workflow asked for transcriptions of the readings in the barometer column and for the thermometer attached to the barometer.As these variables were observed every hour of the day, two near-identical workflows for AM and PM were defined to shorten each task.Similarly, the Temperature workflow asked about dry-bulb (Tdry), wet-bulb (Twet) and sea-surface temperature (Twater, or SST) readings in the various columns.Similarly to the barometer, two near-identical workflows for AM and PM were defined.
For 1943 and 1944, when positional information and meteorological information were recorded in separate sheets, the locations were taken from War Diaries (Figure 2).A War Diaries workflow was defined to ask for transcriptions of the place and location of the ship three times a day at 8 am, 12 pm and 8 pm.
Each logbook image is passed through all workflows (except the War Diary images which are used only for the War Diary workflow) to extract complete positional and meteorological information.A volunteer could do any workflow of their choice, and their transcription response is saved on the Zooniverse server.Each task is shown to at least three independent volunteers to triplicate transcriptions which allows for subsequent error checking.In total, 4,050 volunteers contributed their time to this project.

| Infilling, error checking and correction
After the transcription phase, the data were consolidated, corrected and standardized to form a completed dataset.
The most common kind of error in the transcription process is a typographical error made by a volunteer.To detect and correct these errors, a consensus check of all transcribed values for each text field is performed.Each text field is transcribed three times and if all three values match then the value is accepted to be correct.If only two out of three values match, the value with more matches is accepted.And if all three values are different, the field is kept empty and flagged to indicate inconsistency.There are more mismatches between values when the task asks volunteers to type free text, as compared to entering a single observation.
For example, for the Date variable only about 53% of records have transcribed texts that all exactly match each other, 42% have two-thirds matching and 5% have no matches (Table 2).Dates can be written in different formats (e.g."Jul, 10th 1941", "1941-07-10", "07/10/1941", The total number of observation-days for each ship separated by year. T A B L E 2 Percent of transcriptions that show full match (three out of three), two-thirds and no match (all three text values are different) between text inputs for each variable before processing."Wed 10/07/41"), which are all valid but strict string matching will result in flagging disagreement.To deal with such situations, all date texts were parsed into a standard date format before the consensus check was performed.After these processes are applied about 94% of the dates have full agreement, 3% of records have a two-thirds match and 3% have no match.This demonstrates the very high quality of the volunteer transcriptions of what is written on the pages.For positions, a number of delimiters such as ',", or .were used by the volunteers to separate the position into respective degree, minutes and seconds (if present) and a hemisphere (N/S) flag.Reformatted and decimalised positions are compared to find matches.Place names texts are also compared to find matches, and 81% of records have full agreement, 16% have two-thirds match and 2% have no match.Once a consensus value is confirmed, place names are looked up in a reference table that stores all variations of names and corresponding positions.The place names are then replaced with known positions.

Variable
Similarly, for meteorological observations all input text is read and separated into values for each hour.Due to variations in the formats of the input text, all possible combinations of white space and delimiters are considered to isolate values for each hour.Then the level of consensus is checked and exactly matching values are kept, whilst uncertain values are flagged.Meteorological values suffer to a lesser extent compared to free text and less than 1% records have no match, with 95% having full agreement.
There are many instances when even after such consensus checking, obviously erroneous values persist, possibly due to the original observer incorrectly reading or recording the observation; we use statistical methods to detect and correct these where possible.For positional information, a reasonable threshold for the difference between consecutive positions is set, and whenever this threshold is crossed, the values are flagged for checking.Generally, the position of the ship was recorded three times a day at 8 am, 12 noon and 8 pm, although the availability of position information is not complete on the scanned sheets and so there are occasional longer gaps.The average speed of a USN ship is assumed to be 15 kn (Table 1), which means that, on average, a ship would travel 60 NM between 8 am and 12 noon, 120 NM between 12 noon and 8 pm and 180 NM between 8 pm and 8 am the next day, assuming the ship stays mobile throughout.However, distance covered could be twice as much if a ship travels at or near its maximum speed (~30 kn, Table 1).
We set the threshold as 3° in latitude and longitude (as 1° arc length at equator is 60 NM) allowing for up to 180 NM between consecutive positions that would be enough for average speeds.But to accommodate higher speeds, the position is flagged as erroneous and made null if the next position recorded in the logbook exceeds this threshold by a factor of 2 or more.Also, if the gap between adjacent positions is less than 100 h (possible offset distance of 1500 NM), the position is interpolated using cubic spline curve fitting to imitate a curved path travelled by the ship instead of the straight line produced by a linear interpolation.
Often, positions are not recorded and place names are used instead.In such cases, the port of the named place is considered as the ship's position.When both place name and positions are present, the place names are disregarded.In the final dataset, two versions are made available: one with these infilling corrections made and one with just the raw positional information.This allows other choices to be made if required.
Weather observations were taken hourly throughout the day.All values are checked for physically improbable values against the range 27.32-31.13inHg (925-1054 mb) for sea-level pressure and 20-120°F for temperature observations, reflecting the usual conditions encountered at sea.Values outside these limits are then made null.

| Dataset
More than 630,000 unique records have been rescued, where each record contains the date and time, positional information and one dry-bulb temperature (Tdry), wet-bulb temperature (Twet), Twater (SST), barometer-attached thermometer temperature (Baro At. therm.) and pressure observation.There are 611,223 observations of air pressure, 197,716 observations of Baro At. therm., 601,978 observations of Tdry, 604,155 observations of Twet and 314,713 observations of SST.There are an average of 7,000 records per ship per year, and each ship logbook has observations for around 300 days per year on average (Figure 5).Due to the additional effort required to observe SSTs, fewer SSTs are recorded in the logbooks.
We note that the total number of observation-days per ship in Figure 4 does not translate into the number of records per ship in Figure 5.This is because some records have meteorological observations but not positional observations, and those meteorological observations are not included in the final dataset.The logbooks containing the positions were either destroyed or misplaced and were not included in the transcription process.For example, USS Aylwin, Dale, and Pennsylvania have observations present for particular years in Figure 4 but not in Figure 5.If positional information becomes available in the future, we could make the meteorological data available for those years.1943, 1944 and 1945 were long-distance trips, first to Aleutian Islands, then Fiji, Marshall Islands and Philippines.1945 started from the Naval Shipyard in Washington and travelled to the southern coast of Japan via Hawaii, and also included multiple trips to Chinese coasts.Starting from Japan, the ship then visited Taiwan, Singapore, Sri Lanka, Cape Town, finally reaching New York, completing a circumnavigation.
All ship tracks are supported by documentary evidence about the ships' movements from other sources (Cressman, 2000).Over the 5-year period, the various ships travelled across the Pacific, Indian and Atlantic oceans, providing a rich dataset all across the globe.

| Spatio-temporal characteristics of the dataset
We next consider how the rescued data could improve our understanding of climate variations by comparing the distribution of new observations with those already available.The spatio-temporal distribution of pressure observations in the new dataset is shown in the left column of Figure 7, grouped by each year of the dataset, and binned into a 2° × 2° regular grid.In 1941, the observations were distributed mainly near Hawai'i (around 10,000 observations are concentrated at Pearl Harbour) and the West coast of the US.As WW2 progressed, the ships moved towards the Aleutian Islands, Micronesia, South America, Australia and New Zealand in 1942.By 1943, the ships spent longer periods of time in fewer places such as the Aleutian Islands, Hawai'i and Fiji.In 1944 the ships moved west with observations concentrated around Guam and other Pacific Islands.By 1945, the ships were reaching the coasts of Philippines, China and Japan, covering the whole of South China Seas from Hong Kong to the International dateline.
In the ICOADS dataset (Freeman et al., 2017), existing MSLP observations over the WW2 period are relatively scarce, especially in the Pacific (middle column of Figure 7).The number of observations in ICOADS suddenly drops in 1941 when compared to 1940 (not shown), presumably due to the start of WW2.The distribution of existing observations is almost static over the 1941-1945 period.Large areas of the Atlantic Ocean and along the major shipping routes do have observations but never more than 100 observations per grid cell per year.
The new dataset produced here will fill in vital gaps in the ICOADS dataset.The right column of Figure 7 shows that in some areas of the Pacific Ocean, there will be an increase of 100% when this WW2 dataset is added.The effect is even more profound in 1942 and 1943 when the new observations are added in the Western Pacific where the ICOADS dataset is virtually empty.For 1944, observations from many previously unobserved areas in the equatorial Pacific Ocean are added and for 1945 the WW2 dataset fills in many gaps in large areas of the South China Seas.

| Inter-comparison of convoy ships
The quality of the rescued observations can be assessed by comparing simultaneous but independent observations from ships travelling in a convoy.Figure 8a shows one such convoy when USS Detroit and USS Macdonough travelled together from San Francisco to Pearl Harbour starting in June 1941.It can be seen that daily mean air pressure measured on the two ships closely follows each other whenever the distance between the two ships (blue line, in Nautical Miles) is less than 500 NM (Figure 8b).There is a correlation of 0.85 (p < 0.05) measured over the period covered by the red bar which ends in December 1941 when the two ships separate.Similar agreement can be seen in the measurements of Tdry (Figure 8c).

| Comparison with 20CRv3
Another approach to assessing the quality of observations and the potential to improve reconstructions of climate variability is to compare with a reanalysis of the same period; here we use the 20th Century Reanalysis v3 (Slivinski et al., 2019).Although the observer instructions' state that observed air pressure should be corrected for temperature, gravity and instruments' height above sea level and any instrumental offset (Weather Bureau, 1941, p. 16), we do not have documentary evidence to be certain if this was done or not. Figure 9 shows daily mean air pressure observations taken onboard USS Detroit, compared with the 20CRv3 daily mean MSLP fields interpolated to the observation location and times over the 5-year period from 1941 to 1945.
The grey shading at the bottom of the figure shows the 20CRv3 ensemble spread at those locations and times.The average ensemble spread is 1.3 hPa, and the first peak in the ensemble-spread occurs around December 1941, coinciding with the Pearl Harbour attack, and throughout the war period the ensemble spread is relatively large.However, the correlation between the 20CRv3 data and USS Detroit observations is high (r = 0.84) which adds to the confidence in the rescued dataset.It would be expected that the inclusion of the new observations in a future version of the reanalysis would reduce the ensemble spread (e.g.Hawkins et al., 2023).
Figure 10 shows the same analysis for air pressure observed by USS Salt Lake City.The correlation with 20CRv3 is lower than for USS Detroit (r = 0.63) and with periods of apparent bias.Note also that there are times when the ensemble spread is low but the difference between ship observations and 20CRv3 is large.For example, during 1942, the differences between ship observations and 20CRv3 was more than 5 hPa, but the ensemble spread is close to average at around 1.4 mb.From Figure S8 we can see that USS Salt Lake City covered many regions of the Pacific Ocean where very few observations exist in ICOADS (which was used to generate 20CRv3).This suggests that there is a bias in the reanalysis either in the data-poor region or in the ship observations.To examine this further, we compared ERA5 (Hersbach et al., 2020) for the same ship tracks and found it to be similar to 20CRv3 (not shown) which suggests that an observational bias is more likely in this particular example.

F I G U R E 7
The number of air pressure observations binned into 2° × 2° regular grid for this dataset (left column) and for ICOADS (middle column).The right column shows the percentage change in the number of air pressure observations available if the new dataset was added to ICOADS.

| Lessons learnt
We share here some of the lessons learned from this zooniverse project.The design of transcription workflows in this project reflected the tabular structure of the logbook page.Providing context about the logbook pages, the purpose of the project, and where the data would be used motivated the volunteers.Similar information requiring transcription was grouped together into workflows, e.g.positions, zones and dates were asked in one single workflow, and temperature (both AM & PM) and barometer (both AM & PM) were asked in separate workflows.Specifically, we could improve on the following points for future projects: 1. Transcription or classification tasks should be broken down into smaller chunks as it helps reduce long input strings.Longer input strings are more prone to errors and string mismatch than shorter strings.2. Navigation and meteorological tasks asked volunteers to copy and paste pre-written text in the form to input the data.It was used to give structure to input data, but it often created confusion among the volunteers and resulted in many mis-shaped strings being submitted (Figure 11).3. Storing all images used in the project at a publicly accessible location is helpful to see images in sequence as in the zooniverse platform the pages are served in a random order.

| Summary
A large tranche of WW2 era (1941)(1942)(1943)(1944)(1945)) US Navy ships' logbooks have been transcribed, bringing millions of previously unseen weather observations into light, and enabling those observations to be used for historical climate research and to improve reanalyses.Over 28,000 logbook images, including war diaries, were used in the project.
4,050 volunteers participated in this process by typing 13.35 million keystrokes over the period of 1 year.Relevant metadata such as ship dimensions and method of observation have been extracted and collated from observer manuals and regulations.This data and ancillary information about observing methods and usual location of instruments on board should help estimate offsets and biases in the raw observations to correct for environmental factors (Carella et al., 2018;Kent & Kennedy, 2021).By applying required corrections to these raw observations and ingesting corrected observations into existing datasets, such as ICO-ADS, the uncertainty in the amplitude of climatic variations during WW2 should be reduced (Chan & Huybers, 2021).
The resulting dataset is systematic, consistent, with more than 3.7 million unique weather observations covering large parts of the Pacific Ocean, Atlantic Ocean and Indian Ocean.This dataset fills spatial and temporal gaps in ICOADS at times and places when no other sources of observation exist.
The construction and publication of such an enormous dataset have only been possible by harnessing the collective transcription efforts of thousands of willing volunteers using the zooniverse platform.Many past data-rescue studies have utilized similar citizen science approaches (Craig & Hawkins, 2020;Hawkins et al., 2019Hawkins et al., , 2022;;Lorrey et al., 2022).Every project is as different as the research questions expected to be answered and the format of the data source.Every such project adds to the body of F I G U R E 1 0 Same as Figure 9, but for USS Salt Lake City.

ACKNO WLE DGE MENTS
While we mourn the loss of our colleague, Kevin Wood, a great scholar and mentor, his contributions to the recovery of historical marine weather observations were substantial and will continue to influence us and many others.We thank the 4,050 volunteers who contributed so much of their spare time for the transcription of the observations.We thank The U.S. National Archives and Records Administration (NARA) for making these images publicly available, and NOAA for funding the initial setup of the zooniverse project.We gratefully thank the Zooniverse team for making their platform freely available for citizen-science projects.PT and EH were funded by the NERC GloSAT project

F
I G U R E 1 A typical US Naval ship logbook 'Navigation & Observations' page used during WW2.Information about the ships' name, passage to/from, date, zone and commanding officer is noted at the top.Meteorological and navigation information is recorded in their respective columns.This page is from USS Farragut on the day of the attack on Pearl Harbour (7 December 1941).

F
I G U R E 2 A typical War Diary used later during WW2.Positional information is written alongside the remarks and other operational information.

F
I G U R E 3 Number of ships (left axis) and corresponding number of logbook images available (right axis) for each year.

20496060, 0 ,
Downloaded from https://rmets.onlinelibrary.wiley.com/doi/10.1002/gdj3.222 by University of Reading, Wiley Online Library on [18/09/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Detailed sub-daily analysis of the observations (e.g.Tdry) shows that observations were recorded in the Local time zone (Figures S6 and S8), which have also been converted into UTC, depending on the longitude of the ship at the time of observation.Both Local and UTC times are made available in the dataset.As an example of the data available, Figure 6 shows the track of USS Pennsylvania during the 1941-1945 period.During 1941 and 1942, the ship travelled between San Francisco and Pearl Harbour.In 1943, it made trips to the Aleutian Islands near Alaska, Marshall Islands, and Guam in the Pacific.For the year 1944, meteorological observations are present, but navigation data were missing, hence the year is empty.In 1945, it travelled to Papua New Guinea and Philippines and other islands in the South China Sea from Pearl Harbour.It then reached Puget Sound Naval Shipyard in Washington towards the end of 1945.The meteorological observations of pressure and Tdry closely reflect the regions travelled.F I G U R E 5 The number of records in the dataset by ships and by each year.F I G U R E 6 Ship tracks of USS Pennsylvania (left) and Tennessee (right), including Tdry and SLP observations during the 1941-1945 period.

Figure 6
Figure 6 also shows the track of USS Tennessee over the 1941-1945 period.During 1941, the ship travelled to Pearl Harbour from San Francisco, reaching Puget Sound Naval Shipyard in Washington at the end of the year.1942 was spent completing various exercises off-California and in the seas around Hawaii.The years 1943, 1944 and 1945 were long-distance trips, first to Aleutian Islands, then Fiji, Marshall Islands and Philippines.1945 started from the Naval Shipyard in Washington and travelled to the southern coast of Japan via Hawaii, and also included multiple trips to Chinese coasts.Starting from Japan, the ship then visited Taiwan, Singapore, Sri Lanka, Cape Town, finally reaching New York, completing a circumnavigation.All ship tracks are supported by documentary evidence about the ships' movements from other sources(Cressman, 2000).Over the 5-year period, the various ships travelled across the Pacific, Indian and Atlantic oceans, providing a rich dataset all across the globe.

F
I G U R E 8 (a) Ship tracks of USS Detroit and USS Macdonough from Jun-1941 to Jan-1942.(b) Air pressure observations from both ships are compared over that period; the blue line at the bottom indicates the distance between the two ships in NM.The red bar at the top indicates the period considered to calculate the correlation coefficient.(c) Same as (b) but for Tdry, but without the distance shown.F I G U R E 9 Daily mean MSLP observations recorded onboard USS Detroit (blue) and 20CRv3 MSLP (black) at the same locations and times over the 1941-1945 period.The bottom part of the figure shows the 20CRv3 ensemble spread at the same locations and times.20496060, 0, Downloaded from https://rmets.onlinelibrary.wiley.com/doi/10.1002/gdj3.222 by University of Reading, Wiley Online Library on [18/09/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

20496060, 0 ,
Downloaded from https://rmets.onlinelibrary.wiley.com/doi/10.1002/gdj3.222 by University of Reading, Wiley Online Library on [18/09/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License knowledge about strategies for best designing, building and running similar efforts for climate data rescue campaigns.
(Grand Number NE/S015574/1) and were supported by the UK National Centre for Atmospheric Science.This publication is partially funded by the Joint Institute for the Study of the Atmosphere and Ocean (JISAO) under NOAA cooperative Agreement NA15OAR4320063 and by the Cooperative Institute for Climate, Ocean, and Ecosystem Studies (CICOES) under NOAA Cooperative Agreement NA20OAR4320271, Contribution No. 2023-1268.KW was partially funded by the NOAA Arctic Research Program (ARP) for his work.PMEL contribution is 5171.OPEN RESEARCH BADGES This article has earned an Open Data badge for making publicly available the digital data necessary to reproduce the reported results.The data are availble at https:// zenodo.org/record/7781108.Learn more about the Open Practices badges from the Centre for Open Science: https://osf.io/tvyxz/wiki.ORCID Praveen Teleti https://orcid.org/0000-0003-2691-8488Ed Hawkins https://orcid.org/0000-0001-9477-3677TWITTER Praveen Teleti @PraveenTeleti

name Type Hull no. Class Max speed (knots) Bridge height (approx. ft)
Classification of ships present in the current collection.
T A B L E 1 20496060, 0, Downloaded from https://rmets.onlinelibrary.wiley.com/doi/10.1002/gdj3.222 by University of Reading, Wiley Online Library on [18/09/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License