An Estimation of Human‐Error Contributions to Historical Ionospheric Data

Ground‐based radar sounders are used to characterize the dynamics and chemistry of Earth's upper atmosphere by using measurements of ionospheric peak electron density (NmF2) and its associated altitude (hmF2). Continuous sounder observations of the E and F regions of the ionosphere have been carried out regularly at dozens of stations worldwide since the midtwentieth century. A deep understanding of short‐ and long‐term upper atmospheric variability depends on a fundamental understanding of these observational data. The manual analysis of historical analog (predigital age) ionograms to derive the plasma frequency profiles and the ionospheric parameters hmF2 and NmF2 is a tedious procedure and susceptible to human error. In order to better understand this human error, a study is conducted in which ionograms from vertical sounders are manually scaled by a team of ionospheric researchers. The results of the study are then used to estimate the variability of the hand‐scaled ionospheric parameters foF2 and foE. Those results are then used to estimate the downstream impact on ionospheric models that use foF2 and foE as input. The results demonstrate that there can be large variability in the manual scaling of foF2 and fmaxE from vertical incidence ionograms. However, the participants did typically better than 5% uncertainty for benign ionograms. A long‐term analysis of hmF2 modeling exhibits low sensitivity to statistical errors imposed on foF2 and foE, but a short‐term analysis showed that modeled hmF2, neutral winds, and electron densities can be very different when small adjustments are made to foF2 and fmaxE.


Introduction
Vertical incidence sounding of the ionosphere using high-frequency (HF) radio wave transmissions is a fundamental technique that goes back to the earliest days of radio science (Breit & Tuve, 1925). This technique has been routinely used to understand the ionospheric plasma below the ionospheric F region peak electron density. However, the technique only provides ionospheric layer peak densities directly; determining other state parameters of interest (e.g., layer heights and profile shapes) requires solution of an inverse problem (e.g., Davies, 1990;Rishbeth & Garriott, 1969). Solution of this inverse problem is termed "scaling," named for the popular manual (human analyst), graphical approach used for decades. Guidelines for deriving ionospheric parameters from ionograms can be found in the URSI Handbook of Ionogram Reduction, known as UAG-23A (Piggot & Rawer, 1972) (see also Davies, 1990, section 6.6.2).
As interest in Earth's climate variability has grown over the last few decades, there has also been increased interest in possible subsequent changes to Earth's upper atmosphere, in the thermosphere and ionosphere (Bremer, 1992(Bremer, , 1997(Bremer, , 1998. Various investigators have attempted to predict and confirm the existence of specific changes in ionospheric characteristics that have been predicted by various global atmospheric models. Rishbeth (1997) listed four possible causes of such variations: (1) variations of solar activity, (2) the secular variation of the geomagnetic field, (3) global cooling of the upper atmosphere (vs. warming of the lower atmosphere), and (4) changes in minor atmospheric constituents. Following ideas from Roble and Dickinson (1989) that expected increases in the atmospheric mixing ratios of mesospheric carbon dioxide and methane should cool the thermosphere by about 50°K, Rishbeth (1990) examined the consequences for Earth's ionosphere. He showed that the atmospheric cooling and associated composition changes, as described by Roble and Dickinson (1989), would lower the ionospheric E layer and F 2 layer peaks by about 2 and 20 km, respectively but that changes in the ionospheric E and F 2 layer electron densities would be smaller, on the order of 1%. ©2020. The Authors. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
As such, there is a continuing need for accurate models of upper atmospheric variability of electron density as a function of height for use with basic research. Ideally, such models would be based on homogeneous and well-understood data series and specified in terms of only those ionospheric characteristics that are routinely and accurately scaled from ionograms with a high level of confidence. However, ionospheric characteristics for use with long-term trend studies are manually scaled from "analog" film ionograms that were generated by older ionosondes which preceded modern, computer-based, digital ionosondes. Experts who have analyzed manually scaled ionospheric characteristics for limited scenarios have appreciated the value of having the ionograms all come from the same ionosonde and, possibly, manually scaled by the same person (McNamara, 2008). It is, however, necessary to check the underlying data very carefully for artificial changes due to technical alterations, possible modifications to the data analysis techniques, and the data preservation history. It is not clear that investigators obtaining historical data from World Data Centers (WDC) are aware of the idiosyncrasies of the data derived from the ionograms generated from the older analog ionosondes. As described in Araujo-Pradere et al. (2019), most of the data in long-term databases of analog ionosonde data contain no supporting information that describes the calibration characteristics of the ionosondes, the experience level of the manual scalers, or a description of the data's provenance.
While manually scaling ionograms has been accomplished for many decades, few investigations have been conducted that quantitatively assess the role human error has on the estimation of ionospheric parameters derived from vertical incidence sounding. The goal of this effort is to estimate the human error in manually scaled ionograms and determine its potential impact on research, with a focus on the peak height of the F 2 layer of Earth's ionosphere, known as h m F 2 . In our investigation, we use historical ionosonde data from the National Geophysical Data Center (NGDC) database, data from a recent Intelligence Advanced Research Projects Activity (IARPA) ionospheric experiment, and the two most widely used empirical models of h m F 2 .

Difficulties that Arise in Manually Scaling Ionograms
The conversion of a vertical incidence (VI) ionogram from the observed atmospheric plasma frequency into the corresponding electron density profile is variously known as true-height analysis or ionogram scaling (e.g., Davies, 1990;Rishbeth & Garriott, 1969). Bilitza et al. (1979) drew attention to two fundamental issues that arise with ionogram inversions. The first issue, known as the "starting" problem, arises because there are usually no observations below 1.5 MHz, defined as the top of the medium-frequency (MF) broadcast band. The second issue, known as the "valley" problem, is that no echoes are returned directly from within the E region to F region valley of the ionosphere. These issues have been addressed in the literature over the last few decades, notably by Titheridge (1986Titheridge ( , 1988Titheridge ( , 2003, and summarized by McNamara (2006) and McNamara et al. (2007).
One general problem of ionospheric radio sounding is the lack of observational data that describe the plasma density distribution below the minimum plasma frequency detected by the ionosonde (by echoes of either ordinary (O) or extraordinary (X) polarization) and within the E-F valley. Information about such "unseen" plasma densities is necessary for an accurate inversion of the parts of the plasma density profile that are directly observed. Very few parameters of unseen regions can be obtained from the observations (including combined use of oppositely polarized echoes), far fewer than the plasma profile requires. The practical solution to this problem is to use empirical models of the underlying ionization (i.e., below the altitude corresponding to a plasma frequency of 1.5 MHz) and of the E-F valley.
Although full profiles (at least of the bottom side) are desirable or essential for some studies or applications, frequently a few key parameters are sufficient, such as the peak density and height of the F 2 region (N m F 2 and h m F 2 , respectively). The two most widely used empirical models for estimating the underlying ionization and h m F 2 altitude are the Dudeney (1974Dudeney ( , 1983 formulation and the Bilitza et al. (1979) formulation which are described in the next section. The strengths and weaknesses of each approach are well described in McNamara (2008). To estimate h m F 2 , the scaled ionospheric characteristics are required, which include an estimate of the critical frequency of the E region (f o E), the critical frequency of the F 2 layer (f o F 2 ), and the maximum usable frequency over a distance of 3,000 km divided by f o F 2 ([M3000]F 2 ). The f o F 2 and M(3000)F 2 (Shimazaki, 1955) are two of the most important characteristics scaled from ionograms. Their product, designated by MUF(3000)F 2 , is the highest ordinary-mode frequency that would support a 3,000 km HF communications circuit centered over the ionosonde at that time (e.g., Davies, 1990). M(3000) F 2 is variously known as the M factor, or the obliquity factor.
Modern techniques for automated electron density inversion of digital ionograms (Zabotin et al., 2006) make use of modern physics models of the twilight/night E region and the daytime E valley developed by Titheridge (2003). The older models use input data that were hand-scaled from analog "film" ionograms. When the traditional analog ionograms are used, it is a very tedious manual process. Data from ionosondes that have been hand-scaled by experienced operators are generally expected to have height errors lower than 10 km (McNamara, 2008). However, the determination of key parameters by visual inspection of historical analog ionograms is susceptible to human error and different experts may also introduce subjective "human" variability into the data (Araujo-Pradere et al., 2019). Modern ionosondes use computers to automate the scaling process. However, due to a variety of ionospheric conditions (such as absorption and ionospheric disturbances) and RF conditions (such as the presence of interference or inability to sound at restricted frequencies) ionogram quality is often less than ideal. Consequently, the autoscaling software itself has significant quality control issues and human checking is often required.
The procedure for scaling M(3000)F 2 is described in section 1.5 of Conventions for Determining MUF Factors of the URSI Handbook of Ionogram Reduction, known as UAG-23A (Piggot & Rawer, 1972) (see also Davies, 1990, section 6.6.2). The procedure for hand-scaling f o F 2 (and other parameters) from ionograms is described below. This study is an attempt to quantify the amount of human error from manual scaling of ionograms and consider how that error may subsequently affect basic research of the Earth's ionosphere/thermosphere.

Models of F 2 Layer Height
The analysis in the remainder of the paper is based on two empirical formulations for estimating the F 2 layer height and density. The two most widely used models for estimating the underlying ionization and h m F 2 altitude are the Dudeney (1974Dudeney ( , 1983 formulation and the Bilitza et al. (1979) formulation. The general formulations for the Dudeney and Bilitza models of the h m F 2 altitude (in km) are shown in Equations 1 and 2 below.
1. The Dudeney model of h m F 2 : where: where: where R is the sunspot number and ϕ is the geomagnetic latitude.

Earth and Space Science
Both models of h m F 2 use f o F 2 and f o E, which can be manually scaled from ionograms. The models also depend on solar variability, but in different ways. The Dudeney model uses it implicitly, because the f o F 2 plasma frequency correlates directly with solar activity. The Bilitza model also uses it explicitly, via four numerical functions F 1 -F 4 (not shown here for brevity).

The Ionogram Data Used for Manual Scaling
The ionogram data used for this study were part of the Intelligence Advanced Research Projects Activity (IARPA) (n.d.) Passive Ionospheric Noncharacterized Sounding (PINS) Challenge (https://www.iarpa. gov/challenges/pins.html). The challenge was created to crowdsource the development of new algorithms for autoscaling ionograms. For this study, a collection of 10 VI ionograms from the PINS Challenge were selected and ionospheric scientists at JHU/APL and Clemson University were trained to use software to manually scale each ionogram, one at a time. The observations were collected during August 2017 at a site located in Melrose, Florida (29.7°N, 82.0°W). The transmitter was a Digisonde DPS-4D (https://www.digisonde.com/). The receiver was an Ettus Research USRP X300 (https://www.ettus.com/all-products/x300kit/) with a BasicRx card hooked up to a custom HF preconditioner and an active loop antenna.
The ionogram manual scaling software was developed by the authors in the Python 3.x programming language and consists of a simple point-and-click graphical user interface (GUI) that is very similar to other manual-scaling tools in the community. Every participant used the exact same preconfigured 2017 MacBook Pro laptop for the manual scaling process. The laptop used a 30-bit color depth (10 bits per channel), and we optimized the ionogram images by utilizing 24-bit color PNG files and by using the maximum DPI for each image.

The Variability of Hand-Scaled f o F 2 , f max E, and h′F 2
Each of the 10 ionograms used in this study were manually scaled 20 times: twice, by each of the 10 participants. The ionograms were chosen randomly from the PINS Challenge data, and only ionograms with missing blocks of data were excluded. The team of 10 manual scalers was composed of professional scientists and engineers in JHU/APL's Space Physics Research Group (SRG). The participants were already familiar with the basics of ionospheric physics. They were trained by the authors to analyze ionograms and how to use the Python manual scaling software to determine f o F 2 , h′F 2 , and f max E (if an E layer is present, f max E is the value of its maximum plasma frequency) over a wide range of ionograms. They practiced using the software in multiple sessions before generating the data for this study. Each user could easily select the value of the f o F 2 , h′F 2 , and f max E in each ionogram, and their final selections were written to disk. The software allowed the participants to "undo" and/or "redo" a selection if they felt that they had made an incorrect choice during the manual scaling process.
The graphical results of manually scaling the 10 ionograms 20 times each are shown in Figures 1 and 2. On each ionogram image, the solid yellow lines represent the mean estimated h′F 2 , f o F 2 , and f max E from all the participants who hand-scaled the ionogram. The dashed white lines represent the minimum and maximum selected h′F 2 , f o F 2 , and f max E for each ionogram from all the participants. Pixel noise is the random splotches and speckles in an ionogram and is due to HF interference. Note that there is no ionospheric E layer present in ionogram #1 but it is present in the other nine ionograms.
There is substantial variability in the range of manually selected parameters from the ionogram traces. Similarly, the range of h′F 2 for ionogram #3 spans from 175 to 373 km. The mean value of h′F 2 is at the correct virtual height (as seen on the ionogram at 357 km) for the F 2 layer but the incorrect minimum value at 175 km is due to the fact that, instead of choosing the h′F 2 value, of one of the participants selected the altitude at which the F 1 trace becomes horizontal (h′F 1 ). The incorrect value of for h′F 1 at 175 km is shown as the lowest horizontal dotted line that spans the image from left to right.
Lastly, for ionogram #3, the ranges for the E region are shown in the ionogram as a tall rectangle. The values of f max E span from 3.6 to 4.4 MHz, and the mean value of f max E is at 3.8 MHz which is located near the rightmost pixels in the E layer signal. However, there is pixel noise in all the ionograms and manually determining what is noise versus what is f max E-signal can be very subjective. In the case of ionogram #3, pixelated noise in the image led one participant to choose an E region maximum frequency that is substantially larger than the other participants, out at 4.4 MHz. The result is the E region rectangle with the minimum and mean frequencies located on the left-hand side of the rectangle, and the maximum outlier defining the right-hand side of the box.

Earth and Space Science
Similar unexpected selections are visible in other ionograms. In ionogram #4, the values of the f o F 2 frequency span from 4.3 to 6.3 MHz with a mean value of 6.1 MHz. In this case, one of the participants accidentally chose the peak frequency of the F 1 layer X mode trace (at 4.3 MHz) rather than the peak frequency of the F 2 layer O mode trace. In ionogram #10, the values of the h′F 2 virtual height range from 239 to 298 km, with a mean value of 291 km. In this case, one of the participants accidentally chose the local minimum of the F 1 layer X mode trace (at 239 km) rather than the local minimum of the F 2 layer O mode trace.
In some cases, pixel noise, spread F, and overlapping traces in an ionogram can make the manual selection of a parameter very challenging. For example, due to the spread F in ionograms #5 and #8, it is difficult to determine the exact virtual height h′F 2 at which the F 2 O mode trace becomes horizontal. In ionogram #5, the estimated h′F 2 spans from 276 to 302 km, which, given the amount overlap in the radar traces and the

10.1029/2020EA001123
Earth and Space Science amount of spread F, provides a reasonable range of estimates. In ionogram #8, the trace overlap and spread F are even more extreme, making the selection of h′F 2 somewhat subjective. The estimated h′F 2 spans from 307 to 347 km. The minimum value (307 km) appears to be too low, but a reasonable estimate of h′F 2 could range from the mean (326 km) to the maximum value (347 km).
A summary of the manually selected f o F 2 values from all the participants for each of the 10 ionograms is shown in Figure 3. Figure 3a shows the mean error and standard deviation of all the manually selected f o F 2 values for each ionogram as black circles with error bars. The maximum selected f o F 2 values for each ionogram are shown as green squares, and the minimum selected f o F 2 values for each ionogram are shown as red triangles. Figure 3b is like Figure 3a but expressed as percent error from the mean value. The f o F 2 variability for most of the ionograms is lower than 5%, but ionograms #3, #4, and #8 have significantly larger errors. The large errors in those ionograms were the result of users incorrectly selecting the foF 1 frequency trace in the ionograms rather than the f o F 2 frequency trace. Figure 4 has the same layout as Figure 3 but describes the range of manually selected h′F 2 values. The h′F 2 variability for most of the ionograms is lower than 10%, but the error in Ionogram #3 exceeds 50%. That outlier from Ionogram #3 is due to one of the participants estimating the virtual height for the F 1 layer (h′F 1 ) rather than estimating the virtual height for the F 2 layer (h′F 2 ).
Figure 5 also has the same layout as Figure 3 but describes the range of manually selected f max E values. The f max E variability for most of the ionograms is lower with exceptions of ionograms #3, #5, and #6. The f max E outliers in these ionograms are due to participants subjectively choosing background noise pixels as the peak  Table 1. These values will be used in section 7.

Parameter Estimation Based on Repeated Scaling of Ionograms
While the above results make it clear that there can be fairly large human errors in the ionospheric parameter estimation, one would expect that as the experience of the ionogram scaler increases, so should the precision of the estimation of ionospheric state parameters. To test the reproducibility of the ionospheric parameter estimation, a set of 61 ionograms were repeatedly rescaled 16 times by one person. Each scaling occurred on a different day and the person typically scaled the ionograms toward the end of their day when they were tired. Figure 6 shows the estimation of f o F 2 for the 61 ionograms that were repeatedly rescaled in a similar format to the figures above. In general, the estimation of f o F 2 was consistent and repeatable. As can be seen in the lower panel of Figure 6, standard deviation from the mean was less than 5% for 60/61 ionograms. There was one notable difference for ionogram 35, which includes odd features, which may be spread F. While the precision of the estimation is quite good for many of the ionograms, there are approximately five ionograms (i.e., ionograms #21, 31, 32, 44, and 50) that show large departures from the mean estimate, as indicated by the maximum or minimum estimate. Figure 7 shows the estimation of h m F 2 in a similar format to the previous figures. For this case, again most of the ionograms show that the error in the estimation of h m F 2 was less than 5%. In addition, the error from overestimation and underestimation was typically less than 5%. However, in approximately five instances (i.e., ionograms #44, 50, 58, 60, 61), the uncertainty of the h m F 2 parameter is larger. In the two worst cases, the error is greater than 30%. Figure 8 shows a similar figure for f max E, as Figures 6 and 7. Again, for most of the ionograms, the error estimates were < 5%. However, there were also cases for which it was difficult to obtain good parameter estimates; in particular, the error in f max E in ionograms 55, 56, and 57 was high. In these cases, the error was in excess of 15%, and the error from an overestimation of the parameter was as high as 40%.
Taken together, these results suggest for fairly well-behaved ionograms, an operator can manually handscale ionograms to reproduce highly consistent estimates of the ionospheric parameters, to better than 5% uncertainty. However, the results also show that in some cases ionospheric parameters can be difficult to reproducibly determine, thus resulting in large uncertainties. Given that ionograms are typically not repeatedly rescaled by the same operator, this demonstrates that human error can introduce significant deviations in estimated parameters for more disturbed ionograms (i.e., spread F, etc.).

Relevance to Basic Research
Understanding the variability of manually scaled ionospheric data is critical for using them for basic research.   The resulting maximum value curves (labeled as "Overestimate") and minimum value curves (labeled as "Underestimate") after implementing the errors to the inputs to N m F 2 and h m F 2 are shown in Figure 9.
The solar flux over the 30-year period is shown in Figure 9a and the two N m F 2 curves are shown in Figure 9b. The difference between the two N m F 2 curves increases with solar flux and peaks during maximum solar activity. The long-term changes in the modeled median monthly h m F 2 over the 30-year period are shown in Figures 9c and 9d

10.1029/2020EA001123
Earth and Space Science height versus time (Bremer, 1992) related to atmospheric cooling must first remove the solar and geomagnetic influenced variation patterns from the data. We do not remove that variability from these data. However, it is important to note that measuring errors could potentially mask long-term trends, and the findings in this research address that aspect of such analyses. Dandenault (2018) used the Field Line Interhemispheric Plasma (FLIP) model and the 30-year NGDC database of global h m F 2 observations to develop the Magnetic Meridional Neutral Thermospheric (MENTAT) model of thermospheric horizontal meridional neutral winds. As described by Richards and Torr (1988) and by Torr et al. (1990), the first-principles FLIP model calculates plasma temperatures and densities using the continuity, momentum, and energy equations, which are solved along entire magnetic flux tubes. Using the FLIP model, Miller et al. (1986) determined the constant of proportionality for the approximate linear relationship between changes in h m F 2 and the horizontal neutral wind speed, which along with the peak height in the absence of a neutral wind, provides a means for ionospheric modelers to reproduce the observed height with reasonable precision. Richards (1991) and Richards et al. (1995) introduced a new approach that reduced the complexity and computational requirements and more accurately reproduced the observed ionospheric heights.
It is also worth looking at how human error may affect modeled short-term variability of the ionosphere. In order to do this, three sets of h m F 2 values are fed into the FLIP model to determine how human error may affect the output of a model that can be driven by ionospheric data sets. In this case, we look at the effects on modeled N m F 2 and modeled neutral winds. To accomplish this, the NGDC values of h m F 2 from Boulder for the year 1975 are used by the model as the altitude constraint for the ionospheric F 2 layer height. The first set of data input to FLIP are Bilitza h m F 2 calculated using the raw, unmodified f o F 2 and f o E data for Boulder  Figure 10a. The differences in h m F 2 over the three days are significant. The "No Error" layer height is significantly lower than the "+Error" and "−Error" layer heights in the evening and postmidnight periods on each day.
The resulting modeled meridional winds in Figure 10b are also significantly different. The "No Error" winds are significantly more northward and stronger than the "+Error" and "−Error" winds in the afternoon, evening, and premidnight periods on each day, which is the expected behavior due to the flow of winds from dayside to nightside.
The modeled N m F 2 values are shown in Figure 10c. The reported NGDC observations of N m F 2 from the ionosonde at Boulder are as added ground truth and shown on the plot as gray circles. The "No Error" modeled N m F 2 does a better job of estimating the morphology of the ionosonde observations than the "+Error" and "−Error" modeled N m F 2 . In particular, the "No Error" N m F 2 does a much better job of estimating the nighttime N m F 2 observations. This is due to using the correct "No Error" h m F 2 to drive the FLIP model, which allows FLIP to estimate the correct meridional neutral winds which result in the correct nighttime F region production and loss rates.

Discussion
Two experiments are conducted to estimate the variability of manually scaling ionograms using a simple point-and-click graphical user interface. In the first experiment, 10 VI ionograms are hand-scaled 20 times by a small team of ionospheric researchers. This speculative exercise is done to demonstrate the potential variability of manual scaling from a group, but this approach is not necessarily representative of short time periods since the ionograms in the NGDC database were not manually scaled repeatedly by teams. This method is, however, representative of long-time periods since it is probable that many different people manually scaled the ionograms at any one ionosonde site during the 30-year span of the NGDC database.
The second experiment consisted of one expert repeatedly scaling a set of 61 ionograms, 16 separate times. The goal was to how the ionogram parameters may vary when manually scaled over and over again by the same person. This was done to demonstrate the variability from a single individual over a shorter time period, which is also useful since one person may have been doing all of the manual scaling at one ionosonde site for days or weeks at a time, and may have scaled all of the ionograms during a geomagnetic storm event. The results of both experiments are used to estimate the overall variability of the hand-scaled ionogram parameters,  and then used to estimate the downstream impact on models that use the manually scaled parameters as input data.
The results of hand-scaling the ionograms in sections 5 and 6 show there can be large variability in the manual scaling of f o F 2 , f max E, and h′F 2 from VI ionograms. In some cases, the pixel noise on an image made the determination of the maximum frequency of the E region (f max E) very tricky. In other cases, spread F on an image made it very difficult to isolate the signal trace of the F 1 region from the signal trace of the F 2 region. It is worth noting that sometimes the maximum estimate is outside of the standard deviation around the mean. Spread F can also make isolation of the F 2 X mode trace from the F 2 O mode trace very challenging. In other cases, human error was observed; for example, accidental selection of the wrong region of trace-in particular, by choosing the h′F 1 layer trace rather than choosing the h′F 2 layer trace. It also appears that an operator's mind can simply wander from the task at hand. Even though an incorrect value was chosen for a parameter, the operator did not bother to undo the selection before moving on to the next step.
The repeated scaling of ionograms demonstrated that for fairly benign ionograms, it was possible to determine ionospheric parameters, such as f o F 2 , h m F 2 , and f max E. However, it was found that careless mistakes are made which can produce large errors. Also, as mentioned, in cases of spread F or other unusual features, the estimation of ionospheric parameters produces larger variability.
Section 7 demonstrated how manually scaling parameters from ionograms can affect models that use the parameters downstream as part of the basic research process. The long-term trend analysis of h m F 2 from 1961 to 1990 was not significantly sensitive to the impact of errors imposed on the ionosonde parameters f o F 2 and f o E from Boulder. In fact, the error provided a consistent long-term numerical difference between both empirical models of h m F 2 . On the other hand, the short-term term analysis showed that the modeled winds and electron density can be very different if small adjustments are made to f o F 2 and f max E. The errors imposed on the input to h m F 2 lead to incorrect internal dynamic of the model, which results in modeling errors. This information is very valuable for ionospheric modeling.

Conclusions
These findings suggest that (1) there will always be an unknown amount of error in some historical data sets of the ionosphere, (2) the impact of the variability can be substantial during short time periods (hours/days/ storms) and must be considered before relying on the data for basic research, and (3) steps should be taken to minimize unknown uncertainties in modern data sets. Any time that humans (or computers) generate secondary products from lower-level data, extreme care should be taken to characterize and document how those data were derived. Modern data sets should always include comprehensive and historical documentation of instrument calibration; all the lowest-level data products should be recorded and maintained for posterity; and a historical record should be maintained of all the experts who contributed to the generation of each data set.

Data Availability Statement
The historical ionosonde data are available from the National Geophysical Data Center (NGDC: https://catalog.data.gov/dataset/ionospheric-digital-database). The data used for this study are available for download at this site (https://doi.org/10.5281/zenodo.3525038).