Wintertime Formaldehyde: Airborne Observations and Source Apportionment Over the Eastern United States

Formaldehyde (HCHO) is generated from direct urban emission sources and secondary production from the photochemical reactions of urban smog. HCHO is linked to tropospheric ozone formation, and contributes to the photochemical reactions of other components of urban smog. In this study, pollution plume intercepts during the Wintertime INvestigation of Transport, Emissions, and Reactivity (WINTER) campaign were used to investigate and characterize the formation of HCHO in relation to several anthropogenic tracers. Analysis of aircraft intercepts combined with detailed chemical box modeling downwind of several cities suggests that the most important contribution to observed HCHO was primary emission. A box model analysis of a single plume suggested that secondary sources contribute to 21 ± 10% of the observed HCHO. Ratios of HCHO/CO observed in the northeast US, from Ohio to New York, ranging from 0.2% to 0.6%, are consistent with direct emissions combined with at most modest photochemical production. Analysis of the nocturnal boundary layer and residual layer from repeated vertical profiling over urban influenced areas indicate a direct HCHO emission flux of 1.3 × 1014 molecules cm−2 h−1. In a case study in Atlanta, GA, nighttime HCHO exhibited a ratio to CO (0.6%–1.8%) and was anti‐correlated with O3. Observations were consistent with mixing between direct HCHO emissions in urban air masses with those influenced by more rapid HCHO photochemical production. The HCHO/CO emissions ratios determined from the measured data are 2.3–15 times greater than the NEI 2017 emissions database. The largest observed HCHO/CO was 1.7%–1.8%, located near co‐generating power stations.

Constraints on the anthropogenic contribution to HCHO are best obtained in the absence of significant biogenic contribution, but research grade observations in the winter are uncommon. Zhu et al. (2017) noted that existing EPA monitoring measurements from sites near urban and rural areas may not accurately resolve the concentration of HCHO in the stratified winter atmosphere. Other studies that utilize three-dimensional atmospheric chemistry transport models, such as Community Multiscale Air Quality (CMAQ) and other Air Quality Models (AQMs), cite that underestimation in HCHO is related to the underestimation of VOCs or a failure to capture and quantify VOC precursors of HCHO from anthropogenic activities (Luecken et al., 2012;Luecken et al., 2018;Warneke et al., 2007;Zhu et al., 2017). Furthermore, aircraft missions investigating regional air quality in the lower atmosphere have most commonly been conducted in the summer when photochemical secondary HCHO production is most active (Parrish et al., 2009a;Ryerson et al., 2013;Warneke et al., 2016).
The Wintertime INvestigation of Transport, Emissions, and Reactivity (WINTER) mission was designed to understand the processes controlling atmospheric composition in the wintertime boundary layer of the Eastern U.S. A recent analysis of these observations has revealed that HCHO can be as much as 50% of the daily integrated radical source (Haskins et al., 2019). Furthermore, Jaeglé et al. (2018) found a bias of −46% in GEOS-Chem simulated HCHO concentrations relative to WINTER observations at altitudes < 1 km. This bias was eliminated in the model by increasing direct HCHO emissions from residential wood combustion and mobile sources by a factor of 5. Justification for the adjustment factor is based on the studies by Jobson and Huangfu (2016) which suggested that wintertime cold start HCHO emissions are underestimated by a factor of 5 and the study by VanderSchelden et al. (2017), which found that residential wood combustion accounts for a higher percentage of HCHO compared to what was expected from the National Emissions Inventory (NEI) inventory. Underestimation of wintertime HCHO concentrations and VOC precursors in modeling studies that utilize EPA monitoring data have also been reported by Zhu et al. (2017) and Luecken et al. (2012). Uncertainties in emission inventories are, thus, large and have important consequences for simulating air quality-relevant photochemistry.
Here, we present a detailed analysis based on WINTER observations to investigate anthropogenic sources of HCHO near urban areas and power plants in the Eastern U.S. We use several analysis tools to segregate primary emissions and secondary production of HCHO, including a linear combination model, a detailed chemical box model constrained by observations downwind of urban areas, observations of HCHO accumulation rates in the nocturnal boundary layer, and examination of enhancement ratios relative to carbon monoxide (CO). Results are compared to several emissions inventories, global model simulations results, and other historical data to provide insight and constrain uncertainties in the HCHO budget.

WINTER Campaign Flight Description
The WINTER campaign was a 6-week field campaign conducted from midday-into-night flights (RF03 and RF04), and two night flights (RF07 and RF10). These flights sampled a variety of potential HCHO sources, including urban areas with motor vehicle activity. Several plumes with apparent HCHO emissions from natural gas and diesel fuel co-generating power stations were sampled and quantified.

Instrumentation
The NASA In Situ Airborne Formaldehyde (ISAF) instrument provided fast measurements of HCHO (Cazorla et al., 2015). ISAF has a nominal accuracy of 10% based on standard additions from a calibrated compressed-gas standard and a precision of ∼30 pptv for an integration time of 1 s. On-board instrumentation has been described in previous (Fibiger et al., 2018;Green et al., 2019). Sparks et al. (2019) provided information about nitrogen species that were measured simultaneously with HCHO. The instruments used to measure each chemical species are described in Table 1.
Oxidized nitrogen, NO x (NO + NO 2 ), reactive nitrogen species NO y (NO x , NO 3 , N 2 O 5 , and other species), and O 3 were measured by a six-channel cavity ring-down spectrometer (CRDS) (Brown et al., 2017;Wild et al., 2014). A thermal dissociation laser induced fluorescence (TD-LIF) instrument (Day et al., 2002;Sparks et al., 2019) measured NO 2 and speciated reactive nitrogen, including alkyl and peroxy nitrates and HNO 3 . A chemiluminescence instrument (CL) also provided data for O 3 , NO, and NO y (Sparks et al., 2019;Weinheimer et al., 1994). A modified, rack mounted TECO Model 43C pulsed fluorescence SO 2 analyzer provided SO 2 gas measurements as described in Green et al. (2019). Other measurements not listed in Table 1 include the measurement of the actinic flux and photochemical rates of various photochemical processes by the HIAPER Atmospheric Radiation Package (HARP) (Laursen et al., 2006), the measurement of CO 2 and CH 4 using the Picarro CRDS, the measurement of CO using commercial Aero-Laser AL-5002 VUV resonance fluorescence instrument (Salmon et al., 2018;Yuan et al., 2015), and the measurement of volatile organic compounds by the TOGA instrument (Apel et al., 2015

Pollution Plume Identification, Tracking, and Analysis
Enhancements of HCHO relative to other pollutants were determined for plumes intercepted aloft from both point sources (i.e., electric power generating stations) and urban areas. Power station plumes were identified using the molar ratio of SO 2 /NO y for coal fired power stations, and SO 2 /NO y , CO 2 /NO y , and CO/ NO y for co-generating power stations using a combination of natural gas, bio-diesel, wood, coal, and other fuels. These molar ratios were determined from orthogonal distance regression (ODR) fits to plots of SO 2 versus NO y , SO 2 versus CO, and CO 2 versus NO y . SO 2 /NO y ratios were compared to the ratios of the corresponding emissions of SO 2 and NO x provided by the Environmental Protection Agency (EPA) continuous emission monitoring (CEMS) database to identify plumes from coal fired power stations, with CO 2 /NO y used to identify co-generating stations. Emissions ratios calculated using the aggregate data from the NEI database from 2014 and 2017 in the form of yearly averages were also used for comparison, since HCHO data were not available in the CEMS database. Additionally, CO/NO y was used to separate urban plumes, where urban air masses typically have a CO/NO y ratio between 4 and 10, and power stations using diesel fuels have a CO/ NO y ratio greater than 20 (Hassler et al., 2016;C. Li et al., 2010;Parrish et al., 2009b). Techniques for curve fitting, plume aging, and tracking analysis, and the use of chemical ratios to differentiate urban from power station plumes are described in earlier work (Green et al., 2019).
Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) back trajectory analysis, using the NAMS hybrid sigma-pressure archived data set, was performed from a given plume intercept to 8 h prior to the intercept time to trace the point of origin back to a specific power station as described in Green et al. (2019). HYSPLIT was used in combination with the wind field measurements taken by aircraft instrumentation to confirm the path and the approximate time carried aloft.
In addition to the techniques previously described, a linear model was applied for HCHO source apportionment using several chemical tracers: CO, CH 4 , SO 2 , and NO x . The technique was described by Rawlings et al. (2001) and used by Friedfeld et al. (2002) and Garcia et al. (2006). The multivariable linear regression model assumes that HCHO variability can be described as a linear combination of variability in other correlated observations. The linear model is described by Equation 1 (Garcia et al., 2006;Rawlings et al., 2001) where Y and X 1 …X m are matrices of dimension n × 1, and where n is the number of data points for each chemical tracer used. , these coefficients are solved for each tracer species data set that was used for a given linear model, as described in Section 3.1.1. The solution for these coefficients is used to determine the percent correlation of a chemical species X, in relation to [HCHO] input into Y by using Equation 2 (Garcia et al., 2006). The percent correlated to [HCHO] in the multivariable linear fit for the chemical species represented in the numerator in Equation 2. The r 2 value is related to the percent variability in HCHO that can be attributed to changes in these species. Before each chemical species was used in the linear model, it was correlated to HCHO using ODR fitting to justify their use. Species that were individually correlated with HCHO were then used in the linear model. Each coefficient in the set predicts the association with each chemical tracer using Equation 2. The resulting data ranges, from the ODR correlation, were then used in the linear model to find the coefficients for the equation similar in form to Equation 1. These coefficients then can be used to determine the percent covariance of a chemical species to Y by using Equation 2 (Garcia et al., 2006):

10.1029/2020JD033518
an approximately perpendicular path to the direction of transport, which was determined using the plane's direction relative to the measured and HYSPLIT calculated wind field. Within the hourly time scale, the measurement time for the aircraft is about 3 -7 minutes with a horizontal wind speed of 14 -20 km/hr, the change in distance (0.7 -2 km) and the time elapsed remains small and well within uncertainty. We anticipate that Equation 2 would only be valid during winter when primary HCHO associated with co-emission of other pollutants would be the dominant source. Secondary HCHO and, to a lesser extent, CO, would degrade the validity of this approach in summer. The concentrations for CO and CH 4 in the time series plot are the amounts above the background levels of 122 ppbv for CO and 1,896 ppbv for CH 4 , with the background level of each chemical species defined as the concentration measured in the absence of urban and power station pollution plumes. The resulting stacked plot at the bottom in Figure 2 only illustrates the amount [HCHO] that is attributed to the chemical tracers input into the model along with the portion [HCHO] attributed to the background concentration of HCHO or HCHO production from unaccounted sources of a chemical species, and is labeled B 0 . The black line is the time series of HCHO ppbv measured over the intercepted plume as shown in the top of Figure 2. Portions of the data set where there was a gap in the recorded data of one or more of the chemical species were excluded from the linear model. The area under the black line is separated into regions that represent the percentages of each co-emitted compound attributed to the production of HCHO with a r 2 value of 0.67.

Daytime Urban Emissions and Sources
The examination of daytime emissions of HCHO from the urban environment during the winter focuses on data obtained from flights RF02 and RF03. Figure  photochemical oxidation of NO x had occurred. The CO/NO x ratio of 5.92 is consistent with analysis of urban emissions from major cities in the U.S. (Bai et al., 2017;Hassler et al., 2016). We can also compare the value for HCHO/CO of 0.59% to the expected mixing ratio enhancements from primary emissions of HCHO and CO from the on-road vehicle fleet. Parrish et al. (2012) report an HCHO/CO ratio of 0.3%, while Rappenglück et al. (2010) report a ratio of 0.6% based on summertime observations. This comparison indicates that the ratio of 0.6% is an upper limit to the direct emission of HCHO given that there is likely some contribution of photochemical/secondary production. In the absence of strong photochemistry, as seen in the winter, a lower limit for HCHO/CO ratio of 0.2% has been observed in urban plumes. The value is just slightly higher than the lowest value of 0.17% found during the summer morning peak traffic periods. The chemical box model analysis below provides a more quantitative estimate of the secondary contribution to observed HCHO.
For the urban areas investigated in RF02 and RF03, transport times calculated with observed winds agreeing to within 1-2 h with the HYSPLIT trajectories passing through the urban centers. HYSPLIT was used to estimate transport times for urban plumes 1-4 and 7 that share the same path of transport as indicated by the circles and wind barbs, and originate from Cincinnati, Ohio and pass over the urban area of Columbus, OH.

Analysis of Urban Plumes over the Columbus and Cincinnati Urban Areas
Scatter plots of NO x versus NO y for urban plumes that were measured near cities in RF02 exhibited strong correlation (r 2 > 0.95) with ratios of NO x /NO y of 0.99, decreasing to 0.94 for urban plumes that were carried aloft to 240 km downwind from the city center of Cincinnati, OH. In general, the plumes belonging to the Cincinnati and Columbus, Ohio metropolitan areas were encountered at nearly the same altitude and correspond to CO concentrations in the range of 180-240 ppbv. The ratios of HCHO/CO in RF02 ranged between 0.26% and 0.50%, and HCHO and CO were moderately correlated. Linear fits of HCHO to CO and other tracers give r 2 > 0.44 for HCHO to CO, r 2 > 0.49 for HCHO to NO x and NO y , r 2 > 0.40 for HCHO to CH 4 , and r 2 > 0.30 for HCHO to SO 2 . CO had very good correlations to methane with r 2 > 0.89, CO to NO x and CO to NO y , r 2 > 0.95 for both, with NO x /NO y ratios close to 1.
GREEN ET AL.
10.1029/2020JD033518 6 of 19 Multiple linear regression analysis was applied to determine the percentage of HCHO that is connected to the presence of each of the tracers, or co-emitted compounds: CO, CH 4 , SO 2 , and NO x . A correlation plot of the sum of PNs (peroxynitrates) from the TD-LIF instrument (Day et al., 2002) is a test of possible secondary HCHO production. The correlation was poor, with an r 2 < 0.12 for plumes 1-4 and 7 in RF02. The low correlation of HCHO with reported PNs and a NO x /NO y ratios close to 1 is consistent with slow photochemistry observed by Sparks et al. (2019). The reaction of CH 4 and OH resulting in the production of HCHO is a slow reaction compared to NO 2 and OH. CH 4 has a lifetime of 9.6 years and reacts 4 orders of magnitude slower than NO 2 + OH, with a pressure and temperature dependent rate constant (at 278 K and 943 mbar) of 1.13 × 10−11 cm 3 ·molecule-1 · s−1 for OH + NO 2 . Evidence of slow NO 2 and OH photochemistry means that an even slower oxidation of CH 4 by OH occurs. Any addition of HCHO by CH 4 oxidation would cause an anti-correlation of −0.007% (a slope of −7.0 × 10−5) which is well within the standard deviation of our correlations between HCHO and CH 4 , as shown in Figure 3. The results from the linear model, shown in Figure  5, have an r 2 > 0.67 for the multivariate fit for all four tracers in plume 1 over Cincinnati, and an r 2 > 0.60 for plume 4 over Columbus. When these results are considered with the linear fits discussed in the previous paragraph, this suggests that a secondary source of HCHO does not correlate with any of the measured tracers.
Secondary production of HCHO from AVOCs, though reduced during winter, may also contribute to the observed enhancements in the urban plumes. The CO, CH 4 , NO x , and SO 2 tracers can also be co-emitted with AVOCs from a variety of sources that undergo photochemical oxidation to produce HCHO. The linear model analysis suggests that primary emission explains most the HCHO, accounting for just over 60% of HCHO variability, at least close the source region. The remainder of the variability may arise from a variety of sources, including primary production that correlates to other tracers not included in this model, measurement uncertainty, and secondary production. The evidence for a contribution from secondary production is as follows. First, although there is a strong correlation of CO to NO x along the path of transport for plumes 1-4, there is also a decrease in the percentage of HCHO apportioned to NO x from plume 1-4 with the tracers not following the evolution of HCHO along the path of transport.
The NO x /CO enhancement ratios are 0.29 for plume 1 and 0.15 for plume 4, similar to enhancement ratios attributed to vehicle traffic from within an urban region according to emissions inventories conducted by Hassler et al. (2016). Second, the r 2 values of 0.67 in the linear fit suggests either a missing primary emission or a secondary production of HCHO that is not captured by the tracers used in this analysis. The amount of secondary HCHO will be estimated using a box model analysis given below.
To assess the relative contribution of primary and secondary HCHO sources, we examine the evolution of the Cincinnati and Columbus plumes from RF02. Figure 6 shows the progression of the HCHO-to-CO enhancement ratio, ΔHCHO/ΔCO, as a function of Lagrangian plume age for each of the five plumes circled in Figure 4. This ratio is calculated as the slope of an ordinary least-squares fit of HCHO versus CO for each plume. Following the first transect (∼0.4 h downwind of Cincinnati), ΔHCHO/ΔCO increases by ∼33% before decreasing gradually in the next three plumes. Toluene and benzene, which do not have secondary sources, exhibit similar behavior ( Figure S1). This is loosely consistent with primary emissions driving the observed changes in HCHO, as opposed to strong secondary production.
Though oxidation is slow in the winter, it is likely that some fraction of HCHO is produced through oxidation of VOC, especially alkenes. To illustrate this point, we approximate the evolution of the urban plume with a 0-D photochemical box model. Here we employ the Framework for 0-D Atmospheric Modeling (F0AMv4, available at https://github.com/AirChem/F0AM) (Wolfe et al., 2016) , 2003). The model is initialized with mixing ratios from the peak of the first plume downwind of Cincinnati. Chemical constraints include CH 4 , CO, O 3 , NO, NO 2 , HONO, and a range of hydrocarbons and oxidized VOC (including HCHO). The TOGA instrument does not measure small alkenes, so we estimate initial mixing ratios of ethene, propene, c-2-butene, t-2-butene, and 1,3-butadiene by rescaling emission ratios for Los Angeles reported by de Gouw et al. (2017a). Meteorology (P, T, RH) and photolysis frequencies are fixed at observed mean values for the five plumes. Dilution is treated as first-order with a dilution constant of 1 day −1 and background concentrations taken from observations just south of the plume intercepts. The model is integrated forward in time for 6 h. After t = 0, all chemical concentrations are determined by the model except for HONO, which is constrained to observed values (50-100 pptv) throughout the simulation to improve representation of OH (the model does not include heterogeneous HONO sources). HCHO enhancement ratios are calculated as the ratio of background-subtracted model HCHO and CO mixing ratios.
This simulation is meant to illustrate the relative contribution of secondary production to the HCHO budget. As such, we divide model HCHO into two classes: "initial" HCHO (which we presume is mainly from direct emissions), and "secondary" HCHO produced within the plume after t = 0. This model is not designed to accurately capture all aspects of Lagrangian plume evolution. Sensitivity simulations, however, indicate that results presented here are insensitive to assumed photolysis frequencies or dilution rates. The median OH concentration for the simulation is 4 × 10 5 cm −3 , within the range of observations under similar conditions (Stone et al., 2012). We estimate an uncertainty of ±50% in model OH due to constraints and assumptions.
Model results indicate moderate photochemical production of HCHO ( Figure 6). Integrating over the 6-h simulation and accounting for the uncertainty in OH, we estimate that secondary HCHO comprises 21 ± 10% of the model HCHO budget. It is difficult to directly compare modeled and observed HCHO as aircraft sampling is only pseudo-Lagrangian (6 h of age sampled in 1 h of flying), but results are consistent with our expectation of missing HCHO sources in the model. Potential explanations include additional HCHO precursors or direct HCHO emissions. Given the slow oxidative conditions of the wintertime atmosphere, an additional photochemical source of HCHO is a less likely explanation. Assuming a mixed layer depth of ∼1 km, a regional HCHO emission flux of 0.5-3 × 10 14 molecules cm −2 h −1 could explain the range of observed ΔHCHO/ΔCO. The low end of this estimate is consistent with emissions derived from analysis of nocturnal boundary layers in a different region (Section 3.2).

Analysis of Urban Plumes Over the New York Metropolitan Area
Plumes near the New York metropolitan area in RF03 had NO x /NO y ratios that were very close to 1, as in RF02. For plume 1, RF03 shown in Figure 7, if it was carried aloft from the New York area, then the start of emission was approximately 3:00 p.m. EST, which was 2 h before sunset and 1 h before the evening rush hour period. Urban plumes 1-3 show good correlations in the individual linear fits, with r 2 values of 0.7-0.85 for HCHO/CO, r 2 > 0.81 HCHO/CH 4 . For plume 1, r 2 = 0.94 for CO/CH 4 , r 2 = 0.96 for CO/NO x . New York plumes have an average HCHO/CO ratio of 0.6% HCHO to CO, which is twice the value for the primary emission ratio cited by Cowling et al. (2007) and Parrish et al. (2012), and equal to the value of this ratio for urban emissions during rush hour in the summertime Houston area (Rappenglück et al., 2010). Figure 8 shows the results from the linear model for plume 1, Figure 7 in RF03, which was closest to New York City (r 2 ranging from 0.85-0.89 for the multivariable fit). Combined with timing of when these plumes exited the urban area, the linear model shows that 43% of HCHO production is linked to methane, 23.4% to CO,27.1% to NO x , and 6.5% to SO 2 emissions. The NY city correlation plots of plume 1, shown in Figure 3, is distinct from the cities in Ohio, showing that there is a much larger contribution of CH 4 , possibly from vehicles and/or home heating, as it corresponds to the highest NO x mixing ratios of 10-20 ppbv over NY city. Unlike the urban region in the vicinity of Columbus, the r 2 > 0.81 for HCHO to CH 4 shows that HCHO emissions in New York were strongly correlated with CH 4 . However, the contribution of NO x is also much higher than what was seen in either urban plume in RF02. SO 2 measurements for the region ranges from 3 to 5 ppbv with no apparent evidence of coal fired power plant emissions.
The majority combustion products from the vehicular fleet in NY originate from ethanol enriched fuels (E10-E85) during the wintertime (Brito et al., 2015;Clairotte et al., 2013;He et al., 2009;Knoll et al., 2009). These products of combustion include formaldehyde, acetaldehyde, NO x , other AVOCs, and the emission of unburned ethanol which can photochemically react and decompose into acetaldehyde, peroxyacetylnitrate (PAN), formaldehyde, and ozone (Carr et al., 2011;Dardiotis et al., 2015;Dominutti et al., 2016;Timonen et al., 2017). These combustion products are also generated by fuels used in domestic heating and can result in photochemical products. In this work, observations do not include tracers needed to unambiguously separate of the total contribution of HCHO by vehicles and the contribution from home heating.

Point Sources of HCHO
Direct emissions of HCHO may explain the presence of two large plumes in Southern Ohio. In Figure 4 (RF02), two plumes showed evidence of emissions from a point source, in a region where there was no power generating station with no associated tracers (NO x , NO y , SO 2 , and CH 4 ) correlated with the plume or associated with an EPA reporting facility, with an average [HCHO] of 1.0 ppbv and a maximum value of [HCHO] 3.4 ppbv. The plume is encountered again with an average [HCHO] of 1.2 ppbv and a maximum [HCHO] of 2.3 ppbv over a broader area from the initial point of detection 146 km away, with an estimated travel time of 3.6 h. The correlation of HCHO to CO, CH 4 , SO 2 , NO x , and NO y produced r 2 < 0.15, in plumes 5 and plume 9 (see Figure 4) with the exception of CO in plume 9, which had a HCHO/CO correlation of r 2 = 0.46, and methane near background levels in the range of 1.89-1.9 ppmv at both points. The lack of correlation to the tracers associated with urban emissions and low concentration of methane is not consistent with urban areas or areas where combustion of fossil fuels is occurring. This shows some evidence of a missing source AVOCs or direct emissions contributing to HCHO.
Compared to other studies, the 0.2%-0.5% HCHO/CO ratios found in RF02 are within the range of 0.18%-0.30% for urban emissions by vehicular traffic observed by Cowling et al. (2007) and higher than the 0.10%-0.14% for urban emissions by Anderson et al. (1996). These values approach the values found in the summertime expected mixing ratio enhancements from primary emissions of HCHO and CO from vehicular traffic ratio of 0.3 % given during the summer by Parrish et al. (2012) and 0.6% by Rappenglück et al. (2010). Given the chemical box model used in this work suggests secondary HCHO to account for 21 + 10 % of the observed HCHO well downwind of urban areas in winter conditions with low BVOC, the observed HCHO/CO ratios may be regarded as modest upper limits to the primary emission ratios in the presence of unknown point sources.

Night Time Urban Emissions
The data collected during RF07 provide an opportunity to constrain regional HCHO emission using a budget-based approach for the nocturnal boundary layer (NBL). For this analysis, we assume that the residual layer (RL) represents NBL conditions at sunset (time = 0), and that any difference between the NBL and RL is due to surface emissions of HCHO. In addition, several assumptions were made in the analysis: (1) the residual layer and the nocturnal layer are stable and are mostly uncoupled, (2) nighttime chemical production/loss and dry deposition of HCHO is negligible, and (3) there is relatively minor variability in horizontal advection, such that we can treat the nocturnal boundary layer as a well-mixed and uniform box. As shown in Figure S2, several small airports were selected to perform multiple missed approaches (MAs). These missed approaches were used to ascertain the altitude of the boundary, residual, and nocturnal layers and to determine if these layers remained stable through the night as shown in Figure 9. During RF07, the vertical thickness of the nocturnal layer ranged between 150 and 200 m, and residual layer was 500-700 m. Relative humidity through the nocturnal and residual layers remained around 30%-40% and rapidly decreased above the boundary layer. Characteristics of each of the missed approaches with stable layers al-GREEN ET AL.  lowed for further analysis to estimate the emissions from the ground into the nocturnal layer.
In Equation 3, the difference in the concentration of HCHO, [HCHO] (molecules cm −3 ) of nocturnal and residual layers was defined as Δ [H-CHO] and was then multiplied by the number density of air (M) and the column-integrated concentration of HCHO within the nocturnal layer (Ω HCHO , molecules cm −2 ) which was found using the nocturnal boundary layer height (H) for each missed approach, as defined in Equation 4. The resulting value of Ω HCHO represents a column measurement of [HCHO] emitted from the ground to the nocturnal layer compared to the amount in the residual layer. When these are plotted versus time since sunset, this gives the flux of HCHO emitted from the area in molecules cm −2 h −1 .
Using the data collected from the ascending and descending legs of MA 1, 2, and 7 (shown in Figure S2), Ω HCHO is plotted against time since sunset to determine the HCHO emission flux into the NBL, as shown in Figure  Output from the reference and improved GEOS-Chem (v10-01) 3D model simulations sampled along the flight tracks was compared to the results found in Figure 10  . Ω HCHO values from GE-OS-Chem were determined in a similar manner as the experimental measurements based on missed approach data, and both used the observed boundary layer height, since the vertical resolution of GEOS-Chem was insufficient to determine the height. The maximum HCHO emission flux from area sources was inputted into GEOS-Chem, using the NEI 2011 database scaled to 2015 using the improved simulation set from Jaeglé et al. (2018) featuring the 5X scaling for HCHO based on observational data around the three areas of the missed approaches, with 5.1 × 10 13 molecules cm −2 h −1 at point 1 and 1.3 × 10 14 molecules cm −2 h −1 at points 2 and 7 (see Figure S2) using a 50 km × 50 km grid. Compared to the values of the average Ω HCHO of 1.1 × 10 14 molecules cm −2 h −1 and the max Ω HCHO of 1.3 × 10 14 molecules cm −2 h −1 found using the flight data, the scaled NEI input at point 1 is about a factor of 2 lower, while at points 2 and 7, the input is near to both values of average and maximum Ω HCHO . When emissions into the nocturnal boundary layer based on the GEOS-Chem output along the flight track was examined, the average emissions flux was (9.9 ± 1.7) × 10 14 molecules cm −2 h −1 , nine times larger than the measured average Ω HCHO and 7.6 times larger than the max Ω HCHO . However, this is an inaccurate comparison for several reasons. In GEOS-Chem, the space within 0-3 km altitude contains 17 layers to compute the concentration of the chemical species, where the average concentration is calculated in a volume using the vertical height of the individual layer and the grid dimensions as a base. Each of the vertical heights is determined by the pressure in the atmospheric column from sea level to 80 km (a total of 72 layers) based on the GEOS-FP, "forward processing," meteorological data products. From the pressure altitude of 0-1,000 m, or at a ceiling of 900 m' altitude in the profile in Figure 9, would only contain 7 of these layers in GEOS-Chem. Consequently, there are fewer data points available in GEOS-Chem for averaging compared to the 50-100 points in the aircraft data. The result is still an improvement compared to the reference simulation done by Jaeglé et al. (2018) without any adjustment. The Ω HCHO taken from the GEOS-Chem reference simulation output was (5 ± 2) × 10 12 molecules cm −2 h −1 , 22 times smaller than the measured average Ω HCHO of 1.1 × 10 14 molecules cm −2 h −1 and 26 times smaller than the max Ω HCHO of 1.3 × 10 14 molecules cm −2 h −1 . It is apparent that work is still needed to reduce the underestimation of the NEI emissions fluxes of HCHO to produce results that match observed [HCHO] in finer detail. The thick black line/circles show the observed enhancement ratio. Colored area shows the model-simulated HCHO enhancement due to secondary production (purple) and initial HCHO (blue). Initial HCHO likely includes both primary and secondary HCHO, as the first plume encounter was 0.4 h downwind of city center.

Nocturnal Emissions Near Atlanta, GA
Atlanta, GA, lies in the southern U.S., a different region than sampled by the majority of WINTER research flights in the northeast U.S. Unlike the northeast, the southern U.S. has significant biogenic emissions, such as monoterpenes, even during the winter months (Hagerman et al., 1997). RF10 sampled the residual layer over and downwind of Atlanta, with a series of profiles through the nighttime boundary layer structure through missed approaches to airfields. Formaldehyde sampled in the residual layer showed evidence for mixing between direct emissions from the urban area together with secondary production. Due to the presence of monoterpenes during winter in the southern U.S., a larger secondary component to HCHO may be expected, though such chemistry is likely oxidant-limited.
The plumes in the residual layer on RF10 can be separated into two categories. The first category, (plumes 1, 3, 4, 5, 9, and 10 in RF10, numbers in Figure S3) contains plumes in which HCHO is highly correlated with CH 4 and CO. Values for the correlation coefficient of HCHO with CO range between 0.94 > r 2 > 0.88, and correlation with HCHO to CH 4 range between 0.89 > r 2 > 0.79. The individual contributions of co-emission of HCHO with either CH 4 or CO are difficult to separate due to the strong correlation of CH 4 with CO. While we retain this separation in this analysis, we recognize that the analysis may not accurately provide a separate attribution to each. The ratio of HCHO/CO ranges between 0.6 % and 1.8%, and averaging at a ratio 1.7%, and the HCHO/CH 4 ratio ranges between 0.5% and 1.2 %, averaging at a ratio of 1.1%. Common to this group is NO x mixing ratios ranging between 0.5 and 6 ppbv, NO y ranging between 1 and 8 ppbv, and a peak amount of HCHO of 0.65-1.1 ppbv. These plumes are also characterized by O 3 mixing ratios of 30-40 ppbv and CO from 120-160 ppbv. Plumes 1 and 3 in RF10 contained NO x < 0.5 and NO y < 1.5, with plume 1 intercepted at an altitude of 120 m above ground during a missed approach; see Figures    The r 2 values are in the range of 0.03 > r 2 > 0.15 for HCHO versus CO, 0.26 > r 2 > 0.01 for HCHO versus CH 4 , and 0.44 > r 2 > 0.19 for HCHO versus NO y . The plumes were intercepted in the range of 430-580 meters above ground. These contain CO in the ranges of 160-200 ppbv, with a peak mixing ratios of HCHO of 0.7-1.1 ppbv, like the HCHO in the first category. The region containing plumes 9-12 show evidence of significant mixing of the plumes from both categories. The region containing plumes 2-4 also show a similar pattern of mixing but it is not as visible as the example in Figure 11.
The plots in Figure 11 combine plumes 9-12 in order to show the change in the slope of HCHO/CO, HCHO/ NO y , and O 3 /HCHO as the aircraft samples a series of plumes that exhibit either dominantly direct emissions or direct emissions influenced by photochemistry from the previous day. The mixing between these two air masses is marked by the increased mixing ratio of CO and NO y and the decreased mixing ratio of O 3 .
At high values of CO (180-230 ppbv), the slope of formaldehyde versus CO is 0.2%, similar to the value quoted for direct urban emissions from other studies (Parrish et al., 2012), indicating that at these more GREEN ET AL.
10.1029/2020JD033518 12 of 19 Figure 9. Vertical profile of potential temperature (black), relative humidity (blue), and several chemical tracers obtained during a nocturnal missed approach near Chambersburg, PA on RF07. Two locations where suitable data for this analysis was obtained was at air ports outside of Chambersburg, PA and Harrisburg, PA.
concentrated levels the primary contribution to observed HCHO is direct emission. At lower values of CO (125-150 ppbv) the corresponding slope is much steeper, at 1.7%. This higher value is likely to be dominated by secondary, photochemical production of HCHO in more dilute urban pollution that was emitted the previous day and is present in the residual layer on this nighttime flight. Mixing between these two air masses produces a range of points at intermediate CO (150 < CO < 180 ppbv) that falls below the extrapolated fits to either of the high or low range of CO but that creates a mixing line that connects the two populations. The distribution of HCHO against NO y (an approximately conserved tracer for the emission of NO x ) shows a similar behavior, with slopes given in the figure. Evidence for this interpretation can be seen in the relationships between O 3 versus CO and O 3 versus HCHO in plots C and D in Figure 11. Although the overall relationship between O 3 and CO shows an anti-correlation, at higher mixing ratios of CO, the slope of O 3 versus CO is relatively steeper than it is at low CO. A similar relationship exists in a plot of O 3 versus NO y . The overall anti-correlation in this plot indicates that the urban emissions are overall ozone-destroying, likely dominated by NO x titration, in this winter environment. However, the points at lower CO have a reduced O 3 versus CO slope, indicating that the destruction is moderated by photochemical O 3 production in these points, consistent with the presence of a higher HCHO versus CO slope from secondary HCHO production in this population of points. Similarly, the relationship between O 3 and HCHO exhibits an overall anti-correlation, but with a steeper slope at high CO than at low CO.

Point Source Emissions of HCHO
Several instances of elevated HCHO (0.75-6.6 ppbv) were observed in close proximity to power stations in RF04 and 07. The area covered by RF04 and RF07 ( Figures S5 and S6) includes several coal, natural gas, and diesel co-generating power stations. Generating stations that utilize both natural gas and diesel (various bio-diesel) fuel types, separately or simultaneously, generate HCHO through the combustion of diesel oil (Basha et al., 2009;He et al., 2009;Lehto et al., 2014). Other power stations that are co-generating facilities utilize wood, refuse, and coal, and they may also emit HCHO.
In RF04, Figure S5, the LV Sutton power station was identified as a combined cycle power station using natural gas and diesel, producing a direct emission of HCHO. The top of Figure 12 shows that power station emissions at ∼1:54 a.m. overlap, with the spike of HCHO coming from an exhaust stack dedicated to diesel, and the other profile coming from natural gas. Because there is no direct measurement for CO in the CEMS database, CO 2 must be relied upon to identify the power plant. Since VA Renewable Portsmouth station had the greatest magnitude of CO 2 emissions reported from CEMS database that matches the measured CO 2 emission, it is the only station in operation that the measured HCHO of 0.75 ppbv, can be attributed to. Given the wind direction, several power plants could produce plume 1 (see Figure S6), which reached a maximum of 0.75 ppbv HCHO. Since VA Renewable Portsmouth station was a far larger emitter of CO 2 , according to the CEMS database, it is the likely producer of plume 1. One source measured at 1:52 UTC showed significant quantities of CO and HCHO, while a larger peak in HCHO measured at 1:53 UTC was co-emitted with significant quantities of CO and CO 2 . There was a very good correlation, (r 2 = 0.91) for HCHO and CO, and the HCHO/CO ratio of 1.8% from the power station (the bottom panel of Figure 12). This ratio was higher by a factor of ∼2.3 when compared to the NEI (2014) aggregate emissions ratio of 0.80% HCHO/CO and the NEI (2017) value of 0.76%. A maximum of 6.126 ppbv HCHO was measured for the L.V. Sutton plume, with the amount of CO 2 from another combustion source from the same power station since the plant utilizes natural gas and diesel fuels.
Another co-generating power station in RF07, Figure S6, Va. Renewable Portsmouth LLC, had a similar HCHO/CO ratio of 1.7% (r 2 = 0.71), but was more obscured by other emissions from either the same power station or emissions from the Elizabeth River power station. The NEI (2014) and NEI (2017) aggregate HCHO/CO emissions ratio for Va. Renewable Portsmouth LLC was 0.0035 and 0.11%. The measured ratio is 15 times greater the NEI (2017) value for HCHO/CO. Sunrise occurred during RF07 at 11:43 a.m. UTC (centered on Norfolk, VA), 7 min after the measurement, so it is unlikely that photochemistry reduced HCHO mixing ratio during the plumes time aloft. The mixing ratios observed from both power generating stations are well in excess of typical urban plumes from the WINTER campaign, with levels of CO approaching 400 ppbv.
As mentioned earlier, HCHO scaling was needed to bring the NEI data input into the GEOS-Chem model close to the in-situ observations from aboard the aircraft . Here we have a similar factor of underestimation when NEI was used directly to compare, under conditions without secondary production (i.e., nighttime emissions). The emissions ratios determined from the measured data are 2.3-15 Figure 11. Four plots of the combined plumes 9-12 for RF10. The color spectrum moves from orange to blue along with increase in data points as aircraft passed through group of air masses encountered during RF10 occurring between 4:40 and 5:10 a.m. EDT (9:40 and 10:10 UTC), shown in the color scale bottom right. (a) The combined plot of HCHO versus CO describes the distinct air masses in terms the HCHO/CO ratio, showing a steep slope at low CO and a shallow slope at high CO, as described in the text. (b) Shows the combined plot for HCHO versus NO y . (c) O 3 versus CO anti-correlated with small range in slope with increasing CO. (d) Combined plot of O 3 versus HCHO relationship is anti-correlated, with a steeper slope at high CO as described in the text. (e) The combined plot O 3 versus NOy. times greater than the annual estimates from the NEI 2017 emissions database. Also, the available databases for the CEMS and NEI do not provide an estimate of the daily or monthly emissions of HCHO during facility operation for comparison, so lacks any variation on small time scales. Figure S7 displays the NEI 2017 yearly estimated emissions level per state in the US. Table S1 shows the HCHO/CO ratio for all the identified plumes. Models that utilize the NEI database may underestimate HCHO emissions from power stations, or they may be limited by the reported information in the database. For example, there is no category in the NEI for diesel fueled combined power station, where other types such as wood, coal, and natural gas are covered. This may explain a part of the HCHO not accounted for in models, as diesel combustion and hydrocarbon products from diesel combustion, and other possible AVOC emissions are not included.

Conclusions
We report the wintertime aircraft observations of HCHO from the near urban areas in the Eastern U.S. from the WINTER campaign carried out on the NSF C130 between February and March 2015. This is a unique data set for assessing the spatial distribution of HCHO in the winter season. Due to much slower chemical production of secondary HCHO during the winter, these measurements are more sensitive to the primary emission sources of HCHO and improve the accuracy of primary HCHO emission ratios to other pollutants, GREEN ET AL. such as CO, from urban areas and point sources. They can be further used to assess the rates of wintertime photochemistry.
Urban regions in Ohio, New York, Atlanta, GA, Virginia, and North Carolina exhibit both area and point source emissions of HCHO. Daytime ratios of urban HCHO/CO range between 0.22% and 0.50% for the Ohio area with plumes experiencing aging over 6 h, and 0.57%-0.60% for plumes from New York and Connecticut urban areas experiencing aging of less than 2 h. The analysis from the multivariate linear model used to compare the regions in Ohio to New York City suggests that timing of vehicular activity may be correlated to the direct emission of HCHO. Other anthropogenic emissions within the region also contribute to secondary production of HCHO. The linear model applied to the urban plume closest to New York had a higher r 2 , from 85 to 89, indicating that the tracers considered account for this fraction of variability in HCHO.
Model calculations of AVOC oxidation rates provide estimates of the relative contribution of secondary production and primary emission of HCHO. While the correlation of HCHO to other primary emissions in Ohio, shown in Figure 5, can account for up to 67% of the variance in HCHO, a box model shown in Figure 6 shows that 30% of the observed HCHO could be attributable to secondary production after 6 h of transport, but smaller fractions near the source regions. It is likely that both primary emissions and secondary chemistry contribute to the HCHO budget in urban-influenced regions, despite the relatively slow rate of winter photochemistry. Nevertheless, the analysis indicates primary emissions to be the major contribution to HCHO in the northeast U.S. during winter.
Night flights sampling the urban areas and surrounding regions in Virginia and Georgia characterized HCHO emissions and mixing with photochemically produced HCHO. In the region around Atlanta, GA, plumes sample at night in the residual layer showed evidence for mixing between direct urban emissions and photochemical production of HCHO from the preceding day. The urban emissions over nighttime Atlanta are overall ozone-destroying, likely dominated by NO x titration, with HCHO and O 3 anti-correlated with each other. Other large localized sources of HCHO, such as the power stations and other area sources and point sources, were intercepted by multiple missed approaches in VA. Sampled area sources in VA during the night, emitted an average of (1.1 ± 0.3) × 10 14 molecules cm −2 h −1 or a maximum of (1.3 ± 0.2) × 10 14 molecules cm −2 h −1 of HCHO based on measurements taken from the nocturnal and residual layers and the time since sunset in the region. Two-point source emissions during RF02 in Ohio produced plumes with large HCHO mixing ratios, with average and maximum values of 1.04-1.18 and 2.25-3.39 ppbv, respectively. Two direct emissions of HCHO were identified from natural gas/diesel co-generating power stations on RF04 and RF07: L.V. Sutton, correlation r 2 = 0.98 for HCHO and CO, ratio of HCHO/CO of 1.8% with a maximum observed concentration of 6.126 ppbv HCHO, and Va. Renewable Portsmouth LLC, HCHO/CO ratio of 0.017 and maximum observed concentration of 0.75 ppbv HCHO. For these emitters, very little or no reported information is available for comparison to their estimated points of origin of the measurements from these intercepted plumes. The NEI aggregate data are only resolved on an annual basis, and by directly comparing the NEI emission ratio of HCHO/CO for the 2017 years, it was found that emissions ratio for the power station was underestimated by a factor of 2.3. Diesel fueled power generation is also not a category in the NEI, resulting in the exclusion of AVOC product generated by diesel. Other factors leading to potentially inaccurate HCHO emission factors include excluding items such as operating load and operational time. With no daily or monthly reported values from either the CEMS or NEI, models that rely on NEI data may inaccurately represent HCHO during the winter season. As a result, HCHO emissions may have been underestimated in any previous work that utilized reported data, or have used in situ measurements of other species and emission ratios to extrapolate HCHO, which results in its underestimation, as shown in Luecken et al. (2012) and Zhu et al. (2017).

Data Availability Statement
EPA Air Markets Program Data available at https://www.epa.gov/air-emissions-inventories/2017-national-emissions-inventory-nei-data. Data obtained from the WINTER C-130 research flights used in this study are available at http://data.eol.ucar.edu/project/WINTER.