In this study, a source-oriented version of the Community Multiscale Air Quality (CMAQ) model was developed and used to quantify the contributions of five major local emission source types in Southeast Texas (vehicles, industry, natural gas combustion, wildfires, biogenic sources), as well as upwind sources, to regional primary and secondary formaldehyde (HCHO) concentrations. Predicted HCHO concentrations agree well with observations at two urban sites (the Moody Tower [MT] site at the University of Houston and the Haden Road #3 [HRM-3] site operated by Texas Commission on Environmental Quality). However, the model underestimates concentrations at an industrial site (Lynchburg Ferry). Throughout most of Southeast Texas, primary HCHO accounts for approximately 20–30% of total HCHO, while the remaining portion is due to secondary HCHO (30–50%) and upwind sources (20–50%). Biogenic sources, natural gas combustion, and vehicles are important sources of primary HCHO in the urban Houston area, respectively, accounting for 10–20%, 10–30%, and 20–60% of total primary HCHO. Biogenic sources, industry, and vehicles are the top three sources of secondary HCHO, respectively, accounting for 30–50%, 10–30%, and 5–15% of overall secondary HCHO. It was also found that over 70% of PAN in the Houston area is due to upwind sources, and only 30% is formed locally. The model-predicted source contributions to HCHO at the MT generally agree with source apportionment results obtained from the Positive Matrix Factorization (PMF) technique.
 Formaldehyde (HCHO), in addition to being a hazardous air pollutant that can cause cancer and leukemia under high ambient concentrations [Morello-Frosch et al., 2000; Woodruff et al., 1998], plays a significant role in tropospheric chemistry as an important HOx radical source during the day, which contributes to the formation of secondary air pollutants, including ozone (O3) and secondary organic aerosol [Atkinson, 2000; Cooke et al., 2010]. Both primary emissions and secondary formation from oxidation of other reactive volatile organic compounds (VOCs) contribute to ambient HCHO concentrations. Quantifying the primary and secondary HCHO contributions of various emission sources is essential to better understand the atmospheric chemistry of HCHO and to effectively design air pollution control strategies [Lei et al., 2009; Luecken et al., 2012].
 The Houston metropolitan area in Southeast Texas experiences elevated O3 concentrations due to significant anthropogenic emissions of NOx and VOCs from industrial and mobile sources, as well as VOCs emitted from biogenic sources [Ying and Krishnan, 2010; Zhang and Ying, 2011a]. Past studies suggest that high O3 events in the Houston metropolitan area are related to highly reactive volatile organic compound (VOC) emissions from petrochemical sources in the Houston Ship Channel (HSC) [Murphy and Allen, 2005; Nam et al., 2006; Ying and Krishnan, 2010]. High concentrations of HCHO have been observed in O3 plumes in the Houston metropolitan area [Daum et al., 2004], and substantial HCHO fluxes have been associated with primary emissions at industrial locations [Mount et al., 2010]. Reactivity analysis shows that in both urban and industrial regions in the Houston area, ethene, propene, and HCHO are the highest contributors to O3 formation [Czader et al., 2008]. Olaguer et al.  have noticed that unaccounted primary HCHO emissions from the industrial sector are important to HCHO concentrations and can increase predicted O3 concentrations by as much as 30 ppb.
 Numerous ambient measurements and modeling studies have been conducted in order to fully understand the observed concentrations of HCHO, the relative importance of primary and secondary contributions, and the responsible emission sources in the Houston metropolitan area. However, the results from these studies are far from consistent.
 Linear regression analysis conducted by Friedfeld et al.  and Rappenglück et al.  showed that approximately 45–50% of HCHO at an industrially influenced urban site (Deer Park, south of HSC) and an urban site (Moody Tower [MT] in University of Houston) can be attributed to primary emissions of HCHO. More recently, Buzcu Guven and Olaguer  applied Positive Matrix Factorization (PMF) on the same dataset used by Rappenglück et al.  and found that HCHO associated with motor vehicle, industry and biogenic factors account for 23%, 17%, and 24% of the total HCHO, respectively.
 Contrary to the aforementioned studies which concluded that primary HCHO is non-negligible, several other studies in the literature reported quite opposite conclusions. For example, based on measured HCHO during Texas Air Quality Study (TexAQS) 2000 and reactive plume modeling results, Wert et al.  found that no substantial HCHO was directly emitted from petrochemical facilities. Based on the data collected during 2000, 2006, and 2009 and a simplified treatment of emissions, reaction, and transport, Parrish et al.  concluded that 92 ± 4% of total HCHO in Houston is from secondary production due to highly reactive VOCs emitted by petrochemical facilities.
 The differences in the estimations of primary and secondary sources to HCHO can be at least partially attributed to the fact that direct measurement of HCHO alone cannot determine whether it is primary or secondary, and other relationships, such as the ratios of HCHO to CO, SO2, or PAN, along with meteorological analysis must assist the source apportionment [Buzcu and Olaguer, 2011; Rappengluck et al., 2010]. Due to potentially significant variations in ratios of HCHO emissions to those of other species, as well as in meteorological conditions, different conclusions can be arrived at for different locations and times using receptor-oriented techniques.
 As a complement to the receptor-oriented models, regional 3D chemical transport models (CTMs) can be used to determine source contributions to HCHO concentrations on urban and regional scales based on the reported emissions in the emission inventory. In a recent study, Luecken et al.  applied the Community Multiscale Air Quality (CMAQ) model along with integrated reaction rate (IRR) and integrated process rate (IPR) analyses to determine the primary and secondary HCHO contributions and the precursor VOC species that contribute to secondary HCHO in the eastern US. Their results suggest that, in the summer, HCHO is mainly due to photochemical production rather than direct emissions. Isoprene (ISOP) is the major precursor of HCHO, but other anthropogenic alkenes can also be significant contributors. The relative contributions of the anthropogenic and biogenic precursors vary spatially and temporally. However, the spatial resolution of that study was relatively coarse (12 km), and IPR and IRR alone cannot directly determine contributions of different anthropogenic and biogenic sources to HCHO. Such information is needed in Southeast Texas as vehicles, petrochemical industry, and biogenic sources have all been reported to contribute significantly to HCHO concentrations as discussed above.
 To provide additional insight into HCHO source apportionment, a source-oriented air quality model was developed and applied to directly determine the contributions of sources of primary and secondary HCHO in Southeast Texas during the 2006 TexAQS II campaign. Contributions from upwind sources outside the air quality model domain were also determined. Finally, the predicted source contributions by the source-oriented model were compared with the results of a receptor-oriented model to evaluate the outcomes of both models.
2 Model Description
 The SAPRC99 (S99) photochemical mechanism was modified to include additional source-tagged species and reactions and incorporated into the CMAQ model version 4.7 developed by the United States Environmental Protection Agency (U.S. EPA) to determine source contributions to primary and secondary HCHO. This source-oriented technique has been previously applied in the CMAQ model to study regional source contributions to O3 [Ying and Krishnan, 2010; Zhang and Ying, 2011a], secondary inorganic aerosol [Zhang et al., 2012] and secondary organic aerosol [Zhang and Ying, 2011b; 2012]. The source apportionment technique for primary and secondary HCHO is described in greater detail below.
 To attribute the contribution of each source to primary HCHO (including upwind primary HCHO entering the model domain through boundary conditions), the original S99 mechanism was modified so that HCHO emitted from different sources is represented by different source-tagged HCHO model species. The reaction rate constants and products of these additional tagged HCHO species are the same as the nontagged HCHO species. For example, based on the original reaction of HCHO with OH in the S99 mechanism, additional reactions for tagged HCHO species are included in the revised mechanism, as shown in reaction set ((R1)):
where each HCHOX represents primary HCHO species emitted from source X. N is the number of primary HCHO sources tracked in the model. Similarly, photolysis reactions are also included for the tagged model species. The original nontyped HCHO in the mechanism is used to represent total secondary HCHO from the oxidation of non-HCHO VOCs emitted within the model domain and from upwind sources. The initial condition of HCHO is also attributed to the nontagged HCHO species, but the contribution of initial HCHO becomes negligible after one spin-up day. A similar typed mechanism with only one primary HCHO type (i.e., N = 1) has been used in a previous study to determine the relative contributions of primary and secondary aldehydes to hydroxyl radical formation in the atmosphere [Li et al., 2012].
 Determining the contributions to secondary HCHO from individual sources requires another set of simulations with a second source-oriented mechanism that tracks the total (primary and secondary) source contributions to HCHO. This mechanism has been implemented in a previous study to determine contributions of VOCs from different sources to O3 formation in Southeast Texas [Ying and Krishnan, 2010]. For this study, the modified mechanism is capable of determining contributions to total (primary and secondary combined) HCHO from one explicit source by tagging the primary emitted VOCs and their oxidation products. Using ISOP reaction with OH as an example, the modified mechanism includes the following two reactions (reactions (R2)):
where the superscript E represents ISOP emitted from an explicit source (for example, biogenics) and the nontagged ISOP species represents ISOP emitted from all other sources. Reactions of MVK and MVKE are similarly expanded, and the products (HCHO being one of the products) carry the same source tag as their parent MVK species, so that the amount of HCHO from the explicit VOC source can be determined. Primary-emitted HCHO from the explicit source is also represented by the tagged HCHOE species so that HCHOE actually represents contributions from both primary and secondary HCHO from the explicit source. More details of the expanded VOC reactions can be found in a previous study [Ying and Krishnan, 2010].
 If there are N types of primary VOC sources, N + 1 sets of simulations will be conducted so that contributions of the explicit and upwind sources to total HCHO can be determined. By subtracting the primary contribution results from the total contribution results, the contribution of each source to secondary HCHO concentrations can be determined.
 Computation time required for the two-type source-oriented model almost doubles that of the original non-source-oriented CMAQ model with identical model configurations. Although it is possible to track more than one explicit source simultaneously in a single simulation by including more reactions and tagged species, the modified mechanism with more than two explicit sources has many more species and reactions than the original mechanism, and thus requires even more computational resources to complete one simulation. In this study, only one explicit source is tracked in a single simulation.
3 Model Application
 In this work, CMAQ 4.7 with a source-oriented S99 mechanism was applied to study primary and secondary HCHO in Southeast Texas for a two-week-long episode, from August 28 to September 12, 2006. The episode is part of the 2006 TexAQS II study with an extensive collection of field measurements, emission, and meteorology data. This data-rich episode has also been used by the Texas Commission on Environmental Quality (TCEQ) for ozone attainment demonstration. The episode is appropriate for studying HCHO, whose contribution to ozone formation is high in August due to intensive solar radiation. Understanding HCHO source contributions during this episode will assist in designing more efficient emission control strategies. The first day of the model episode was used as a spin-up day, and the results from that day were not used in the analysis. The simulations were conducted using a three-level nested domain, with the innermost 4-km resolution Southeast Texas domain centered on the Houston metropolitan area. The nested domain setup is based on that used by TCEQ and has been documented in detail in a previous study [Zhang and Ying, 2011b]. The spatial coverage of the 4-km domain and the stations where HCHO was measured are shown in Figure 2(a). A more detailed map that shows the monitoring stations and major known emission sources can be found in Figure 4 of Buzcu Guven and Olaguer  and is not duplicated here.
 The MM5 meteorological inputs were provided by TCEQ. Emissions were generated from the 2005 National Emission Inventory (NEI) (version 4 of the 2005-based modeling platform, downloaded from the website of the U.S. EPA Emission Modeling Clearing House) using a modified Sparse Matrix Operator Kernel Emission (SMOKE) model that computes emissions for different source categories based on user-supplied Source Classification Code lists [Ying and Krishnan, 2010]. A special hourly emission inventory prepared by TCEQ (version 9) for the TexAQS 2006 study replaced the NEI for point sources in the Houston-Galveston-Brazoria and Beaumont-Port Arthur (BPA) areas. In this study, emissions of HCHO from anthropogenic sources were based on VOC speciation profiles from the SPECIATE 4.2 database (for NEI emissions) and from the TCEQ special inventory. Vegetated surfaces can be a significant source of primary HCHO, as discussed by Kesselmeier and Staudt . Biogenic emissions of HCHO and other VOC species were generated using the Biogenic Emissions Inventory System, Version 3 (BEIS3) [Vukovich and Pierce, 2002]. More details on the emission processing can be found in a previous paper [Zhang and Ying, 2012].
 The emissions were divided into five explicit categories: vehicles, industry (including petrochemical and other industrial sources), natural gas combustion, wildfires and biogenic sources based on literature-reported major sources of primary and secondary HCHO. In addition, they are specifically chosen so that a comparison with the PMF study of Buzcu Guven and Olaguer  is possible. The natural gas combustion source includes industrial processes that use natural gas as fuel but does not include natural gas production processes. Other HCHO emission sources that do not belong to the above source categories are lumped into the “other” source category in the simulations. Thus, a total of six local source categories are explicitly modeled. It should be noted that upwind primary HCHO in this study is the HCHO entering the Southeast Texas model domain as HCHO boundary conditions. While some of this “primary” HCHO is indeed directly emitted, some are formed from other VOCs before reaching the Southeast Texas domain boundary. Similarly, upwind secondary HCHO is HCHO formed from non-HCHO VOCs that enter the domain through boundary conditions.
 It is well known that some episodic emission events from industrial sources are not reported to TCEQ and thus are likely not included in the special emission inventory provided by TCEQ. Moreover, the TCEQ emission inventory does not adjust for any reporting inaccuracies in the regular emission inventory, including the neglect of primary HCHO emissions from routine flares, catalytic crackers, or other major industrial combustion sources, as pointed out by Olaguer et al. . Thus, this study focuses more on the episode average source contributions rather than contributions during specific high HCHO events.
4 Results and Discussion
4.1 Modeling Results of HCHO
 Model predictions of O3, NO2, and organic carbon have been compared favorably with observations in a previous study [Zhang and Ying, 2012]. The same emissions and meteorology inputs from that previous study are used to drive the model simulations in this study. The resulting predicted hourly HCHO concentrations are compared with available measurements at the MT, HRM-3 and Lynchburg Ferry (Lynchburg) sites (see Figure 2 for the location of the sites). MT is an urban site within the University of Houston campus approximately 5 km southeast of downtown Houston. HCHO concentrations were measured at the rooftop of the MT (59 m above surface), so the predicted concentrations at the Tower used in the following analyses were extracted from the second model layer, with a midlayer elevation of 58 m above ground level. HRM-3 and Lynchburg are two surface sites near HSC operated by a private consortium (the Houston Regional Monitoring Network) and/or TCEQ. Lynchburg is approximately 12 km east of HRM-3 and is closer to HSC.
 Figure 1(a) shows that at MT the predicted HCHO concentrations agree with observations except for some underpredictions of peak HCHO concentrations in the first few days. The predicted HCHO concentrations are approximately 4 ppb during most days, while there are 3 days (September 9–11) with lower concentrations. Analysis of the model results show that the concentration of HCHO in the surface layer is very similar to that in the second model layer due to strong turbulent mixing during the day and mechanical mixing at night. At HRM-3 (Figure 1b), observations are only available for the last 5 days. The high observations on September 10 and 11 (up to 10 ppb) are missed by the model, while for other days, the model captures the diurnal variation and peak concentrations. At the Lynchburg site (Figure 1c), the model underpredicts the HCHO concentrations significantly for all the days. Based on a PMF analysis, Buzcu Guven and Olaguer  found that as much as 87% of the HCHO at the Lynchburg site could be apportioned to a single factor with HCHO as the dominant species, which likely represents industrial emissions of primary HCHO. Using back-trajectory analysis, they also found that high HCHO concentrations at the Lynchburg site were related to industrial facilities close to the station. Some peak HCHO concentrations may have been linked to transient emission events of HCHO that were not reported to TCEQ. More recently, Olaguer  used an adjoint neighborhood scale 3D chemical transport model to show that the highest concentrations of HCHO observed at Lynchburg Ferry are best explained by primary HCHO emissions in the area rather than by long-range transport of secondary HCHO or exclusively by secondary HCHO formed from local olefin emissions. Thus, underprediction by the CMAQ model implies that some primary emissions of HCHO might be incorrectly represented in the current inventory near this location. Model performance statistics (NMB, normalized mean bias; NME, normalized mean error) for HCHO can be found in Table 1.
Table 1. Model Performance Statistics for HCHO and PAN
* NMB = normalized mean bias, NME = normalized mean error.
 In addition, predicted HCHO concentrations are compared with measurements taken on board the NOAA research vessel Ronald H. Brown (RHB) and P-3 aircraft. Details of the measurements and trajectories of the ship and aircraft can be found in http://esrl.noaa.gov/csd/projects/2006/. Figure 1(d) and (e) show that the model is capable of reproducing most of the measured HCHO concentrations in Galveston Bay and in the air aloft, as indicated by the model performance statistics in Table 1.
4.2 Source Apportionment of HCHO
 Figure 2 shows the regional distribution of total HCHO and the contributions to HCHO due to primary and secondary local sources, and from upwind sources, averaged over the entire model episode. The highest HCHO concentrations are approximately 6 ppb in industrial areas. Concentrations range from 4 to 6 ppb in areas influenced by wildfire emissions. In the urban Houston area, the concentrations are 3–4 ppb, and the concentrations over the ocean are approximately 2 ppb. Primary HCHO concentrations are on the order of 0.5 ppb on land and high concentrations up to 3.5 ppb are limited to industrial and wildfire emission areas. Secondary HCHO concentrations are approximately 2 ppb throughout Southeast Texas on land. Upwind HCHO can be as high as 3.5 ppb near the northern boundary of the 4-km domain. In the urban Houston area, the concentrations are approximately 1.5 to 2 ppb. Lowest HCHO due to upwind sources occurs in the southwest part of the domain. Average wind speed in this area is significantly lower than other parts of the domain (see wind vectors in Figure 2d).
 Figure 3 shows the fractional contributions to total HCHO due to local and upwind sources. Source contributions shown in Figure 3 include both primary HCHO and secondary HCHO formed from degradation of VOCs. In the urban areas of Houston, vehicle contributions are approximately 10–15%. The industrial sources have high contributions in the HSC area, as well as in Texas City, BPA, and near a petroleum refinery in west Brazoria County. Maximum contributions due to industry can be as high as 46%. It should be noted that actual contributions of industry are likely higher near Lynchburg because of undercounting of HCHO in the current emission inventory. Natural gas combustion, which is mainly for natural gas-fired power plants, can contribute as much as 31% to HCHO concentrations. In this episode, most of the wildfires are located along the Texas-Louisiana border. Maximum contributions due to wildfires can be as high as 43%. In the immediate downwind areas, their contributions are approximately 10–20%. Emissions from wildfires can be transported over relatively long distance, and their influence can be clearly seen in Figure 3(d). The contribution of biogenic emissions to HCHO is approximately 35–50% over most land areas. The contribution can be as high as 54% in areas northeast of the urban Houston area. A significant fraction of the HCHO in Southeast Texas is due to upwind sources. As much as 94% of the HCHO near the boundary and over the ocean is due to upwind sources. Over most of the land area, upwind sources contribute approximately 40–50% to the total HCHO. The results shown in Figure 3 clearly illustrate that while local sources can be major contributors to HCHO in areas near emission sources, biogenic and upwind sources dominate the HCHO budget over most of Southeast Texas.
 Contributions of primary and secondary HCHO to the total HCHO for each source category are illustrated in Figure 4. Contributions to total HCHO due to vehicles are approximately 0.2–0.4 ppb. In the urban area, 60% is due to primary emissions, while approximately 40% is due to secondary formation. Contributions of industrial sources to HCHO are generally between 0.2 ppb and 0.4 ppb but can be as high as 2.5 ppb near source regions. Primary emissions account for the majority of high HCHO concentrations, but secondary HCHO is more important in the downwind regions. The maximum contribution due to natural gas combustion is 1.36 ppb, and almost all HCHO from natural gas combustion is primary. Secondary HCHO from that source is almost negligible. Wildfires can be a significant source of HCHO, with contributions as high as 2.46 ppb. Primary emissions account for approximately 80% of the HCHO near source regions. The majority of wildfire HCHO away from sources is secondary. Biogenic emissions have a broad spatial coverage, with 0.4–0.5 ppb of primary HCHO and 1–1.5 ppb of secondary HCHO throughout the land portion of the domain. Secondary HCHO is approximately 0.25–0.5 ppb over the ocean. Upwind HCHO dominates the HCHO budget in many areas, with a maximum concentration of 3.09 ppb along the northern border. While the majority of upwind HCHO along the borders is primary HCHO, secondary HCHO is more important in areas away from the border, with a near-uniform concentration of 0.75 ppb. Most of the non-HCHO VOCs entering the domain through the boundaries are aged biogenic VOCs, as major anthropogenic sources are not near the boundaries.
 Figure 5 shows the time series of relative contributions of each local source, as well as upwind sources, to primary, secondary and overall HCHO at MT and HRM-3. At both sites, primary and secondary HCHO concentrations show clear diurnal variations. Primary HCHO peaks at night, while secondary HCHO peaks during the day. At MT, vehicle and biogenic sources are the major local sources of primary HCHO. At HRM-3, industry, natural gas combustion, and biogenic sources are major local sources. Contributions of industry and natural gas combustion increase at both sites during the second half of the model episode due to a change in the dominant wind direction. Contributions from upwind sources to primary HCHO at MT and HRM-3 are similar and can be as high as 80% on some days during the first half of the episode.
 As shown in Figure 5(c) and (d), concentrations of secondary HCHO are generally higher than those of primary HCHO. Biogenic VOCs are the dominant local source of secondary HCHO at all times at both locations, accounting for 50–60% of secondary HCHO. Contributions of vehicles to secondary HCHO peak during morning rush hour and can occasionally reach 20%. Contributions of upwind sources to secondary HCHO exceed 40% most of the time and can be as high as 60–80% on some days.
 Since primary and secondary HCHO peak at different hours, overall HCHO does not show strong diurnal variation. As shown in Figure 5(e) and (f), biogenic and upwind sources are the dominant sources of total HCHO. A decrease in the contributions from upwind sources is usually accompanied by an increase in the contributions from local biogenic sources. Contributions due to vehicles are higher at MT than at HRM-3, while contributions of industry and natural gas combustion are usually higher at HRM-3. The CMAQ model significantly underpredicts total HCHO at the Lynchburg site, hence detailed source apportionment results for this site are not shown in Figure 5. The source apportionment results at this site are also excluded from Section 4.3 below.
4.3 Comparison With PMF
 The model-predicted HCHO source apportionment results at MT and HRM-3 are compared with the results from a PMF source apportionment study. While the source apportionment results described in previous sections depend on the accuracy of the emission inventories used to drive the CTM, the PMF analysis depends only on the observed ambient concentrations to determine source contributions at receptor locations. Thus, a comparison of the source apportionment results from these two independent techniques provides an additional evaluation of the source apportionment results described in the previous sections.
 More details about the PMF study can be found in Buzcu Guven and Olaguer . In summary, hourly average HCHO concentrations along with hourly speciated Photochemical Assessment Monitoring Stations VOC concentrations at MT and HRM-3 during August/September TexAQS 2006 were used as inputs to the EPA PMF model. At MT, colocated CO, SO2, PAN, and HONO measurements are also used to enhance the ability of the PMF model to resolve HCHO sources. Overall, six HCHO factors were identified (PAN-dominated, petrochemical, SO2-related, ISOP-related, vehicle exhaust, and natural gas) at MT, which accounts for 98% of the measured HCHO. At HRM-3, there are also six HCHO factors identified, but only four of them (petrochemical, ISOP-related, vehicle exhaust, and natural gas) overlap with the MT sources. Only 61% of the measured HCHO is explained by the resolved PMF factors. Four of the overlapping sources explain 56% of the total HCHO at HRM-3, and the remaining 5% is due to heavy alkanes and fuel evaporation. The analysis in the PMF study included more days than the modeling episode. Average PMF source apportionment results for the same modeling episode were recalculated at the MT and resulted in only small differences (less than 5%). Observational data were not available at the HRM-3 site between August 29 and September 8. Averaged source contributions based on data between September 8 and 12 show only small difference from the results using all observations.
 At MT, the PAN-dominated factor accounts for 36.3% of the total HCHO. In order to compare the PMF results directly with the CMAQ source apportionment results from this study, sources of PAN need to be studied to determine the sources of HCHO associated with PAN in the PMF analysis. Although it has been demonstrated during several high HCHO events that PAN is associated with SO2 and thus is likely to have originated from industrial or petrochemical sources [Rappenglück et al., 2010], the PAN-dominated HCHO resolved by PMF in the study of Buzcu Guven and Olaguer  needs further scrutiny because the PMF analysis identified a separate SO2 related factor, suggesting that PAN is not always related to SO2 emitted from industrial sources.
 Figure 6 shows a comparison of the predicted and observed hourly PAN concentrations at MT and along tracks of the RHB ship and P-3 aircraft during the model episode. Observed PAN concentrations at MT show a clear diurnal variation and peak in the early afternoon of each day, reaching 1–2 ppb on most days. Model predictions agree well with observations and clearly reproduce the diurnal and day-to-day variations. The observed PAN concentration increases sharply on September 7 and reaches a maximum concentration of 6 ppb, which is likely due to a transient VOC emission event. This sharp increase of PAN is not reproduced by the model simulation. Predicted PAN concentrations along the tracks of the RHB and P-3 also agree with model predictions in general. Two significant peaks of HCHO measured by the RHB are not reproduced by the simulation. However, the general good agreement between the predicted and observed PAN concentrations on most days, as indicated in Table 1, provides confidence in using the episode average PAN source apportionment results to apportion the “PAN-dominated” factor into responsible emission sources. The predicted NMB and NME for PAN is significantly better than the NMB (0.92) reported in another study that uses WRF/ARW-CMAQ [Yu et al., 2012].
 Figure 7 shows the episode average surface level PAN concentrations predicted by CMAQ and the fractional contributions from different emission source types within the 4-km model domain and from upwind sources. In most parts of the domain, upwind sources and biogenic emissions are the dominant sources of PAN, accounting for approximately 65–75% and 25–35% of the predicted PAN concentrations, respectively. Contributions of other sources are generally much smaller. Maximum contributions from both vehicles and industrial sources are approximately 7%, while wildfires contribute between 6% and 18% downwind of the fire source. Results for the second model layer (where the MT site is located) are similar, with slightly higher contributions from upwind sources and lower contributions from biogenic sources. The high contributions of PAN from upwind sources are expected because the relative long lifetime of PAN at night allows it to accumulate and be transported to downwind areas.
 Figure 8 shows a comparison of relative source contributions to HCHO predicted by the PMF technique and by the source-oriented CMAQ model at MT and HRM-3. Predicted total contributions (primary + secondary) of each source using the source-oriented CMAQ model are compared with the respective PMF factors. SO2-related and petrochemical factors from the PMF results are combined and compared with the predicted industrial contributions from the source-oriented CMAQ model. At MT, in addition to showing the original PMF results, we have used the CMAQ-predicted episode average fractional contributions to attribute PAN-dominated HCHO from PMF as follows: upwind sources (69.4%), biogenic sources (21.4%), vehicles (4.6%), industry (3.6%), wildfires (0.4%), and other sources (0.6%). The “other” source type of the source-oriented CMAQ results at MT is compared with the unresolved HCHO by the PMF model.
 At the MT site, the PMF model, after attributing the PAN-dominated factor contributions to local and upwind sources (see PMF* results in Figure 8), predicts that more than 20% of the HCHO is from vehicles, while the source-oriented CMAQ model predicts a little more than 10%. PMF also predicts higher contribution from industry (14%) than this study (5%). The contributions from natural gas combustion predicted by both models are small (PMF 5% vs. CMAQ 4%). Both methods predict that biogenic sources are the major local contributor to PAN at MT (PMF 35% vs. CMAQ 34%), with small contributions from other sources (PMF 2% vs. CMAQ 2%). The relative contributions due to upwind sources predicted by the CMAQ model are higher than for the PMF model (PMF 18% vs. CMAQ 44%). The results suggest that HCHO emitted from vehicles and industry might be somewhat underestimated in the current CMAQ model emission inventory.
 At HRM-3, the heavy alkanes and fuel evaporation factors from the PMF results are combined with the petrochemical factor and compared with the CMAQ industrial contributions. A significant fraction (44%) of the HCHO at HRM-3 is unresolved, which is likely due to nonlocal sources. The two models show excellent agreement in the relative source contribution assessments for local sources. Both models predict lower contribution from vehicles (PMF 2% vs. CMAQ 5%) than that at MT. The source-oriented CMAQ model predicts slightly higher contributions from industrial sources than the PMF results (PMF 3% vs. CMAQ 6%). The same is true for natural gas combustion (PMF 8% vs. CMAQ 11%). Both models predict that approximately 40% of HCHO is from biogenic sources. The relative contributions due to upwind sources predicted by the CMAQ model is 42%, while the PMF model does not predict any upwind contribution at HRM-3. However, 44% of HCHO is not resolved by PMF, which is comparable to the upwind contributions predicted by CMAQ. The good agreement between the CMAQ and PMF results suggests that the current model inventory can well represent the average HCHO emissions from major local sources near HRM-3.
 Source contributions to primary and secondary HCHO from six local source categories and upwind sources in Southeast Texas during TexAQS 2006 were studied using a source-oriented version of the CMAQ model. Primary HCHO accounts for 20–30% of the total HCHO concentrations in urban Houston. Major local sources of primary HCHO include biogenic, natural gas combustion, and vehicles. Secondary HCHO from local sources contributes about 30–50% to total HCHO. Biogenic, industry, and vehicles are the most important sources of secondary HCHO predicted by the source-oriented CMAQ model. Upwind sources account for the remaining HCHO concentrations, and a majority of the HCHO from upwind sources is formed from oxidation of precursor VOCs after they are transported into the Southeast Texas domain. The predicted source contributions to HCHO at the MT based on the source-oriented CMAQ model generally agree with the PMF source apportionment results. These results suggest that to correctly predict HCHO concentrations in Southeast Texas, it is necessary to account accurately for both primary and secondary sources as well as long-range transport, as it is likely that they all contribute significantly to observed HCHO concentrations. Furthermore, it should be pointed out that the results from the current study are limited by significant uncertainties in emission inventories, especially incomplete knowledge of HCHO and precursor emissions from industrial sources in the HSC area. The simulation episode is also relatively short and does not cover other meteorological conditions that could potentially affect source attribution results. In addition, different inventories and complexity of model treatment of emissions, transport, and chemical transformations of HCHO and its precursors could lead to the significantly different estimation of the importance of primary and secondary HCHO and their responsible sources, for example, between this study and Parrish et al. . Further studies should be directed toward quantifying and reducing these uncertainties in HCHO source apportionment estimates.
 Original development of the source-oriented CMAQ photochemical mechanisms was supported by funding from the Texas Air Research Center (TARC) under project 078ATM2080A. H. Zhang is financially supported by the U.S. EPA research grant RD-83386501. J. Li is financially supported by the TARC research project 079ATM0099A. PMF source apportionment study was funded in part by the U.S. EPA, University of Houston and Texas Environmental Research Consortium (TERC). The authors would like to thank Barry Lefer, Bernhard Rappenglück and Perry Samson for providing ambient concentration data.