#### 2.1. Observation site description

The observation data used in this paper are from the Shangdianzi site, the regional atmospheric background station, which is located in the northeast of Beijing (40.65°N, 117.12°E, 293.91 m a.s.l.), and about 120 km away from the urban area of the city (Figure 1). Because Shangdianzi station is far from the densely-populated urban area, its air pollution level can represent the regional atmospheric background concentrations of the North China region, and the longer time-series observations can reflect the effect of human activities on regional atmospheric background concentrations (Lin *et al.*, 2008; Zhang *et al.*, 2010). The observation data of CO in this paper come from the GC-ECD *in situ* system at Shangdianzi with relative measurement precisions typically < 5%. The system was installed in a laboratory in Shangdianzi.

#### 2.2. FLEXPART model introduction and model establishment

The core of the FLEXPART model is to study the source-receptor relationship (SRR) of air pollutants. The pollution emission is the ‘source’, and the observation station is the ‘receptor’. Through study of the processes of transportation, dispersion, convection, dry and wet deposition and radiation attenuation, the pollution concentration of grid varying with time series (forward simulation) or the residence times at grid (also called the sensitivity co-efficient or the footprint, backward simulation) can be obtained.

The kernel of the FLEXPART model adopts the zero acceleration approach to calculate the trajectory of particles, and its expression formula is:

- (1)

in which Δ*t* is the time increment; *X* is the position vector; expresses the wind vector that is composed of grid scale component (, turbulent wind fluctuations (ν_{t}) and mesoscale wind fluctuations (ν_{m}).

The calculation of turbulent wind fluctuations is based on the Langevin equation (Thomson, 1987):

- (2)

The drift term *a* and the diffusion term *b* in the above equation are the functions of space position, turbulent velocity and time. *dW*_{j} are incremental components of a Wiener process with mean zero and variance, *d*_{t}, which are uncorrelated in time (Legg and Raupach, 1982).

Through transition of the source-receptor relationship, the formula for calculating the residence times at grid point can be expressed as:

- (3)

in which Δ*T* is the time resolution; *N* is the quantity of sampling in the scope of Δ*T*;*J* is the total particles emitted; *f*_{ijn} is a function that decides the quantity of particles with ‘contribution’ at designated grid point.

According to the research characteristics, the parameters of the FLEXPART model in this paper are set up as follows: the simulation direction of the model is backward, and relative to the forward simulation, the backward simulation can more effectively reflect the distribution of potential source areas that have impact on the designated stations (Stohl *et al.*, 2005). The source of emission in the model is set as a point source, i.e., the Shangdianzi background station (40.65°N, 117.12°E). For each 3 h, 50 000 particles were released in a layer reaching from 0 to 100 m above the model ground at the measurement location and tracked backwards in time for 7 and 20 days. The model output results are the residence times, and ps kg^{−1} is the unit used to express the residence times of pollution gases of unit mass at horizontal grid. The horizontal grid resolution of the model is 1°× 1°, and the time resolution is 3 h.

CO is selected as the tracer because CO is one of the atmospheric pollutants with the largest emission, and is an important tracer for studying the transfer, transport and redistribution of pollutants in atmosphere, so it has the better representation. The 1°× 1° grid GFS data provided by NCEP are used as the initial field for FLEXPART model. The meteorological data used to drive FLEXPART was 1°× 1° resolution data of the GFS (Global Forecast System model) from NCAR/NCEP (The National Center for Atmospheric Research/National Centers for Environmental Prediction). GFS data were available with 3 h resolution (analysis at 0000, 0600, 1200, 1800 UTC and *T* = + 3 h forecast at 0300, 0900, 1500 and 2100 UTC). Also, the ECWMF data are used to compare with the GFS data and evaluate the model performance. ECWMF is another set of meteorological data with 3 h and 1°× 1°. The simulation time is from 10 February 2009 to 31 December 2009, in accordance with the observation data.

#### 2.3. Simulation concentration method

The FLEXPART backward simulation is an emission sensitivity, which refers to the residence times at grid point. The simulated mixing ratio at the receptor can be obtained by multiplying the footprint emission sensitivity with the emission inventory. The CO grid emission is from the INTEX-B2006 inventory of the East Asia region (Zhang *et al.*, 2009). The space resolution of the inventory is 0.5°× 0.5°. Therefore, the model output value of each grid multiplied with the CO emission of such grid amounts to the contribution of such grid to CO concentration of Shangdianzi station and by summating the contributions of all grids on the same time level, the simulation value of CO concentration of Shangdianzi observation station is obtained.

Before comparison with the simulation, the observed CO concentration is screened into background and non-background. The background concentration represents the characteristics of homogenized atmospheric composition after straining out the direct influence of local conditions and human activities. Therefore, when discussing what influence human sources from the surrounding region have on CO concentration of the observation station, the result of comparison between non-background concentration and simulated value of FLEXPART ought to be more logical and closer to the actual status. The time range is from 10 February 2009 to 31 December 2009. In reference to the scheme of Stohl *et al.* (2009), the algorithm of robust extraction of background signal (REBS) (Ruckstuhl *et al.*, 2010) is used to separate the CO time series concentrations of Shangdianzi station into background (also called baseline concentration) and non-background concentration. The REBS is a statistical method based on robust local regression that is well suited for the selection of background measurements and the estimation of associated baseline curves. The REBS method does not simply take the lowest values of the time series, but identifies pollution events in an iterated series of local regression fits using robust weights for values above the baseline. The partial data of abnormally high CO concentration caused by local sources is eliminated according to the observation records and the surface meteorological data. CO data eliminated take up 1.76% of the total quantity of data. The comparison of non-background time series concentration with model simulation value is also a base for carrying out the study on applicability of the FLEXPART model.