Laboratoire des Sciences du Climat et de l'Environnement, Gif-sur-Yvette, France
Corresponding author: B. Koffi, Laboratoire des Sciences du Climat et de l'Environnement, L'Orme des Merisiers Bat. 712, Point courrier 132, F-91191 Gif-sur-Yvette, France. (email@example.com)
Corresponding author: B. Koffi, Laboratoire des Sciences du Climat et de l'Environnement, L'Orme des Merisiers Bat. 712, Point courrier 132, F-91191 Gif-sur-Yvette, France. (firstname.lastname@example.org)
 The CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) layer product is used for a multimodel evaluation of the vertical distribution of aerosols. Annual and seasonal aerosol extinction profiles are analyzed over 13 sub-continental regions representative of industrial, dust, and biomass burning pollution, from CALIOP 2007–2009 observations and from AeroCom (Aerosol Comparisons between Observations and Models) 2000 simulations. An extinction mean height diagnostic (Zα) is defined to quantitatively assess the models' performance. It is calculated over the 0–6 km and 0–10 km altitude ranges by weighting the altitude of each 100 m altitude layer by its aerosol extinction coefficient. The mean extinction profiles derived from CALIOP layer products provide consistent regional and seasonal specificities and a low inter-annual variability. While the outputs from most models are significantly correlated with the observed Zα climatologies, some do better than others, and 2 of the 12 models perform particularly well in all seasons. Over industrial and maritime regions, most models show higher Zα than observed by CALIOP, whereas over the African and Chinese dust source regions, Zα is underestimated during Northern Hemisphere Spring and Summer. The positive model bias in Zα is mainly due to an overestimate of the extinction above 6 km. Potential CALIOP and model limitations, and methodological factors that might contribute to the differences are discussed.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The variability of aerosol particle loading, both in space and time, makes it difficult to quantify the current impact of aerosols on the Earth radiative forcing and climate. Several studies [e.g., Kinne et al., 2006; Schulz et al., 2006; Textor et al., 2006, 2007; Koch et al., 2009; Prospero et al., 2010; Huneeus et al., 2011] conducted within the AeroCom (Aerosol Comparisons between Observations and Models; Schulz et al. ) project show a large diversity in burdens and spatial distribution of the simulated aerosol species. These differences reveal large uncertainties in simulated aerosol processes (transport, removal, chemistry and microphysics) and, to a lesser extent, in the spatial and temporal distributions of (precursor) emissions [Textor et al., 2007]. The vertical distribution of tropospheric aerosol is of particular importance because it is a combined signature of atmospheric transport patterns, residence times in the atmosphere (i.e., removal), and the efficiency of vertical exchange.
 The vertical distribution of aerosol particles plays a crucial role in aerosol-cloud interaction and in the radiation balance of the Earth atmosphere [e.g., Stier et al., 2007; Langmann et al., 2009]. While passive space-borne observations such as the MODerate resolution Imaging Spectroradiometer (MODIS) [Remer et al., 2005; Levy et al., 2007] and the Multiangle Imaging Spectroradiometer (MISR) [Kahn et al., 2005] have been used to evaluate the two-dimensional (2D) distribution of atmospheric particles (MODIS, MISR), or plume heights (MISR) [e.g., Tosca et al., 2011], the vertical distribution could not be evaluated systematically over the globe until the launch of active space-borne sensors. Since 2006, the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) lidar instrument on board the CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) satellite has profiled the vertical distribution of aerosols and optically thin clouds, as well as their optical and physical properties, with unprecedented coverage and spatial resolution [Winker et al., 2009]. It has continuously acquired, with near global coverage, attenuated backscatter data at 532 nm and 1064 nm, including linear depolarization information at the shorter wavelength channel. This unique continuous and global aerosol profiling allows assessments of model simulations of the aerosol vertical distribution on global and multiannual scales.
 A first comparison of CALIOP aerosol measurements with global model simulations was performed by Yu et al. . The authors compared extinction coefficient profiles (km−1) simulated by the GOCART model [Ginoux et al., 2004; Chin et al., 2009] to mean extinction profiles derived from CALIOP measurements under clear sky conditions over 13 sub-continental regions from June 2006 to November 2007. Using Version 2.01 of the CALIOP Level 2 Layer 5 km product, their results show that GOCART reproduces higher aerosol extinction heights in major dust and smoke regions compared to industrial pollution source regions. They also highlighted differences with CALIOP observations, including a larger simulated extinction coefficient in dust regions, and a lower altitude and extinction coefficient of the simulated aerosol at midlatitudes downwind of pollution sources. The authors point out the need to conduct similar comparisons between CALIOP observations and other models, by using an improved version of the CALIOP retrieval algorithms. In fact, the version (v2.01) used by Yu et al. , as well as many other studies over the past four years, have shown large uncertainties and errors [Kacenelenbogen et al., 2011, and references therein].
 This study represents an extension of Yu et al. . The CALIOP Level 2 Layer v3.01 product, over the 2007–2009 period, is used to evaluate the performance of 12 global atmospheric models simulating the mean aerosol particle vertical distribution at sub-continental scales. This current version of CALIOP data was produced from improved retrieval algorithms for cloud and aerosol optical depths [e.g., Hu et al., 2009; Liu et al., 2010; Vaughan et al., 2010; Kacenelenbogen et al., 2011]. While validation is ongoing, initial studies [e.g., Wu et al., 2011] indicate an enhanced reliability in measuring aerosol particle vertical distributions. We also use MODIS to evaluate CALIOP and model tropospheric AOD (Aerosol Optical Depth). The 2007 to 2009 CALIOP-derived vertical aerosol profiles and MODIS AOD are compared to the twelve models simulations performed for the year 2000 in the AeroCom Phase I exercise, intended to support the 4th IPCC assessment report. While the AeroCom Phase II simulations [Schulz et al., 2009] cover the CALIOP observation period, their output was not available at the time of this analysis.
 Even though the AeroCom I simulations were conducted for a different period of time and, to some extent, are outdated, as the models have improved and evolved since AeroCom I, it is nevertheless interesting to evaluate their vertical aerosol particle distribution, as many other properties have been widely documented, evaluated, and analyzed in previous studies [e.g., Kinne et al., 2006; Schulz et al., 2006; Textor et al., 2006, 2007; Koch et al., 2009; Prospero et al., 2010; Huneeus et al., 2011]. Therefore, this study provides complementary information for the discussion of model simulated aerosol properties. It further serves the aerosol modeling community identifying to what extent and in what manner the models improved between the two AeroCom phases with respect to their reprodution the aerosol particle vertical distribution.
 Model simulations evaluated for 2000 using the 2007–2009 CALIOP measurements likely suffer from emission and meteorological differences between the modeled and observed years. Our presumption is that the three-year CALIOP observations do provide, in a climatological sense, a consistent and robust depiction of the typical aerosol particle vertical distribution at sub-continental and seasonal scales. This is tested in this study through the assessment of the representativeness of CALIOP coverage over the selected regions and studied years, based on a comparison of CALIOP and MODIS-derived AOD, analysis of the inter-annual variability of the CALIOP-derived extinction profiles over the 3-yr period, and through sensitivity analysis where the model data is sampled on CALIOP overpasses. CALIOP and MODIS satellite observations as well as the AeroCom models and simulations are presented in Section 2. Methodology and results are discussed in Sections 3 and 4, respectively. In Section 5, we provide some insights for possible model and observations related factors contributing to the reported model versus observation differences. Major findings are summarized in Section 6.
2. Brief Description of Satellite Observations and AeroCom Models
 The CALIOP Level 2 Layer 5 km product was used to evaluate the performance of the AeroCom models in simulating mean particle vertical distributions over selected sub-continental regions. As part of the “A-train” multisatellite constellation, CALIPSO follows a 705 km sun-synchronous polar orbit, with an equator-crossing time of about 1:30 P.M., local solar time [Stephens et al., 2002]. The orbit repeats the same ground track every 16 days. Using two 532 nm receiver channels and a channel measuring the total 1064 nm return signal, the CALIOP lidar has been continuously acquiring high resolution (up to 30 m vertically and 333 m horizontally), attenuated backscatter data at both wavelengths since June 2006. Attenuated backscatter represents the range-corrected calibrated lidar signal, obtained after the subtraction of calibration terms [Winker et al., 2007; Hunt et al., 2009].
 The CALIOP Level 2 Layer 5-km product is derived from the CALIOP attenuated backscatter Level 1 product [Winker et al., 2009; Vaughan et al., 2010; and references therein]. The algorithm system for the Level 2 products (i) detects cloud and aerosol layers in the backscattered signal, (ii) determines which of these layers are cloud or aerosol features, (iii) estimates aerosol lidar ratio, and (iv) retrieves profiles of aerosol extinction coefficients, which are then integrated to derive estimates of AOD at 532 nm and 1064 nm. Cloud versus aerosol discrimination (ii) is performed using an adaptive threshold based on the magnitude and spectral variation of the lidar backscatter at both wavelengths. Its uncertainty is assessed through a Cloud-Aerosol-Discrimination (CAD) score ranging from −100 (most likely to be aerosols) to +100 (most likely to be clouds) [Liu et al., 2004, 2009; Vaughan et al., 2004, 2005]. The aerosol lidar ratios (iii) are estimated via an empirical determination of aerosol type using a model-based scheme that includes six aerosol types: polluted continental, biomass burning, desert dust, polluted dust, clean continental and marine [Omar et al., 2009]. The detection scheme uses an adaptive multiprofile averaging scheme that identifies coherent features in consecutive profiles, within windows from 5, up to 80 km, along the CALIOP tracks.
 This study is based on the CALIOP Layer Product version 3.01. The data set used here to evaluate the AeroCom simulations for the year 2000 covers the time period from January 2007 to December 2009. The CALIOP version 2.01 data were also processed, but for the year 2007 only, to evaluate the impact of the data version. The AOD, the bottom and top above sea level altitudes, and the CAD information of all individual aerosol and cloud layers are used to calculate mean aerosol vertical extinction profiles over the selected regions, as described in Section 3. Only the 532 nm channel and only the nighttime observations are used because these data show a better signal-to-noise than the 1064 nm and/or the daytime observations [Chazette et al., 2010; Wu et al., 2011].
2.2. Space-Borne Radiometer MODIS Onboard Aqua Satellite
 While the goal of this study is not to evaluate total column AOD of the models, it is important to check if the nighttime measurements of the CALIOP lidar, with their small footprint, ensure a representative image of the mean aerosol distribution over the studied sub-continental regions. For this objective, and in addition to the sensitivity study described in Section 3, a comparison was made using the AOD derived from the CALIOP Layer product to MODIS AOD measurements. The MODIS instrument, on board the NASA Earth Observing System (EOS) Aqua satellite, is a passive radiometer part of the A-Train, and data are available over the 2005–2009 period. MODIS AOD are retrieved over the oceans in 7 different spectral bands (from the visible to the near infrared) and in 3 bands over land. To increase signal to noise, the standard retrieval algorithm is applied to a spatial average of pixels, leading to a resolution of the AOD product of 10 × 10 km. Despite uncertainties in the MODIS AOD retrieval algorithm [e.g., Kaufman and Tanré, 1998; Chu et al., 2002; Remer et al., 2005], and the fact that Quality Assurance screening has produced more accurate data sets [Zhang et al., 2008], the operational MODIS AOD is commonly acknowledged as a reliable global product of aerosol optical depth [Bréon et al., 2011]. Exceptions occur over bright land surfaces, like deserts, where the operational algorithm fails at retrieving valid AODs. The Collection 5 Level-3 atmosphere global product (MYD08_M3), which is used in this study, contains monthly 1 × 1 degree grid average values of aerosol optical properties (among other atmospheric parameters). Over desert areas, where the operational MODIS algorithm fails at retrieving valid AOD, the deep blue product (“Deep Blue AOD at 550 nm”) is used. Note that MODIS aerosol observations are performed during the daytime, whereas we use nighttime CALIOP data in our study. Nonetheless, comparing 2007 mean annual AOD calculated over the thirteen regions from CALIOP 24h data with the one computed from CALIOP nighttime observations does not show major differences (less than 10% for 11 regions, and ∼20% for the 2 other regions).
2.3. AeroCom Simulations
 The AeroCom project documents differences of aerosol component modules of global models and assembles data sets for aerosol model evaluations. In order to facilitate model inter-comparisons and comparisons to measurement data, AeroCom requests modelers to follow a common data-protocol and provide detailed model outputs. This paper presents the first part of our study that details the Experiment A simulations [Kinne et al., 2006, Textor et al., 2006, 2007] of AeroCom phase I, in which the models used their own standard emissions for the year 2000 (“AeroCom A” hereafter). It focused on five aerosol components: dust (DU), sea salt (SS), sulfate (SO4), black carbon (BC), and particulate organic matter (POM), and on the sum of these components. Models have shown to best agree on the emission mass fluxes of “anthropogenic” compared with “natural” matter [Textor et al., 2006]. Results from Textor et al.  show that differences in natural SS and DU aerosols are due to differences in the simulation of the size spectra, in the parameterizations of source strength as a function of wind speed (and soil properties for dust aerosol), and in the atmospheric dynamical fields (wind, temperature, precipitation, clouds, etc.). Emissions of anthropogenic SO4, BC and POM species show better agreement, due to the common use of few and usually similar emission inventories. Nonetheless, these inventories have been improved for some species or emission types by individual modelers, and their mix in each model varies [Kinne et al., 2006].
 Twelve AeroCom models (see Table 1 for model description and references) providing 3D AOD output information at the 550 nm, close to the CALIOP spectral wavelength (532 nm), are considered. The difference in AOD between the two wavelengths is expected to be small, i.e., between 2% and 4% for typical Angstrom exponents of 0.5 to 1 [Kittaka et al., 2011]. The models are either Chemical Transport Models (CTMs), or General Circulation Models (GCMs). In CTMs, the meteorology is prescribed either from climate model simulations or from analyzed weather observation systems. In GCMs, the aerosol transport is simulated online. Results from five CTMs (GOCART, MATCH, MOZGN, TM5, and UIO-CTM) and seven GCMs (ARQM, GISS, SPRINTARS, LSCE, ECHAM-HAM, PNNL and UIO-GCM) are reported. Note that the SPRINTARS and/or the ECHAM-HAM models can be found under the name ‘KYU’ and ‘MPI-HAM’ in earlier AeroCom papers [e.g., Kinne et al., 2006; Schulz et al., 2006; Textor et al., 2006, 2007; Koch et al., 2009]. Ten models simulated the year 2000 specifically using (different) meteorological analysis data sets (‘2000’ experiments), whereas two models without nudging capability (UIO-GCM, and ARQM) provided climatological averages from 5 years of simulation, using identical emissions each year (noted ‘9999’ or ‘climatic’ hereafter). For the TM5 model, which did not participate to AeroCom A, outputs from Experiment B (“AeroCom B” hereafter), using Dentener et al.  unified 2000 emission data set are reported. Note also that in this Phase I of AeroCom, only fully prognostic SO4, BC, and POM aerosol particles are subject to transport and removal in the UIO-GCM model, whereas the major part of the dry sea-salt and mineral dust concentrations are prescribed [Kirkevåg et al., 2005]. For additional information on the models, see Textor et al. .
Table 1. AeroCom Models and A (ID) and B (ID_B) Experiments Considered in This Study
 Annual and seasonal aerosol mean extinction profiles from CALIOP observations (at 532 nm) and the model simulations (at 550 nm) are calculated from January 2007 to December 2009, over 13 sub-continental regions representative of industrial, dust and biomass burning pollution (Figure 1). These regions are the same as the ones defined by Yu et al. , except that we split (and extended) their so-called ‘SAF’ African biomass burning region [25°S-15°N; 0°E–45°E], into a ‘CAF’ Central [0°–15°N; 18°W-60°E] and a ‘SAF’ Southern [25°S-0°S; 0°E–45°E] African regions, in order to account for the different biomass burning periods, i.e., November to February and June to October, respectively [e.g., Liousse et al., 2010].
3.1. Satellite Data Processing
3.1.1. MODIS-Derived Aerosol Optical Depths
 MODIS-derived mean AOD were calculated over the studied regions and the 2007–2009 period, by combining the standard AOD product from MODIS (“Optical Depth over Land and Ocean”) with the deep blue product (“Deep Blue AOD at 550 nm”) that provides information over desert areas. We averaged the 1 × 1° monthly AOD estimates that fall into the regions. As we are interested in evaluating the overall CALIOP potential with respect to regional averages in AOD, no dating or spatial coincidence with CALIOP overpasses is considered here.
3.1.2. CALIOP Absolute and Normalized Aerosol Extinction Profiles
 CALIOP Layer Product Version 3.01 data are used to derive a climatology of the aerosol vertical distribution in the troposphere (0–10 km), over the 2007–2009 period. CALIOP Version 2.01 data have been processed in an identical manner but for the year 2007 only. Among the parameters, we use the top and base of the aerosol and cloud layers, and the AOD and Cloud Aerosol Discrimination score of the aerosol layers. In order to include only well-defined aerosol layers (see Section 2.1 for definitions) and avoid cloud contamination, two aerosol data screenings are applied to the CALIOP 5-km aerosol layer product, following Yu et al. : The first screening excludes the aerosol layers that have low CAD scores (i.e., between −50 and 0). In the CALIOP algorithms, the initial lidar ratio is selected based on a set of predefined aerosol types. The second screening method excludes cases where this initial lidar ratio has been adjusted which usually occurs for complex features and induces instabilities in the algorithm and larger uncertainties in the retrieved extinction. Although this happens rarely [Kittaka et al., 2011], these retrievals generally lead to anomalously high AOD [Omar et al., 2009; Winker et al., 2009; Young and Vaughan, 2009].
Yu et al.  only concerned CALIOP nighttime observations in cloud-free conditions. All nighttime data are considered here in order to ensure a higher representativeness of the regional climatology. Cloud-free (or clear-sky) conditions are identified by the total absence of cloud layers in a given column. Since CALIOP may see through optically thin clouds, the computation of the hereafter so-called all sky profiles requires the application of a vertical cloud mask to exclude below cloud areas that correspond to parts of the profile where aerosols cannot be detected by the lidar. This information is provided by the CALIOP 5-km Cloud Layer product. The all sky profiles are then calculated by averaging the aerosol extinction in the atmospheric layers. For each shot, altitudes below cloud and aerosol features that completely attenuate the backscatter signal (identified from the ‘Layer Opacity Flag’) are not factored into the averaging. Over land, levels that are below the surface elevation at the lidar footprint are also ignored. Elsewhere, at levels where the CALIPSO product reports neither cloud nor aerosol layers, representing clear air, an extinction value of zero is assigned.
 In the CALIOP Layer product v2.01, the base altitudes of optically thick aerosol layers are sometimes biased high due to lidar signal attenuation or signal perturbations. This causes an underestimate in the aerosol extinction at low levels. Version 3.01 includes a base-extension algorithm for such cases. Despite the improvements in the CALIOP Layer product, and in particular the above mentioned algorithm, an unrealistically low mean extinction was found at the surface level over the selected regions (see auxiliary material Figure S1). This can be at least partly explained by the CALIPSO algorithm that sets the aerosol layer base 90 m above the surface, in order to limit the contamination of the measured signal by the surface return. To reduce this anomaly, we apply our own further correction: the lowest aerosol layer is assumed to extend to the surface whenever its height above the surface is less than 10% of the layer depth. This correction removes unrealistic low aerosol extinction at the surface over all but one region (auxiliary material Figure S1). Each well-defined aerosol layer and aerosol-free layer is then split in 100 m height segments to allow for averaging over complex layer structures along the CALIOP path. Mean annual and seasonal vertical profiles are next calculated over the 0–10 km altitude range, by averaging the screened 5 km layer product over each region, using 100 m vertical resolution. The resulting CALIOP all sky extinction profiles, as well as the impact of our CALIOP data processing (screening, cloud masking) are discussed in Section 4.1.
 Finally, “normalized” profiles were also calculated from the original extinction profiles (also called “absolute” profiles) by normalizing the total AOD over the 10 first kilometers to the same common value (AOD 0–10 km = 1). This normalization provides profiles that better superimpose on each other, and therefore allows easier comparing the typical shape of the simulated and observed vertical distributions.
3.1.3. CALIOP Mean Extinction Height Diagnostic (Zα)
 We quantitatively assess the ability of the different models in reproducing the aerosol mean vertical distribution over each region and season through the calculation of a mean extinction height diagnostic (Zα), as follows:
with bext,i the aerosol extinction coefficient (km−1) at level i, and Zi altitude (km) of level i. This is applied between the first and last 100 m altitude layer.
 Zα diagnostic provides a useful and simple measure to gauge the performance of the models in the different regions. It is applied in a first step over the first 10 km and, in a second step, over the first 6 km, i.e., where most of the aerosol load is concentrated. It is important to notice that Zα does not allow identifying combinations of positive and negative biases of different super-imposed air masses and their compensative effects. This limits its usefulness in the case of complex and non-continuous multimodal vertical profiles, such as the ones that are obtained at seasonal time scales over some regions, as further discussed in the results and discussion sections.
3.2. Model Simulations Output Processing
 The GCM and CTM mean aerosol profiles were calculated from the monthly 3D AOD model results, averaged for each model vertical level, from all the model grid cells within the studied sub-continental region. The altitudes of the 3D model levels are calculated from the 3D pressure, assuming a hydrostatic flow. While this monthly regional averaging does not allow the spatial and temporal match between the model cells and CALIOP overpasses, it provides a complete image of the simulated climatology of the tropospheric aerosol distribution over a given region.
 In order to check if the extinction profiles averaged from the spatio/temporal limited coverage CALIOP data can be compared to regional model averages, we made a preliminary sensitivity study by calculating LMDz-INCA (3.8° × 2.5° horizontal resolution) and SPRINTARS (1.1° × 1.1° horizontal resolution) extinction profiles, using a temporal (daily) and spatial (grid cell) model-data collocation from the original model grid and a CALIOP 1x1 degree grid. Since no daily model output was available after 2006 at the time of our study, the 2006 simulation outputs of AeroCom II exercise [Schulz et al., 2009] were sampled versus 2007 CALIPSO overpasses. Very little change is found in the simulated vertical aerosol profiles when sub-sampling the model grid cells versus a regional average (auxiliary material Figure S2), indicating that the CALIOP coverage is adequate to evaluate the mean aerosol climatology as simulated by the models on seasonal time scale, over sub-continental areas.
 The highest impact of subsampling on the annual profile is obtained in Western China and Eastern Europe regions, for LMDz-INCA and SPRINTARS models, respectively. On a seasonal scale, the highest impact of sub-sampling is calculated over the Western China region for both models, and during DJF (+26% on Zα) and JJA (−9% on Zα) seasons, respectively. In fact, more than 81% of the grid cells inside each studied region are covered by at least one CALIOP overpass. These statistics together with the large samples and small model inter-daily variability of the aerosol vertical distribution explain why the model-data collocation sampling has only a slight impact on the mean simulated profile.
 Similar to CALIOP data processing, mean “normalized” profiles, AOD and extinction height (Zα) were calculated over the first 6 km and 10 km altitude ranges. The model profiles presented here are plotted at the original model vertical resolution. The Zα diagnostic is then calculated as for CALIOP-derived profiles (equation (1)), by first applying a linear interpolation of the extinction every 100 m.
 In a first section (Section 4.1), we analyze the mean AOD and aerosol extinction profiles calculated for the different regions from CALIOP Layer Product Version 3. The effects of data processing and screenings and differences with CALIOP Version 2 derived profiles are first discussed. The consistency of the aerosol vertical distributions with current knowledge on aerosol emissions, transports and distributions is then analyzed. The simulated vertical distributions are presented and compared to CALIOP-derived ones in Section 4.2. Both annual and seasonal distributions are analyzed. Note that the South East Asian (SEA) region includes a large fraction of marine area and is therefore less representative of biomass burning emissions than the CAF, SAM and SAF regions. Moreover, it might be more significantly affected by an ocean versus land bias in the application of the CALIOP extinction to backscatter ratios (see the discussion section). For these reasons, as well as for plotting convenience, both the annual and seasonal SEA profiles are provided in the auxiliary material (Figure S3), but are discussed all the same hereafter.
4.1. CALIOP Extinction Profiles
 As an example, Figure 2 depicts the number of CALIOP 2007 observations and the resulting mean annual aerosol extinction profiles calculated over the Western Europe region [36°N-60°N; 10°W-50°E]. The impact of the aerosol data screenings and the cloud masking are shown. The number of observations in cloud-free columns decreases close to sea level, because of orography. Depending on the region, clear-sky data account for 40% to 60% of the all sky screened data, except in North Africa, NAF, (76%) and over the North Atlantic, NAT, (36%). Interestingly and unexpectedly, also for the other regions cloud free and all sky extinction profiles resemble each other both in magnitude and shape, and only slightly lower (by up to 13%) mean annual Zα, are obtained in clear sky conditions (see auxiliary material Figure S1). This result indicates that the climatology of the mean aerosol vertical extinction distribution is not significantly affected by the presence of clouds. While the aerosol-cloud interaction processes are likely important, this finding might result from efficient mixing between cloudy and cloud-free areas, from compensating effects (such as larger humidification effects in cloudy regions compensating for lower aerosol concentrations), or from a biased cloudy sky average (i.e., below-cloud scavenging of the aerosol being probably only important for clouds that are optically too thick for CALIOP retrieval).
 Significant differences are obtained between CALIOP Versions 2.01 and 3.01 profiles, especially below 2 km altitude (auxiliary material Figure S1). Annually, the CALIOP version 3 data lead to more monotonic distributions in the Planetary Boundary Layer (PBL) and low troposphere, with aerosol extinction values increasing almost down to the surface. This result reflects the enhanced performance of the new version of the CALIOP processing algorithm to retrieve low-lying aerosol layers, especially in the regions of South America and Central Africa.
 CALIOP Version 2.01 AOD, retrieved by columnar integration of extinction profiles, shows a negative bias compared to MODIS for 10 out of the 13 regions, reaching up to 45% and 50% for the NWP and NAT regions, respectively. With Version 3.01, the AOD is increased over all regions, leading to a reduction of the absolute bias for all but three regions (Tables 2a and 2b). The bias between CALIOP and MODIS AODs is often less than 25%. A significant correlation is obtained between CALIOP and MODIS-derived seasonal AODs, using the 13 regions as independent data points (Figure 3). These results, together with our sensitivity analysis in Section 3.2, indicate that the spatial and temporal coverage of CALIOP measurements and their version 3 products do provide a consistent and representative signal of the mean regional and seasonal aerosol load and distribution at these scales. Nonetheless, significant CALIOP versus MODIS AOD discrepancies are still obtained e.g., for the WCN West China dust region (DJF bias = +128% and SON bias = +74%), and the NAT (MAM bias = −44%) and NWP and (JJA bias = −54%) maritime regions. Such discrepancies, related both to MODIS (e.g., limited accuracy over deserts) and CALIOP (e.g., estimated lidar ratios, sparse sample) limitations, should be kept in mind for the analysis (Section 4.2) and discussion (Section 5) of the model results.
Table 2a. CALIOP (AODc) and MODIS (AODM) Mean Aerosol Optical Depth for the 0–10 km Altitude Range and the Year 2007: Individual Regionsa
Both CALIOP version 2.01 (v2.01) and 3.01 (v3.01) are reported and compared to MODIS, with the version with the best agreement highlighted in bold.
Table 2b. CALIOP Versus MODIS Mean Aerosol Optical Depth for the 0–10 km Altitude Range and the Year 2007: All Regionsa
Both CALIOP version 2.01 (v2.01) and 3.01 (v3.01) are reported and compared to MODIS, with the version with the best agreement highlighted in bold.
 The CALIOP annual “absolute” and “normalized” extinction profiles (mean and standard deviation from profiles of 2007, 2008 and 2009) are shown in Figures 4 and 5, respectively (auxiliary material Figure S3 contains SEA region results). The mean annual profiles are characterized by a decrease of the extinction coefficient magnitude (α) from the surface to about 5 km. Different typical shapes are observed according to the type of the region (source/downwind) and main aerosol type (maritime, industrial, dust, biomass burning). Quasi exponential and more linear annual mean vertical shapes are retrieved over industrial/maritime and over dust regions, respectively, whereas a convex mean profile is obtained over the African and South American biomass burning regions. A relatively low inter-annual variability, both in magnitude (Figure 4) and in shape (Figure 5) of the vertical extinction distribution is generally observed over the 3-yr period, except over the Eastern U.S. industrial (EUS) and over the South American biomass burning (SAM) source regions. In these regions, the standard deviation reaches up to 30% (at 2.8 km altitude) and 43% (at 4.0 km altitude) of the annual mean extinction, respectively.
 Higher CALIOP derived Zα diagnostics are generally obtained in dust and biomass burning regions than in industrial source regions (Figure 5). This can be partly explained by the fact that the dust and, to a less extent, smoke aerosols are more prone to vertical transport up to mid and high tropospheric levels than industrial aerosols, which are found near or in the boundary layer [e.g., Kim et al., 2006]. This is particularly true in the African and South American tropical regions, where higher reaching convection leads to an enhanced upward transport of aerosol emissions compared to midlatitude regions [e.g., Andreae et al., 2001; Vernier et al., 2011].
 Seasonal “normalized” aerosol extinction profiles and Zα diagnostics (auxiliary material Figures S3 and S4) generally show low 2007–2009 inter-annual variability, except in the EUS, SAM and WCN regions. In EUS and SAM regions, standard deviations up to 51% (at 4.9 km altitude) and 64% (at 3.7 km altitude), are obtained during the season with the highest aerosol pollution (JJA and SON, respectively), eventually indicating higher variability due to wild fires [e.g., Langmann et al., 2009]. The particularly high inter-annual variability observed for the WCN region during the DJF season (Figure 6a), could be due both to its reduced size and to the high variability of the processes (wind gusts and cold pockets) responsible for the uplift of the dust particles. Our understanding of the climatology of the CALIOP-derived seasonal aerosol vertical distributions and mean extinction heights is further discussed hereafter.
4.1.1. EUS, WEU and NAT Source and Downwind Regions of Industrial Pollution
 Consistent with reported in situ observations [e.g., Cozic et al., 2008; Chazette et al., 2010; Veefkind et al., 2011], the atmospheric aerosol extinction is low or medium (α < 0.1 km−1 on annual average at the surface) in the midlatitude regions investigated, with a maximum in magnitude and altitude during the Northern Hemisphere (NH) spring/summer and summer seasons, respectively. The anti-cyclonic weather conditions that prevail in summer often lead to enhanced photo-oxidation of sulphate and accumulation of pollution [e.g., Goldstein et al., 2009], as well as higher PBL height and higher thermally driven vertical transport from the boundary layer to the free troposphere [Matthias et al., 2004]. Other factors are invoked to explain the summer maxima in magnitude and altitude in Southern Europe countries, such as local/regional meteorology (coastal re-circulations), and Saharan dust transport above the PBL [Sicard et al., 2011]. In the North Atlantic (NAT) region, downwind the Eastern United-States, a major fraction of the aerosol load is confined in the PBL (annual Zα = 1.2 km). However, the fraction of the aerosol extinction above the PBL is significantly increased during NH spring (Figure 6b) and summer (Figure 6c), which can be attributed to time pollution outflows from the Eastern United-States to the Atlantic [e.g., Fischer-Bruns et al., 2010], eventually merging with effluents from western North America [e.g., Li et al., 2005].
4.1.2. IND, ECN, WCN and NWP Asian Industrial and Dust Regions
 In Eastern China (ECN), Western China (WCN) and Indian (IND) regions, the highest Zα is observed during the spring season, MAM (Figure 6b). This presumably reflects the impact of dust storms which occur annually in Asia, but are much more frequent in the NH spring, when high surface winds occur and when the soil is relatively dry [e.g., Qian et al., 2002; Huang et al., 2010; Wang et al., 2010]. Previous studies on the Eastern China region [see Li et al., 2011, and references inside] clearly find a spring wind-blown dust transport into the free troposphere from the Taklamakan (Western China) and Gobi (Mongolia) deserts. In the northern part of this region, the dust layers during this season mix with pollution, which originates from local emissions [Wang et al., 2010]. In India, anthropogenic pollution (notably biomass combustion) also significantly contributes to the aerosol load during NH spring [e.g., Dey and Di Girolamo, 2010]. Downwind to China, the NWP region exhibits a low surface aerosol extinction (maximum of 0.08 km−1 in NH winter) and a pronounced seasonal variability of the mean extinction height. Consistent with previous lidar observations [e.g., Shimizu et al., 2004; Kim et al., 2007; Hayasaka et al., 2007; Nakajima et al., 2007; Yu et al., 2008; Chen et al., 2009], higher extinction heights are obtained during spring (Zα = 2.2 km) and summer (Zα = 1.8 km) than during the autumn and winter (Zα < 1.4 km) seasons. This reflects the downwind transport of dust [Tsay, 2009; Liu et al., 2010; Logan et al., 2010] and industrial (summer) pollution in the free troposphere from China down to the Pacific, with lower altitude for the industrial than dust pollution transport [Shimizu et al., 2004].
4.1.3. NAF, CAF and CAT North African and Downwind Regions
 The Saharan and Sahel regions contribute at least 50% to the global mass of dust emissions [e.g., Washington et al., 2003]. Both sandstorm activity and vertical mixing are at their maximum over the early NH spring-to-fall period, during which easterly winds prevail throughout levels reaching up to 6 km in summer [e.g., Schütz, 1977; Hsu et al., 1999; Washington et al., 2003]. This seasonality in observing dust at higher levels is depicted in the CALIOP measurements from the NAF region, with JJA Zα being twice that of DJF (Figures 6c and 6a, respectively). The advection of the Saharan desert mineral aerosol to the Atlantic in the layer between 2 and 5 km during NH summer is also clearly observed in Figure 6c (CAT region). Wong et al.  have suggested that this distinct dry and warm Saharan Air Layer is found over the cooler, more-humid surface air, stabilized by the strong temperature inversion which suppresses vertical exchange. The maximum of extinction during NH winter found within the marine PBL of the CAT region (auxiliary material Table S1) might reflect sea salt and an additional and lower layer transport of dust from the Sahel [Chiapello et al., 1995]. Similarly to the CAT region, the seasonality of aerosol particle extinction in the CAF region shows a JJA Zα maximum, as well as bimodal vertical distribution, with a second (more pronounced) peak around 3.5 km of altitude (Figure 6c). In this region and season, both the long-range transport of mineral dust from the Sahara and Sahel regions [e.g., Crumeyrolle, 2008; Crumeyrolle et al., 2011; Reeves et al., 2010] and the cross-hemispheric transport of biomass burning products from South Africa [e.g., Real et al., 2010] contribute to the aerosol load in the free troposphere. During the last months of the fire season, from January to March [Liousse et al., 2010], the large local and/or regional pollution by fire products is found in still elevated altitudes (Zα = 2.1 km) but relatively lower than in NH summer (Zα = 2.4 km).
4.1.4. SAM, SAF, and SEA Southern Hemisphere Biomass Burning Regions
 A convex character of the extinction profiles is obtained from March to November in the tropical savannah and forest biomass burning regions of South Africa (SAF) and America (SAM) that is more pronounced during the peak biomass burning season (Figure 6d). It reflects the contamination of the troposphere by the injection at high altitude of the fires' products. In fact, deep convection locally induced by the large vegetation fires, i.e., the so-called pyro-convection, can transport the emissions up to the upper troposphere and lower stratosphere [e.g., Chen et al., 2009; Yu et al., 2008; Gonzi and Palmer, 2010]. Another potential factor contributing to aerosol at high altitudes is the formation of secondary inorganic and organic aerosol from the biomass burning gaseous products, during plume aging [e.g., Reid et al., 2005]. An autumn maximum, but a lower seasonal variability are calculated for the Zα diagnostic in the SEA region (auxiliary material Figure S3) compared to the other biomass burning regions. This region includes a larger fraction of marine area and experiences higher industrial aerosol emissions, which decrease the mean annual height of aerosol vertical distribution [e.g., Kaufman et al., 2002; Dentener et al., 2006].
 In summary, the main annual and seasonal regional patterns of CALIOP profiles are consistent with previous observations and knowledge about aerosol emissions and transport seasonal patterns. Most of the discrepancies highlighted by previous studies [Yu et al., 2010; Kacenelenbogen et al., 2011] compared to MODIS and ground lidar measurements are no more apparent in this updated version of CALIOP Layer product. Therefore, it offers a robust benchmark data set to be used as a reference for the evaluation of the global models in reproducing the vertical distribution of the tropospheric aerosol.
4.2. Comparison of Simulated and Observed Extinction Profiles
 The purpose of this section is to evaluate the ability of twelve global aerosol models to simulate the vertical distribution of the aerosol. In this first part that focuses on the AeroCom I simulations (see section 2.3 for description of the experiments), the mean absolute extinction profiles and the resulting AOD are therefore only briefly discussed (Section 4.2.1). The performance of the models (see Table 1 for description of the models and references) is then qualitatively and quantitatively evaluated from the comparison of the CALIOP and model derived “normalized” extinction profiles and Zα diagnostics (defined in section 4.2.2).
4.2.1. Mean Vertical Extinction Profiles and AOD Diagnostic
Figure 4 shows qualitatively that most of the models reproduce the observed variation in mean annual extinction regionally, with an increase from marine areas (e.g., NWP, NAT), to regions dominated by industrial, biomass burning, and dust (e.g., NAF, WCN) aerosols. As a result, significant regression coefficients (p > 95%) are obtained for 9 out of the 12 models when comparing simulated and MODIS AOD (Figure 7). However, for a given region a large inter-model range in mean annual extinction profiles is generally obtained that exceeds the standard deviation from the CALIOP profiles from 2007, 2008 and 2009. The extinction is underestimated over segments (IND, ECN, WCN, SAF), and over the whole (SAM) 0–4 km altitude range, by the twelve models, which causes an AOD underestimate by all but two (PNNL and UIO_CTM) models.
 At the seasonal scale (auxiliary material Table S1) negative biases in these same regions are also obtained for all, 11, 10 and 9 models, for the MAM, SON, DJF and JJA seasons, respectively. The ranking of the models is shown to strongly vary from one season to the other. But, six of them (PNNL, LSCE, MATCH, UIO_CTM, ECHAM-HAM, GOCART) perform better than the others (i.e., being among the three with lowest RMSD and/or highest correlation coefficients, for at least two out of the four seasons). A model AOD lower than the MODIS derived AOD was also reported for the GOCART model and the year 2000 over these selected regions by Yu et al. . Note, however, that different time periods, being potentially responsible for different emissions and transport patterns, are investigated in our study for simulations (2000 or climatic mean) and observations (2006–2009), which might also be responsible for some of the observed biases. Further MODIS AOD data analysis and model simulations, beyond the scope of this study, would be required to assess the impact of natural AOD inter-annual variability on the AeroCom phase I model performance in simulating the aerosol distributions over the CALIOP observation period.
4.2.2. Mean Normalized Vertical Extinction Profiles and Zα Diagnostic
 “Normalized” extinction profiles allow for a qualitative assessment of agreements between model and CALIOP retrieved shape (including the vertical amplitude) of the aerosol profiles. Respective Zα diagnostics allow for quantitative evaluation.
 Annually (Figure 5), a different picture is obtained compared to the absolute profiles (Figure 4). Whereas a particularly large inter-model range of extinction magnitude is simulated in the NAF and WCN dust regions, and in the IND industrial region, most of the models reproduce the shape of the vertical distribution over these regions. The models also replicate the exponential shape of the profile in WEU and ECN industrial regions. But, three of them (MATCH, MOZGN and LSCE) fail to reproduce the slope of the extinction decrease with altitude. The LSCE model also fails in the other industrial (EUS) and downwind regions (NAT, NWP) of the northern hemisphere, where it simulates an increase in extinction with altitude above 6 km not observed by CALIOP. Discrepancies between the observed and simulated shapes of the annual aerosol extinction profile are found for the NAT (e.g., MATCH, LSCE, ECHAM_HAM models), NWP (e.g., MATCH, LSCE, ECHAM_HAM, MOZGN, ARQM), and CAT (e.g., ECHAM_HAM) downwind of maritime regions. In the SAF and SAM biomass burning regions, less than half of the models reproduce the pronounced convex character of the mean annual profile. Large discrepancies are also obtained in the PBL of the CAF biomass burning region, where the shape of the profile in the free troposphere is however better simulated than in the SAF and SAM regions.
 Performance capacity to reproduce the annual mean extinction height Zα across the thirteen regions strongly depends on the model (Figure 8a). While many models show a similar range in observed Zα, only three of them (GISS, ECHAM_HAM, UIO_CTM) capture the inter-regional variation and provide statistically significant correlations. Moreover, a general Zα over-estimation (up to +1 km) is obtained for all but one model (Table 3a). The largest over-estimations are obtained by models having the coarsest vertical resolution (namely the LMDZ-INCA and UIO_GCM models), which suggests that numerical diffusion may contribute to the unrealistically large aerosol load at high altitude in these models [e.g., Jacob et al., 1997]. This positive bias could also be related to the parameterization of convective transport and scavenging, e.g., to the fact that some AeroCom models simulate too little scavenging and thus too much transport into the upper troposphere as suggested also by Schwarz et al. . For the UIO_GCM model, some of this bias might due to inherent biases in the prescribed background aerosol. The positive bias in Zα can also be due to a low bias in the CALIOP measurement at high altitude with respect to the real world aerosol distribution due to both the detection limit of the lidar and to its data processing. Additional data would be required to test the above assumptions on the model-CALIOP bias as further discussed in Section 5. To free ourselves from these discrepancies at high altitude, and thus better evaluate the models in the altitude range where most of the aerosol load is concentrated, we re-calculated the simulated and observed Zα diagnostic over the 0–6 km altitude range only (Figure 8b and Table 3b). The agreement between Zα from all the studied models and CALIOP is significantly improved, i.e., the RMSE in model mean is reduced by about 60%. Significant positive biases (>100 m) are now only obtained for two models (LSCE and MATCH), whereas half of them show significant correlations with CALIOP-derived Zα across regions.
Table 3a. Observed (CALIOP v3.01) and Simulated (AeroCom) Mean Annual Zα (km) Over the 13 Selected Regions, for the 0–10 km Altitude Rangea
Corresponding biases, RMSE and r correlation coefficients are reported. Significant correlations (p > 95%) are highlighted in bold.
Table 3b. Observed (CALIOP v3.01) and Simulated (AeroCom) Mean Annual Zα (km) Over the 13 Selected Regions, for the 0–6 km Altitude Rangea
Corresponding biases, RMSE and r correlation coefficients are reported. Significant correlations (p > 95%) are highlighted in bold.
 The seasonal Zα model mean bias and inter-model range in this bias over the 0–6 km altitude range are compared to CALIOP observations in Figure 9. It depicts an over-estimation of Zα for most regions in NH winter (11 out 13 regions) and autumn (9 regions), whereas a less systematic mean bias is seen in spring and summer (regional bias −21% to +21% and −22% to +30% respectively). During these seasons, models tend to underestimate the mean aerosol height over the ECN, NAF, WCN, and CAF regions, while only in spring over the IND and NWP regions and in summer in the SAF region.
 The quantitative Zα diagnostic allows revisiting the comparison of the dust above Northern Africa and in Central Atlantic in NH summer (section 4.1.3). Despite a general Zα negative bias over the NAF region, simulated Zα is too high over the CAT region for most of the models. This may be due to missing aerosol extinction in the maritime boundary layer in CAT, which induces an overall over-estimation of Zα, despite an under-estimation of the altitude of Saharan air layer in the mid troposphere (Figure 6c). This example illustrates the limitations of the use of mean diagnostics and the associated risk of compensating effects, as further discussed in Section 5.2.
 The lowest inter-model ranges in Zα bias is found for the NAF (MAM: 23% and JJA: 23%) and WCN (MAM: 31% and JJA: 23%) dust regions, followed by the Indian IND (MAM: 28%), and the Central African CAF (MAM: 37%) regions. High inter-model ranges and high biases are obtained over the Eastern U.S. (EUS) and the downwind North Atlantic (NAT) regions in all seasons. A high inter-model range (121%) is also simulated for the WCN region during the DJF months (i.e., for the region and season that show the highest 2007–2009 CALIOP-derived Zα inter-annual range, Figure 6a). The difficulty that models and observations encounter to resolve and sample sub-grid scale processes (e.g., wind gusts) and their variability over this relatively small region might partly explain this.
 Individual model performance simulating the seasonality of the aerosol particle vertical distribution over the thirteen regions is briefly discussed hereafter by looking at the season with the highest mean extinction height Zα (auxiliary material Figure S4): All twelve models reproduce CALIOP Zα maximum in NH summer for the North African dust region, and ten of them show high dust layers over the Atlantic Ocean. Due to the prescribed sea-salt and dust background aerosol in UIO_GCM, this model displays a quite weak seasonal variability in both the NAF and CAT regions. The high Zα obtained, both in JJA and MAM seasons, over the Western China is also well captured by all but two (ARQM and LSCE) models. Ten models also reproduce a summer maximum over Western Europe, whereas two of them (GOCART and SPRINTARS) simulate a maximum in Zα during NH spring. Fewer models agree with the observed seasonality in Zα over the other industrial regions, i.e., in the Eastern U.S. (9 models), Eastern China (6 models) and India (5 models). Consistently, only nine and five models agree with the observed seasonality in the North Atlantic and the North West Pacific downwind regions, respectively. Over the SAF and SAM biomass burning regions, most of the models (i.e., nine and eight, respectively) simulate the observed Zα peak in the SON season, whereas only five of them do so over the SEA region (auxiliary material Figure S3). While the ranking of the models is shown to strongly depend on the region and season, two of them (GISS and UIO_CTM) show a better ability to reproduce the typical aerosol height (i.e., by reproducing the observed Zα seasonal peak over each studied region).
 Our study reveals that several models simulate a higher mean extinction height of the aerosol than observed over many regions, notably over the Atlantic and Pacific maritime regions downwind of the continents, whereas over the African and Chinese dust source regions, Zα is underestimated during the spring and summer seasons. This latter negative bias could be due to the fact that phenomena such as cold pools (so-called wakes) and wind gusts that play an important role in the upward propagation of dust particles in these regions are not captured in the models [e.g., Grandpeix and Lafore, 2010]. The positive bias was shown to be due to a large extent to an over-estimation of aerosol extinction above 6 km altitude. Among possible causes, we suggest too effective vertical mixing (e.g., due to the convection scheme and/or due to a low vertical resolution of the model), or too little removal of aerosol and precursor gases especially in the lower troposphere, which would induce a too long aerosol life time and subsequent too large upward and long-range transports.
 Such model limitations have been reported in previous studies. Textor et al.  show that large differences exist among the AeroCom models for aerosol dispersal, both in the vertical and in the horizontal direction, with higher diversities horizontally than vertically, and for SS and DU aerosols than for the other species. While differences in aerosol composition, and thus in extinction, among models [e.g., Kinne et al., 2006], or simply inherent biases in the prescribed background aerosols for given models (e.g., for UIO-GCM) might impact the results, Textor et al.  identify the differences in the simulated transport and removal processes as the major cause for the inter-model variability in the global aerosol 2D distribution. These conclusions are further corroborated with the CALIOP profile product we have developed and applied here. Following Textor et al. , we compare our results from the AeroCom A experiment (in which models used their own emission data sets), with AeroCom B simulations (in which models used Dentener et al.  emissions for the year 2000). Although we do not explore the full range of emission uncertainty with this comparison, we find that the simulated Zα is generally also only slightly modified by the use of harmonized emissions and injection heights (auxiliary material Figure S4). Thus, there are important inter-model differences in the simulated processes (transport, removal, chemistry and microphysics).
 More recently, Schwarz et al.  show with new aircraft measurements that refractory particles (interpreted as black carbon) are generally over-estimated in altitude of remote regions by most of the AeroCom models, and suggested insufficient removal by convective precipitation to be responsible. Over the Atlantic and Pacific oceans, downwind of the continents, surface dust concentrations were shown to be under-estimated by most of AeroCom I models [Huneeus et al., 2011], which is consistent with our results showing a significant positive Zα bias in these regions. Such is the case for the Zhang et al.  study that revealed the NAAPS global aerosol mass transport model having a negative AOD bias and a positive bias in mean extinction height over both land and maritime AERONET sites in the June–July 2007 period. The authors show that assimilating CALIOP-derived aerosol extinction profiles, together with two-dimensional MODIS and MISR AOD in the model improves the simulation of both aerosol features by redistributing further aerosol mass into the model boundary layer.
 In addition to the models uncertainties, some of the discrepancies highlighted in the present study could be due to inherent limitations and uncertainties in the CALIOP measurements and aerosol retrieval. The CALIOP lidar has a detection limit to diffuse aerosol, particularly in the free troposphere or at high latitudes [e.g., Winker et al., 2009]. The detection limit at night for the 5 km CALIOP Layer product is estimated to be between 0.010 to 0.015 km−1 [D. Winker, personal communication, August 2011]. In our study, atmospheric layers with no detected aerosol are assumed to have zero aerosol extinction. Previous studies based on the LITE (Lidar-In-SpaceTechnology Experiment) and on SAGE (Stratospheric Aerosol and Gas Experiment) global satellite data sets [Kent et al., 1993, 1998] show aerosol background extinctions mainly ranging between 0.001 and 0.005 km−1 on average in the upper troposphere (from 6 km to the tropopause) of the Southern Hemisphere. In order to test the impact of an eventual presence of background aerosol below the detection limit, we conducted a sensitivity test by assuming 0.001 km−1 and 0.005 km−1 extinction over atmospheric layers where no aerosol layer was detected by the CALIOP retrieval algorithm. The results indicate that, as expected, prescribing a background extinction of 0.001 km−1 reduces the positive Zα biases over the 0–10 km altitude range, notably for the NAT and NWP maritime regions, and for the SEA region, leading to a 47% decrease in the mean model bias over the 13 regions (i.e., from +0.38 km to +0.20 km). However, it also reduces the performance of several models, and notably the three models that showed the best agreement when no correction was introduced (namely, GISS, ECHAM_HAM and UIO_CTM), which then provide larger Zα (negative) biases, and less correlation with observations. In the case of the 0.005 km−1 extinction threshold, no more significant correlation with CALIOP but a general negative bias is obtained for all but one (LSCE) model.
 Further analyses (e.g., based on the SAGE global 2D distributions of the aerosol background extinctions), as well as more detailed information on the CALIOP detection limit, resulting from both the measurement and the aerosol layer retrieval algorithm, are needed to better characterize and constrain model error in different low polluted environments/altitudes. Excluding the regions with no detected aerosols (e.g., with a CALIOP simulator) would create a different regional average vertical profile and would not allow for the type of comparison presented here. Models would need to reproduce the exact three-dimensional location of the aerosol plumes, which is even more difficult and would introduce more error in the lower troposphere; that part of the column that we attempted to characterize with the current methodology. Important discrepancies have been also highlighted at seasonal scale between CALIOP and MODIS derived optical depths, in the three WCN, NAT and NWP regions, with the highest differences obtained for the smallest region, WCN. This suggests the need for further analyses to better assess the representativeness of CALIOP measurements when going to relatively small temporal and spatial scales.
 Methodological factors also limit the assessment of the uncertainties on the simulated aerosol vertical distribution. The fact that we use different time periods (i.e., different emissions and transport patterns) for simulations (2000 or climatic mean) and CALIOP measurements (2006–2009) could be responsible for some of the differences that cannot be assessed in the present study. The effect of removing daytime data has been evaluated for the year 2007. Similar aerosol vertical distributions and relatively small Zα differences (<10% for 9 regions, and <17% for the 4 other ones) are obtained between CALIOP 24h and CALIOP nighttime mean annual profiles. While additional analysis would be necessary to further assess the impact of the nighttime screening, it could therefore only partly explain the model biases, which reach from 27% to 116% of CALIOP annual Zα, according to the model. CALIOP AOD retrievals have been seen to exhibit regional biases with respect to MODIS, and biases of opposite sign in adjacent land and ocean regions sometimes occur [e.g., Kittaka et al., 2011]. While there is no fundamental problem with combining land and ocean regions into an average, this might introduce a bias in any comparison. While the characteristic height of extinction, Zα, established for each region provides a useful and simple measure to evaluate the performance of the models, it does not allow capturing complex multilayer patterns, as well as identifying related combinations of positive and negative biases and their compensative effects. This is notably the case for multimodal vertical distributions such as the ones observed from our mean extinction profiles over the Atlantic Ocean and Northern Africa in NH summer. A further quantitative evaluation step could be to assess not only the mean altitude, but also the altitudes and magnitudes of the different modes in the profile, if several exist. Our approach based on seasonal and sub-continental averaging does not permit assessing either the models or the ability of CALIOP to reproduce the intraregional diversity of the vertical aerosol distribution (e.g., the significant differences that exist within Europe [Guibert et al., 2005]). Depending on the CALIOP sampling, this might introduce significant uncertainties that have to be further estimated.
 The methodology described in the present study to build a CALIOP product, which is compatible with model outputs, is currently being applied to the evaluation of the AeroCom phase II simulations in preparation of the AR5. The latter employ updated model versions (with respect to atmospheric dynamics, physics, and aerosols), and adequately cover the CALIOP 2007–2009 measurement period. A global monthly gridded (1° × 1°) 3D CALIOP database was built that will allow a better assessment of the models' uncertainties and discussion of potential limitations of the CALIOP data. The data set also discriminates between total and dust aerosol, which will provide additional constraints in the assessment of the size-resolved aerosol simulation.
6. Summary and Conclusions
 The CALIOP Layer Product 3.01 nighttime data at 532 nm were used to evaluate the aerosol vertical distribution simulated by twelve global aerosol models over thirteen sub-continental regions representative of industrial, dust and biomass burning pollution, or located downwind from continental source regions. In a first step, the reliability of the CALIOP-derived mean extinction profiles has been evaluated through various analyses. Based on current knowledge, the CALIOP spatial and temporal coverage was shown to provide a representative signal of the aerosol vertical extinction at seasonal and sub-continental scales and our processing to produce a robust benchmark data set for the evaluation of the global models in reproducing its climatology. Observed and simulated annual and seasonal average aerosol extinction profiles were then compared over the selected regions. To quantitatively assess the performance of the models, the simulated AOD and mean extinction height Zα were calculated and compared to MODIS and CALIOP respective diagnostics, based on linear regressions, biases, and RMSD statistical analyses.
 While a large inter-model range of the simulated AOD is obtained over most regions, a lower than satellite retrieved AOD is simulated by all models, especially over regions with large aerosol load. The vertical shape of normalized model profiles compares better with CALIOP, indicating that the vertical aerosol dispersion at least in the lower troposphere up to 6 km is captured well by most of the models. On the other hand, the accuracy of the shape of the profile highly depends on the model, the season and the region. From the characteristic extinction height Zα, established for each region and season over the 0–10 km altitude range, several models reproduce the inter-regional diversity, with a Zα increase from marine and industrial to dust and biomass burning regions. In many regions however, notably over the Atlantic and Pacific downwind maritime regions, most of them simulate a higher mean aerosol height than observed, due to a higher fraction of the aerosol extinction above 6 km altitude. Among possible causal factors are too effective vertical mixing (due for instance to a low model vertical resolution and/or to strong vertical transport by the convection scheme), too little removal in the lower troposphere that would induce a too long aerosol life time, and/or simply biases in the prescribed background aerosol (for UIO-GCM). An under-estimation of the aerosol extinction in the upper part of the troposphere by the CALIOP retrieval algorithm could also contribute to such discrepancies.
 Our study provides a documentation of the performance of the individuals models as a function of the season and the region as well as a comparison of CALIOP version 2.01 and version 3.01 aerosol profiles that will be helpful to the different modeling teams in revisiting results of previous studies (e.g., AeroCom studies and studies based on the CALIOP Layer Product version 2), and in further understanding the factors responsible for the models' biases. It will also further serve to assess, in a forthcoming paper, the impact of the models' improvements on the quality of the simulated aerosol vertical distribution, from the evaluation of the AeroCom II simulations currently being analyzed in the preparation of the AR5.
 The authors would like to thank the three reviewers for their valuable comments and suggestions that allowed significantly improving the quality of the manuscript. We thank the ICARE Data and Services Center for providing access to the CALIOP CNES/NASA data used in this study and for providing computing access and support. We also would like to thank Stefan Kinne (MPIM, Germany) and Christiane Textor (previously at LSCE, France) for their important contribution to the development and the maintenance of the AeroCom tool and website (http://aerocom.met.no/cgi-bin/aerocom/lidar_annualrs.pl). We are grateful to Øyvind Seland, who was a central developer of the UIO-GCM model. We also acknowledge Cecilia Garrec for the English revision and her general comments on the text. This work was supported by the French space agency CNES (Centre National des Etudes Spatiales) and by the Infrastructure for the European Network for Earth System Modeling (IS-ENES) European Union project (agreement 228203). S. Ghan and R. Easter were funded by the U.S. Department of Energy, Office of Science, Scientific Discovery through Advanced Computing (SciDAC) program. The Pacific Northwest National Laboratory is operated for DOE by Battelle Memorial Institute under contract DE-AC06-76RLO 1830.