The planetary boundary layer (PBL) mediates exchanges of energy, moisture, momentum, carbon, and pollutants between the surface and the atmosphere. This paper is a first step in producing a space-based estimate of PBL depth that can be used to compare with and evaluate model-based PBL depth retrievals, inform boundary layer studies, and improve understanding of the above processes. In clear sky conditions, space-borne lidar backscatter is frequently affected by atmospheric properties near the PBL top. Spatial patterns of 5-year mean mid-day summertime PBL depths over North America were estimated from the CALIPSO lidar backscatter and are generally consistent with model reanalyses and AMDAR (Aircraft Meteorological DAta Reporting) estimates. The rate of retrieval is greatest over the subtropical oceans (near 100%) where overlying subsidence limits optically thick clouds from growing and attenuating the lidar signal. The general retrieval rate over land is around 50% with decreased rates over the Southwestern United States and regions with high rates of convection. The lidar-based estimates of PBL depth tend to be shallower than aircraft estimates in coastal areas. Compared to reanalysis products, lidar PBL depths are greater over the oceans and areas of the boreal forest and shallower over the arid and semiarid regions of North America.
 The planetary boundary layer (PBL) is the turbulent layer closest to the Earth's surface with a depth of about 1–2 km at midday and is crucial to many aspects of weather and climate. The PBL mediates exchanges of energy, moisture, momentum, carbon, and pollutants between the surface and the overlying atmosphere. PBL processes also influence the production of clouds, which modify the radiation budget through their effects on short and longwave radiation [Stull, 1988]. The inversion at the top of the PBL acts as a barrier to surface-emitted pollutants, leading to high concentrations within the PBL [Stull, 1988], so diagnosing the depth of the PBL is critical to air quality studies as well as weather prediction and climate.
 Extremely high vertical resolution (on the order of a few meters, depending on application) is needed for proper representation of surface layer ventilation and turbulent entrainment at the top of the PBL, but is only required over very thin spatially and temporally changing portions of the atmospheric column in the surface and entrainment layers. Adding hundreds of vertical levels to a model in anticipation of the need to resolve strong gradients in temperature and turbulence at the (unknown) inversion height would be excessive and computationally expensive over the rest of the boundary layer and free troposphere. Therefore, small-scale processes that control PBL development are unresolved by many models [Ayotte et al., 1996; Gerbig et al., 2003; McGrath-Spangler and Denning, 2010].
 Multiple methods are used to determine PBL depth and often give different results [Seidel et al., 2010]. Model products, such as those from the Modern Era Retrospective-analysis for Research and Applications (MERRA) and North American Regional Reanalysis (NARR), are sensitive to empirical parameters in addition to the diagnostic method chosen and verification by direct observations of PBL depth are sparse [Seibert et al., 2000; Jordan et al., 2010]. The NARR reanalysis product determines the PBL depth using the TKE (turbulent kinetic energy) method. This method identifies the PBL height as the height at which the TKE drops below a threshold value. In the MERRA reanalysis product, the PBL height is defined by identifying the lowest level at which the heat diffusivity drops below a threshold value. Furthermore, the European Center for Medium-range Weather Forecasts (ECMWF) identifies the PBL height by identifying the level at which the bulk Richardson number reaches its critical value [Palm et al., 2005].
 Comparisons between the PBL depth (above ground) products provided by the MERRA and NARR data sets (Figure 1) show qualitatively similar results. Figure 1 shows PBL depths provided by MERRA (2/3° × 1/2° resolution) and NARR (32 km resolution) under mostly sunny conditions by restricting modeled total cloud cover to less than 10%. This analysis examines summertime conditions from 2006–2010 at 3pm local time in the MERRA analysis and from 2pm to 4pm in the NARR. The two scenes are not temporally the same, but provide a climatology for PBL depth derived by the two products. In summer, midday the deepest PBL depths in both products occur over the drier western United States, a relatively deep PBL is found along the coast of the Gulf of Mexico, and shallow PBL depths are present over the U.S. Midwest. Quantitatively, the analyzed PBL depths are very different. For instance, over the western United States, the NARR data set commonly finds a PBL over 3 km while the value for MERRA is closer to 2.5 km.
 Although PBL depth is important, no observation-based global PBL climatology exists [Seidel et al., 2010] and there are many problems with simulating the processes involved [Martins et al., 2010]. It is difficult to observe PBL depth at large scales [Randall et al., 1998] and to observe fluxes and processes at the top of the PBL because of its height and variable location. As a result, turbulent entrainment at the PBL top is among the weakest aspects of PBL models [Ayotte et al., 1996; Davis et al., 1997].
 Radiosondes that could be used to observe PBL processes and depth are launched in the morning and evening over North America (0 and 12 UTC), insufficient times for evaluating daytime maximum PBL depth and estimates made from radiosondes may differ from the space/time average by up to 40% [Stull, 1988; Angevine et al., 1994; White et al., 1999] producing uncertainty when compared to model simulations. Stull  specifically recommends not determining PBL depth from a single rawinsonde for this reason. Seidel et al.  examined radiosonde observations using seven different methods to determine PBL depth. These methods can be found in more detail in their paper, but they include: (1) the height at which the virtual potential temperature matches the surface value, (2) the level of the maximum vertical potential temperature gradient, and (3) the base of an elevated temperature inversion. They found that all but one of the methods produced similar results over Lerwick, UK in February 2007, although the one that did not agree was substantially different. However, at that same station, in December 2006, five different values of PBL depth were retrieved with differences over a factor of 10. These concerns complicate the use of radiosondes to produce PBL depth estimates to which to compare other estimates.
 There have been few, limited scale studies that have examined PBL processes using space-based remote sensing in the past [Martins et al., 2010]. The Lidar-In-space Technology Experiment (LITE) flew for 9 days in September 1994, identifying the PBL top by locating a sharp aerosol gradient [Randall et al., 1998]. The Geoscience Laser Altimeter System (GLAS) had limited success making observations of PBL depth as well [Palm et al., 2005]. Palm et al. examined PBL depth over the oceans for October 2003 and found the derived depth from GLAS to be 200–500 m deeper than that from the European Centre for Medium-Range Weather Forecasts.
 In addition, multiple studies have examined ground-based and airborne lidar to determine PBL depth with good results.Davis et al. [1997, 2000] and Brooks  used airborne lidars to develop automated methods using wavelets to derive PBL depth for specific field campaigns and surface/atmosphere conditions. Wiegner et al. used ground-based lidar as the reference for the depth of the mixing layer using the mixing layer definition of an “abundance” of aerosols.Mattis et al. used a network of ground-based lidar to identify the PBL top in order to identify free-tropospheric aerosols.Cohn and Angevine found good comparisons between two ground-based lidars and a wind profiler deployed during the Flatland96 Lidars in Flat Terrain experiment.White et al.  found good agreement between wind profilers and airborne lidar in Tennessee during the 1995 Southern Oxidants Study (SOS95). Wind profilers determine the PBL depth by examining the refractive index structure function parameter. There should be a peak in this variable at the boundary layer capping inversion due to fluctuations in water vapor and temperature [Wyngaard and LeMone, 1980; Yi et al., 2001].
 Comparisons of these measurements to radiosondes and other remote sensing methods depend upon the definition of the PBL depth used and the instrument used to retrieve it [Seibert et al., 2000; Wiegner et al., 2006; Mattis et al., 2008; Seidel et al., 2010]. Furthermore, PBL depth measurements from lidar are generally slightly deeper than those derived from temperature profiles since convective plumes transport aerosols above the base of the inversion [Beyrich, 1997]. This means that the retrieved depth will not provide an estimate of the temperature inversion, but rather the height to which aerosols and pollutants are lofted. This can differ from the height of the temperature inversion by as much as the depth of the entrainment zone (40% of the depth of the mixing layer [Stull, 1988]) and must be considered in the context of the desired application (e.g., examining vertical lofting of pollutants versus estimating the height of neutral buoyancy).
 The Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite has the potential to expand the available data tremendously. Such a data set will provide data to which modelers can compare PBL depth estimates to improve simulations of PBL processes. This paper is a first step in producing such a data set using space-based lidar. The advantage of orbital lidar is its ability to provide near global coverage, irrespective of political and land/water boundaries and is important for model validation and data assimilation [Palm et al., 2005]. The following section provides a description of a method to determine PBL depth from the CALIPSO satellite. Section 3 describes the results using this method during the summer over North America. Section 4 compares this data set to other observations and the final section offers a brief conclusion.
 The Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) aboard the CALIPSO satellite is the first space-based lidar optimized for aerosol and cloud measurements and the first polarization lidar in space [Winker et al., 2007]. The lidar backscatter data are recorded at 532 nm (parallel and perpendicular polarization) and at 1064 nm. CALIPSO is part of NASA's Afternoon constellation (A-train) of satellites and is in a 705 km sun-synchronous polar orbit with an equator crossing time of about 1:30 P.M. local solar time and a 16-day repeat cycle [Winker et al., 2007, 2009]. The products used in this analysis are the Version 3.01 Level 1B data available online (http://eosweb.larc.nasa.gov/PRO-DOCS/calipso/table_calipso.html). The attenuated backscatter retrieved by the lidar is available at 30 m vertical and 0.33 km horizontal grid intervals from 0–8 km altitude and at reduced resolution above 8 km.
 The maximum variance technique developed by Jordan et al. , and used to evaluate PBL depth during two months in 2006, is used here to derive estimates of PBL depth from the CALIOP lidar 532 nm attenuated backscatter data. This technique is based on an idea by Melfi et al.  that at the top of the PBL there exists a maximum in the vertical standard deviation of lidar backscatter. This maximum in the standard deviation exists because within the entrainment zone, in clear conditions, turbulent boundary layer eddies mix aerosol laden air with cleaner free tropospheric air. This mixture of clear and dirty air produces a large standard deviation in the backscatter [Jordan et al., 2010]. In conditions with boundary layer clouds, a maximum in the standard deviation occurs either within or just above the cloud, depending on the specific conditions. The difference in estimates depends on the thickness of the cloud.
 The Jordan et al. technique examines the vertical profile of retrieved backscatter beginning at the surface, and searches for the first (lowest in altitude) occurrence of a maximum in the vertical standard deviation (calculated over four adjacent altitude bins) collocated with a maximum in the magnitude of the backscatter itself, often identifying the mid-level or top of boundary layer clouds. The level of the maximum in the standard deviation and backscatter is well correlated with the top of the PBL because the conditions that affect lidar backscatter (e.g., jumps in temperature, relative humidity, aerosol concentration, etc.) are frequently associated with the PBL top. Since this technique would identify the residual layer at night and not the nocturnal boundary layer, it is applied only to daytime satellite passes.Jordan et al. found that this technique compared favorably to ground-based lidar and radiosonde data at the University of Maryland Baltimore County.
Jordan et al. used visual inspection to evaluate their CALIPSO-based PBL retrieval. We have automated the algorithm in order to process a larger subset of the available data. In order to do this, we made several modifications. We restricted the retrieved daytime depths to between 0.25 and 5 km above the ground surface and added a check for surface backscatter to eliminate surface noise and profiles without a clear aerosol signature within a reasonable height range for the midday PBL. Profiles containing large signal attenuation due to clouds were not analyzed and were instead assigned a missing value. These missing values were defined by the occurrence of three vertically consecutive layers with a 1064 nm backscatter value exceeding 10−2.25 km−1 sr−1. The 1064 nm data were used to identify cloud layers following Okamoto et al. . Cloud topped boundary layers were of interest, so the feature top was first determined, and then the search for attenuating clouds began 750 m above the feature and continued to the top of the profile. This allows us to estimate the depths of both aerosol-topped and cloud topped PBL as long as deep clouds do not interfere with the retrieval.
Figure 2shows an example profile that can be analyzed using the above method and shows the step function change between the PBL and overlying free troposphere. This profile has been horizontally averaged (over 17 km) using a running mean to increase the signal-to-noise ratio. At 0 km altitude, there is a large backscatter signal from the surface return. Starting at this point, two peaks are identifiable in the backscatter at 0.8 and 1 km. The lowest peak (at 0.8 km) does not have a corresponding local maximum in the standard deviation and so is rejected. The second peak (1 km), however, is coincident with a local maximum in the standard deviation and so is identified as the top of the backscatter feature. After detection, individual PBL depth retrievals are horizontally averaged over 20 km using a running mean in order to minimize outliers and increase spatial continuity. The final PBL depths have a vertical resolution of 30 m.
 There are several weaknesses of this method that limit its ability to estimate PBL depth and can be a source of error. First, profiles containing deep, optically thick cloud (such as within the Intertropical Convergence Zone) or aerosol layers attenuate the signal making it impossible to detect backscatter features near the surface in convective or otherwise cloudy conditions. This is similar to trying to observe the sun on a cloudy day and introduces a bias. Furthermore, in regions of shallow convection that does not attenuate the signal, the algorithm may detect an apparent gradient either within the cloud or at the cloud top. In general, convection complicates the interpretation of PBL depth using not only the CALIOP lidar data, but also theory and other observational systems [e.g., White et al., 1999; Seidel et al., 2010] and should be considered in context of the desired application.
 Second, the potential exists for the algorithm to detect the aerosol gradient from a previous day's residual layer and miss a current, shallower feature thus overestimating the PBL depth. Multiple cloud layers can also produce an overestimate of PBL depth if the algorithm detects a cloud above the actual PBL. Third, a very shallow backscatter feature cannot be resolved due to noise in the backscatter very near to the surface. However, during the afternoon overpass, especially in the summer over land, the PBL should be well developed and deep enough to exceed the minimum depth assigned here except under very unusual circumstances. However, these weaknesses should be kept in mind when considering the results presented in the next section.
 The first major weakness is discussed in the next section, but the other two are more difficult to quantify. The frequency of days with shallower PBL depths than that of the previous day or with cloud layers above the top of the PBL requires observations of PBL depth that do not currently exist and is a deficiency that this data set would improve. The satellite repeats the same track only once every 16 days so CALIPSO measurements, by themselves, cannot answer this question. Additionally, widespread information about the frequency of PBL depths less than 250 m is unavailable. It is rare for such conditions to exist and these occur mostly over the oceans. Several authors [e.g., Wulfmeyer and Janjić, 2005; Hannay et al., 2009; Rahn and Garreaud, 2010] have looked at the depth of the marine boundary layer and found such shallow depths to be relatively rare.
Figure 3 shows an example of the estimated PBL depths using the above method for a July 2007 overpass of the satellite across the Midwestern United States. The gray line near the bottom of the figure indicates the surface elevation available in the Level 1B product. There is a strong backscatter signal at this level where the lidar beam reaches the solid surface. The apparent signal return below the surface is due to imperfect electronics and is discussed by McGill et al. . The vertical regions of dark blue are results of attenuation (lidar shadows) from overlying clouds for which we do not retrieve a value (notice that this figure is zoomed in and exhibits a maximum altitude of 5 km). The black line indicates the PBL depth estimated by the algorithm and in general does a good job of locating the backscatter signal indicating the PBL top. There are a few regions in which the backscatter feature is not obvious to visual inspection, but is found by the automated algorithm. The profile from Figure 2 was taken from this plot at about (29°N, 90°W). The deep PBL depths at about 34°N, −91.4°W is associated with high clouds. This produces an apparent discontinuity with the regions on either side of the areas attenuated by high cloud. However, these regions are about 400 km away (about the distance between Washington, DC and New York City) and affected by different land surfaces and air masses.
 Instantaneous values of PBL depth from the CALIPSO lidar attenuated backscatter data were averaged onto a 1.25° × 1.25° grid covering much of North America from 20°N to 70°N latitude and from 32°W to 160°W longitude. This data includes summertime values from 2006 (when CALIPSO was launched) through 2010. The total number of profiles used within each grid box ranged from as few as 1000 to over 2100 depending on the exact path of the satellite and lidar outages. Considering a 16-day repeat cycle and the number of days in this analysis, about 29 days of data were averaged into each grid box. The local solar time of satellite observation ranged from approximately 13:00 to 14:00 throughout the majority of the domain. Earlier observations in boreal Canada are present due to the satellite path and longer day length. The following results are averages of the instantaneous values in order to show general behavior, but the individual values themselves (such as fromFigure 2) should be used during evaluations.
 The percentage of retrieved backscatter heights to the total available CALIPSO profiles in each 1.25° × 1.25° grid cell during June, July, and August from 2006 to 2010 is shown in the retrieval rate map in Figure 4. Here, retrieval rate refers to whether a height feature was derived and not to whether it is accurate as compared to other observations. The algorithm can fail due to the lidar not functioning properly, thick cloud cover above the boundary layer, and/or aerosol profiles that do not exhibit a clear PBL top: this figure only considers the last two conditions. From this figure it is evident that the algorithm has its greatest retrieval rate over the subtropical oceans due to the sparsity of overlying clouds. Reduced rates occur in regions with high frequency of mid-day convection such as Florida, the Gulf Coast region, the U.S. Rocky Mountains, and the Mexican Plateau. In general, the retrieval rate ranges from a low of 15% near Lake Okeechobee in Florida and southern New Mexico to near 100% in the Pacific Ocean off the coast of California and Mexico.
Figure 5 shows the standard error of the mean of the estimation of PBL depths from the CALIPSO satellite. The standard error of the mean provides a simple measure of sampling error of the estimated depth of the backscatter features and gives an estimate of the uncertainty of the values. In other words, it provides an estimate as to the amount that an obtained mean may be expected to differ from the true mean. It is calculated by dividing the standard deviation by the square root of the sample size. It therefore increases with increasing standard deviation and decreasing sample size. This figure shows that there is an increased sampling error in the gaps between the satellite orbits due to decreased sampling in those locations. In general, the sampling error is greater over land than over water, as would be expected due to the more heterogeneous nature of PBL processes over land. In addition, the greatest sampling error occurs where the retrieval rate is smallest and the region experiences convection during the summer months.
 The advantage of automating the Jordan et al.  algorithm is the ability to process large quantities of data that can then be combined to form maps of PBL depth estimates. Figure 6shows the mean mid-day clear sky PBL depth over North America for June, July, and August (JJA) on a 1.25° × 1.25° grid. These estimates are above ground level. One can identify deeper PBL depths over land than over water and such features as the Florida and Yucatán peninsulas and the island of Cuba. Shallower boundary layers off the coast of California can be seen associated with cold, upwelling water, stratocumulus clouds, and overlying subsidence. Another prominent feature is the relatively shallow boundary layers over the farmland of the U.S. Midwest. High moisture availability means that a large portion of net radiation in this region is used to evaporate water and less energy is available to grow the PBL. This regional minimum in PBL depth roughly follows the valley of the Mississippi River.
 The deepest backscatter features occur along the semiarid Rocky Mountains in the southwestern United States and the Mexican Plateau. Relatively deep boundary layers are present over Canada. A possible explanation for this is that in the boreal ecosystem, low soil temperatures and nutrient availability reduce stomatal conductance and lead to a high ratio of sensible to latent heat flux [Margolis and Ryan, 1997]. Additionally, during summertime at high latitudes, the longer day length results in higher amounts of incoming solar radiation, leading to more available energy. The higher Bowen ratio and amount of solar radiation create a situation in which deeper than expected (over 2 km) PBL depths have occurred in June during the Boreal Ecosystem-Atmosphere Study (BOREAS) field experiment [e.g.,Betts et al., 1996; Margolis and Ryan, 1997].
Figure 7 shows histograms of the backscatter feature depths over land (top) and over water (bottom). Over land, the probability distribution function (PDF) has a maximum at 1.75–2 km. Over water, the features are biased more toward shallow depths, with a maximum at 0.75–1 km and a near exponential decay toward greater depths.
 It is important to consider spatial and temporal separation between the satellite and other PBL depth data. The PBL depth can change by a kilometer or more in as little as 1 h [White et al., 1999] and a point measurement may not be representative of the spatial average [Angevine et al., 1994; White et al., 1999]. These differences could be a result of surface heterogeneity, variations in advection and subsidence, or local conditions being measured by the observational system (e.g., a radiosonde traveling through a penetrating updraft rather than the predominant subsidence). It is therefore imperative to sample the model or other observations as coincidentally as possible. This sensitivity of the PBL depth to specific spatial and temporal conditions must be taken into account when doing comparisons and complicates evaluations of the observing systems. Since it is nearly impossible to obtain perfectly coincident observations in both space and time, this complexity should be kept in mind for the following comparisons.
Figure 8shows the ratio of MERRA reanalysis PBL depths to the PBL-associated backscatter heights derived from CALIPSO. Over much of the United States and portions of the subtropical oceans, the MERRA PBL depths are within 25% of the estimates derived from CALIPSO. The lidar-based PBL depth estimates are significantly larger than those from the reanalysis product over the oceans and boreal forests and shallower over the dry, complex terrain of the American Southwest.
 Specially equipped commercial aircraft capable of reporting pressure, temperature, and height data produce the AMDAR (Aircraft Meteorological DAta Reporting) data set. These data provide atmospheric profiles during takeoff and landing of the aircraft that can be used to estimate PBL depth by examining the temperature profile. The CALIPSO satellite passed within 100 km and half an hour of one of these aircraft profiles over 1000 times during the time period discussed here, representing 62 different airports.
Figure 9 shows the ratio of the AMDAR retrieved PBL depths to the depth of the backscatter feature retrieved by CALIPSO. The AMDAR PBL depths were retrieved by identifying the level of maximum vertical gradient of potential temperature as described by Seidel et al. . This figure only shows airports for which there are at least 2 observations and the CALIPSO profile occurred over land. Most continental locations compare well to each other, within 25%, which is better than radiosonde estimates of space/time average PBL depth [Angevine et al., 1994]. Exceptions occur for stations with very few observations. The largest disagreements occur along the coasts, possibly due to coastal dynamics such as land/sea breezes affecting the retrieval by either the aircraft or the satellite, but not the other.
Figure 10 shows the relationship between the number of observations at an airport location and the amount of agreement between the CALIPSO and AMDAR estimates. A value of one shows perfect agreement. The general relationship shows that increasing the number of observations improves the agreement between the satellite and the aircraft with a mean ratio of 1.11. This implies that errors are random and can be averaged out.
 The CALIPSO retrieval rate is over 75% over the subtropical oceans, less than 40% over the Southwestern deserts and around 50% over the majority of the North American continent. The results show that the PBL depth estimated by the CALIPSO backscatter climatology is deeper over the oceans than estimates of PBL depth from reanalysis. Areas of the boreal forest with deep daytime summer PBL depths are also underestimated in the MERRA reanalysis as compared to CALIPSO. Over the arid and semiarid complex terrain of the Southwestern United States and the Rocky Mountain region, the CALIPSO retrievals estimate a relatively shallow PBL depth compared to reanalysis and aircraft profiles. The average agreement between the satellite and aircraft data improves with increasing numbers of observations. These results should be considered keeping in mind the spatial and temporal distance between the PBL estimates.
 Several weaknesses of this algorithm should be considered when using these results. Optically thick clouds attenuate the lidar signal and thus make retrieval impossible. This introduces a bias and should be considered when examining mean values or other statistics. Second, the potential exists for a residual layer from a previous day to be detected rather than a current shallower PBL. Third, surface noise inhibits the retrieval of very shallow PBL depths.
 Despite these weaknesses, initial estimates of a PBL depth using the methodology of Jordan et al.  seem qualitatively reasonable although more evaluation is needed in future work. The retrieval rate of deriving backscatter features from the CALIPSO satellite is high enough that there are millions of available observations within the limited spatial and temporal domain examined here. With enough computer resources, this analysis could be extended to provide a global PBL data set.
 The NARR data for this study are from the Research Data Archive (RDA), which is maintained by the Computational and Information Systems Laboratory (CISL) at the National Center for Atmospheric Research (NCAR). NCAR is sponsored by the National Science Foundation (NSF). The MERRA data for this study are from the Global Modeling and Assimilation Office (GMAO) and the Goddard Earth Sciences Data and Information Services Center (GES DISC). The original data are available from the RDA (http://dss.ucar.edu) in data set number ds608.0. This study was made possible in part due to the data made available to the National Oceanic and Atmospheric Administration by the following commercial airlines: American, Delta, Federal Express, Northwest, Southwest, United, and United Parcel Service. We would like to thank Nikisa Jordan and Mark Vaughan for their assistance with the CALIPSO data and the PBL depth algorithm. We would also like to thank David Randall for many helpful suggestions that improved this manuscript substantially. This research was supported by National Aeronautics and Space Administration grant NNX08AV04H.