A purely Lagrangian assessment of dispersion from modeled surface current trajectories in the coastal ocean is presented. Modeled trajectories come from ROMS simulations for the Southern California Bight during the 1996 through 1999 period. Data are from surface current trajectories collected primarily in the Santa Barbara Channel with CODE style drifters. Distributions of particle positions from trajectories emanating from launch locations within 10 kilometers of the coast throughout the Santa Barbara Channel that advect for one through four days (Lagrangian PDFs) are evaluated descriptively and quantitatively. The two dimensional Kolmogorov-Smirnov (K-S) statistical test for comparing discrete sampled data with a known probability distribution is the quantitative basis. In general, dispersion distributions from observations are similar to Lagrangian PDFs computed from modeled trajectories and the K-S statistic quantifies this accordingly. A few specific regions of poor model-data agreement are indicated and discussed. The purely Lagrangian assessment, elucidates an improved understanding of model performance and ocean circulation beyond that offered in a Eulerian sense, and is necessary when modeled trajectories are utilized for applied oceanographic and marine ecology problems.
 Regional coastal ocean observing systems exist in part to provide stakeholders with the best possible information for addressing a wide variety of applied coastal problems. Many applications are inherently Lagrangian, requiring trajectories to determine the “fate and transport” or “connectivity” of tracers. For example, coastal water quality management, ecosystem management, spill remediation, and search-and-rescue operations all require knowledge of dispersion from water parcel trajectories.
 Lagrangian applications in the turbulent ocean require statistical approaches, and thus a large number of observations. Probability distributions of water parcel location as a function of initial position and advection time (hereafter “Lagrangian PDFs”) are the required quantities. Direct observations with water following drifters are too sparse for computation of meaningful Lagrangian PDFs for more than a few combinations of initial position and advection time. Trajectories computed from numerical circulation models must therefore be relied on. In a recent study, Mitarai et al.  determine Lagrangian PDFs with millions of trajectories computed from the Eulerian output of Dong and McWilliams'  Southern California Regional Ocean Modeling System (ROMS) simulations. Results of the work by Mitarai et al.  are a part of the planning process for designation of Marine Protected Areas in Southern California.
 Southern California ROMS simulations, the basis of Mitarai et al.'s  Lagrangian PDFs, compare favorably with observations in a Eulerian sense [Dong et al., 2009]. Comparisons show agreement in the first two statistical moments and lead Dong et al.  to conclude, “The model results resemble the observations in terms of the spatial structure and magnitude of the mean, interannual, seasonal, and intraseasonal variations”. Eulerian agreement is not necessarily indicative of accuracy in trajectories, dispersion, or Lagrangian PDFs. Various Eulerian flow patterns can yield similar low order statistical moments, relatively small errors in Eulerian velocity statistics can become substantial when integrated and eddy structures can manifest themselves differently in Eulerian and Lagrangian frames [Ohlmann and Niiler, 2005].
 This paper uses in situ drifting buoy data to assess dispersion distributions from modeled trajectories in a purely Lagrangian sense. Such an assessment is necessary prior to using modeled trajectories and their derived products in applied problems. The analysis gives a spatially continuous distribution of model skill that can elucidate an improved understanding of both model performance and ocean circulation beyond that offered in a Eulerian assessment. Limited densities of Lagrangian observations typically preclude such studies.
2. Data and Methods
2.1. ROMS Derived Trajectories
 Results from the Southern California Bight ROMS simulations (SCB-ROMS), described in detail by Dong and McWilliams  and Dong et al. , are the basis for modeled trajectories. SCB-ROMS is a primitive equation hydrodynamic model with a free sea-surface, a horizontal curvilinear coordinate system, and a vertical sigma coordinate system [Shchepetkin and McWilliams, 2005]. The innermost of three nested grids has 1.0 km horizontal grid resolution. Model solutions are available for the 1996 through 2003 period.
 Particle trajectories are computed from Eulerian velocity fields at 1 m depth using fourth order Adams-Bashford-Moulton predictor-corrector scheme with no sub-grid scale energy (hereafter “modeled trajectories”). The two-dimensional (2-D) velocity fields at 1 m are used for consistency with drifter observations. The 2-D modeled trajectories differ from the fully three-dimensional (3-D) trajectories described by Mitarai et al. . Fixed-depth tracking can lead to “beaching” compared with 3-D flows (particles that reach the shore in the surface layer continue their motion vertically in 3-D to satisfy continuity). Pathways of modeled trajectories are thus terminated when reaching a shoreline boundary just as in situ drifter observations terminate when they run ashore, or “beach”.
 Launch locations (located within 10 km of the coast), the scheme for launching modeled trajectories, and computation of Lagrangian PDFs from positions along surface current trajectories after some integration time, are all described in detail by Mitarai et al. . Lagrangian PDFs are computed for 1, 2, 3, and 4 day advection times allowing trajectories to terminate through beaching. Nearly 200,000 trajectories launched from 1996 through 1999 and weighted by month and year to agree with the temporal distribution of observed trajectories (described below) are the basis of the Lagrangian PDF calculations.
2.2. Drifter Data
 Two of the densest in-situ drifter studies performed to date are attributable to the United States Minerals Management Service (MMS) [e.g., Winant et al., 1999, 2003; Ohlmann and Niiler, 2005]. Drifter data used here are a subset of those collected from May 1993 through November 1999 in the Santa Barbara Channel (SBC) and Santa Maria Basin (SMB) as part of the MMS funded Santa Barbara Channel – Santa Maria Basin Coastal Circulation Study. Drifter design, deployment scheme, and resultant data, are all described in detail by Dever et al.  and Winant et al. [1999, 2003]. Only observations from 1996 through 1999, the time period for which ROMS solutions are available, are considered here.
 CODE design drifters are drogued at nominal depth near 1 m and follow water to within ∼0.01% of the wind speed [Davis, 1985]. Up to six drifter positions per day are obtained via Doppler ranging with the satellite based Argos system. Position accuracy ranges from ∼100 to 1000 m. Initially, single drifters were released roughly every two months from 12 stations located throughout the SBC [Winant et al., 1999]. A dozen deployment locations in the SMB were added in May 1996. Deployment frequency decreased to near quarterly during the 1997 through 1999 period. Drifters have a nominal sampling life near 40 days. However, near half the drifters reportedly beached giving many short-lived tracks [Dever et al., 1998].
 Since only starting and ending position data are required for this study, raw Argos position data are utilized. All drifter position records contained within each of Mitarai et al.'s  launch sites during the 1996 through 1999 period are first identified. For each individual drifter that passes through a launch site, the position record closest to the center time of all position records within the launch site is identified as the starting record. Ending positions after 1, 2, 3, and 4 days are identified by linearly interpolating between records. Drifter tracks without position records within ± 8 hours of the ending time may have significant error in interpolated position and are thus eliminated. Drifter trajectories within 5 days of a previous launch are eliminated to insure independent observations.
2.3. Comparison Metric
 The Kolmogorov-Smirnov (K-S) test is a non-parametric statistical method for determining if two distributions differ [Press et al., 2002]. The test is based on the maximum difference in cumulative distribution functions (CDFs) of the distributions. It uses information from individual data points, can be used with relatively small sample sizes (as it does not require binning), can accept analytical PDFs, and can be applied to empirical distributions. These features make it attractive for comparing empirical PDFs with a limited number of drifter observations.
 The K-S test has long been used with oceanographic data primarily to examine the Gaussianity of velocity distributions [e.g., Swenson and Niiler, 1996; Bracco et al., 2000; LaCasce, 2005]. More recently, van Sebille et al.  use the K-S test for binary determination of model skill with some confidence level. Lagrangian PDFs of modeled dispersion are not expected to be isotopic or spatially homogeneous as with Gaussian velocity distributions [Mitarai et al., 2009]. The focus here is on the K-S test statistic as a quantitative metric for assessing agreement in the spatial distribution of the ending positions of modeled and observed trajectories. Modeled dispersion is represented with analytical Lagrangian PDFs, and dispersion observations are represented as discrete positions.
 The K-S test is based on the idea that different distribution functions, or data sets, give different CDFs, and the largest absolute difference in CDF values (D) is indicative of the probability of disagreement. The K-S test statistic (P)
described by Press et al.  quantifies the confidence level associated with D. For application here, D is the maximum absolute difference between CDFs obtained from SCB-ROMS Lagrangian PDFs and drifter observations, N is the number of statistically independent drifter observations, and r is the correlation coefficient for zonal and meridional drifter positions determined in a least squares sense. D is determined as a function of launch location and advection time by considering probabilities in each of the four quadrants about all given drifter positions as described by Fasano and Franceschini  and Press et al. .
 The K-S statistic is applied to cases with more than 10 independent drifter observations, a sufficiently large threshold for reliable use of the test statistic while allowing comparisons for the majority of the SBC. Histograms of 1000 random draws of various sample sizes between 10 and 20 agree with the analytical solution (equation (1)) when P < ∼0.2. Equation (1) degrades slightly with increasing sample size for P > ∼0.2, however, the implication, that compared distributions are not significantly different, still holds [i.e., Press et al., 2002]. Dependence of the P statistic on values of N and r is discussed in detail by Fasano and Franceschini  for N = 5 to 5000.
 A comparison of model Lagrangian PDFs and drifter observations is first indicated for a single arbitrarily selected launch site and four advection times to demonstrate the P value computation. Thirty in situ drifters emanate from the launch site during the model integration time. Positions of 17 independent drifters after 1, 2, 3, and 4 days of sampling are shown with corresponding Lagrangian PDFs (computed from positions of modeled trajectories weighted for temporal agreement with observations) in Figure 1. Lagrangian PDFs suggest primarily northwest movement towards the center of the SB Channel, and equator-ward movement around the east side of Santa Cruz Island. The PDFs show spread throughout almost the entire SBC after 3 days, with the largest densities nearest the deployment location (Figure 1c).
 The distribution of drifter positions after each of the four integration times is quantitatively compared with the corresponding model derived Lagrangian PDF through the P value (equation (1)). For each in situ drifter position (considering a specific integration time) the sample space is divided into four quadrants about that position as illustrated by dashed lines in Figure 1d. CDFs are then computed for each quadrant from the model Lagrangian PDF and drifter data as the integral of the modeled PDF and the relative number of drifter positions, respectively. D is computed as the maximum difference in model and observed CDFs considering the four sets (quadrants), and quantitative agreement between distributions is determined with P (equation (1)).
In situ drifter positions always exist where the model derived Lagrangian PDFs are non-zero for the advection times considered. Drifter positions after 1, 2, and 4 days show good qualitative agreement with Lagrangian PDFs, and P ≥ 0.08 for these times (Figures 1a, 1b, and 1d). After 3 days, the Lagrangian PDF has a pronounced southward extension that is matched by only a single drifter (Figure 1c). This discrepancy gives rise to a larger maximum difference in CDFs, and thus a much smaller P value (P = 0.02). These qualitative relationships for various values of P aid interpretation of the quantitative statistic.
 Comparisons of modeled and observed dispersion distributions for six release sites and a 2 day advection time show qualitative relationships for a larger range of P values. For the launch site near Point Conception, both modeled and in situ distributions indicate similar movement mostly to the south with relatively large energy and this agreement is quantified with P = 0.14 (Figure 2a). Both modeled and observed trajectories that emanate from the north shore of San Miguel island are most likely to be transported either along the coast of Santa Rosa Island, toward the center of the SBC or equatorward around the east side of the island, and again the good agreement is quantified with a large P (0.77 (Figure 2d)).
 Release sites with poor qualitative agreement have P values much less than the statistical threshold at the 95% confidence level. The majority of drifters released from the northwest tip of Santa Cruz Island end up to the northeast of the launch location after 2 days (Figure 2e). The PDF from modeled trajectories that emanate from this site has its greatest weighting directly south of the launch location where less than 20% of the drifters go (P = 0.03). Southward movement in model trajectories also differs from northeastward in situ drifter movement for the launch site on the north coast of Santa Cruz Island (P = 0.02 (Figure 2f)).
 The cases shown in Figure 2 demonstrate major directional differences in observed and modeled trajectories when P ≤ 0.05. Qualitatively good agreement exists in Figures 1 and 2 when P > 0.05. The spatial distribution of low P values, where modeled trajectories terminate in very different locations than observed, is relevant to the quest for improved model skill.
P values are computed for all defined launch sites and 1–4 day advection where >10 independent drifter observations emanate, and all trajectories have ending positions within the ROMS domain. These constraints limit the advection times considered and explain why the set of launch sites resolved varies with advection time (Figure 3). The majority of launch site and advection time combinations considered give P > 0.05 indicating good agreement between ending distributions of modeled and observed trajectories. However, regions of poor agreement (P ≤ 0.05) exist for launch sites in the northwest SBC for the 1-day advection time, and along the north coast of Santa Cruz and Santa Rosa Islands for all advection times (Figure 3).
 Inconsistencies between simulated dispersal patterns and historical drifter data are not necessarily due to inaccuracies of the circulation simulations themselves. Some differences are expected given the 1 km resolution of ROMS, unresolved eddy energy known to exist [e.g., Ohlmann et al., 2007], and the small size of Channel Island gaps. The ROMS trajectories do not include forcing from tides. Despite the fact that tidal flows in the SBC are mostly rectilinear with velocities <5 cm/s, tidal energy in flows between Channel Islands can be significantly larger [Munchow, 1998] and this may explain low P values for launch locations near island gaps. The region of disagreement in the northwest SBC (Figure 3a) is characterized by large wind stress gradients that may not be adequately resolved in the ROMS simulations [e.g., Dong et al., 2009]. The K-S test presented here allows the skill of future ROMS configurations (i.e., with tides and/or finer spatial resolution) to be quantitatively evaluated for most of the coastal domain in the SBC.
4. Conclusions and Summary
 A purely Lagrangian validation of ROMS coastal dispersal simulations in the SBC region is presented. The Lagrangian assessment, believed to be the first of its kind, is a necessary step in understanding model skill and offers insight into model performance beyond traditional Eulerian comparisons. The assessment provides both quantitative and qualitative information regarding interpretation of model trajectory PDFs. “Fate and transport” type models that utilize simulated trajectories for applied problems should undergo this sort of Lagrangian assessment on their way to operational use.
 The K-S test statistic P derives from the maximum difference in CDFs computed from model-based analytical PDFs and individual observations. The K-S test is appropriate for this application as it can be used with relatively small sample sizes, can accept analytical PDFs, and can be applied to empirical distributions. In addition to interpreting P in the usual binary statistical sense (accepting or rejecting a null hypothesis that the distributions differ), a more quantitative interpretation is used. Significant directional differences between modeled Lagrangian PDFs and observed drifter distributions exist when P ≤ 0.05. Distributions show qualitatively good agreement when P > 0.05 as the binary statistic suggests.
 The study focuses on ROMS derived trajectories that emanate from launch sites within 10 km of the coastline in the SBC from 1996 through 1999 and advect for up to 4 days. The spatial domain over which comparisons are performed extends far beyond the more traditional Eulerian approach, typically confined to locations of a few scattered moorings. For the cases considered, ROMS-derived PDFs and in situ drifter observations do not, in general, differ substantially. The assessment indicates that observed and modeled Lagrangian distributions show consistently (among advection times) poor agreement in regions along the north coasts of Santa Cruz and Santa Rosa Islands, likely due to strong tidal flows through island gaps not resolved in simulations. Poor agreement in the northwest SBC for 1-day advection suggests a local, short-time mechanism. The K-S metric presented here enables a quantifiable Lagrangian skill assessment of future model configurations that attempt improved simulations.
 Thanks to Charles Dong and Jim McWilliams for supplying ROMS solutions, and to Ed Dever for supplying drifter data. Brian Kinlan contributed insight to the K-S test. Jim McWilliams and Andrew Poje provided useful comments on the original draft. We acknowledge enlightening discussions with Dave Siegel. Support for this work comes from the National Science Foundation (OCE-0352187, OCE-0623011), the Minerals Management Service, U.S. Department of Interior under (MMS agreement 1435-01-04-CA-36650 to M05AC12301) and the California Coastal Conservancy (04078.05LA), University of California Coastal Environmental Quality Initiative.