Ensemble modeling is a method of prediction based on the use of a representative sample of possible future states. Global models of the solar corona and inner heliosphere are now maturing to the point of becoming predictive tools; thus, it is both meaningful and necessary to quantitatively assess their uncertainty and limitations. In this study, we apply simple ensemble modeling techniques as a first step towards these goals. We focus on one relatively quiescent time period, Carrington rotation 2062, which occurred during the late declining phase of solar cycle 23. To illustrate and assess the sensitivity of the model results to variations in boundary conditions, we compute solutions using synoptic magnetograms from seven solar observatories. Model sensitivity is explored using (1) different combinations of models, (2) perturbations in the base coronal temperature (a free parameter in one of the model approximations), and (3) the spatial resolution of the numerical grid. We present variance maps, “whisker” plots, and “Taylor” diagrams to summarize the accuracy of the solutions and compute skill scores, which demonstrate that the ensemble mean solution outperforms any of the individual realizations. Our results provide a baseline against which future model improvements can be compared.
 Since their modest beginnings in the late 1960's [Hundhausen and Gentry, 1968], numerical approaches for studying the structure of the solar corona and inner heliosphere have blossomed into sophisticated 3-D, time-dependent, massively parallel, and necessarily complex models [e.g., Downs et al., 2011; Riley et al., 2012a]. As the models evolved and proliferated over the years, being applied to different scientific questions, they have diverged from one another, emphasizing different observations and relevant physics. The Hybrid Heliospheric Modeling System with Pickup Protons (HHMS-PI) model, for example, combines a magnetohydrodynamic (MHD) model of the solar wind [Detman et al., 2006] with a numerical description of pickup protons from neutral hydrogen [Intriligator et al., 2012]. Our global model, CORHEL (corona-heliosphere), couples a range of coronal models each driven by the observed line-of-sight photospheric magnetic field, with several solar wind models to compute the 3-D structure of the inner heliosphere [Riley et al., 2012a].
 With the notable exception of the operational version of the Wang-Sheeley-Arge (WSA)-Enlil model (www.bu.edu/cas/news/press-releases/cism/), global solar and heliospheric models share a basic common goal: to understand the physical processes responsible for the solar wind we observe at 1 AU and elsewhere. In other words, they are science-based models, not operational tools. However, as these models have matured over the last 45 years, they have transitioned from simple tools allowing us to understand fundamental processes to being able to accurately reproduce (at least under some conditions) solar wind conditions near earth. As such, they are poised to transition into predictive models, and it appears timely and appropriate to assess the quality and robustness of these solutions.
 Initial assessments of solar wind model solutions came in the form of visual comparisons of observed and modeled time series [e.g., Riley et al., 2001]. More recently, studies have attempted to find usable metrics to estimate the quality of the solution. For example, Owens et al. [ 2008] computed mean square errors (MSEs) and correlation coefficients (CCs) over an 8 year period for a combination of solar and heliospheric models. They also considered an event-based approach to model validation by tracking the timing of high-speed streams. In addition to emphasizing a parameter that is of high interest to space weather forecasters, such a metric also avoids the problem that relatively small offsets in the arrival of high-speed streams can a disproportionate effect on the statistical skill score produced by MSEs or CCs.
 Thus far, “ensemble” modeling techniques have only been applied to global heliospheric problems in a rudimentary way. By way of illustration, we provide three examples. First, our group regularly computes several ambient coronal and heliospheric solutions for each Carrington rotation (CR) [Riley et al., 2012b]. Each solution is driven by a synoptic map derived from one of up to seven solar observatories, and the predicted in situ measurements at 1 AU can vary substantially. Second, Jian et al. [ 2011] compared the output from a handful of global heliospheric runs in which the input synoptic magnetogram, the coronal model, and even the version of the model were varied, finding that while there was a notable improvement from use of the latest models, it was difficult to determine which combination of synoptic map/coronal model was superior. And third, Lee et al. [ 2012] studied the 15 February 2011 halo CME by constructing an “ensemble” of solutions by (1) varying the input parameters for the WSA-Enlil cone model and (2) using several input synoptic magnetograms. While these studies can be illustrative in pointing out the sensitivity of the solutions to the inputs and free parameters, they provide no rigorous or quantitative feedback for improving the solutions. Because of this, it is probably misleading to associate the term “ensemble” with these types of investigations, which are perhaps better described more generically as parametric studies (Even a basic definition of “ensemble” goes beyond a mere collection or group of objects. It refers to a group of complementary parts that contribute to a single effect, such as an “ensemble” of musicians).
 Ensemble forecasting is more rigorously defined as a method of prediction that relies on the use of a representative sample of possible future states to derive a prediction. One of the appealing aspects of such an approach is that it offers a rigorous method for computing confidence bounds of the solution by estimating the uncertainty in the ensemble [Wilks, 2006]. Moreover, the mean of the ensemble of forecasts is or should be more accurate than the forecast from any individual member, the reason being that the random or unpredictable regions of the forecast tend to cancel one another while the aspects of the forecast that the majority of the models agree on are not removed [Warner, 2010]. In its simplest interpretation, ensemble modeling is essentially a method of nonlinear filtering [Wilks, 2006]. A further feature of ensemble forecasting is that the probability distribution function of a variable can be used to infer information about more extreme events.
 For terrestrial weather forecasting, the uncertainty in the solutions can be divided into two main causes. The first is due to imperfect initial conditions. Unlike the Earth's lower atmosphere, however, small errors in the initial conditions in the solar corona are not likely to cause a substantial divergence of the solutions. Thus, the proverbial butterfly flapping its wings in the corona will, at worst, be fried to a crisp, and the remains convected out into the solar wind. Our “initial condition” issue then is replaced by a “boundary condition” problem. Specifically, the use of line-of-sight magnetograms from different solar observatories can lead to substantially different solutions [Riley et al., 2012a].
 The second source of errors can be termed “model formalism.” This includes the physical processes that are included in the algorithms, as well as the parameterization of the model. For example, coronal models typically rely on either Potential Field Source Surface (PFSS) or MHD approximations, which have unique advantages and disadvantages [Riley et al., 2006]. Additionally, time-dependent effects can also alter coronal structure through field shear and twist. These effects cannot be captured by PFSS models or even equilibrium MHD models, instead requiring a time-dependent solution driven by evolving photospheric magnetic fields [Riley et al., 2006]. Finally, different sets of free parameters must be specified within each [Riley and Luhmann, 2012].
 In this report, we compute a representative set of model realizations for one relatively quiescent time period (CR 2062) to introduce and explore the utility of applying ensemble modeling techniques to global solar and heliospheric numerical models. These are tentative first steps, which we hope will lay the foundation for more comprehensive studies in the future.
2 Modeling Approach
 The process of generating a global solution describing the plasma and magnetic field properties of the corona and inner heliosphere is relatively complicated. Figure 1 illustrates the main decisions one must make if using the CORHEL global coronal and heliospheric modeling suite [Riley et al., 2012a]. One first chooses a synoptic map derived from observations by one of seven solar observatories. These raw maps must then be processed to provide sufficiently robust radial magnetic field boundary conditions for the model. The procedure involves smoothing the maps, balancing the flux, and extrapolating observed mid-latitude fields to the poorly or even completely obscured polar regions, a process that can significantly impact the solutions [Riley et al., 2012a]. Next, one chooses a particular coronal model. Currently, the choices are a PFSS model and one of two types of MHD models. In the simpler polytropic MHD algorithm, the energy equation is replaced by a polytropic relationship, and γ is set to 1.05. The resulting solutions are relatively accurate in terms of the large-scale structure of the magnetic field but inaccurate with respect to the properties of the plasma. In contrast, the thermodynamic MHD approach, while computationally more challenging, yields much more accurate plasma properties. These solutions can then be used to drive, either directly or indirectly, a heliospheric model. Here we show two possible models: Enlil and MAS (Magnetohydrodynamics Algorithm outside a Sphere). We do not believe that this portion of the modeling chain is particularly sensitive to the type of model used, and in fact, to these options, we could add an even simpler mapping technique to evolve the plasma and field from 30RS to 1 AU [Riley and Lionello, 2011]. Associated with each model are a distinct set of free parameters. In the polytropic MHD approach, for example, we are free to specify the plasma temperature and density at the base of the corona. Resolution of the solution too is effectively a free parameter of the model but one that the results should not be sensitive to.
3 Modeling Results
 To illustrate the concepts of ensemble modeling, as applied to the ambient structure of the solar wind, we computed ∼ 40 realizations using the CORHEL model suite. For each one, we chose to vary (1) the input synoptic magnetogram (GONG, MDI, MWO, SOLIS, or WSO), (2) the coronal model (MAS or WSA/PFSS), (3) the base temperature ( 1.8 × 106K or 2.5 × 106K), and (4) the spatial resolution ( 101 × 100 × 128 or 201 × 150 × 256). In practice, of course, there are an almost limitless number of choices that could be varied, leading to a potentially intractable number of solutions. We chose this limited set because we believe the model results are relatively sensitive to them and they illustrate the essential features of ensemble modeling techniques. An additional pragmatic constraint is that ensemble modeling of the global solar wind can be computationally expensive. Our higher-resolution simulations typically require ∼ 25,000 h of Central Processing Unit (CPU) time on either NASA's Pleiades or NSF's Ranger supercomputers. Thus, ensemble modeling techniques can, in principle, quickly consume a large fraction of our yearly allocation.
 A selection of realizations for CR 2062 is summarized in Figure 2. These maps show solar wind speed at 30RS as a function of longitude (x axis) and latitude (y axis). The panel titles summarize the parameters that were modified. In particular, we (1) used magnetograms from GONG, MDI, MWO, SOLIS, and WSO; and (2) constructed solutions using both our MHD coronal model (MAS) and the WSA/PFSS model. These two models represent more than just different numerical approaches. In addition to different assumptions, such as the existence of a source surface, they incorporate two distinct ideas for the origin of the solar wind [Riley and Luhmann, 2012] and so produce different values of the bulk solar wind speed at the inner boundary of the heliospheric model. The solutions are, at least superficially, similar: fast solar wind at high latitudes and slower wind organized about the heliomagnetic equator. Beyond this, however, there are some notable differences. First, the structure of the band of slow wind differs dramatically from one model to another. The WSA solutions, for example, do not tend to produce slow-flow arcs that split from the main slow-flow band and then rejoin. In fact, they predict the opposite: wind that is faster than that predicted from deep within polar coronal holes. These regions are associated with pseudo (or unipolar) streamers, and one of the distinguishing features between the two ideas for the origin of the slow solar wind is whether plasma from these locations is fast [Wang et al., 2007] or slow [Riley and Luhmann, 2012].
 Given these solutions, and with no a priori information about which is better or worse, we can estimate the variance between them and thus infer where the model solutions agree and disagree [Epstein, 1969]. Figure 3 summarizes the standard deviation (s.d., i.e., the square root of the variance) for CR 2062, using the realizations shown in Figure 2. The ensemble mean is also overlaid, but its interpretation is more difficult here and we defer a discussion of it until later. This presentation of the s.d. emphasizes several interesting points. First, away from the “band of solar wind variability,” all realizations predict fast, quiescent solar wind, in agreement with Ulysses polar observations [e.g., Riley et al., 1997] and the corresponding s.d. is effectively zero. Second, near the equator, the models agree best in an “island” extending from 60° to 240°, which traces the locus of the heliospheric current sheet (not shown). Third, the realizations disagree the most between ∼ 130° and 180° longitude at mid-southern heliographic latitudes. This maps back to the location of a pseudo streamer (Figure 2). Fourth, the degree to which the models differ is substantial: In many regions where near-Earth spacecraft would sample, the s.d. is of the order of ∼ 100 km s − 1.
 The information shown in Figure 2 can also be summarized in more detail for specific trajectories through the region. In Figure 4, we show the ensemble solution, that is, the average speed through the heliographic equator, together with various statistical information, which are explained in the figure caption. Of particular note is that the variance is lowest when the speed is lowest and highest when the speed is highest. (This is also apparent but more difficult to discern in Figure 3.) This makes intuitive sense: It is easier for models to agree when there is no structure, such as long periods of slow, or fast wind, but more difficult to predict the location (phase) of sharp stream fronts, that is, the boundaries between slow and fast wind. This is particularly true at stream interfaces, where fast wind is flowing into slower wind. At such locations, the gradient from slow to fast is sharp, and so small offsets in the location of this boundary will translate into large model variances.
4 Comparison With Observations
 Using the results from the heliospheric portion of the simulations, we can compare the ensemble solution with in situ measurements. Figure 5 compares in situ measurements (red) of the bulk solar wind speed as measured by the ACE spacecraft with the ensemble model solution (black) at 1 AU in the ecliptic plane. Again, the boxes with “whiskers” summarize the statistical properties of the realizations. We note that the ensemble model has captured the three streams (located between 0° and 240° longitude) but misses the region of prolonged low-speed wind from 240° to 360°. This is, we believe, an intrinsic limitation of using synoptic magnetograms to compute solutions that are fundamentally time-dependent phenomena. We note also that the individual realizations (not shown) displayed considerably more variation both in magnitude and phasing of the streams. Thus, one of the basic effects of ensemble averaging is to reduce the amplitude of the variations.
 A simple technique for assessing the performance of each realization as well as the ensemble solution is to compute the root mean square error (RMSE) or difference between in situ measurements and the model results. Table 1 summarizes these values when compared with ACE measurements. The “persistence” model simply involves taking the observed average solar wind speed during the entire Carrington rotation and using this as the prediction of speed for the interval. Thus, strictly speaking, it is really just the standard deviation (s.d.) of the ACE measurements. It should also be noted that “persistence” in a meteorological sense typically refers to, say, a forecast that some future weather pattern will be the same as a pattern already observed. Our definition here, nevertheless, turns out to be a relatively stringent requirement; requiring that the model beat the s.d. of the observations is quite challenging. The results in Table 1 suggest that there are modest differences in the RMSEs from different observatories and that the RMSE for the ensemble solution is lower than any of the realizations, including persistence, which is of course encouraging. It is somewhat alarming that persistence appears to beat some of the individual realizations. However, it must be realized that phase offsets within what are otherwise good predictions (say, predicting all three streams but being offset by 12 h) can lead to large RMSEs. Additionally, and perhaps more importantly, the flat-line prediction of the “persistence” model has no useful predictive information: Predicting a large, fast stream, even only to within a day or so, has considerably more intrinsic scientific worth and, from an operational point of view, potentially more actionable value.
Table 1. Root Mean Square Errors for Model Realizations for CR 2062.
5 Assessing Model Performance and Sensitivity: Taylor Diagrams
 Even the relatively limited number of realizations we have discussed here produces a considerable amount of data that can be difficult to digest and assess. To concisely summarize the degree to which the model results match observations across many realizations, Taylor [ 2001] developed what has come to be known as a “Taylor diagram,” which combines correlation, centered RMSE, and variances.
 Figure 6 summarizes five high-resolution runs using the MAS model, driven by data from one of five solar observatories. The radial distance from the origin indicates standard deviation, while the angular direction shows correlation coefficient. The observations thus become a single point on the x axis (black diamond) since they are exactly correlated with themselves, and the model results (red diamonds) are distributed at various locations in this parameter space. Both the correlation of the model results with the data and the degree to which its variance matches the observations are key aspects for providing useful forecasts. Additionally, it turns out that when the correlation coefficient is displayed as its arc cosine, there is a simple geometric relationship between the centered RMSE, the s.d., and the correlation coefficient. (Note that the difference between the centered RMSE and the RMSE in Table 1 is that in the former, prior to computing the RMSE, the average values of the data and model are first subtracted.) Specifically, contours of centered RMSE lie on circles centered around the observation point (dashed circles in Figure 6). Thus, the optimum model is the one that lies closed to the observation point, or, equivalently, on the smallest radial dashed circle. Solutions with high correlation coefficient, then, are penalized in this view if the s.d. is substantially smaller or larger than the s.d. of the observations. Model 4, for example, although having a lower correlation coefficient than model 2, provides a closer match in terms of variance and is thus arguably better. Similarly, the ensemble mean of these five cases produces a marginally better solution than the best individual realization (model 5), which has a higher correlation coefficient.
 Using the Taylor diagram, we can assess changes in the performance of a model resulting from modifications to it. For example, in Figure 7, we explore the effects of reducing the base coronal temperature in the polytropic MAS model from 1.8 × 106K to 1.1 × 106K, that is, a reduction by one third. We emphasize that this change is primarily to illustrate the application of ensemble modeling techniques: In reality, we do not believe such large changes would lead to solutions that are more consistent with either remote solar observations or in situ measurements. The red diamonds mark the reference solutions, and the blue squares mark the location of the new solutions, with the arrows connecting one to the other. Even without reviewing the individual time series, we can see that all variability in the solar wind solutions has been removed. Physically, we can understand this change as follows: By decreasing the coronal-base temperature, the thermal pressure of the plasma is reduced. Fewer magnetic field lines are opened up, which in turn causes more of the smaller equatorial coronal holes to close down. Thus, there are less fast streams at low latitudes penetrating the otherwise slower wind associated with the streamer belt, resulting in less variability in the speed of the wind there. It is worth noting, however, that at least on the basis of the centered RMSE, the solutions are no worse.
 Another possibility we can explore is an increase in temperature. Again, for the purposes of illustration, we increase the base temperature by one third, from 1.8 × 106K to 2.5 × 106K. This is summarized in Figure 8. Here there is a clear tendency for the solutions to become worse with a higher base temperature. More field lines are opened at lower latitudes, resulting in larger equatorial coronal holes. Slow wind still flows from the edges of open-closed field lines, however, so the variability of the wind is still maintained. The decrease in correlation results from the broadening and misalignment of high-speed streams.
 As a final example, in Figure 9, we show how the solutions change when the number of grid points in the simulation is increased from 101 × 100 × 128 to 201 × 150 × 256. Reassuringly, this change results in only a modest change in the performance of the solutions, suggesting that we can explore ensemble techniques at computationally feasible resolutions. In the future, however, as we constrain other, as yet poorly known free parameters, it may be necessary to evaluate the effects of less sensitive factors such as spatial resolution on the quality of the solutions.
 In this report, we have applied some relatively simple terrestrial weather ensemble forecasting concepts to models of the ambient solar corona and inner heliosphere. Although there are some fundamental differences between the Earth's lower atmosphere and the Sun's corona, our results suggest that ensemble modeling techniques can be adapted and successfully applied to assess and improve the performance of global solar and heliospheric models. These are clearly “first steps,” but ones that we believe lay a foundation from which more rigorous methods can be developed. We showed that at least for the time period analyzed and the models/parameters chosen, the ensemble prediction outperforms any of the individual realizations. Furthermore, using “Taylor” diagrams, we were able to assess the relative merits of each of the individual solutions as compared with the ensemble mean. Ultimately, it may be possible to implement these techniques at “third party” institutions, such as NASA's Community Coordinated Modeling Center (CCMC). This would allow users and developers to both track the intrinsic improvement within a single model and compare the relative strengths and weaknesses of different models.
 Our numerical “experiments” to estimate the sensitivity of the model results to both boundary conditions and model formalism suggest that substantial improvements in performance can be achieved by careful exploration of the relevant parameter space. Three substantial problems, however, pose significant and potentially long-term handicaps for improving the quality of the model solutions. First, our limited view of the Sun's poles forces us to either extrapolate mid-latitude, well-observed data poleward [Riley et al., 2012a] or use flux transport models that attempt to reconstruct the polar regions using physical transport processes coupled with data assimilation [Arge et al., 2010]. Second, our current reliance on synoptic maps (which are built up from daily observations taken over 25.38 days) as inner radial boundary conditions cannot be rigorously defended: The model implicitly assumes that this map is synchronic; that is, it is a snapshot of the surface of the photosphere at one point in time, which it clearly is not. Furthermore, because the first and last slices for each Carrington rotation are separated by more than 25 days, the model's requirement that the azimuthal boundary be periodic necessarily leads to artifacts. Third, our assumption that the Sun's structure does not change appreciably during the course of a solar rotation is an idealization that is not met in reality. Sequences of synchronic maps derived from flux transport models could be used to drive a time-dependent model of the ambient solar wind which may eliminate or at least mitigate all of these concerns. Ideally, of course, a “sentinels” mission consisting of a fleet of spacecraft, each carrying a magnetograph, which together could image the entire solar surface simultaneously, would provide the ultimate boundary conditions for global models. In reality, our first step towards realizing this vision may come from combining magnetograms obtained from Solar Orbiter (currently scheduled for launch in January 2017) with those from near-Earth space and/or ground-based observatories.
 The authors gratefully acknowledge the support of NASA (Causes and Consequences of the Minimum of Solar Cycle 24 program, LWS Strategic Capabilities program, Heliophysics Theory Program, and the STEREO IMPACT team) and NSF (Center for Integrated Space Weather Modeling (CISM) program).