## 1 Introduction

[2] Atmospheric convection is an important phenomenon which drives weather and climate in the tropics as well as the global general circulation. Convection is relevant on a range of spatial and temporal scales from large-scale phenomena, such as the Inter-Tropical Convergence Zone, El Nino-Southern Oscillation, and the Madden-Julian Oscillation, to short weather time scales, such as an individual squall line and mesoscale convective systems. Numerical models exhibit limitations in their ability to capture convective phenomena. Particular examples include biases in the tropical mean precipitation distribution [*Sun et al.*, 2006; *Zhang et al.*, 2007] and significant timing errors in the diurnal cycle of convection over land [*Yang and Slingo*, 2001]. Shortcomings in model simulations have been related to the model representation of convection [e.g., *Neale et al.*, 2008; *Bechtold et al.*, 2008; *Zhang et al.*, 2006; *Neale and Slingo*, 2003; *Wang and Schlesinger*, 1999]. This is largely due to the limitations of the convective parameterizations used in models to represent the subgrid scale behavior of convection in relation to the resolved large-scale processes. Accurate representation of convection is particularly important for the tropics where precipitation is generally associated with convective cloud systems.

[3] Convective parameterizations (see *Arakawa* [2004] for a full review of convective parameterization approaches) generally exploit some relationship between the large-scale, given by the atmospheric state at the model grid box scale, and the convective scale. The schemes mostly invoke an assumption that the two scales are in quasi-equilibrium [*Arakawa and Schubert*, 1974; *Emanuel*, 1991; *Brown and Bretherton*, 1997] and use these assumptions to provide closure to the model equations. A variable which characterizes the thermodynamic state of the atmosphere, such as Convectively Available Potential Energy (CAPE), is often used to determine convective strength. CAPE is the vertical integral of the temperature perturbation of a buoyant air parcel ascending from near the surface to its level of neutral buoyancy. A comprehensive investigation of other possible relationships, between a large range of large-scale and small-scale variables, which may be used in the closure of convective parameterizations is somewhat lacking.

[4] Another possible limitation of convective parameterizations (and other parameterizations in general) is that they determine the subgrid scale convective behavior deterministically, meaning that for a given large-scale state, only one possible convective state can be attained. This is unlikely to be true in the real atmosphere, but traditional parameterizations cannot produce variability about their mean relationship between the two scales. Several cloud-resolving models (CRM) studies have identified variability in the large to small-scale relationships, however, to our knowledge there are no observational studies investigating the stochastic nature of these relationships [e.g., *Cohen and Craig*, 2006; *Shutts and Palmer*, 2007; *Plant and Craig*, 2008]. There have been several attempts to include stochastic elements in the description of convection in models. *Buizza et al.* [1999] showed that applying multiplicative noise to the physics tendencies improved modeled skill. *Lin and Neelin* [2007] used empirical relationships to adjust the convective parameterization. *Khouider and Majda* [2006] used a Markov chain lattice to stochastically describe the evolution of convective cloud types in a model grid-cell. *Plant and Craig* [2008] developed a fully stochastic convective parameterization. These studies have used either assumptions of empirical relationships or higher resolution models, such as CRM, to study the stochastic nature of the relationships. This study aims to supplement this earlier work by providing observations of the key relationships and also quantifying their stochastic components.

[5] In this study, we first develop two concurrent data sets, one representing the large-scale atmosphere and another the small-scale convective state, over a sufficiently long time period to sample a large range of different states. These data sets are then used to investigate important relationships between the two scales and furthermore to determine the stochastic nature of the relationships. Section 2 describes the methodology used to derive data sets for the large-scale atmospheric state and the concurrent small-scale convective state. Section 3 then discusses some key relationships between the two scales that are relevant for convective parameterizations. The stochastic nature of these relationships is probed in section 4. The following sections then discuss the results (section 5) and summarize the main conclusions (section 6).