Can Recurrence Quantification Analysis Be Useful in the Interpretation of Airborne Turbulence Measurements?

In airborne data or model outputs, clouds are often defined using information about Liquid Water Content (LWC). Unfortunately LWC is not enough to retrieve information about the dynamical boundary of the cloud, that is, volume of turbulent air around the cloud. In this work, we propose an algorithmic approach to this problem based on a method used in time series analysis of dynamical systems, namely Recurrence Plot (RP) and Recurrence Quantification Analysis (RQA). We construct RPs using time series of turbulence kinetic energy, vertical velocity and temperature fluctuations as variables important for cloud dynamics. Then, by studying time series of laminarity (LAM), a variable which is calculated using RPs, we distinguish between turbulent and non‐turbulent segments along a horizontal flight leg. By selecting a single threshold of this quantity, we are able to reduce the number of subjective variables and their thresholds used in the definition of the dynamical cloud boundary.


Introduction
Understanding atmospheric turbulence is crucial to properly describe transport processes in the Atmospheric Boundary Layer (ABL) and in the free atmosphere, mixing processes in clouds, and even rain production (Bodenschatz et al., 2010).Airborne measurements provide direct information on turbulent velocities and associated fluctuations of thermodynamic fields (Wendisch & Brenguier, 2013).Statistical interpretation of these measurements is, however, problematic.Time series collected in the course of airborne measurements represent conditions along a one-dimensional, complicated trajectory inside evolving and coupled fields of momentum, temperature, humidity and cloud particles.In contrast to laboratory experiments, data collected in the course of a flight do not contain information on the boundary and initial conditions of the flow, on external forcings and internal feedbacks.There is even no general rule to distinguish to what extent the observed fluctuations can be contributed to turbulence, waves, or inhomogeneities of non-turbulent nature.Thus an objective division of the collected data series into segments representing similar stages/properties of the flow is crucial to construct conditional statistics of measured variables to characterize phenomena along the flight trajectory.
Typically, additional information and assumptions are necessary to construct a reasonable conditional statistics.To explain this in more detail, consider for example, cloud/clear air turbulent mixing.While the cloud microphysical boundary can be defined in a data series by a selected threshold of liquid water content (LWC) or droplet number concentration (DNC) (Malinowski & Zawadzki, 1993), cloud dynamical boundary, that is, the region around a cloud where the flow is affected by dynamical (Ackerman, 1958) and thermodynamical (Radke & Hobbs, 1991) processes related to the cloud, is not unique.In airborne measurements proposing through stratocumulus cloud top (Malinowski et al., 2013) three parameters: turbulence kinetic energy, potential temperature gradient and wind shear, were used to distinguish between the four different layers in the cloud top vicinity.Neither the parameters, nor threshold values could be considered objective, yet the division was algorithmic and allowed to describe many properties of turbulence and turbulent mixing in stratiform clouds.Such divisions are even more difficult in case of convective clouds penetrated horizontally.Heus and Jonker (2008) have shown the existence of shells around clouds where dynamical interactions occur.While cloud modelers who obtain two-or three-dimensional thermodynamic fields may use such parameters as enstrophy (Mellado, 2017) or more advanced numerical approaches (Nair et al., 2021) to obtain a border between turbulent cloud and non-turbulent environment, a similar approach cannot be adopted to airborne data along a 1-D trajectory.
This work describes an effort to construct a robust, objective method to classify time series collected in the course of the flight by "turbulent" (dynamic) properties using a small number of a-priori assumptions.The idea is to apply a technique which is widely adopted in nonlinear time series analysis called Recurrence Quantification Analysis (RQA) (Marwan et al., 2007), which can be used to study transitions in the system.It revolves around the concept of recurrence of the dynamical system into the same regions of the phase space, which is one of the fundamental aspects in the analysis of such systems.It has found applications in many fields such as physiology, neuroscience, economy, and earth sciences.There are many examples of the usage of the Recurrence Plot method, but in this study, we mainly refer to a comprehensive review by Marwan et al. (2007), containing the necessary information about the method and an in-depth look at its variations and its applications.
In the following analysis, dynamic (turbulent) properties of clouds and clear air in between are characterized by three variables, representing important dynamical processes in clouds: turbulence, characterized by its kinetic energy TKE, mixing, characterized by temperature fluctuations T′ and vertical transport (updrafts and downdrafts) characterized by vertical velocity w′.These variables, normalized according to the RQA approach, form a 3-D phase subspace.Recurrences of a vector with these three variables into the same regions of this phase subspace, indicate segments of the time series representing similar dynamic properties of the flow.
To test applicability of this approach we use airborne data collected in the course of the EUREC 4 A (Elucidating the role of clouds-circulation coupling in climate) experiment (Stevens et al., 2021) with the British Antarctic Survey (BAS) Twin Otter (TO) research aircraft, equipped to measure turbulence, temperature fluctuations, and cloud properties.Cloud mask defined by a threshold of LWC, is added to the comparison, allowing turbulence inside and outside of cloud to be distinguished.
The structure of the article is the following.In Section 2 we describe the data, Section 3 discusses the details of the time-dependent RQA method adopted, and in Section 4 the results are presented and discussed.Final conclusions can be found in Section 5.

Data Description
EUREC 4 A (Stevens et al., 2021), was a measurement campaign conducted in late January and early February 2020, with a focus on the study of coupling between atmospheric boundary layer and ocean surface, and the study of marine cumulus clouds and their spatial organization near Barbados.One of the research aircraft present was the British Antarctic Survey's Twin-Otter (TO).The aircraft flew at different altitudes performing cloud penetrations, collecting data between the clouds as well as in the atmospheric boundary layer (ABL), even near the ocean surface.
The TO was equipped with Meteorological Airborne Science INstrumentation (MASIN) gathering data at two frequencies: 1 and 50 Hz.The data set contains GPS plane position, pressure altitude, three components of wind velocity from the Best Air Turbulence (BAT) probe (Crawford & Dobosy, 1992;Garman et al., 2006), temperature and H 2 O molar ratio.There were cloud spectrometers, measuring cloud microphysical properties, as well as an additional instrument to measure temperature fluctuations, the Ultra-Fast Thermometer 2b (UFT-2b) from the UFT family (Kumala et al., 2013) with an effective resolution reaching 2000 Hz.Not all collected time series were suitable for analysis due to instrumental problems.The UFT-2b provided useful data from 7 flights, in 5 of these cloud penetrations were recorded.In these flights three 50 Hz velocity components (turbulence data) from the BAT (Best Air Turbulence) probe were available, yet in some segments of flights with deteriorated resolution.Records from BAT and UFT-2b were used in this study.These instruments were co-located within 10 cm distance on the nose boom of the aircraft, thus after averaging temperature fluctuations to match the 50 Hz signal from the BAT probe (corresponding to ∼1.3 m spatial resolution at 65 m/s true air speed (TAS)) data can be considered as collected in the same volume.
Additionally, LWC from cloud spectrometers, located under the wings (∼5 m apart) was used to define cloud mask.In the first two flights it was the data from the Fast-Forward Scattering Spectrometer Probe (FFSSP) (Baumgardner et al., 1993;Brenguier et al., 1998) with 50 Hz resolution, and for the remaining three, from Cloud Droplet Probe (CDP) (Lance et al., 2010) with 1 Hz resolution, which is equal to around ∼65 m spatial resolution.A total of 246 clouds defined as segments with LWC > 10 3 g/kg at 1 Hz were found.Only a fraction of these data were selected for the preliminary analysis described here: a fragment of a time series collected at constant altitude during flight TO-336, starting at around 12:42 UTC.Since the typical TAS of the TO was about 69 m/s, and velocity fluctuations were of order of a few m/s, the frozen flow hypothesis is applicable to data interpretation.

Data Analysis
As mentioned in the introduction, we decided to perform time-dependent RQA using TKE (denoted in equations as k), T and w′: as indicators of airflow dynamics.w′ is calculated by subtracting the mean calculated over the whole horizontal flight segment.
RQA began from normalization of the components of the recurrence vector in order to equally capture the influence of each component.The normalization was performed according to the formula: where μ x n and σ x n is the mean and standard deviation of n-th vector component, and the average is taken along the whole horizontal flight segment of the flight.
For each data point on the leg a recurrence matrix, centered on this point was calculated (Marwan et al., 2007): here Θ is the Heaviside step function, ϵ is the so-called recurrence threshold and where N is the length of X → , and |⋅| is the norm (here, the L2-norm is used).Equation 3 describes a recurrence matrix R ij , where the values of the matrix can be either 0 or 1, depending on the fact if the norm of the vector X → i X → j is bigger or smaller than ϵ.Then, to visualize the matrix as a plot, points with the value 1 are colored black, and points with the value 0 are colored white.The matrix is symmetric by definition.As pointed out in Section 1.2.1 of Marwan et al. (2007), there is no single universal method for choosing the best ϵ, but there are several "rules of thumb."According to these rules, a recurrence threshold ϵ = 1 was chosen.The value corresponds to about 10% of the phase space diameter, that is, diameter of a sphere bounding all the measured states in the phase space (all measured data points).The example recurrence matrix and corresponding elements of Equation 1 are plotted in Figure 1.
According to Section 1.2.2 of Marwan et al. (2007), visual inspection of patterns appearing in the recurrence matrix makes it possible to distinguish different types of trajectories of the investigated vector in the phase space.In particular, white areas or bands correspond to rapid changes, solitary black points/squares to rare events/heavy oscillations, chess-board like structures to periodic events, and diagonal lines suggest parallel segments of trajectories in phase space.In Figure 1 a change of the recurrence behavior of the time series after 3 s from a slowly changing state to rapid fluctuations is visible indicated by a white band.Additionally, two vertical lines around 4 and 4.5 s correspond to the values of [T, k, w′] similar to that in the first 3 s of the plot, visible as a black area on the matrix.These vertical lines in general indicate a presence of an intermittent state, which occurred previously in the Geophysical Research Letters 10.1029/2023GL105753 segment.The phase space corresponding to the data segment presented in Figure 1 is presented in the Figure S1 in Supporting Information S1.Some of the above-mentioned properties of each matrix can be characterized by one or more measures proposed by Marwan et al. (2007) in the overview of recurrence quantification analysis (RQA).Time series of such measures can be interpreted in terms of varying properties of turbulence along the trajectory of the aircraft.
Since there is no general rule allowing for selection of the recurrence matrix size, we assumed that the matrix built on a segment corresponding to a few integral scales of turbulence should be the right choice.We performed a series of tests, starting from the size corresponding to 128 data points (which corresponds to ∼176 m at 69 m/s TAS) to 1,024 data points, and we decided on a 256 point matrix, corresponding to 352 m, that is, 1-7 typical integral length scales (Lothon et al., 2005;Nowak et al., 2021;Wacławczyk et al., 2022).The reported range of integral scales estimated in various conditions varies between ∼50 and ∼350 m.In parallel we had to decide which RQA parameter serves our purpose best.We have tested such parameters as recurrence rate, determinism and laminarity (Marwan et al., 2007).Since our time series does not characterize the particular dynamical system, but the actual state of turbulence along the flight track, and the goal is to distinguish between nonturbulent and turbulent regions, we decided to focus on a metric called laminarity LAM, which, according to Marwan et al. (2007) "represents the probability of occurrence of laminar (slowly changing) states of the system": where v is the length of a black line in the recurrence plot, P(v) is the number of lines of length v and v min is the minimal line length.In other words, LAM is defined as a number of black points forming vertical lines of length equal to at least v min , normalized by the number of all black points in a Recurrence Plot.For laminar states LAM is close to 1 (almost all points form vertical lines), whereas for states with high fluctuations it is closer to 0 (few vertical lines, mostly solitary black points).The choice of parameter v min is important, since it excludes shorter lines from calculations.In this study v min = 10, corresponding to a ∼14 m distance, was chosen.The argument is that we look for fluctuations of sizes corresponding to large scales of the turbulence cascade, small scale variability should be studied after selecting segments for more detailed analysis.The effect of choice of v min , as well as time series of other quantities in the used in the RQA is discussed in Supporting Information S1.
Finally, in order to understand the relative importance of turbulence kinetic energy, mixing and updrafts/downdrafts analogous analysis, construction of recurrence matrices and following RQA, was performed for each vector component separately.The example application of the procedure described in the previous section is presented in Figure 2.
The top panel of Figure 2 shows three time series of laminarity calculated for various sizes (512, 256 and 128 points corresponding to ∼704, ∼352 and ∼176 m distance in physical space) of the RP.It can be seen that the black curve representing an RP of size 512 × 512 hardly reflects dynamic disturbances related to small clouds, such as the one at t ≈ 7,350 s.Supposedly ∼700 m averaging, imposed by this choice, "smoothes out" turbulent events of the size comparable to 1-3 integral scales and should be avoided.The blue curve representing LAM calculated for a 128-point-long segment is most sensitive to changes in the environment, but shows strong oscillations, which suggest that so short RP may capture separate events in small scales, which could lead to too detailed division into segments.In general, increasing the size of the matrix smooths out the time-dependent RQA, and broadens the peaks of the quantities.This is an effect of capturing a bigger picture around a selected point and related smoothing, as seen for example, at t ≈ 7,250 s, where the black curve does not give insight into the variability of dynamics captured by the red and blue curves corresponding to smaller matrices.Additionally, 256 points/∼352 m corresponds roughly to 2-4 integral scales of turbulence, and distinguishing between turbulent/non turbulent regions on a distance corresponding to a few integral scales seems reasonable.
The bottom panel of Figure 2 shows a plot of LAM time series for the whole vector (light-blue curve) and for its components: T, k and w′.Gray shading indicates the separate use of a cloud mask indicating the presence of cloud water (at 1 Hz, i.e. ∼69 m spatial resolution).It can be seen that considering the whole vector gives the complete picture of turbulent regions within and around the clouds, while laminarity calculated for each component separately shows the relative importance of k, T, and w′, that is, mechanical turbulence, mixing and vertical transport which change along the time series.In example, decrease of laminarity around 7,250 s indicates turbulent region outside the cloud, in which w′ and k contribute more than T to the decreased laminarity estimated from the whole vector, while at ∼7,300 s, inside the cloud, contribution from T dominates, while contributions from the other parameters are small.

Results
Figure 3 shows the time series for all three components of the vector, LWC series as the gray cloud mask and LAM calculated for the whole vector, with yellow shaded regions corresponding to LAM below the selected threshold, that is, to the regions where a signature of turbulence is observed.Since the goal is to distinguish between the turbulent and laminar regions, a threshold just below 1 should be the cutoff.After visual inspection the specific value of LAM = 0.95 was chosen.Selection of LAM = 0.99 resulted in selection of the majority of the record as turbulent, while LAM = 0.90 resulted in omission of segments where some clear signatures of turbulence were present.The exact value of the threshold can be discussed, however the goal of the selection is not the detailed turbulence analysis, but selection of the segments on which such analysis should be performed so the detailed number is of secondary importance.
Figure 3 shows that the investigated region is composed of clouds with vast volumes around the clouds affected by turbulence and laminar regions with occasional signatures of turbulence, not related to visible amounts of LWC.All clouds in the analyzed data are turbulent, with clearly visible local minima of LAM values below ∼0.75.The local minima of laminarity reflect coincident input of all the components of the vector, but there is no clear dependence to the actual value of LWC.
Additionally, for each cloud, regions with LAM < 0.95 around the cloud can be considered "turbulent cloud shells", and borders of these shells can be determined by the LAM threshold: such a division will help in more detailed investigation of cloud dynamics.
To close this analysis we performed initial statistical analysis of turbulence in the investigated time series.Table 1 summarizes the average turbulence kinetic energy, vertical velocity variance and temperature variance in 3 different types of segments along the trajectory: clouds, where the LWC is above threshold, turbulent regions with no clouds, where LAM is above threshold and non-turbulent regions in between.Not surprisingly, these values are very different, reflecting different dynamics along the flight trajectory.
As we can see the values of the mean TKE are the highest for cloud regions, but they are also significantly higher for regions with LAM below threshold than for LAM above threshold.Standard deviation of vertical velocity follows the same trend as the mean TKE, while the standard deviation of temperature is not significantly different between the regions.However, more detailed inspection of the data indicates, that the standard deviation of the temperature in laminar region comes from large scale smooth temperature variations, while in turbulent regions it comes from small scale variability.Consequently, the mask based on the changes of LAM allows for better division of data in terms of turbulence characteristics.

Summary and Outlook
In this study, we proposed a method to algorithmically divide time series of airborne measurements according to dynamic (turbulent) properties of the sampled air, combining the simple signature properties characterizing turbulence, turbulent transport and turbulent mixing.The method is based on recurrence quantification analysis and the assumptions adopted are based on well-documented physical characteristics of atmospheric turbulence.Since elements of the vector are normalized based on their fluctuations, consecutive application of the method for flight legs at various heights does not require selections of arbitrary thresholds for each leg, allowing for using the same method for all analyzed horizontal segments of the flight.Such approach significantly reduces arbitrary decisions which have to be undertaken in the construction of the conditional statistics.Thus the proposed approach will facilitate future investigations of atmospheric turbulence in, within and around clouds, which requires a more systematic approach.Another advantage of the RQA method is that the constructed vector can be expanded with more measured quantities, such as for example, specific humidity, or other passive scalars.Selection of the variables included in the analysis can vary, depending of data availability and quality as well as research needs.
We used only one metric from the RQA, laminarity, but RQA offers a variety of metrics, therefore it is possible to expand the studies and/or use metrics that are more suitable for other types of analyses.The possibilities in the future with regards to this type of study include imposing an additional condition based on the LWC signal of a better resolution added to the vector, as well as applying the method to data gathered during other campaigns, where the requirements to perform the similar analysis (straight, horizontal flight leg, presence of clouds or abrupt changes) hold.
We have applied this method to an environment with clouds, which have a strong contrast between regions with stronger turbulence and the surrounding environment.The method is not universally applicable, since typical flight patterns in airborne measurements consist of slant profiles and horizontal legs, and the latter can include cloud penetrations or not.A cloud penetration is characterized by abrupt changes in temperature and vertical wind velocity, therefore such events affect normalization, which in turn affects the distribution of points in the phase space, which has a direct connection to how the RP looks, and the values obtained in the RQA.Therefore, the data studied with this method should in principle contain such abrupt events, although their magnitude can vary.
The selected threshold is subjective and its value should not be interpreted universally.On the other hand, all the distances and scales used in the analysis as well as the values and fluctuations present in the analyzed time series are typical and in agreement with published data.Thus, the method proposed here should be easily adaptable to other similar data sets as well as to the interpretation of numerical simulations, in which for example, independent estimates of cloud boundary based on LWC and on dynamic properties of the flow should help not only to characterize properties of turbulence inside and around the cloud, but to enhance our understanding of entrainment, mixing and vertical transport in clouds.

•
A vector characterizing turbulence along the aircraft track based on velocity and temperature fluctuations is constructed • A time series of this vector allows the construction of a Recurrence Plot, a tool used in studies of dynamical systems • A quantity named laminarity, derived from the Recurrence Plot, is used to detect turbulent and non-turbulent regions Supporting Information: Supporting Information may be found in the online version of this article.

Figure 1 .
Figure 1.Top panel: recurrence plot for 256 experimental points long segment centered on a selected point (a cloud edge) of the time series of the measurement data (notice symmetry of the matrix R ij = R ji ), bottom panels: plots of individual components of the vector.

Figure 2 .
Figure 2. Top figure: a plot of LAM for different window sizes (blue line-128, red line-256, black line-512).Bottom figure: a plot of LAM along the segment for different versions of the vector (blue line-whole vector, as in Equation 1, orange line-only T, yellow line-only k, violet line-only w′).The gray rectangles mark the cloud penetrations derived from LWC signal.

Figure 3 .
Figure 3.Time series of (from top to bottom): temperature, TKE, vertical velocity fluctuations (red-positive values, blue-negative values), LWC, and LAM along the analyzed flight segment.Gray rectangles mark the cloud segments derived from LWC signal.Yellow rectangles mark the segments with LAM below 0.95.

Table 1
Values of the Mean TKE, and Standard Deviations of the Temperature and the Vertical Wind Velocity Variable Component for Three Different Masks: LAM > -LAM Above 0.95 Threshold, LWC-LWC Above Threshold, LAM < -LAM Below 0.95 Threshold, but With LWC Below Threshold