Discerning the influence of climate variability modes, regional weather features and time series persistence on streamflow using Bayesian networks and multiple linear regression

A large literature on the sensitivity of streamflow response to climate variations has emerged over the past several decades, but the underlying mechanisms are not fully understood. The usual approaches to this problem have been simple and do not fully address its complexity, which involves the individual and joint effects of large‐scale climate modes and smaller‐scale weather features on streamflow volumes, the presence of persistence in these time series and their effects on antecedent conditions. Ongoing improvements in observational and reanalysis datasets and online access allow for better quantification of the connections between streamflow response, climate variability modes and regional weather features. The purpose of this paper is to determine whether Bayesian networks can be used to better identify key factors and their associated pathways leading to streamflow generation. A Markov blanket approach is described and illustrated using monthly streamflow series recorded at eight gauging stations within or immediately adjacent to the coastal region of eastern mainland Australia. The method is compared and contrasted with conventional multiple linear regression with discussion around this type of approach as part of a growing interest in machine learning applications. One example is used to illustrate the application of both approaches to a streamflow series exhibiting strong non‐stationarity. Results for the other streamflow series are included in a concise synthesis. Overall, the results suggest that Bayesian networks have several desirable features in terms of transparency, interpretability and explanatory insight. The findings from this study lend support to the use of Bayesian networks for modelling connections between streamflow volume and the variability of climate and regional weather. This improved understanding of the key controls of streamflow variability is intended to help address growing needs around informing social, cultural, economic and ecological aspects of water planning and management.

observational and reanalysis datasets and online access allow for better quantification of the connections between streamflow response, climate variability modes and regional weather features.The purpose of this paper is to determine whether Bayesian networks can be used to better identify key factors and their associated pathways leading to streamflow generation.A Markov blanket approach is described and illustrated using monthly streamflow series recorded at eight gauging stations within or immediately adjacent to the coastal region of eastern mainland Australia.The method is compared and contrasted with conventional multiple linear regression with discussion around this type of approach as part of a growing interest in machine learning applications.One example is used to illustrate the application of both approaches to a streamflow series exhibiting strong non-stationarity.Results for the other streamflow series are included in a concise synthesis.Overall, the results suggest that Bayesian networks have several desirable features in terms of transparency, interpretability and explanatory insight.The findings from this study lend support to the use of Bayesian networks for modelling connections between streamflow volume and the variability of climate and regional weather.This improved understanding of the key controls of streamflow variability is intended to help address growing needs around informing social, cultural, economic and ecological aspects of water planning and management.

| INTRODUCTION
Managing risks associated with seasonal to interannual variations in streamflow is important for informed water planning and management, and societal and environmental well-being.Thus, the influence of large-scale oceanographic-atmospheric processes on streamflow variability has been the subject of extensive study for many years (Chiew & McMahon, 2003;Cullen et al., 2002;Kahya & Dracup, 1993;Kiem & Verdon-Kidd, 2009;Lee & Julien, 2017;Mechoso & Iribarren, 1992;Pascolini-Campbell et al., 2015;Sahu et al., 2014;Steirou et al., 2017;Verdon et al., 2004;Wang et al., 2021;Worako et al., 2021;Zubair, 2003).The focus of these studies is principally confined to climate variability modes with time scales of the order interannual to interdecadal, such as the El Niño-Southern Oscillation, North Atlantic Oscillation, Interdecadal Pacific Oscillation, Indian Ocean Dipole and Northern and Southern Annular Modes.Information captured by persistence in streamflow series as well as well as indices of atmospheric variability have been used in streamflow forecasting schemes (Bhandari et al., 2018;Chiew & Siriwardena, 2005;Piechota et al., 1998;Soukup et al., 2009;Wei & Watkins Jr, 2011).Relatively few studies have examined streamflow variations in relation to weather phenomena such as storms that cause extreme rainfall.One example is the influence of low-pressure systems on eastern Australia (i.e., midlatitude cyclones known as East Coast Lows) on streamflow (Dowdy et al., 2013).However, such studies do not include the effects of other atmospheric and hydroclimatic phenomena.Apart from a few partial exceptions (e.g., Bhandari et al., 2018;Chiew & Siriwardena, 2005;Shams et al., 2018), to the authors' knowledge there are no studies that present an integrated view of the individual roles of and interactions between climate modes, regional atmospheric conditions, time series persistence and antecedent conditions in shaping streamflow response.
This paper describes a Bayesian network approach to determining a probabilistic graphical model that explicitly describes the individual and joint effects of climate variables on streamflow.Such networks are data-driven, highly interpretable and open to scrutiny.Bayesian networks make it possible to: combine incomplete prior causal knowledge and observational data in a formal setting; provide a compact graphical model to express and illustrate complex probabilistic interrelationships among a finite set of variables, facilitate learning about potential causal relationships between variables, predict with reasonable accuracy and incorporate the method in decision making frameworks (Aguilera et al., 2011, and references therein;Uusitalo, 2007, and references therein).Dynamic Bayesian networks (DBNs) are an extension of Bayesian network theory that explicitly incorporate the dimension of time by relating variables to each other at adjacent time steps.Thus, they provide the means to model system dynamics using the Markovian framework (Dabrowski et al., 2016;Flesch & Lucas, 2007).Thus, the main objective of this paper is to develop, analyse and interpret explanatory models that identify possible casual relationships in complex systems.The additional gain in knowledge from using such an approach is illustrated through a comparison with multiple linear regression models.Such benefits can be useful for researchers and practitioners in water planning and ecological management.

| Bayesian networks
Bayesian networks are probabilistic models based on directed acyclic graphs (DAGs) that consist of two components: nodes (or vertices) and directed arcs (or edges).The nodes in a network represent the random variables comprising a given system and directed arcs the directions of dependencies among the nodes.Directed arcs originate from a parent node and terminate at a child node.Terminal and leaf nodes are only connected to their parents.A target node is a variable of particular interest to an analyst.Thus, a DAG can be viewed as a hierarchical dependency chain between source and one or more terminal nodes.Unconnected nodes represent variables that are conditionally independent of each other.The acyclic requirement means that no node can be its own parent or child.This permits the factorization of the joint probability distribution based on a conditional distribution for each variable, given its parents.Thus, the connectivity of the resulting network provides potential insight into the underlying system dynamics and physical processes.
The class of Bayesian networks used in this study is explained in an earlier paper (Bates et al., 2021) and so is only briefly described here.The approach, called sparse dynamic Bayesian networks (SDBNs), incorporates the temporal dimension by including lagged and non-lagged variables.It rests on the following assumptions: (1) the network structure and its properties do not vary with time; (2) the joint probability distribution is Gaussian (multivariate normal); (3) the dependencies between variables are linear; (4) the DAG need not be fully connected (a graph is fully connected if there is a direct arc between every node pair) and ( 5) the temporal dynamics are adequately described by a low-order autoregressive process (a process in which an observation can be approximated by a linear function of a small number of past successive values plus random noise).
A key concept in this study is the notion of a Markov blanket.It is a minimal set of variables that renders a target node conditionally independent of all others.The blanket of a target node is comprised of its parents and children, and all other nodes that share a child with it.Thus, the behaviour of the target node can be fully predicted using only the set of variables (nodes) inside its Markov blanket.This often produces a compact network that improves interpretability without compromising model accuracy.In this study, the target node is always the terminal node and hence the Markov blanket network (MBN) is defined by that node and its parents.

| Multiple linear regression
For the purpose of comparison with conventional statistical methods, multiple linear regression (MLR) was used as a reference model for investigating the dependencies between streamflow volumes and climate and weather feature variables.A MLR model can be written as where Y is the response variable (streamflow), X are predictors (climate and weather factors), β 0 , β 1 , …, β p are parameters to be estimated and ε is the error term.A key assumption is that errors are independently and identically distributed with zero mean and constant variance.
If the distribution of the errors is Gaussian, the parameter estimates coincide with the maximum likelihood estimates which have desirable statistical properties.
A MLR model can include interaction terms among predictors.A simple example of an interaction is when the effect of one predictor X i ð Þ on the response Y varies as a function of another predictor X j À Á .This first-order interaction is included in the model as an additional term β p+ 1 X i X j .While an MLR model can incorporate more than one interaction term, the inclusion of higher-order interaction terms between three or more predictors is rarely considered in practice.

| Time series regression
Time series regression analysis was used to investigate the presence of trends in the monthly series of climate and weather feature variables.Visual inspection of individual time series plots indicated either no trend or essentially linear trends.Thus, the presence of a trend was judged using the simple linear regression model given by where Y is the variable of interest and date = year + month/12 − 1/24 in which year is the calendar year for each data point and month = 1, …, 12.The parameters were initially estimated using ordinary least squares.When the error terms are serially correlated, the variance estimators of the model parameters are biased and inconsistent leading to incorrect statistical inferences.In such cases, the model parameters were estimated using generalized least squares using the R packages forecast and nlme (Hyndman & Khandakar, 2008;Pinheiro et al., 2021).The statistical significance of trends in the monthly series was tested using analysis of variance.

| Modelling framework
Figure 1 displays a schematic diagram of the study methodology.The modelling step involves the coupled processes of lag-order selection and structure and parameter learning for the sDBN (see below for further details).For a given lag-order all leaf nodes, except for the terminal node, and their associated arcs are pruned from the network in a series of successive steps.This process enhances structural sparsity without detracting from model performance and leads to a final sDBN from which the structure of the MBN is derived.Once the Markov blanket for the terminal node is identified, the final step involves synthesis of the assorted results and discussion of their implications.

| Data preprocessing
A monthly time step is adopted throughout the analysis as the climate variability mode data are available as monthly series.Data distributions of monthly streamflow are typically right skewed and may have minimum values at or close to zero.The inverse hyperbolic sine (IHS) transformation was applied to streamflow series as it approximates the natural logarithm of that variable, permits retention of zero-valued observations and precludes negative predictions.For a variable z, this transformation is defined by sinh − 1 z ð Þ= log z + ffiffiffiffiffiffiffiffiffiffiffi z 2 + 1 p À Á .Exploratory data analysis revealed that all the monthly series considered (streamflow volume, climate mode and regional weather features) exhibit an annual cycle.Consequently, the data are deseasonalized by subtracting their monthly means over the whole period of record.The regional weather features used here are based on gridded monthly means of low-to mid-tropospheric variables.Dynamic principal component (DPC) analysis is used to account for autocorrelation and to reduce the dimensionality of the data in the frequency domain while preserving as much information as possible (Brillinger, 1981;Hörmann et al., 2015;Kwon et al., 2020).
A two-stage data transformation is used to convert the marginal distributions of these variables to standard normal (normal distribution with zero mean and a standard deviation of one).This involves use of the probability integral transform (PIT) and inverse transform sampling (Chave, 2017).The chi-square Q-Q plot (Korkmaz et al., 2014) is used to facilitate visual assessment of multivariate normality analogous to the normal quantile-quantile plot for univariate data.Similarly, pairwise scatter plots of the variables were used to check for violations of the linearity assumption but none were found in the case study described below.

| Structure and parameter learning/ estimation
The structure of the sDBN is learned using the tabu search algorithm (Glover, 1986) and the Bayesian Gaussian equivalent score as implemented in the R package bnlearn (Scutari, 2010).The algorithms in the bnlearn package permit the imposition of structural constraints based on prior causal knowledge via "blacklists" to force the exclusion of specific arcs.This knowledge can take the form of path (direction) and precedence (temporal ordering) constraints.An example of a path constraint is that the El Niño-Southern Oscillation can influence streamflow at a given site, but not vice versa.A simple precedent constraint is that at-site streamflow at some time t can be dependent on its value at t − 1, but not vice versa.Thus, large-scale variables can influence local-to regional-scale variables but not conversely; and directed paths backwards in time are forbidden.The imposition of such constraints reduces the size of the search space, improves the performance and convergence of the search algorithm, and improves network performance and robustness.The bnlearn package is also used to obtain maximum likelihood estimates of the unknown parameters of a network given its structure.R packages bnviewer (https://robsonfernandes.net/bnviewer/) and igraph (https://igraph.org/)are used for network visualization.
The preprocessed data used for learning SDBNs were also used for fitting MLR models.This ensured that goodness-of-fit statistics are placed on the same scale.Forward and backward stepwise regression analyses were used to select subsets of predictors.The Bayesian Information criterion (BIC) was used for predictor selection given the focus on explanatory modelling and goodnessof-fit rather than predictive accuracy (Shmueli, 2010).Lagged response terms were included in the MLR models to account for temporal structure in the error terms.

| Lag-order selection
The selection of lag order (in months) is important as model results may vary with the differentiated lags.Too few lags can result in autocorrelated errors while using too many can lead to overfitting which affects inference and assessment of model performance.The approach adopted here is to build models with lags k =0, 1, …,k max and to choose the value of k which optimizes one or more measures of goodness-of-fit (GOF).Three lag selection criteria based on GOF measures are used.The mean absolute error (MAE) and root mean square error (RMSE) are minimized, and a robust measure of the coefficient of determination (ρ 2 ; Shevlyakov & Smirnov, 2011) is maximized with respect to the number of lags included in the model.Although the trends in the GOF measures with increasing lag are not strictly monotonic, they show a tendency to plateau or flatten at low k values.Thus, it is possible to select a lag value that represents an appropriate compromise between accuracy and parsimony.

| Model fit assessment
For all analyses, residuals from the models are examined for systematic patterns that may suggest that some modelling assumptions are violated (Faraway, 2014).After learning the final structure of the MBN and fitting of a MLR model, diagnostic plots are used to check that the residuals are approximately normally distributed with zero mean and constant variance, and are unstructured (absence of trend and serial correlation).The predictive accuracy of the fitted models was compared using the percentage relative difference defined by where ν and ν ref are GOF statistics (MAE, RMSE and ρ 2 ) for the fitted MBN and MLR model, respectively.It is assumed that any temporal trends in the monthly streamflow data can be explained by trends in the climate and weather variables.Although regression diagnostics can be used to check the reasonableness of this assumption, they cannot provide any causal insight into the specific roles of individual nodes.

| Network stability analysis
Given the modest lengths of record available in this study (section 4.2), it is desirable to investigate the stability of the Markov blanket network over time and the effects of data variability on the learned network.An expandingwindow estimation scheme is used to learn the structure of the MBN over three overlapping periods in time.In each instance, principal component scores are calculated using only the data available in each window.This approach mimics the updating of a deployed model over time.The impact of data variability was investigated using a block bootstrap resampling scheme.Details for the expanding-window and bootstrap approaches are given in section 4.3.

| Background
The study area is the eastern coastal region of mainland Australia (Figure 2).The major defining feature of the region is the Great Dividing Range (GDR) which parallels the coast for $3700 km, comprising the upland boundary of the drainage basins of interest.From north to south, Köppen climate types are (Peel et al., 2007): tropical savanna (Aw) climate; monsoon climate (Am), monsooninfluenced humid subtropical climate (Cwa), subtropical humid climate (Cfa) and temperate oceanic climate (Cfb).About 85% of the region's population lives within 50 km of the coastline, comprising 65% of Australia's population.Surface water is the main source for consumptive water supply.
Australia is affected by midlatitude westerlies in southern latitudes (southward of 30 S), tropical convergence in northern latitudes (equatorward of 20 S) during the monsoon season (December-March) which is also characterized by westerly winds, and stable subsiding air under the subtropical anticyclone belt between latitudes 20 S and 40 S (Bridgman, 1998;Pepler et al., 2019).The subtropical anticyclone belt (the subtropical ridge) is associated with the descending branch of the Hadley circulation (Lucas et al., 2014).It separates the southeast trade winds of the tropics from the westerlies of the midlatitudes.
Rainfall climatology for the region increases quickly during December, peaks in February, and decreases rapidly during March (Berry & Reeder, 2016).The annual cycle of the trade winds is largely driven by the subtropical ridge and affect latitudes between 5 S and 35 S (Riehl, 1979).Their impact is greatest from April to September, and their frequency decreases steadily poleward along eastern Australia.
The mean meridional circulation and eddies move water vapour from the tropics to middle and high latitudes (Hartmann, 2016).This transport controls the mean climate of southern Australia, the position of the subtropical ridge and the passage of rain-bearing weather systems (Cai et al., 2011;Drosdowsky, 2005;Pepler et al., 2020;Timbal et al., 2016).The southern half of the continent is dominated by northwesterly winds but has a significant component of southwesterly and southerly winds associated with the passage of cold fronts and depressions.Although a relatively uniform precipitation regime exists in southeast Australia, long-term trends in climate are evident (McKay et al., 2021).In the subtropics, the rainfall pattern is summer dominant, but the difference between summer and winter rainfall is much less marked than under monsoon conditions (Bridgman, 1998).

| Data description and sources
A modelling effort should start with defendable assumptions on the possible roles of variables based on previous studies, expert knowledge, or common sense (Heinze et al., 2018).Thus, the present analysis is based on climate variability modes that are known, or have been hypothesized, to influence streamflow in Australia (Chiew & McMahon, 2003;Esha & Imteaz, 2020;Kiem & Verdon-Kidd, 2009;Verdon et al., 2004, and references therein).Here, the surface and upper-air weather features used is intended to be a plausible but not necessarily complete or definitive set.The primary drivers of surface water yield and floods in Australia are the rainfall prior to the flood event and the catchment soil moisture state (Potter et al., 2005;Wasko et al., 2020, and references therein).The use of a sDBN makes possible the description of current and past weather conditions using regional-scale features such as zonal and meridional winds, atmospheric moisture and vertical velocity.
The adoption of a monthly time step (section 3.2) suggests the use of an ingredients-based approach to characterize regional weather conditions.Thus, the analysis was based on the use of monthly means of meteorological variables.The period of analysis is chosen as May 1958 to December 2018.The start date corresponds to the approximate midpoint of the International Geophysical Year (July 1957-December 1958).Prior to that time, upper-air data for the Southern Hemisphere are considered less reliable due to a scarcity of observations (Zillman, 2018).The end date of the analysis corresponds to the last complete calendar year of the streamflow records.

| Streamflow
Daily discharge data for eight gauging stations were extracted from the Hydrologic Reference Stations (HRS) dataset compiled by the Bureau of Meteorology (Australia) (Zhang et al., 2016).As of August 2020, the HRS dataset is comprised of 467 records from unregulated rivers and streams with minimal anthropogenic disturbance (https://www.bom.gov.au/water/hrs/about. shtml).The earliest record in the HRS dataset is from 1950.
The selected gauging stations are spatially representative of all hydroclimatic regions in eastern mainland Australia (Figure 2 and Table 1).The streamflow records are of high quality in terms of accuracy and completeness of information.Seven stations are located east of the GDR, and one (Creswick Creek) to the immediate north (Figure 2).The associated drainage basins are not nested within one another.This means that although some of the streamflow series may be correlated, they are not causal in a mechanistic sense.Thus, the streamflow series can be treated as independent terminal nodes.This simplification reduces network complexity significantly, alleviates the problem of overfitting and improves convergence speed.

| Regional weather features
Monthly data for atmospheric variables were obtained from the NCEP reanalysis 1 dataset (Kalnay et al., 1996) with spatial resolution of 2.5 for the domain from 25 S to 45 S, and 105 E to 155 E. The reasons for this choice are: (1) it facilitates examination of the strengths of the relationships between observed monthly streamflow series and climate features that can be resolved at large spatial scales as well as climate variability modes; (2) it provides a baseline for future studies involving higherresolution reanalysis products and (3) it's resolution is similar to that of some models used for operation guidance in seasonal prediction applications (Hudson et al., 2017).The HadSLP2r gridded monthly mean sea level pressure (MSLP) dataset (Allan & Ansell, 2006) was used to characterize the subtropical ridge using the method described by Larsen and Nicholls (2009).For each month, the maximum MSLP (intensity: SI) and the latitude at which it occurred ( position: SP) were determined.Monthly means of the following upper air variables are also considered: (1) 850 hPa zonal wind (U), which has been used in previous investigations of the Australian monsoon (Chen & Guan, 2017;Kajikawa et al., 2010;Kelly & Mapes, 2016;Lisonbee et al., 2020;Wang et al., 2004); (2) 850 hPa meridional wind (V), which is used to characterize low-level circulation over the hinterland; (3) precipitable water (P), which is used as an indicator of atmospheric moisture content and was used in a study of the Australian monsoon (Zhang, 2010) and ( 4) vertical velocity at 500 hPa (W), which is used as an index for large-scale vertical motion.
The use of a transect approach rather than summary statistics of gridded fields was motivated by an earlier study of interannual to decadal scale variability in zonal geostrophic wind flow across eastern Australia from 25.6 S to 36.5 S and its strong relationship with rainfall (Rakich et al., 2008).In this study, four transects labelled A, B, C and D are used to quantify spatial and temporal variations of one or more of the above variables.Each transect is a trajectory through or along a set of nearby reanalysis grid points (Figure 2).Transects A, B and C are oriented along or parallel to the ridges of the GDR.The atmospheric variables considered along Transects A and B are U, P and W. Transect A is positioned to characterize the subseasonal to interdecadal variability of the impact of the Australian Monsoon and southeast trade winds over northeast Australia.Transect B has a similar orientation and location to that the transect used by Rakich et al. (2008, fig. 1).Transect C is positioned to characterize either onshore flow from the Southern Ocean and Bass Strait or northerly flow from the continental interior.Transect D is positioned to capture information about meridional wind at 850 hPa over the hinterland.Along each Transect, the DPCs of gridded atmospheric series were computed prior to deseasonalization and transformation.The naming convention used for the components is as follows: "TXa" where "T" is the Transect name (one of A, B, C or D), "X" is the atmospheric variable name (one of U, V, P or W) and "a" is the component number (1, 2, …).

| Approach
The southeast of the study area experienced severe and prolonged dry conditions from late-1990s to 2009 known as the Millennium Drought (AghaKouchak et al., 2014;Grant et al., 2013).This presents a particular challenge in terms of variable selection, and model specification and accuracy.For the sake of brevity, the Latrobe River data (Table 1 and Figure 2) are used as a running example to illustrate the main ideas (section 5.1).This dataset was selected for the following reasons: (1) its streamflow record is relatively long and of high quality; (2) it presents a modelling challenge as the streamflow series exhibits non-stationarity (downward trend) and (3) results for the seven remaining datasets exhibit striking similarities in terms of variable and lag selection, residual diagnostics and model goodness-of-fit in the presence or absence of a trend.Thus, key results for all eight datasets are included in a synthesis (section 5.2).
For the network stability analysis, each overlapping window started at May 1958.The finishing dates of each window were the month of December in the years 1998, 2008 and 2018.To investigate the effects of data variability on arcs within a learned MBN, the original data were preprocessed through: applying the IHS transformation to the monthly streamflow series; deseasonalization of the data matrix comprised of transformed streamflow, climate variability mode and DPC series; detrending the resultant data matrix columns by using loess smooths with span = 0.75 and degree = 1, and finally applying the PIT and inverse transformations to the data matrix columns (Figure 1 and section 3.2).The resultant record was broken into equal-length blocks consisting of consecutive monthly observations for each water year.The blocks were resampled at random with replacement to build 1000 replicate datasets of the same length as the original record.The loess trend curves were added to each replicate dataset to account for the presence of non-stationarity.The MBNs learned from these datasets are used to assess arc strength and arc direction where arc strength is the probability of an arc appearing between two nodes regardless of its direction and arc direction is the probability of an arc direction re-occurring in the MBNs.The method described by Scutari and Nagarajan (2013) was used to assess the statistical significance of the arc strengths, possibly leading to a more condensed network structure.

| Preliminary insights and data preprocessing
Figure 3 displays the monthly streamflow series for the Latrobe River station over the study period.Visual inspection suggests an underlying non-monotonic trend with a marked decrease in the series mean during the 1997/98 water year.This reduction coincides with the beginning of the Millennium Drought.
Table S1, Supporting Information summarizes the results of the DPCA analysis of the NCEP Reanalysis data for grid points along or adjacent to the four Transects shown in Figure 2.For the zonal and meridional winds (U and V), the first two components account for 94%-97% of the total sample variance.The first two components for vertical velocity (W) account for 89%-95%, and the first component of precipitable water (P) accounts for 90%-97% of the total sample variance.Therefore, DPC analysis has been effective at dimensionality reduction.
Figure S1 displays a chi-square Q-Q plot of the squared Mahalanobis distances for the transformed Latrobe River data versus the corresponding quantiles from a χ 2 p distribution, where p is the data dimension.A departure from multivariate normality is evident in the upper tail.The figure includes an inset that compares the summary statistics for sample and theoretical quantiles.Although the sample distribution has an elongated tail, the first and third quartiles and the medians of the sample and theoretical distributions are quite similar, with the 97th percentiles (upper whiskers of the paired boxplots) indicating that the departure is limited to extremes in the upper tail.Therefore, the assumption of multivariate normality is judged to be reasonable.Visual inspection suggests that a reasonable estimate of lag-order for the Latrobe River data is 1 month.Diagnostic checks (Figure S3) indicate that the residuals obtained from the resulting sDBN model are well behaved (zero mean, constant variance, and no systematic pattern or notable serial correlation).However, there is a tendency to underestimate low and particularly high streamflow extremes.This suggests that use of the multivariate Gaussian assumption cannot properly account for tail behaviour (Figure S1) and may also reflect the use of a monthly time step which smooths weather conditions.For additional insight on the streamflow data, Figure S4 presents time series of monthly streamflow values.

| Markov blanket network
Figure 4 displays the Markov blanket DAG for the Latrobe River data.There are 8 blanket (non-terminal) nodes and 18 directed arcs.The arc strengths depicted in the DAG represent the arc contributions to the Bayesian Gaussian equivalent score for the network, scaled to the interval [0, 1].The DAG exhibits a strong dependence on current values of regional weather features as captured by the NCEP Reanalysis data.Only one climate variability node (DMI) and one lagged node (lag-1 streamflow, Le_1) are present in the Markov blanket.The remaining nodes consist of the first and second DPCs of zonal wind at 850 hPa for Transect B (BU1 and BU2), the first and second DPCs of meridional wind at 850 hPa for Transect C (CV1 and CV2), and the first DPCs of vertical velocity and precipitable water for transect C (CW1 and CP1).This indicates that, in the current setting, DPCA is a useful method for discriminative feature selection.Figure 3 displays a loess smooth (with span = 0.35 and degree = 2) of the fitted values obtained from the MBN (red curve).There are fluctuations in the trend over the length of the monthly streamflow series.The marked impact of the Millennium Drought on streamflow is captured as well as the slight recovery phase after 2010.Nevertheless, the scaled strengths of 13 of the 18 arcs are weak ≤0:1 ð Þ.This suggests that an investigation of the effects of data variability on network stability would be a prudent step.

| Multiple linear regression
Application of stepwise regression with BIC led to the selection of eight predictors, the same variables as those used in the identified Markov blanket.The fitted regression model was Residual diagnostics indicated that the model was correctly specified.The t values for the coefficients of Le_1, DMI, BU1, CV1, CV2, CW1 and CP1 are significant at the 0.001 level, and the coefficient of BU2 at the 0.01 level.All the predictors in Equation ( 4) appear in the MBN (cf., Figure 4).The smoothed model fit for the Latrobe River data is similar to that for the MBN (Figure 3), but the GOF statistics for the MLR model are generally slightly better (Table 4).An exhaustive search of first-order interactions revealed only one instance that was significant at the 0.05 level (BU1 and CV2, p-value <0.01).However, the inclusion of this interaction term did not lead to an improvement in model fit as measured by MAE, RMSE and ρ 2 .Regardless of these findings, the complexity of the dependencies depicted in the DAG of the MBN cannot be revealed using conventional regression techniques.

| Network stability analysis
Table 3 summarizes the structural characteristics of the three Markov blanket DAGs obtained for each of the overlapping windows depicted in Figure 3.The table's contents are arranged as a coloured matrix.Black diagonal cells depict nodes and off-diagonal terms directed arcs.Numerals indicate the sum of instances of each node and directed arc across the three DAGs.The table entries exhibit four main features: (1) Le_1 is the only lagged term and DMI is the only climate variability mode present in any of the three DAGs; (2) four non-terminal nodes (Le_1, DMI, BU1 and CV2) and the directed arc DMI !BU1 are present in all three blankets; (3) three directed arcs (BU2 !Le, CV1 !Le and CP1 !Le) are present in two DAGs each, the first arc in the DAGs for Windows 1 and 2 and the remaining arcs in the DAGs for Windows 2 and 3; and (4) four directed arcs between transect C nodes are only present in the DAG for Window 1, four such nodes in the DAG for Window 2 and three in the DAG for Window 3.These results suggest that the presence of directed arcs associated with Transect C in the DAG has some sensitivity to the length of record.Nevertheless, with the exception of Le_1, the central finding is that only current values of climate mode and weather feature variables have a direct effect on streamflow.
Application of the procedure described by Scutari and Nagarajan (2013) to the 1000 bootstrapped networks indicated that arc strength probabilities greater than 0.45 could be regarded as statistically significant.The model structure resulting from the selection of significant arcs is illustrated in Figure 5.There are two nodes in the MBN (and two predictors in the MLR model) that do not appear in this DAG: CW1 and CP1.This suggests that the presence of these nodes and their associated arcs in a DAG is quite sensitive to data variability.Configuration of the nodes and the number and position of the directed arcs in Figure 5 highlight the importance of the impacts streamflow persistence and the dependencies between streamflow volume and the zonal and meridional components of lower tropospheric winds along Transects B and C (respectively).Also, DMI has a direct influence on streamflow (Le) and an indirect influence through the dependency chain DMI !BU1 !Le.The latter feature is important as there is a strong relationship between Le and BU1 (the first DPC of the zonal wind).Comparison of Figure 5 and Table 3 reveals that five of the seven arcs judged as statistically significant appear in the learned MBNs for Windows 1, 2 and 3.The two remaining arcs appear in the MBNs for Windows 1 and 2 (BU2 !Le) and Windows 2 and 3 (CV1 !Le).These results suggest that the network depicted in Figure 5 is reasonably robust over time.

| Time series regression analysis
The strength of evidence against the null hypothesis of no trend in Markov blanket node series is judged by comparing two competing models: a null model containing only an intercept against a linear model containing intercept and slope terms.Results of the analysis of variance revealed strong evidence against the null hypothesis for DMI and CV1 0:001<p-values≤0:01 ð Þ .The series for these variables exhibit increasing and decreasing trends.There is no evidence against the null hypothesis for BU1, BU2, CV2, CP1 and CW1 p-values>0:10 ð Þ .Comparison of the loess smooths in Figure 3 indicates that the fitted MBN and MLR model can reproduce the non-stationarity in the streamflow series.The bootstrapped network in Figure 5 and the MLR model indicate that the variables DMI and CV1 have a direct influence on streamflow.These results indicate that over the period of record, the observed decrease in streamflow volume can be largely related to the upward trend in the DMI and the decreasing trend in the 850 hPa meridional wind, noting there is considerable interannual variability in these data.Both of these trends are conducive to drier conditions in the southeast corner of Australia.

| Synthesis of results
Recall from section 4.3 that the plots shown in Figures S1-S3 for the Latrobe River data are similar in form and pattern to those for the seven remaining datasets.Consequently, these plots are not presented here.Table 4 summarizes the characteristics of MBNs and MLR models for the eight datasets.It was found that additional constraints were needed to enforce physical realism during the structure learning process for the MBNs.Thus, nodes from Transects B and C were excluded from the analysis for Wenlock River and Transect A nodes from the analyses for Brodribb and Latrobe Rivers and Creswick Creek.These constraints were imposed because the associated synoptic connections seem unlikely given the distances involved and they could represent statistical associations rather than causal relationships.During the learning process it was found that one or two lags is sufficient to ensure a reliable and parsimonious network structure.Perusal of the relative differences in the GOF statistics indicates that MLR models provide a slightly better fit to the streamflow data.Nevertheless, these differences are unlikely to affect the efficacy of the Markov blankets and their interpretation.Figure 6 displays a dot chart of the climate and weather variables selected using stepwise regression and the Markov blanket approach.There is a strong commonality in the variables selected by both approaches for the Wenlock, Wild, Calliope, Brodribb and Latrobe River datasets (black dots).There are, however, notable differences for the Clyde River dataset.Given the small relative differences GOF statistics for this station (Table 4), this discrepancy might reflect an approximate equifinality (obtaining similar model fits from different model structures, variables and parameters for a given set of performance measures).

| DISCUSSION
This study has demonstrated that Bayesian networks can represent hierarchical dependencies that cannot be captured by a MLR approach.The same limitation applies to Gaussian graphical models (partial correlation networks) which have been used in related fields (Maher & Sherwood, 2014;Ramadas et al., 2015;Taeb et al., 2017).Briganti et al. (2022) note that the interpretability of these models is limited because their arcs/edges are undirected.Therefore, it is impossible to infer whether one variable is more likely to cause or be caused by another.Furthermore, assuming a partial correlation network when the underlying model contains directed arcs can lead to spurious causal connections.
Bayesian network analysis is a relatively complex task to implement and apply.In the context of goodness-of-fit, it is interesting that the MBN approach was not able to outperform multiple linear regression, a relatively simple technique.This is likely due to the former's assumption that the joint probability distribution of the variables is Gaussian.No actual dataset truly follows a multivariate Gaussian distribution; even if the marginal distributions are normally distributed.This is because the dependence relationships may be nonlinear.Thus, in the context of predictive modelling, benefit could be gained by using multiple linear regression (a simple form of machine learning), wavelet analysis (Briciu & Mih ail a, 2014;Schulte et al., 2016;Wei et al., 2020) or modern machine learning techniques (Herath et al., 2021;Jiang et al., 2022).

| CONCLUDING REMARKS
The utility of dynamic Bayesian for capturing and quantifying the relationships between monthly streamflow, climate variability modes and regional weather features was investigated using an Australian case study.The focus was on locations along the eastern seaboard spanning a range of climate zones from the tropics to the extratropics.The role of temporal persistence in these series was also considered.The aims were to identify concise network structures that: (1) characterize the underlying joint distribution of the variables in the network, (2) capture and represent complex hierarchical dependencies among variables sets, (3) adequately deal with nonstationary and noisy time series and (4) are easy to interpret and communicate.Emphasis was placed on improving current understanding of the processes involved rather than predictive accuracy.The results showed that Markov blankets have considerable potential as an efficient expository tool for variable selection in the context of elucidating possible causal pathways in a complex system.They seek discovery of the most relevant factors and construct a Bayesian network and its graphical representation based on those findings.Also, it was shown that block bootstrapping can be a useful strategy for identifying irrelevant, weakly and strongly relevant features in an MBN.This leads to network structures that are reasonably robust to variations in record length and data variability.
Caution should be exercised with respect to the causal interpretation of learned networks due to incomplete knowledge and the presence of data gaps, errors and variability.The pathways identified herein should be subjected to verification by suitable theoretical, observational and modelling studies to provide insight on physical processes for understanding causal mechanisms.Nevertheless, the results indicate that the approach described herein can help provide richer insights into the workings of the system of interest and better serve to generate plausible hypotheses about the underlying processes than conventional statistical methods.In particular, the networks can be used to differentiate between direct and indirect causes of streamflow variability from observational data sources and reanalysis products.With few exceptions, the models identified streamflow persistence and concurrent broad-scale (e.g., monthly average) weather feature variables as having direct effects on streamflow volumes.This is in accordance with physical realism in that large-scale climate variability modes influence regional weather patterns.Nevertheless, the results reported herein suggest that in future research the sensitivity of Markov blanket structures to variations in record length and data variability should always be considered.This is particularly important for the study of mutual relationships between Markov blanket nodes.
No claim is made here as to the completeness of the above analysis.The adoption of a transect-based approach, the reanalysis product used, the selection of atmospheric variables, and investigation of the effects of data variability on network structures and multiple linear regression models, all provide scope for potential future research building on the findings presented here.Extra sources of uncertainty include the locations of the transects, the representativeness of the gauging stations used, limited record length and measurement error.All these sources can affect structure learning as they will result in a lack of power to detect true relationships.The sensitivity of the results to different and higherresolution reanalysis products such as ERA5 (Hersbach et al., 2020) could be investigated.Moreover, additional atmospheric variables worthy of future consideration include the following: zonal and meridional winds at 500 hPa; geopotential height at 500 hPa, convective available potential energy (CAPE), convective inhibition (CIN), surface air temperature and cloud cover.In addition to examining atmospheric variables such as those above, soil moisture measures could also be considered as well as data for the occurrence of weather systems including thunderstorms, cyclones (tropical and extratropical) and fronts.Different physical processes associated with runoff generation (e.g., soil moisture and catchment response characteristics) and environmental conditions associated with evaporation could also be involved.The pursuit of these possibilities is left for future research.

F
I G U R E 2 Location map of the study area.The inset depicts the location of the study area within the broader context of Australia.Filled circles mark locations of the eight streamflow gauging stations considered (see Table1for key to station codes).Crosses mark NCEP reanalysis grid points used in the BN analyses.Dotted lines indicate four Transects (labelled A, B, C and D) consisting of staggered or aligned grid points.The Transects are configured primarily to capture information about 850 hPa zonal and meridional winds, precipitable water and vertical velocity at 500 hPa.

Figure
Figure S2 displays plots of the three lag-order selection criteria (MAE, RMSE and ρ 2 ) versus trial values of k.Visual inspection suggests that a reasonable estimate of lag-order for the Latrobe River data is 1 month.Diagnostic checks (FigureS3) indicate that the residuals obtained from the resulting sDBN model are well behaved (zero mean, constant variance, and no systematic pattern or notable serial correlation).However, there is a tendency to underestimate low and particularly high streamflow extremes.This suggests that use of the multivariate Gaussian assumption cannot properly account for tail behaviour (FigureS1) and may also reflect the use of a monthly time step which smooths weather conditions.For additional insight on the streamflow data, FigureS4presents time series of monthly streamflow values.

F
I G U R E 3 Monthly streamflow series, Latrobe River at Willow Grove.Superposed are local linear regression curves fitted using the base loess function in R with span = 0.35 and degree = 2).Blue curve indicates loess smooth of fitted values obtained from multiple linear regression.Red curve indicates same obtained from the Markov blanket network.Vertical dashed lines indicate positions of three overlapping windows used for network stability analysis.F I G U R E 4 DAG of MBN for Latrobe River dataset.Grey labelled circles indicate Markov blanket nodes and white labelled circle the terminal node.Arrows indicate directed arcs, and arrow colours estimates of arc strength.Le, Latrobe River streamflow series; Le_1, lag-1 streamflow series; DMI, dipole mode index; BU1 and BU2, first and second dynamical principal components of zonal wind at 850 hPa for transect B; CP1, first dynamical principal components of precipitable water for transect C; CV1 and CV2, first and second dynamical principal components of meridional wind at 850 hPa for transect C, respectively; CW1, first dynamical principal component of vertical velocity at 500 hPa for transect C.
Coloured cells indicate nodes and directed arcs that are common to one or more networks.Cyancoloured cells indicate those found for all three networks, grey-coloured cells those found for the Window 1 network only, red-coloured cells those common to Windows 1 and 2 networks, orange-coloured cells those for the Window 2 network only, yellow-coloured cells those for Window 2 and 3 networks, and greencoloured cells those for the Window 3 network.White blank cells indicate arc directions that do not appear in any network.F I G U R E 5 DAG resulting from 1000 bootstrap replicates of Latrobe River dataset.Grey labelled circles indicate non-terminal nodes and white labelled circle the terminal node.Legends display estimated arc strength and arc direction probabilities.Node labels are explained in caption for Figure 4.
Details of original data for climate mode indices.Details of streamflow gauging stations and records.
T A B L E 2 T A B L E 1Note: Streamflow statistics are based data for the study period (May 1958 to December 2018, inclusive).Abbreviations: Ck, Creek; max, maximum; min, minimum; R, River.
T A B L E 4 Comparison of fitted multiple linear regression (MLR) models and learned Markov blanket networks (MBNs).
F I G U R E 6 Dot chart of sets of climate and weather variables selected using stepwise multiple linear regression (MLR, red circles) and Markov blanket (MB, blue circles) approaches.Variables common to both sets are indicated by black circles.Key to station codes is given Table1and variable names are given in Table2and section 4.2.3.