Structural dominance analysis of large and stochastic models

Authors


Abstract

The last decade and a half has seen significant efforts to develop and automate methods for identifying structural dominance in system dynamics models. To date, however, the interpretation and testing of these methods have been with small deterministic models (fewer than five stocks) that show smooth behavioral transitions. While the analysis of simple and stable models is an obvious first step in providing proof of concept, the methods have become stable enough to be tested on a wider range of models. In this paper I report the findings from expanding the application domain of these methods in two important dimensions: increasing model size and incorporating stochastic variance in some model variables. I find that the methods work as predicted with large stochastic models, that they generate insights that are consistent with the existing explanations for the behavior of the tested model, and that they do so in an efficient way. Copyright © 2016 System Dynamics Society

Introduction

The link between system structure and dynamic behavior is one of the defining elements of system dynamics (SD). In a sense, a simulation model can be viewed as an explicit and consistent theory of the behavior it exhibits (Kampmann and Oliva, 2009; Oliva, 2003). Although this point of view has certain merits, not least of which is the fact that it lifts the discussion from outcomes to causes and from events to underlying structure (Forrester, 1961; Sterman, 2000), system dynamicists often need more compact explanations of the system's behavior. In fact, most dynamic modeling projects report their results with simple explanations, typically in terms of dominant feedback loops and, occasionally, external driving forces to the system that produce the salient features of the behavior.

For simple systems with relatively few variables, it is usually easy to use intuition and trial-and-error simulation experiments to explain the dynamic behavior as resulting from particular feedback loops. In larger systems, this method becomes increasingly difficult and the risk of incorrect explanations rises accordingly. There is a need, therefore, for analytical methods that provide consistency and rigor to this process.

Eigenvalue elasticity analysis (EEA) is a set of methods to assess the effect of structure on behavior in dynamic models (Kampmann, 2012; Kampmann and Oliva, 2006; Oliva, 2015). It works by considering model behavior as a combination of characteristic behavior modes and assessing the relative importance of particular elements of system structure in influencing these behavior modes. Elements of the model structure that have a large influence on particular behaviors can provide important clues to identify areas for further testing and policy design. EEA uses linear systems theory to (i) decompose the observed behavior into its constituent behavior modes, such as oscillation, growth and exponential adjustment, and (ii) outline how a particular behavior mode and its appearance in a given system variable depend upon particular parameters and structural elements (links and loops) in the system. In this manner, it provides a precise account of the relationship between structure and behavior through mathematical rigor absent in the experimental methods normally used in the field (Duggan and Oliva, 2013).

The last decade and a half has seen significant efforts to develop and automate methods for identifying structural dominance in SD models (see Duggan and Oliva, 2013, for an overview of this literature). To date, however, the interpretation and testing of these methods have been with small deterministic models (fewer than five stocks) that show smooth behavioral transitions (e.g. Gonçalves, 2009; Güneralp, 2006; Kampmann and Oliva, 2006; Mojtahedzadeh, 2011; Mojtahedzadeh et al., 2004; Saleh et al., 2010). While the analysis of simple and stable models is an obvious first step in providing proof of concept, the methods have become stable enough to be tested in a wider range of models. In this paper I report the findings from expanding the application domain of these methods in two dimensions: increasing model size and incorporating stochastic variance in some model variables. These expansions are important tests of the applicability of EEA, as most realistic applications of SD require higher-order models than the demonstration models used until now. Furthermore, an intrinsic part of testing SD models and assessing the robustness of proposed policies is to subject these models to expected random noise (Sterman, 2000, section 4.3.2 and Appendix B). As noise can have a significant effect on model performance and hide otherwise easy-to-observe core behavior modes (thus making the use of exploratory methods for structural analysis more difficult), it is important to assess whether these methods can work under noisy conditions.

While I only show the results of the analysis of one large and stochastic model, the results are promising. I find that the methods work as predicted, that they generate insights that are consistent with the existing explanations for the behavior of the tested model, and that they do so in an efficient way.

The next section provides an overview of the eigenvalue elasticity methods as well as the most common strategies to interpret the results from the methods. The overview is followed by a brief presentation of the model used for our analysis, as well as a description of the modifications that had to be performed for the methods to work. The following section describes the strategy followed to select operating points 1 to perform the model linearization and analysis. EEA results and a comparison with previous analyses, narratives and policy recommendations are presented in the next section. The paper concludes by summarizing findings and implications.

Background and formulation

Characterizing linear and nonlinear systems

A dynamic model can be represented mathematically as a set of ordinary differential equations:

display math(1)

where x(t) is a column vector of n states variables (levels), u(t) is a column vector of p exogenous variables, f(·) is a corresponding vector function, and t is simulated time. A dynamic system (model) is said to be linear (nonlinear) if f is a linear (nonlinear) function of its arguments. Given the model structure (1), knowledge of the initial conditions x(0) and the path of the input variables u(t), the behavior of the model is completely determined. In this sense, the model structure (1) constitutes a “theory” of the behavior x(t).

The approaches considered in this paper are based on tools from linear systems theory (Chen, 1970) and they approximate the nonlinear model (1) with a linearized version, using the first-order Taylor expansion around some operating point x0, u0, i.e.

display math

or, by redefinition of the variables x → x − x0 − f(x0, u0) × (t − t0) and u → u − u0:

display math(2)

where A is a constant n × n matrix of partial derivatives ∂fi/∂xj and B is a constant n × p matrix of partial derivatives ∂fi/∂uj, and all partial derivatives are evaluated at the operating point. These matrices of first-order partial derivatives are know as Jacobian matrices.

Initially, one may be concerned with the endogenous response of the system, in which case one can set the exogenous or control variables to zero or a constant (u=0). 2 In the absence of changes in exogenous inputs, the resulting behavior of any given state variable xi(t) can be written as a weighted sum of a set of behavior modes:

display math(3)

where the λ's are the eigenvalues of the system Jacobian matrix A and the weights w are constants that depend upon the eigenvectors and the initial conditions of the system (see Saleh et al., 2010, for derivation).

Equation (3) yields three important insights. First, each of the system eigenvalues represents a behavior mode. For real eigenvalues, the behavior mode eλt amounts to an exponential growth (λ > 0) or adjustment (λ < 0). Complex eigenvalues appear in conjugate pairs δ ± , which give rise to oscillations eδt sin(ωt + θ) of frequency ω that are either expanding (if δ > 0) or damped (if δ < 0). The absolute value of λ is known as the natural frequency, math formula, while the imaginary part of λ is known as damped frequency, fd = ω. Second, the behavior of every state variable in the system is a constant weighted sum of the system behavior modes. That is, the behavior of every state variable in the system results from the projection of each of the behavior mode λ, with weight w, into that state variable. Finally, the system core behavior modes are structurally determined, as they are derived from the eigenvalues λ of the system matrix A.

Different “flavors” of EEA emphasize each of the three insights in different ways. For example, EEA can be used to develop a “structural explanation of behavior” as it can pinpoint which system elements are responsible for generating a particular behavior mode λ. Alternatively, the tools can be used to derive effective policy recommendations by isolating the system elements affecting the projection of a particular reference mode in a stock (w), or altogether changing the system behavior modes λ.

Eigenvalue elasticity and influence

The EEA is concerned with assessing how system structure affects the behavior modes (λ) as well as the projections of those behavior modes in a particular stock (w). A measure of the impact on an eigenvalue λ when one changes individual elements a of the system matrix A is the eigenvalue elasticity:

display math

The most granular element of system structure is the gain of the link between two variables, i.e. the ratio of the output to the input. Clearly, all elements a of the system matrix A are combinations of these individual link gains, and thus it is possible to make assessments of eigenvalue elasticity to each link gain and model parameter.

For a complex-valued eigenvalue, the elasticity measure will also be a complex number. One may define the elasticities of the real and imaginary parts separately, i.e. as the real numbers

display math

respectively. Kampmann and Oliva (2006), however, found that it is often easier to work with the so-called influence measure instead, defined as

display math(4)

where Re{μ} = μδ and Im{μ} = μω. In addition to simplifying interpretation, the influence measures also eliminate computational difficulties when eigenvalues are close to zero.

Loop eigenvalue elasticity analysis (LEEA)

Kampmann (2012) showed that it was possible to express the characteristic polynomial 3 of the system matrix A, i.e. the polynomial whose zeros are the eigenvalues of A, in terms of the gains of the loops in what he termed an independent loop set (ILS). The gain of a loop is defined as the product of the gains of its constituent links. An ILS is a maximal set of loops whose gains can be determined or changed independently of each other. The gain of any loop outside this set is then dependent upon the loop gains in the ILS.

Once an ILS has been identified (for procedures, see Kampmann, 2012; Oliva, 2004), it is possible to calculate the gain (g) of each loop in the set and use those gains as the basis for exploration of the eigenvalues’ behavior. Specifically, the eigenvalue elasticity to a loop and the loop influence metric are defined as

display math(5)

While a complex model might not have a unique ILS, this decomposition focuses the analysis on a relevant subset of loops. In particular, changes in relationships in the model that are not part of a feedback loop will have no effect upon the system eigenvalues. Thus one can interpret the elasticities or influence measures in terms of how they affect the gains of a set of (independent) feedback loops in the system. Alternatively, one can assess the relative importance of a particular feedback loop in generating a particular mode of behavior, where loops with large elasticities (or influence) are considered important for the behavior mode in question.

Dynamic decomposition weight analysis (DDWA)

DDWA, introduced by Saleh et al. (2010), is based on the eigenvectors of the system matrix A and is concerned with what happens to the weights w in Eq. (3) when changes are made to the system elements (parameters and link gains). As with the LEEA, one may express the relationship either as influence measures or elasticities. Specifically:

display math

Unlike in LEEA, however, where only those links in the model that are part of feedback loops will have any significance, all the links in the model are potentially relevant in the determination of the DDWs.

The next sections describe the model selected for this test as well as the model changes that had to be done in order to perform EEA.

Test model

The model used for this test is the one presented in Oliva and Sterman (2001). The model captures how a service setting responds to imbalances in supply and demand and, given the structural characteristics of service settings, it provides an endogenous explanation for the erosion of service quality often seen in industry (see Oliva, 2001, for examples and articulation of the dynamic hypothesis). Since services are simultaneously produced and consumed, only four possible responses are available for a service setting facing high demand (Oliva and Sterman, 2001): increase the employees’ work intensity; reduce the time per order; invest in additional capacity (either through personnel or technology); or control the customer inflow through pricing (see Figure 1 for core model structure). As management attempts to maximize profit, it underutilizes the last two options (most settings will not invest in additional capacity until there is ample evidence that it is needed), thus forcing employees to either increase their work intensity (loop 19 in Figure 1) or reduce the time per order (loop 18). As increasing work intensity has long-term implications (fatigue and turnover; loops 20 and 31), 4 employees prefer to reduce the time per order. This reduction of service quality often goes unnoticed as services are intangible and it is difficult to define a service quality standard. Thus the reduction in time per order is interpreted by management as a productivity gain, resulting in decisions to further reduce staffing levels and consequently sustaining the imbalance between service demand and supply (loop 17).

Figure 1.

Model structure (subset) and loops in SISL. Fatigue E and Fatigue A are the work intensity accumulations that affect productivity (short-term) and turnover (long-term) respectively. Only negative link polarities are labeled

This model was an attractive choice for several reasons. First, the model is openly available and fully documented. 5 Second, the model is relatively large and complex as it contains 117 symbols, including 14 stocks and 110 feedback loops; the independent loop set (ILS) contains 35 loops. Third, the model is realistic in scope and application as each of its 37 structural parameters was grounded in a specific situation and the model was fitted to six different time series. Furthermore, the model has been used to develop organizational policy at various sites and as a basis for further theoretical developments (e.g. Chuang and Oliva, 2015; Martinez-Moyano et al., 2014; Oliva, 2001; Oliva and Sterman, 2010). Fourth, the model's core formulations have been extensively tested (e.g. Dogan and Sterman, 2007; Oliva, 2003; Sterman, 2000; Struben et al., 2015). Finally, exogenous stochastic elements (customer demand and employee absenteeism) play an important role in the model's dynamic hypothesis, as a large fraction of the observed erosion of service quality is the result of how the system absorbs these random variations.

Random variations

In the original study (Oliva, 1996), I found that customer demand and absenteeism exhibited small random variations around their averages. Since no endogenous drivers of these variations were identified, I treated both of these variables as model inputs. While the model was calibrated and first analyzed using 1 year of data for these variables, testing over extended periods of time, and under a variety of replications (e.g. different initial conditions) required characterizing these random variations. Both variables were modeled as independent stationary random variables whose means, variances and autocorrelation spectra were estimated from the historical data. Specifically, I estimated the autocorrelation function of each data series and identified a correlation time constant of about 2 weeks for absenteeism and about 1 week for customer demand. Since variations among the two time series were independent of each other, each was modeled as a separate pink noise process (see Sterman, 2000, Appendix B for a description of autocorrelated noise).

The next section summarizes the required modifications to prepare the model for the existing EEA software implementations (Oliva, 2015) and underlying assumptions.

Model preparation

Two types of change to the model had to be performed in preparation for EEA. This paper's electronic supplement (supporting information) contains a documented runtime version of the resulting model that can be easily compared to the pre-existing model (see footnote 4).

This first set of changes address the existing limitations of the tools to perform EEA. These changes did not modify the model structure but rather accommodated the software. First, to address the inability of the software routines to process macros, I removed the model structure responsible for generating random customer demand and absenteeism rates. Instead, I fed a stream of random numbers generated by these macros directly into the model structure. Second, some variables had to be renamed to fit the naming conventions required by the Mathematica® utilities developed for perform EEA. Third, I made explicit the formulation of nine rates that were previously embedded in special functions (e.g. smooth, delay). Finally, I removed one stock that in the original formulation remained constant throughout the base simulation and provided no active feedback mechanism. The inactive stock, as it had no change rate, resulted in a system matrix with reduced rank that could not be solved. Removing the stock from the model restated a full rank system matrix. These first four changes had no impact on model behavior.

The second type of model change performed ensured all model equations were continuously differentiable (C1) so that the algorithm was able to generate the Jacobian system matrix A. This involved identifying continuous analytical forms for the table functions and removing or replacing discontinuous formulations (e.g. MIN, MAX, IF_THEN_ELSE). These changes required extensive testing as they had potential impact on model behavior.

Finding an analytical approximation of a table function is relatively easy, and calibrating the function—i.e. identifying the parameters that best match the existing piecewise table function—can quickly be done with Excel's solver set to minimize the square differences between the table function values and the analytical form. Four table functions were replaced in the test model using analytical forms. After each change, I did extensive testing to assess the impact of these substitutions on model behavior, especially if the model was to run outside of the pre-existing table range. The net effect on model behavior of replacing these table functions was negligible.

Generally speaking, discontinuous formulations (e.g. equations containing MIN, MAX, IF_THEN_ELSE) can be replaced with a generalized logistic curve with a very steep slope at the inflexion point and the pre-fixed lower and upper asymptotes. 6 In this case, however, rather than complicating the model, I was able to eliminate three MAX and one MIN functions. These functions were originally introduced for formulation robustness (i.e. first-order controls on stocks). After extensive testing, I found that these functions were non-binding in simulations under normal operating conditions, and decided to eliminate them. The model lost robustness, but eliminating those functions had no impact on the model behavior for the base case explored here. 7

The model also contained two IF_THEN_ELSE statements. One of them is central to the dynamic theory of erosion of service quality as it controls the asymmetric adjustment of the desired time per order—the time constant for the adjustment is longer if time per order is greater than desired time per order—and it had to be retained in the model formulation for the analysis to be meaningful. I replaced the IF statement with an analytical function of a logistic curve with the characteristics described above. The replacement had negligible impact on model behavior. The second IF statement was originally introduced to allow for faster reduction of labor and was there to prevent replacing the attrition rate if desired labor is less than actual labor. While the formulation is clearly discontinuous, it accurately reflected the policies in place and fitted the historical data well. For this exercise, however, I did not attempt to replicate this behavior and decided to eliminated the IF statement. The less aggressive downsizing (see left-hand panel in Figure 2) resulted in lower erosion of service quality (from an average erosion rate of the desired time per order of 3.17 percent per year in the original model to 3.09 percent per year erosion in the C1 model) but very small operational differences (time per order) in the day-to-day operations (see right-hand panel of Figure 2).

Figure 2.

Behavior difference between original and continuously differentiable (C1) model

The changes to the original model reduced the number of symbols in the model to 112, of which 13 are state variables. While the number of feedback loops in the model stayed the same (110), the number of loops in the ILS dropped to 33. Figure 3 shows the behavior of the main model variables for the first 147 weeks of a typical simulation (initial time for the simulation was week 53). The first panel shows the number of orders processed (order fulfillment) each week, which happens to be equal to the number of customer orders arriving in the service centers, as employees were required to clear the backlog every day. The second and third panels show the adjustments to work intensity and time per order that employees performed every day to meet with the random variations in customer orders, as well as the accumulations of these immediate responses. Work intensity accumulates in short-term fatigue that affects employee productivity, and long-term fatigue that eventually affects turnover. Adjustments in time per order affect the internal standard of desired time per order (the proxy for service quality). Note how the adjustment of desired time per order is asymmetrical on time per order in that it erodes when time per order is below the standard, but the standard does not improve when employees dedicate more time per order.

Figure 3.

Behavior over time of main model variables (C1 model)

The last panel in Figure 3 shows the difference between desired labor and total labor. As the service center is initially understaffed (weeks 53–79), employees increase their work intensity and reduce the time per order in order to match customer demand. When the hiring process eventually catches up with the labor requirements we see a period of low work intensity and high time per order (weeks 80–150). Note, however, that during this period management has reduced the desired labor despite the fact that customer orders are stationary as it adjusted its perceived labor productivity (not shown) to the new desired time per order. This reduction in desired labor eventually reduces total labor, creating a new imbalance and further perpetuating the erosion of service quality. Note that this reference mode is, with the exception of random realizations, the same as that reported in Oliva and Sterman (2001).

Once the model has been transformed into a continuously differentiable model, it is necessary to select the operating points where the model will be linearized to analyze its structural dominance. I tackle that issue in the next section.

Sampling of operating points

Remember that EEA is based on a linearization of the model at a particular operating point. As such, it is necessary to perform the EEA at different operating points to assess the model's dominant structure under different operating conditions. However, the selection of those operating points is not trivial in a model with random inputs that is constantly being “shocked” and taken out of equilibrium. In this section I discuss several approaches tested to select the operating points to study and report the benefits of a successful strategy.

Regular sampling

The first idea was to sample the model operating point over regular intervals. Figure 4 shows the behavior of the time per order for the model simulation and the values of the sampling points every 20 weeks. While the approach seems intuitive, there is no certainty that the selection will cover the full range of operating conditions—keep in mind that the operating point for the modified system is defined by the state of the system, i.e. the value of the 13 stocks in the system. As the system spends more time operating under “normal conditions” this regular interval sampling is likely to miss extreme operating points that might shift the model's structural dominance. Specifically, note how the regular interval sampling misses the extreme values of the plotted variable in Figure 4.

Figure 4.

Model behavior and sampling points at regular intervals

Convex hull

In an effort to capture the model's extreme operating points, I attempted to identify the convex hull of the system's 13-dimensional state space, i.e. the smallest subset of system states that contains all other realized system states. However, over 200 points in a 13-dimensional space proved to be too much for the Quickhull algorithm (Barber et al., 1996).

To reduce the load on the algorithm, I decided to contract the state space considered to six variables. This contraction was based on the idea that some stocks are a direct function of other stocks, and as such they show high covariance. Thus one individual stock can be used to represent a broader set of stocks. Specifically, the variable service capacity was used to reflect the four stocks in the labor sector (desired labor, vacancies, rookies and experienced personnel), the stock short-term fatigue also served as a proxy for long-term fatigue, and the four stocks dealing the perception of service quality were captured by the stock employee quality expectation. The stocks service backlog, desired time per order and perceived labor productivity complete the list of variables considered for the reduced space set.

The algorithm was capable of finding the convex hull of the reduced space state in less than 2 seconds. However, over 50 percent of the points considered ended up being in the convex hull; thus the algorithm did not provide a significant reduction in the set of operating points. While this strategy to identify a reduced set of extreme operating conditions seems theoretically sound, 8 the number of state variables in a typical SD model requires a large set of vertices to cover all potential extreme combinations. Furthermore, the convex hull approach does not take advantage of the high covariance among state variables often seen in SD models.

Cluster analysis

An alternative to the exhaustive search of extreme points implied by the convex hull is to identify clusters of similar operating points and assess the model structure in each of these clusters. To test this strategy, I performed agglomerative hierarchical clustering on the reduced state space (the six variables described above) sampled every other week. I used Ward's method to minimize the within-cluster variance (Ward, 1963). Inspection of the parallel plots revealed that nine clusters were enough to generate sufficiently uniform clusters (see Figure 5) and inspection allowed the identification of operating points that were representative of the whole cluster (heavy lines in Figure 5). The times identified as representative for the nine clusters were {69, 79, 97, 143, 195, 243, 251, 275 and 293}.

Figure 5.

Nine hierarchical cluster (Ward's method) of model reduced state space. The x-axes contain the six stocks in the reduced state space, where sb stands for service backlog, dto desired time per order, fe short term fatigue, plp perceived labor productivity, eqe employee quality expectation and sc service capacity. The y-axis shows the relative value of the stock at a point in time. Each line represents a model operating point in the reduced state space. The groupings minimize the within-cluster variance

Interestingly, the resulting clusters were time dependent. For instance, all points between time 53 and 73 were grouped together in cluster 1; cluster 2 included all the points between time 75 and 89, etc. Although the mapping into temporal stages was not perfect, four clusters {1, 2, 8, 9} were temporally contiguous and the other five had at most three points out of their main temporal range. In retrospect, this is not surprising as the model shows inertia and slowly drifts from one operating point to a new operating point.

Although the time dependence clustering gives credence to the regular sampling strategy first described, this result might not be generalizable for all SD models. Instead, I recommend the clustering strategy, as it will explicitly identify different operating points that cover the model's operating range.

Now, as discussed above, model dimensionality might be an issue while performing cluster analysis. In this case, I did the contraction of the state space to consider based on my understanding of the model structure. For more complex models, or where the analyst does not have a priori intuition of the covariance of the stocks’ behaviors, I recommend using principal components analysis (PCA) (Greene, 1997) across all state variables to identify what states load together and can thus be represented by a single stock. In fact, performing PCA on the 13 stocks over the base case simulation indicated that 77 percent of the observed variance could be explained by selecting the most representative stock of the three top principal components. 9 In what follows, however, I retain the representative operating point for each of the clusters identified in my reduced space state.

Eigenvalue elasticity analysis

Once the model was formatted and expressed in C1 form and the operating points selected, the EEA utilities processed it in less than 90 seconds. The EEA utilities run in Mathematica® and are available online (Oliva, 2015; Oliva and Kampmann, 2010). The utilities generate, for the operating point at the sample time, the system eigenvalues (λ), the dynamic decomposition weights (w), link and parameter elasticities to λ and w and, for the loops in the ILS, the loop gains and loop elasticities to λ.

Eigenvalues

Table 1 shows the values of the 13 system eigenvalues over the 13 sampled times following the regular interval sampling. Note that despite the random noise that affects the system on every simulation interval, and the fact that the system is drifting across different operating modes over time (see cluster analysis discussion above), the eigenvalues are remarkably stable over time. Indeed, performing the analysis of the eigenvalues in the representative times for the nine clusters yielded similar stability (see Table 2).

Table 1. System eigenvalues (real part) over regular sampling interval (20 weeks)
Time12345678910111213
53−9.133−0.321−0.250−0.186−0.084−0.084−0.083−0.045−0.045−0.038−0.019−0.019−0.001
73−10.563−0.321−0.250−0.187−0.085−0.085−0.083−0.026−0.026−0.038−0.019−0.0190.000
93−10.994−0.325−0.250−0.187−0.084−0.084−0.083−0.026−0.026−0.038−0.019−0.0190.000
113−10.478−0.324−0.250−0.184−0.083−0.083−0.083−0.029−0.029−0.038−0.019−0.0190.000
133−10.263−0.323−0.250−0.184−0.083−0.083−0.083−0.029−0.029−0.038−0.019−0.0190.000
153−10.865−0.322−0.250−0.183−0.083−0.083−0.083−0.030−0.030−0.038−0.019−0.0190.000
173−10.596−0.323−0.250−0.185−0.084−0.084−0.083−0.028−0.028−0.038−0.019−0.0190.000
193−10.633−0.324−0.250−0.185−0.084−0.084−0.083−0.028−0.028−0.038−0.019−0.0190.000
213−9.950−0.317−0.250−0.184−0.071−0.071−0.099−0.099−0.083−0.038−0.019−0.0190.000
233−10.611−0.324−0.250−0.183−0.083−0.083−0.083−0.030−0.030−0.038−0.019−0.0190.000
253−10.241−0.323−0.250−0.184−0.084−0.084−0.083−0.028−0.028−0.038−0.019−0.0190.000
273−9.867−0.319−0.250−0.188−0.083−0.083−0.083−0.046−0.046−0.038−0.019−0.0190.001
293−10.442−0.323−0.250−0.186−0.084−0.084−0.083−0.027−0.027−0.038−0.019−0.0190.000
Table 2. System eigenvalues (real part) at times representative of the nine hierarchical clusters
Time12345678910111213
69−9.541−0.320−0.250−0.186−0.083−0.083−0.083−0.046−0.046−0.038−0.019−0.0190.000
79−10.283−0.323−0.250−0.186−0.084−0.084−0.083−0.027−0.027−0.038−0.019−0.0190.000
97−11.088−0.325−0.250−0.185−0.083−0.083−0.083−0.028−0.028−0.038−0.019−0.0190.000
143−10.506−0.324−0.250−0.185−0.084−0.084−0.083−0.028−0.028−0.038−0.019−0.0190.000
195−10.443−0.323−0.250−0.186−0.084−0.084−0.083−0.027−0.027−0.038−0.019−0.0190.000
243−9.739−0.320−0.250−0.185−0.081−0.081−0.083−0.048−0.048−0.038−0.019−0.0190.000
251−10.902−0.323−0.250−0.186−0.084−0.084−0.083−0.028−0.028−0.038−0.019−0.0190.000
275−9.261−0.318−0.250−0.186−0.082−0.082−0.083−0.047−0.047−0.038−0.019−0.019−0.001
293−10.442−0.323−0.250−0.186−0.084−0.084−0.083−0.027−0.027−0.038−0.019−0.0190.000

Since the eigenvalues are the result of the system structure, the current state of the system (the values of the system stocks) and the random inputs driving the model, this stability was initially surprising. Upon further reflection, I realized that this stability to random inputs is something that can be expected of SD models. Our models normally behave as low-pass filters of exogenous noise; i.e. they attenuate high-frequency signals, and are mainly driven by the endogenous explanations (multiple and redundant feedback loops) and not the effects of random variations. This claim, however, cannot be generalized to all SD models. Indeed, Appendix B in Sterman (2000) outlines a strategy on how to test our models under random variation, as models can be taken out of the normal operating points by these random variations, and models with transient behaviors do exhibit important changes in the eigenvalues (see Kampmann and Oliva, 2006). Of course, I could not have made that prediction before performing this analysis as it is not certain what is the “threshold” for transient behavior (the model clearly shows a drift to lower service quality). Nevertheless, this result suggests that these methods could be effectively used to assess the model's response to noise functions with higher variance or autocorrelation. In this particular case, however, it is clear that the system structure dominates the random effects driving absenteeism and customer orders and that the drift in operating performance does not affect the eigenvalues during the simulation horizon. For the remainder of the paper I will limit the presentation of results to the operating point at time 73, i.e. 20 weeks into the simulation. 10

Inspection of the eigenvalues reveals that all real parts are negative, suggesting that the system is heavily dampened. The time constants implied by the real part of the eigenvalues (1/Re[λ]) reveal that the model had five distinct operating speeds (see Figure 6) with a range of four orders of magnitude: from 0.1 to 1000 weeks. Eigenvalue 1 reflects the daily dynamics of the service center (T = 0.1 weeks). Eigenvalues 2–4 have time constants of 4–6 weeks and are related to the monthly processes of the service center. Eigenvalues 5 and 7 represent quarterly processes (13 weeks) and eigenvalues 9–12 annual processes (~52 weeks). Finally, eigenvalue 13 has a time constant of close to 20 years and corresponds to the gradual erosion of service quality (desired time per order) observed in Figure 3. This segmentation of model behaviors is by itself a powerful tool to understand model dynamics and to begin to identify system structure responsible for that behavior.

Figure 6.

Time constants of system eigenvalues at time 73. Eigenvalues 6 and 9 have the same real value as eigenvalues 5 and 8 respectively

Inspection of the imaginary part of the eigenvalues reveals that there are two complex pairs {5,6} and {8,9} representing oscillatory behavior modes, with periods of 56 and 179 weeks respectively (2π/Im[λ]).

Loop eigenvalue elasticity analysis

In this section, I illustrate how the methods reveal the model's dominant structure. I do so by analyzing the structure most influential to eigenvalues 1 and 13: the daily response to work pressure and the long-term dynamics of erosion of service quality. Figure 1 shows a subset of the system structure as well as the loop numbers assigned by the shortest independent loop set (SILS) algorithm.

Analyzing the most influential loops on eigenvalue 1 at time 73 produces the scatter plot in Figure 7, where each dot represents a loop in the SILS. The x-axis reports the absolute value of the influence of the loop on the eigenvalue (see Eq. (5)). The larger the value, the more influential the loop is. The y-axis reports the real value of the influence metric of the loop on the desired eigenvalue. Loops with a negative real influence metric are stabilizing whereas loops with a positive real influence are destabilizing. Note that only two loops (18 and 19) are visible in Figure 7, as all the other loops have a negligible influence and are clustered at the (0, 0) coordinate. Both loops are stabilizing, and loop 18 is significantly more influential than loop 19. These two loops represent the two employee responses to work pressure: reduce time per order and increase work intensity respectively (see Figure 1). Both are balancing and are the employees’ response to the organizational mandate to clear the service backlog within 24 hours (Oliva and Sterman, 2001). Thus the eigenvalue represents the daily dynamics to clear the backlog and the structure responsible for it are the mechanisms used by the employees. Finally, note the LEEA also reveals that reducing time per order (loop 18) has a much stronger response than expanding work intensity (loop 19). These findings are consistent with the documented explanations of model behavior. However, it should be noted that they were achieved from a single simulation, as opposed to the extensive sensitivity testing described in those sources (Oliva, 2001; Oliva and Sterman, 2001).

Figure 7.

Loop influence on eigenvalue 1 at time 73

Figure 8 reports the six most influential loops on eigenvalue 13 (the reference mode with a time constant of almost 20 years that represents the slow drift to lower service quality) at time 73. Loops 13 and 5 are the most influential, and are respectively the reinforcing loop for erosion of time per order, and the balancing loop to adjust time per order to past performance (see Figure 1). Next in terms of influence are the effects of work intensity and short-term fatigue (see loops 20 and 21 in Figure 1). Note, however, that these two pairs of loops have exactly the same influence magnitude, but in opposite direction, suggesting that the loops cancel each other out with respect to this eigenvalue. In the context of the long-term dynamics captured by eigenvalue 13, the reinforcing effects of the loss of productivity because of short-term fatigue (loop 20) are fully compensated by the increased output gained by increasing the work intensity. Finally, loops 26 and 28 are both destabilizing for eigenvalue 13 (see Appendix for full listing of loops in the SILS). Loop 26 (reinforcing) captures the adjustment of desired labor as a function of the perceived labor productivity, i.e. the misattribution of erosion of service quality as a productivity gain, and loop 28 (balancing) captures the hiring response to changes in backlog. The interaction of these two loops results in the long-term erosion of service quality. Again, these explanations are consistent with the model behavior explanations described elsewhere (Oliva, 2001; Oliva and Sterman, 2001).

Figure 8.

Loop influence on eigenvalue 13 at time 73

Dynamic decomposition weight analysis

While LEEA focuses on developing an endogenous (feedback-loop-based) explanation of the observed behavior, the DDWA is much more appropriate for policy development as it allows one to assess the impact of parameter values (where policies normally reside) on reference modes and their projections on specific stocks. In this section, I show the benefits of DDWA by addressing the issue of how to reduce the erosion of service quality, i.e. the policy goal articulated in section 5 of Oliva and Sterman (2001).

One possible strategy to develop a policy to reduce the erosion of service quality would be to focus on the parameters of the loops that, according to LEEA, are most influential. For instance, one could try to reduce the influence of loop 13—according to Figure 8, the most destabilizing loop for eigenvalue 13. The fact that the gains of the loops that were analyzed in LEEA can be independently set—the defining characteristic of an independent loop set (Kampmann, 2012)—immediately focuses our attention on the parameter that uniquely affects the gain of the relevant loop: time to adjust time per order. This approach, however, has two important limitations. First, the approach is driven by the attempt to modify an eigenvalue, i.e. a pure behavior mode, as opposed to the behavior mode of a particular variable. While at times it might be straightforward to identify the dominant eigenvalue in the behavior of a variable (as in our case here with the erosion of service quality), it is important to keep in mind that the behavior of a variable is the weighted sum of different behavior modes. Thus focusing on a single reference mode might limit the space for viable leverage points over a particular variable. The second limitation of LEEA for policy analysis is the fact that the search space is limited to the loops identified in the ILS. While an ILS does cover all the links in the feedback structure, the ILS is not unique; thus the relative influence of a particular loop is the result of the choices of loops in the ILS. Furthermore, LEEA limits the analysis of influence to links within the ILS, and the highest leverage might be outside that set. DDWA addresses these two limitations by providing an explicit mapping of the eigenvalues in each stock and by assessing the elasticity of weights and eigenvalues to each model parameter.

Figure 9 shows the relative projection of the eigenvalues on the desired time per order stock, the model's proxy for service quality. To assess the relative role of each eigenvalue, the projections are normalized by dividing each term in Eq. (3) by the absolute value of the constant term (wi,0); hence the constant line at −1. As expected, eigenvalue 13, the ~20-year decay process, is the dominant eigenvalue on the desired time per order stock, but the oscillatory mode represented by the conjugated pair of eigenvalues {8,9}, labeled as “8” in Figure 9, also has a significant impact in the short term (10 weeks). This suggests that additional structure might be leveraged to affect the erosion of the desired time per order. Eigenvalues {8, 9} represent an oscillatory behavior mode with a period of 179 weeks. LEEA revealed that these oscillations were the result of the short-term fatigue (loops 20 and 21) and their interaction with the hiring policies driven by perceived labor productivity (loop 26) and service backlog (loop 25).

Figure 9.

Eigenvalue projection on desired time per order stock

Table 3 reports the parameter elasticity of the weight of eigenvalue 13 on desired time per order for the 10 parameters with the highest elasticity. Excluding initialization parameters that dictate the models operating point (e.g. hours-per-week-per-employee, initial perceived labor productivity, initial desired time per order and initial rookies), the most influential parameters are related to the speed of the hiring process—time to adjust labor, hiring delay, time to perceive labor productivity, time to adjust desired labor and time for experience (ranks 3, 4, 5, 8 and 9 respectively)—and the effect of work pressure on time per order (alpha; rank 7).

Table 3. Parameter elasticity of weight (w) of eigenvalue 13 on desired time per order
RankParameterElasticity of w of λ 13 on desired time per order
  1. Note: Parameters in bold are simulation initial conditions.
1Hours per week per employee−23.261
2Initial perceived labor productivity−22.595
3Time to adjust labor−1.259
4Hiring delay0.667
5Time to perceive labor productivity−0.523
6Initial desired time per order0.378
7Alpha0.356
8Time to adjust desired labor0.099
9Time for experience0.062
10Initial rookies0.047
  

Comparing these results to the policy testing developed in Oliva and Sterman (2001), we see that this parameter ranking is consistent with the first three proposed policies (see Table 4). The positive sign of the elasticity of the weight of alpha (effect of work pressure) indicates that reducing it will reduce the projection of this behavior mode (eigenvalue 13) on desired time per order (consistent with the results reported by Oliva and Sterman under policy 3). Furthermore, the magnitude of the elasticities confirms the relative effectiveness of the tested policies (cf. column 3 of Table 4). For instance, the non-significant impact of policy 2 (which reduced the time for experience and increased rookie effectiveness) on the erosion rate is explained by the very small elasticity that the weight of these parameters (0.062 and −0.004, respectively) had in the undesired behavior mode. 11

Table 4. Policy analysis: desired time per order erosion rate
PolicyParameters changedQuality erosion (%/year)
  1. Note: Adapted from Oliva and Sterman (2001, Table 6).
Base case −1.28
(1) Faster capacity acquisitionTime to adjust labor, hiring delay−1.08
(2) Faster learningTime for experience, rookie effectiveness−1.33
(3) Reduced effect of work pressureAlpha, beta−0.93
(4) Quality pressure (QP)Gamma−1.05
(5) QP + management pressure(4), weight for employee Q expectation, management quality goal−0.77
(6) QP + upward management pressure(5), with higher mgmt Q goal0.86
(7) Combined policy(1) and (3) and (6)1.39

Analysis of policy 1 (faster capacity acquisition), however, reveals an interesting insight. While Oliva and Sterman treated time to adjust labor and hiring delays as two levers to be moved in the same direction (they reduced both to achieve faster capacity acquisition), from the elasticity of influence in Table 3 it is evident that these two parameters have opposite effects on the erosion of time per order. This conflict in sign explains in part the limited effectiveness of their policy to hire faster: although the net effect was positive, most of the benefit of reducing the hiring delay (from 29.9 to 15 weeks) was negated by the effect of reducing the time to adjust labor (from 11.5 to 6 weeks).

Finally, it is interesting to note that none of the policies tested by Oliva and Sterman affected the time to perceived labor productivity and the time to adjust desired labor, both identified as highly influential parameters by this process.

It is important to highlight that all the above insights (direction and relative ranking of the effectiveness of policies, conflicting parameter changes and omission of significant parameters) were gained after analyzing one simulation run—the base case— at one representative operating point, as opposed to a comprehensive examination of alternative possible policies. Furthermore, the methods are exhaustive in their approach and revealed parameters that were not explored by Oliva and Sterman.

The above comparison is based on what the DDWA method did relative to policy analysis. However, it should be noted that parameters involved in policies 4, 5 and 6 originally proposed by Oliva and Sterman (all having to do with increasing the quality pressure) were identified as having no influence on this eigenvalue–stock pair. The reason for this omission is the fact that in the base case the effect of quality pressure is null, as it has no effect on time per order, regardless of its value. Because of this truncated response, desired time per order is not influenced by all the parameters that might affect it through quality pressure. While this is also consistent with the analysis performed by Oliva and Sterman—they first activated the effect of quality pressure (policy 4) and then explored the impact of other parameters through quality pressure (policies 5 and 6)—it also points to the main limitation of the method for policy analysis. Namely, it can only identify leverage points within the active structure of the system. That is, the algorithms cannot “see” beyond the active elements of the system. Clearly, policy design often involves the creation of additional structure or feedback mechanisms to address the undesired behavior. The methods described here do not inform that process and policy design might always require the creativity and insight from a modeler. Nevertheless, the ability to assess the model behavior comprehensively and in one simulation makes these methods a powerful set of tools to assess the proposed structure and feedback mechanisms.

Conclusions

This paper sets out to test the EEA methods in a realistic, complex model with stochastic elements. There are multiple ways to answer the question “Does EEA work with realistic models?” Here I answer the question at three levels using Checkland and Scholes’ (1990) three E's criteria to assess performance.

First, the methods are efficacious. They work! They generate results. There were no computational problems introduced by the model size (although, with 13 stocks, this is still a relatively small model), or the constant “shocking” of the system by the random perturbations. The methods are robust to random variations and are capable of cutting through large model complexity and honing in on the model structure and parameters responsible for observed behavior.

Second, the methods are efficient. While it took some effort to prepare the model for the EEA utilities, 12 this time would have been shorter if the analysis had been anticipated while modeling and all formulations were made continuously differentiable to begin with. It should be noted, however, that modeling with these constraints would have reduced the modeling flexibility as it is easier to use discrete and piece-wise linear functions to test and develop the model than to generate their continuous counterparts. Although the time to retrospectively prepare the model might seem long, it is short when compared to the days it took to do sensitivity analysis and develop intuition about the model behavior or the thousands of simulations it would take to do an exhaustive analysis using trial-and-error methods. 13 The true efficiency of the methods, however, became clear when a single simulation, evaluated at nine representative operating points, is capable of yielding a precise identification of the model's behavior modes (eigenvalues), generated a formal assessment of loop dominance for each eigenvalue, identified the projection of each eigenvalue into every model stock (w), and, through assessment of link and parameter elasticities, identified the main levers for policy design. Any one of these tasks would have required hundreds of simulations without the formality of the analysis presented here. Furthermore, the methods yield certain results: they are exhaustive in considering all elements of model structure, and the process is systematic and predictable. The formality of the analysis removes much of the uncertainty that lingers in the analyst even after hundreds of simulations and sensitivity tests.

Finally, the methods are effective. Not only were results generated (efficacy), but these results make sense within the context of why the methods were deployed. That is, the methods deliver on the promise to explicitly link structure to behavior. The explanations and insights generated by this approach are consistent with previous behavior analyses, narratives and policy designs developed from this model. The LEEA results generated the same endogenous explanation, articulated in terms of feedback loops, for the observed erosion of service quality. The DDWA yielded a list of high-leverage parameters to affect the undesirable reference mode, thus strictly linking behavior to parameter changes. Indeed, the only “surprise” generated by the analysis presented here was an explanation for the nullifying effect of two previously thought-to-be-related interventions that had eluded analysts before.

Performing this analysis on a large and stochastic model that is well understood was the natural next step for validating these tools. The positive answer to the question “Does EEA work with realistic models?” should encourage the community to expand the application of these methods. The true test of the value of these methods, however, will be when they are used to explore a model for which the analyst does not have a complete a priori structural explanation, and the methods prove capable of yielding a credible explanation for the observed behavior and practical policy recommendations.

Appendix: Listing of loops in the shortest independent loops set (SILS) utilized in the LEEA analysis of the Oliva and Sterman (2001) model

#LNodes in Loop
12FatigueA > ChangeFatigueA
22PerceivedLaborProductivity > ChangePLP
32DesiredLabor > ChangeDesiredLabor
42FatigueE > ChangeFatigueE
52DesiredTo > dtochg
62ExperiencedPersonnel > turnoverrate
72Rookies > experiencerate
82Vacancies > hiringrate
92EmpPerceptionOfQuality > ChangeinEPQ
102CustPerceptionOfQuality > ChangeinCPQ
112MgmtPerceptionOfQuality > ChangeinMPQ
122EmpQualityExpectation > ChangeinEQE
133DesiredTo > timeperorder > dtochg
143DesiredTo > ttoadjustdto > dtochg
154DesiredTo > timeperorder > ttoadjustdto > dtochg
164Vacancies > vacanciescorrection > indicatedlabororderrate > labororderrate
176DesiredTo > desiredservicecapacity > workpressure > effectofwponto > timeperorder > dtochg
187ServiceBacklog > desiredservicecapacity > workpressure > effectofwponto > timeperorder > potentialorderfulfillment > orderfulfillment
196ServiceBacklog > desiredservicecapacity > workpressure > workintensity > potentialorderfulfillment > orderfulfillment
206FatigueE > effectoffatigueonprod > servicecapacity > workpressure > workintensity > ChangeFatigueE
2110ServiceBacklog > desiredservicecapacity > workpressure > workintensity > ChangeFatigueE > FatigueE > effectoffatigueonprod > servicecapacity > potentialorderfulfillment > orderfulfillment
229ExperiencedPersonnel > turnoverrate > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > experiencerate
2311ExperiencedPersonnel > turnoverrate > desiredhiring > desiredvacancies > vacanciescorrection > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > experiencerate
248Rookies > totallabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate
2510ExperiencedPersonnel > totallabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > experiencerate
2612PerceivedLaborProductivity > ChangeDesiredLabor > DesiredLabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > totallabor > ChangePLP
2713PerceivedLaborProductivity > ChangeDesiredLabor > DesiredLabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > effectivelaborfraction > servicecapacity > ChangePLP
2815ServiceBacklog > desiredservicecapacity > ChangeDesiredLabor > DesiredLabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > effectivelaborfraction > servicecapacity > potentialorderfulfillment > orderfulfillment
2915PerceivedLaborProductivity > ChangeDesiredLabor > DesiredLabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > experiencerate > ExperiencedPersonnel > effectivelaborfraction > servicecapacity > ChangePLP
3014PerceivedLaborProductivity > ChangeDesiredLabor > DesiredLabor > laborcorrection > desiredhiring > indicatedlabororderrate > labororderrate > Vacancies > hiringrate > Rookies > totallabor > onofficeservicecapacity > servicecapacity > ChangePLP
319FatigueA > effectoffatigueonturnover > turnoverrate > ExperiencedPersonnel > effectivelaborfraction > servicecapacity > workpressure > workintensity > ChangeFatigueA
326EmpPerceptionOfQuality > qualitypressure > effectofqponto > timeperorder > deliveredquality > ChangeinEPQ
339EmpPerceptionOfQuality > indicatedqualitystandard > ChangeinEQE > EmpQualityExpectation > qualitypressure > effectofqponto > timeperorder > deliveredquality > ChangeinEPQ

Biography

  • Rogelio Oliva is Professor of Information and Operations Management at Mays Business School at Texas A&M University. He holds a BS in Industrial and Systems Engineering from ITESM (Mexico), an MA in Systems in Management from Lancaster University (UK), and a PhD in Operations Management and System Dynamics from MIT. His research explores how behavioral and social aspects of an organization interact with its technical components to determine the firm's operational performance.

Footnotes

  1. 1A model ‘operating point’ refers to a particular system state, i.e., the status of the system as described by the value of its state variables (stocks).
  2. 2See Kampmann and Oliva (2006) for a discussion of when such an approximation is appropriate and useful.
  3. 3The characteristic polynomial is defined as the roots of det(tI − A), where I is the identity matrix. These roots are the eigenvalues of A, and the determinant is polynomial in λ.
  4. 4Not visible in Figure 1; see Appendix for detail listing of all loop variables.
  5. 5Fully documented model and time series required to replicate the results presented in Oliva and Sterman (2001) are available at http://iops.tamu.edu/faculty/roliva/research/service/esq.html.
  6. 6The four-parameter logistic curve math formula, where a and b are the left and right asymptotes and (b − a)c/4 is the slope of the curve at the at the inflexion point m, is flexible enough for these formulations.
  7. 7Note that this decision to eliminate those functions was an informed and personal choice that worked for this model. Had there been uncertainty on the role of those functions they could have been easily replaced with the analytical expression as described above.
  8. 8The expected number of vertices of the convex hull of n points drawn independently from a uniform distribution in a d-dimensional space grows only O(logd − 1n) (Dwyer, 1988).
  9. 9These representative stocks are management perception of service quality—which has a 0.96 correlation with employee quality expectation—perception of labor productivity, and service backlog.
  10. 10Note that there is nothing special on the operating point at time 73. As the content of Tables 1 and 2 suggest, the results would be very similar when performing the analysis in other operating points.
  11. 11The reported erosion rate in Table 6 of Oliva and Sterman (2001) is the average of 500 simulations. The difference between the erosion rate of the base case and the seemingly higher erosion rate resulting from policy 2 is not statistically significant.
  12. 12It took me 10 hours to modify the model to be able to run the algorithms (2 hours to adjust variable names and formulations to address software limitations and 8 hours to replace equations to continuously differentiable analytical expressions), 1 hour to perform and interpret the hierarchical clustering and 1 hour to interpret the EEA output.
  13. 13For instance, an exhaustive assessment of all possible loop combinations (active/non-active) following Ford's (1999) approach would require 110! simulations.

Ancillary