Monitoring events with application to syndromic surveillance using social media data

Availability of time series data in different domains has resulted in approaches for outbreak detection. A popular alternative to detect outbreaks is to use daily counts of events. However, time between events (TBE) has proven to be a useful alternative, especially in the case of sudden, unexpected events. Past work that uses TBE for monitoring events assumes that the in‐control number of events is up to 10 per day. In this article, we derive robust monitoring plans that are scalable when the in‐control counts are higher than 10 per day but less than 100 per counting period (eg, day). TBE values are generally nonhomogeneous across days and within days. This makes the volume of data to train the technology a challenge, and this challenge increases the volume of data needed to design the charts. This article discusses these challenges and suggests solutions for data that are known to be Weibull‐distributed. We present our results in two parts. The first is a simulated dataset that controls parameters of the plan such as the daily counts of events. We then show how the monitoring plans can be applied to the detection of syndromes (ie, disease outbreaks) using social media data.


INTRODUCTION
Availability of time series data in different domains has resulted in approaches for outbreak detection. Outbreaks of interest in these domains pertain to an unexpected increase in diseases (indicating an epidemic), warranty claims (indicating faulty products or fraud), traffic crashes (indicating vulnerable infrastructural conditions), species (indicating a possible biodiversity hazard), and so on. Given the criticality of these applications, the objective is to detect the outbreaks as early as possible. The ruling paradigm to do so is to monitor counting processes or event counts (eg, see Kauhala and Helle, 1 Jiang et al, 2 Miller, 3,4 Skinner et al, 5 Sparks et al, [6][7][8][9] Weiß, 10, 11 Weiß and Testik, 12 Yontay et al, 13 and Zhou et al 14 ).
Monitoring event counts over time needs to select a time period over which these events are counted. This also means that any increase in the events can only be assessed at the end of the time period. This temporal aggregation cannot provide real-time decision support. For example, in 2016, Melbourne experienced a thunderstorm asthma outbreak in which nine people died in a one-day event. 15 Daily counts of the demand for ambulance services failed miserably in warning the authorities of this outbreak.
Because of their reliance on pre-determined time periods, counting processes miss the opportunity for an outbreak decision that each event offers. Therefore, an alternative in the form of time between events (TBE) has emerged (see Kittlitz, 16 Benneyan, 17 Jones and Champ, 18 Alemi and Neuhauser, 19 Liu et al, 20,21 Zimmermann et al, 22 Sharma et al, 23 Khoo and Xie, 24 Xie et al, 25 Dogu, 26 Chen, 27 Demirhan and Hamurkaroglu, 28 Qu et al, 29,30 Zhao and Gilbert, 31 Shafae et al, 32 Yang et al, 33,34 Kumar and Chakraborti, 35 and Cheng et al 36 ). A recent paper by Sparks et al 37 demonstrated that event outbreaks were detected earlier using TBE as compared to counting processes. The main advantage for a TBE-based approach is that it constructs a chart each time we have new information on outbreaks (ie, each time we have a new event). While this can be onerous in situations when the daily counts are high (1000 per day), it is real-time and, hence, potentially useful in the case of critical outbreaks. In such outbreaks, it is helpful to not have to wait for the end of the day to make a decision on a significant change in the frequency of events. The robust monitoring plan in Sparks et al 37 consisted of three exponentially weighted moving averages (EWMA) each with different levels of temporal memory but roughly the same in-control false discovery rates. A limitation of this plan is that it can be applied only to events that occur at a frequency of less than 10 per day. Therefore, in this article, we present a robust plan that uses TBE (and, hence, has the advantage of real-time decision support) but can be used for more frequent events: events with more than 10 counts per day.
This article explores statistical process control methods that monitor the TBE when the in-control mean rate varies naturally with time. As the number of events per unit time interval increases, the time between consecutive events shortens. We are interested in finding when this TBE shortens significantly from what is expected for processes with moderately high counts (eg, up to 20 counts per day). We examine an adaptive exponentially weighted moving average (AEWMA) that detects a significant decrease in TBE values to signal an alert for an event/outbreak. The aim of this article is to provide robust alternatives to existing technology that behaves a little like an EWMA chart when changes are small to moderate, while still remain reasonably robust to large changes. It is anticipated that this chart will typically be efficient for the early detection of most persistent changes in the frequency of events.
The rest of the article is structured as follows. Section 2 introduces the EWMA approach for the TBE. Section 2.1 presents one robust alternative that works reasonably well for low to moderate numbers of counts per time intervals. Section 2.2 introduces the adaptive EWMA plan that is useful when outbreak trend can be predicted. This plan adjusts the amount of smoothing to be efficient for monitoring the forecast change in the scale parameter. Section 3 compares these approaches in a simulation study. Section 4 considers an example of an application: detection of symptoms reported in tweets posted from the state of Queensland in Australia. Section 5 summarizes the findings in the article.

MONITORING PROCESSES
We first describe notations and conventions used in this paper. We consider a series of n events e 1 , e 2 … e n . Each event e i is represented as a tuple ( d i , t i ) where d i indicates the day number of the event, while t i indicates the time of the day as a proportion of number of hours elapsed. d i is a series of integers with 0 indicating the day of event e 1 , while t i ranges from 0 to 1 with the mid-day represented as 0.5. Counts: Wherever applicable, we use a day as the pre-determined time window for counting processes. Daily counts are given by counting the number d i values that are the same and then adding the days with zero counts for those days that are missing from the d i values.
Outbreaks are flagged when the daily counts are greater than a certain value. Time between events: A series of n events corresponds to n−1 TBE values w 1 , w 2 … w n-1 where w i indicates the TBE e i and e i-1 . TBE values are computed as follows: In general, w n − 1 = d n − d n − 1 + t n − t n − 1. Outbreaks are flagged when these TBE values are consistently low. The notion of being consistently low is formally captured as an EWMA statistic. The EWMA statistic for TBE is given as: where ew 0 (0.05) = w , the in-control mean TBEs. Here i represents the ith event in the sequences of events. Later we will use d i to be the day number of the ith event and t i is the time within a day when the event occurred. The value of = d i + t i . We assume that the distribution between events is known to be Weibull-distributed. The Weibull distribution is considered because of the nature of the application.
An outbreak is signaled when the EWMA statistic for TBE is significantly small. Formally, an outbreak is signaled when where h( , ) is a constant. Its value is determined as a function of: (a) , the weighting parameter between TBE and EWMA (as shown in the first term of Equation (1), and (b) , the vector of parameters for the Weibull distribution for the TBE. h( , ) delivers an acceptably high in-control ANE value so that when the EWMA value is lower than expected, an outbreak can be signaled.

The three-EWMA plan
The robust approach that works reasonably well in all cases is the three-EWMA plan in Sparks et al 37 with differing levels of temporal memory. The differing levels of temporal memory mean that we can flag increases in the event frequency efficiently for: • Small changes by using a small exponential weight in the EWMA. Keeping a longer temporal memory allows the plan to detect this small change early.
• Large changes by using a larger exponential weight for the EWMA statistic. This reduces the amount of temporal memory for in-control TBE values but restricts the memory of out-of-control TBE values allowing the plan to detect large changes early.
The three-EWMA plan extends Equations (1) and (2) as follows: The value of j is the number of EWMAs used in the joint plan that simultaneously applied three-EWMA statistics with differing levels of temporal memory. This was used as the robust option applied in Reference 37.
The condition to trigger an outbreak is either: In Equations (3) and (4), 1 < 2 < 3 such that the thresholds h( j ) for j = 1, 2, 3 are designed to signal roughly equal number of out-of-control false discoveries. An outbreak is signaled when Equation (4) is true for one of the js.
When the process is nonhomogeneous and Weibull distributed, the shape and scale parameters (denoted ) are time varying and take on the values appropriate at the exact time of the ith event. These values are estimated using a Weibull regression model discussed later in the application. We use the adaptive version of these three-EWMA statistics: where h( 1 , ) is the appropriate threshold designed to deliver an appropriate fixed in-control ANE value, and appropriate starting value w 0 = w ( 0 ) h( j , 0 ) > 1. This flags an outbreak when This adaptive version is difficult to train because it insists on the out-of-control false discover rate of aew i ( 1 ) < 1, aew i ( 2 ) < 1, aew i ( 3 ) < 1 being the same for all non-homogeneous states. 38 This is an almost impossible task. Therefore, an alternative approach is required, and this approach is the adaptive EWMA plan outlined in the next section.

2.2
The adaptive EWMA plan The adaptive EWMA for these charts with a constant exponential weight is given by where ew 0 ( ) = w (0) and w ( ) is the in-control mean of the TBEs at time , are the scale and shape parameters of the Weibull distribution, and h( , ) is the threshold designed to deliver a particular in-control average number of events (ANE). We only consider flagging significant increases in the event rate by signaling an out-of-control situation whenever i ( ) < 1.
The main advantage of this form of the adaptive EWMA is the ability to adjust the value of the exponential weights over time to best detect the forecast changes from what is expected. In other words, where t alters each time to be efficient for the forecast change in TBE. The threshold for this remains one for this plan. It is hoped that, even though the parameters of the Weibull distribution changes with time, the optimal weights are estimated using explanatory variables w (t) = E(w i ) and the scale and shape parameters in (see Appendix A). In addition, the h( , ) can be found in Appendix A for in-control ANE values equal to 400, 1000, and 2000. The parameters of the Weibull distribution are estimated from a Weibull regression model which will be introduced later in the article.
The forecast change in mean rate is forecasted by using the EWMA statistiĉ for a suitable k ≤ 1 (option 2).

SIMULATION STUDY
We now present a simulation process to compare TBE with count of events in order to detect outbreaks. Since our focus is on low counts scenarios, we consider the mean counts to be between 0.1 and 10 in this simulation study.

Performance evaluation
The monitoring plan performance evaluation is taken as the ANE. A false alarm occurs once the plan flags an out-of-control situation when it is in fact in-control. In our simulations, we have taken the in-control ANE to be 400 (1000 or 2000) events on average. When the rate of frequency of events increases significantly, we want ANE to be as low as possible. In other words, plans whose ANE is the smallest when the rate of frequency of events has increased to a specific value are the best plans to detect this change, conditional of all plans having the same in-control ANE.
In the homogeneous case, the plans sequentially test the hypothesis that the mean TBEs w is In the nonhomogeneous case, the plans sequentially test the hypothesis that the mean TBEs when in-control is given by 0 ( ) at time are

Results
The simulation process generated TBE data and then assigned them with date and time stamps. Daily counts were established by counting those with the same date (eg, generating daily counts), while the TBE values are computed by first sorting events in the order of their occurrence and then using the formula for the time between consecutive events. Different variations (in terms of its shape and scale parameters) of the Weibull distribution are now considered. We experiment with the shape parameter as 1.5 (as in Reference 37), and processes. Other shape parameter values lead to similar results. We conduct our experiments for homogenous Weibull-distributed processes in two parts. As the first part, we experiment with processes where mean counts are less than 10. This process validates that the adaptive EWMA is competitive for the small counts considered in Sparks et al. 37 Following that, we explore Weibull data with daily counts ranging from 10 to 40 to assess processes with at least 40 counts per day.

3.2.1
Processes with counts less than 10 per day Table 1 explores the situation when the in-control ANE scale parameter is 0.2. This results in a process with roughly 5.546 counts per day (ie, less than 10 per day). The adaptive EWMA plan with the forecast option 1 and with = 0.08 is the best for small decreases in the TBE value, specifically when the scale > 0.14. The three-EWMA plan is a good robust option that has reasonable performance across a broad range of changes. Notice also from Table 1 that the robust methods on average have better the out-of-control detection properties than the univariate EWMA counts for the changes (see Table 1). This difference is substantial for larger outbreaks.
Allowing the exponential weights to be estimated as between 0.02 and 0.25 rather than 0.03 and 0.2 leads to the results in Table 2. Note that Adaptive EWMA Forecast option 1 with = 0.08 and exponential weights restricted to 0.02 ≤ ≤ 0.25 nearly always performs better than the EWMA counts for all shifts in Table 2. It is substantially better for large changes in TBE values from the in-control mean. The Adaptive EWMA Forecast option 1 with = 0.1 and exponential weights restricted to 0.02 ≤ ≤ 0.25 are better than other plans in Table 2 for larger changes (scale ≤ 0.12). Table 3 explores the situation when the shape parameter is fixed at 1.5 and the in-control ARL scale parameter is 0.4. This results in roughly 2.792 counts per day. There is little difference between the adaptive EWMA plans in Table 4. The three-EWMA plan is a good robust option that has reasonable performance across a broad range of changes, but the adaptive EWMA plans perform better than this plan for small changes in TBE from the in-control mean. Note also from Table 3 that the robust methods on average perform better at early detection of the out-of-control situations than the univariate EWMA counts. The adaptive EWMA plan does well when there are enough observations to reliably estimate the change in mean TBE values. Therefore, when changes are large, the EWMA forecasting method does not adjust to the new mean value quickly enough to optimize the plan to signal early.
Based on these simulation results, we recommend: (a) the adaptive EWMA plan works well when frequency distribution of changes in the mean TBE are mostly small, and (b) the robust three-EWMA plan when the frequency distribution of larger changes in TBE are more likely or the criticality of an outbreak (as in disaster management) is a higher priority. Table 4 explores the situation when the shape parameter is fixed at 1.5 and the in-control ARL scale parameter is 0.12. This results in roughly 9.2 counts per day. There is little difference between the three adaptive EWMA plans applied in Table 4. The three-EWMA plan is a good robust option that has a reasonable performance across a broad range of changes and should be applied if it is difficult to forecast the change in mean TBE values. The EWMA counts applied in Table 4 perform badly.

Processes with counts between 10 and 40 per day
Now, we explore cases where counts are moderately high, that is, when the daily counts are between 10 and 40. For such changes, we attempt to derive a plan that is robust at detecting an outbreak when the average number of out-of-control false signals is one in 2000. This translates to a false discovery rate of: • One in 100 days when the in-control process has roughly 20 events per day (Table 5); • One in 66.7 days when the in-control process has roughly 30 events per day (Table 6); and • One in 50 days when the in-control process has roughly 40 events per day (Table 7).
Therefore, our robust plan is the simultaneous application of the adaptive EWMA plan outlined in the paper and the EWMA counts with the smoothing constant equal to 0.1. More specifically, this robust option uses the adaptive EWMA with = 0.1 in Tables 5 and 6 and the adaptive EWMA with = 0.12 in Table 7. This article suggested a similar robust plan that applied the EWMA TBE and EWMA counts statistics simultaneously, flagging an outbreak when any one of these signals an outbreak. This is denoted as the ET&C plan since it combines expected TBE and counts. The other robust plan applied is the three-EWMA plan with different levels of temporal memory used in Sparks et al. 37 The bold values in Tables 5-7 indicate the plans with the earliest outbreak detection properties for each outbreak considered.
The robust plans that apply the adaptive EWMA with = 0.1 and EWMA counts with smoothing constant 0.1 (ie, the ET&C plan as defined earlier) is able to signal outbreaks earlier than the three-EWMA approach applied in Sparks et al 37 for all shifts, as given in Table 5. This ET&C plan is the best option for shifts from 0.46 to values less than 0.2, that is, these have the earliest detection properties in Table 5. When shifts from 0.46 to values less than 0.22 occur, the adaptive EWMA plan with = 0.12 flags outbreaks the earliest in the table. Table 6 shows that the EWMA of counts is terrible at detecting large shifts in scale early. Note that the adaptive EWMA TBE plans have earlier out-of-control detection advantages than the EWMA TBE plan with the smoothing constant 0.1. When comparing the robust options, the ET&C plan that simultaneous applies the adaptive EWMA with smoothing constant = 0.1 and EWMA counts with smoothing constant 0.1 flags outbreaks earlier than the three-EWMA approach. The ET&C plan flags outbreaks earliest in the table when the scale shifted from 0.036 to less than 0.22. However, the adaptive EWMA plan with = 0.12 flags earliest for the remainder of the outbreaks in this table. Note that the ET&C plan behaves moderately like a Shewhart chart when shifts are large and like an EWMA chart when shifts are small and, therefore, is a moderately robust plan when the size of future shifts are unknown.
In Table 7, the EWMA of counts does not do well at detecting large shifts in scale early, however it is good at detecting small changes in scale. This time, the ET&C plan simultaneously applies the adaptive EWMA with = 0.12 and  EWMA counts with smoothing constant 0.1. On average, this robust plan detects changes earlier than other charts in Table 7 when the scale parameter changes from 0.025 (in-control) to 0.014. Thereafter, the adaptive EWMA plan with = 0.12 has earlier out-of-control detection advantages than the other plans in Table 7. Note that, in general, the adaptive EWMA TBE plans have earlier out-of-control detection advantages than the EWMA TBE plan with smoothing constant 0.1 in Table 7. This suggests that, for in-control processes with Weibull distributed TBE values for events with daily counts of 20 to 40, the ET&C plan has robust early detection advantages when the future shifts in the scale parameter are unknown. The ET&C plan simultaneously applies the adaptive EWMA with = 0.1 or 0.12 and EWMA counts with smoothing constant 0.1

APPLICATION EXAMPLE
This research is a part of a project for an early detection of symptom outbreaks expressed in social media posts, using Twitter in particular. Therefore, we now demonstrate how our plans can be used for disease surveillance using a retrospective study. We use a dataset of posts on Twitter (referred to as "tweets") posted from within the state of Queensland in Australia from 1st January 2015 to 30th November, 2017, and retrospectively determine how effective our plans were to detect outbreaks of certain symptoms. The symptoms considered are fever and head cold. Tweets containing keywords corresponding to these symptoms are downloaded along with their timestamps. These keywords are outlined in Sparks et al, 38 and the classifier used to decide whether the tweets mentioned themselves or someone else as unwell is covered in Robinson et al. 39 In this section, we apply the adaptive EWMA plan for early detection of these symptoms. This is because the threshold can adjust for in-control changes in parameters of the Weibull distribution. We analyze different symptoms separately using the timestamps of the tweets as the time of the event and apply our plan to detect early outbreaks. The series of TBE values for a symptom is the difference between timestamps of consecutive tweets corresponding to the given symptom. We use the gamlss model [40][41][42] to compute the expected values of the Weibull regression model based on the TBE values. The gamlss package in R is useful for predicting the changes in distributional parameters over time from data for a range of distributions. The appropriate Weibull regression models can be fitted using weihreg function in the eha package, vgam package, survreg package when there left or right censoring, and gamlss package in the packages for R. The glm package in R does not support the Weibull distribution and it models a single parameter distribution that makes it inappropriate for two parameter distributions. Our preference is to use the gamlss package because it provides for our needs by modeling both the shape and scale parameters as a time series, and we have no left or right censoring in our data. The vgam package is more general and so may be more useful in other circumstances, but the gamlss package suits our modeling needs. The gamlss package fits the scale parameter first by specifying it as trend, and then the shape parameters is fitted in the sigma. formula (sigma.fo) part of the model. One-step ahead forecasts are established using the predict.gamlss function. It was verified that the appropriate distribution for the TBE values was the Weibull distribution. The explanatory variables for predicting the TBE values are seasonal harmonics, time of the day harmonics, and the day of the week influences. The standard residuals of these models did indicate significant but low autocorrelation (less than 0.35). This model was used to estimate the model parameters for the nonoutbreak scenarios and hence estimate the expected values for the in-control TBE values. For example, using the cough symptom as an example, we fitted the model with explanatory variables dtime = Time of the day (0-24) the symptom was first mentioned; time = Day number; wd = weekdays and TBE = time between people mentioning they experienced cough in time order sequence as the response variable. The model was fitted using the formula: fm < −gamlss (w i ∼cos(2*pi*d i /365.25) + sin(2*pi*d i /365.25) + (wd=="Sunday")+d i +t i +cos(t i *2*pi/24) + sin(t i *pi/24) + cos(t i *2*pi/12) + sin(t i *2*pi/12), sigma.fo = ∼cos(2*pi* d i /365.25) + sin(2*pi*d i /365.25) + (wd=="Sunday") + (wd=="Thursday")+d i +t i +cos(t i *2*pi/24) + sin(t i *2*pi/24) + cos(t i *2*pi/12) + sin(t i *2*pi/12),data = TBE,family = WEI()) The first part of the model specification is for predicting scale parameter using explanatory variables of harmonics that adjusts for the seasonal influences and the more complicated harmonics for time within the day. Sunday has slightly shorter TBE values because people are generally not working and have more time on their hands to tweet about their symptoms. The second part of the model (after "sigma.fo=") defines the shape trend in the model. This shape changes more on Sundays and Thursdays than on other days of the week. Lagged TBE values are sometimes included when F I G U R E 1 The adaptive three-EWMA plan for monitoring fever outbreaks autocorrelation appears. We validated the adaptive plans achieving an in-control ANE close to 400 by simulating data using the estimated scale and shape parameters fitted from the Weibull fitted regression model. This confirmed that the in-control ANE is close to 400. Therefore, we report our plans for these values. When presenting the results, we plot over the timing of events the time i against the value min ) with a threshold of 1. Therefore, points lower than 1 indicate an outbreak. Specifically, in all figures below, the points below the red line signal an outbreak.

4.1
Plans for outbreaks of fever Figures 1 and 2 show the results for the three-EWMA plan and the adaptive EWMA plan respectively for outbreaks of "fever" in the tweets. Figure 1 shows three outbreaks: one in early 2015, another in late 2015, and the last one in mid-2017. The outbreak indicated in 2017 fails to persist for at least two signaled events in the year. Figure 2 shows two outbreaks in 2015, followed by no outbreaks until May 2017 (the end of autumn in the southern hemisphere) where there was an outbreak involving several days. Thus, in comparison with three-EWMA, the adaptive EWMA plan is more likely to flag persistent events with six of signals flagged fever events that persisted, and the last event flagged persisted for the whole of the Southern Hemisphere flu season. The advantage of the adaptive EWMA plan is its ability to adapt to forecast changes and this helps it to respond quickly to changes in the location of the TBE values. For example, the last persistent outbreak flagged in Figure 2 started at a historical high point but adapted very fast to flag the event Figure 3 monitors for outbreaks in head colds using the three-EWMA plan. In addition, we use an in-control behavior of an average over 10 events per day. This indicates that it is likely to have a false discovery rate of more than one every 40 days. This false discovery rate may be alright for short-duration events which lasts for less than 2 weeks, but is likely to involve too many false discoveries for most public health monitoring applications. Therefore, in Figure 3, we will only comment on sustained outbreaks rather than the many small outbreaks that are likely to be false alarms. Note that the sustained head cold signals in 2017 seem to start during the flu season in the Southern Hemisphere (ie, in August) and carry through into the next summer in late 2017. The year 2017 seemed to be a particularly bad year for head colds with an increased signal into November 2017 with even smaller TBE values than expected. The sustained head cold signals in 2015 start earlier than the Southern Hemisphere flu season and carry through into the flu season (end of September). Figure 4 presents the outbreaks for head colds using the adaptive EWMA plan. This plan is more likely to flag an outbreak that persists than the three-EWMA plan. This is particularly clear for the outbreak in 2015, which seemed to

SUMMARY
Counting processes need to wait for the end of a counting period before it makes a decision on outbreaks. In contrast, monitoring TBE is real-time because each event offers an outbreak decision point. In this article, we show how events can be monitored using TBE, with an application to the detection of infectious diseases using social media data. The key challenge here is to develop a plan that is efficient for both: (a) large changes, say, caused by a group of people with an infectious disease flying into a country, or (b) a slow emerging disease that start within the country. This article delivers a plan that is reasonable at covering both these challenges in the homogeneous in-control Weibull distribution case. We are working on creating a robust approach for the non-homogeneous in-control situation. We compare an adaptive EWMA plan with a three-EWMA robust plan of Sparks et al 37 for events with low counts, to demonstrate the feasibility. We F I G U R E 4 The adaptive EWMA plan for monitoring head cold outbreaks demonstrate that adaptive EWMA TBE values should be preferred to EWMA counts for large changes in mean rate but that the EWMA counts is preferred for smaller changes in mean rate. We derive a robust plan when the in-control mean rate is homogeneous that simultaneously applies the EWMA counts and the adaptive EWMA TBE, but more research is required to extend this to the non-homogeneous case. Following this, the application focus of this paper is on the design of a syndromic surveillance plan using social media data for the state of Queensland.
Research is required into using TBE values for detecting lower than expected number of events per unit time interval where the absence of events is of interest. If the daily counts are on average 500 per day, then selecting an in-control ANE of 50 000 may involve too heavy a simulation burden. New robust procedures are need for applying the TBE approach is these situations.
In terms of the specific application presented in the article, we observe that the EWMA counts plan may not be efficient at detecting massively large sudden outbreaks like thunderstorm asthma outbreak in Melbourne in November 2016. Monitoring TBE in social media applications is a rich new research area. This article has only skimmed the surface of the possible research topics more work is needed to extend its application to the nonhomogeneous in-control Weibull distributed situations. Call: lm(formula = beta1 ∼ (al + shape + log(al) + log(shape) + alsq + alcub + shapsq + shapcub + al * (shape + log(shape) + shapsq + shapcub) + log(al) * (shape + log(shape) + shapsq + shapcub) + alsq * shapsq + shape * alsq + shapcub * alcub), data = betas) Note that mua= max(ŵ i, , w ( )) .