Semiparametric detection of changepoints in location, scale, and copula

This paper proposes a new method to detect changepoints in the location and scale of univariate data sequences. The proposed method assumes that the data belong to the location‐scale family of distributions and estimate the associated densities nonparametrically. Specifically, the approach does not require knowledge of the functional form of the distribution of the data sequence. As such, the approach can detect changepoints in many distributions. We also propose a new method to detect changes in the location of multivariate sequences, using the marginals and a copula to capture the dependence between variables without the influence of marginal distributions. The performance of the proposed semiparametric approach is contrasted against both other competing nonparametric and Gaussian methods, via simulation studies, as well as applications arising from health and finance.


INTRODUCTION
Historically, many changepoint methods have been founded on parametric assumptions. These assumptions can be explicit, for example, detecting a change in mean using the likelihood-ratio test statistic under an assumption of Gaussian data [9,18,37]; or implicit, such as methods that detect a change in mean using the CUSUM statistic [29], or based on measuring fit to the data using square error loss [23], both of which are equivalent to using the likelihood-ratio test under the Gaussian assumption [11]. While these methods perform well if the parametric assumptions are appropriate for the data under focus, they can otherwise be inefficient or ineffective.
In parallel with the above developments, an equivalent body of work has emerged focused on detecting a change in distribution using nonparametric methods. For example, Pettitt [31] introduced nonparametric techniques using Mann-Whitney statistics to test for a change in distribution. Alternatively, Zou et al. [44] (see also [17]) introduced an approach using a nonparametric maximum likelihood based on the empirical cumulative distribution function to detect multiple changepoints. While these methods are suitable for detecting changes in the distribution of non-Gaussian time series, they are not suitable when the objective is to detect changes in a particular property of the time series.
There has also been substantial work extending changepoint methods to the multivariate setting. Here, the aim of the analysis is to borrow strength from individual variables to detect smaller changes that would otherwise go undetected when each variable was considered separately. Ignoring the dependence between series can hinder the detection of subtle changes and cause problems such as inefficient estimation. Here, nonparametric approaches include, for example, detecting changepoints in distribution using hierarchical clustering [25] or a kernel-based approach [4]. Most current multivariate methods, including parametric approaches, ignore dependencies between the data sequences (see Wang and Samworth [41] and Tveten et al. [40] for exceptions).
In many applications, we may be interested in detecting changes only in certain key aspects of the data, such as its location (mean) or scale (variance), but we may not wish to make strong parametric assumptions about the distribution of the data. For these situations, we propose a semiparametric method to detect changes in the location and scale of the distribution without assuming a particular underlying distribution for the data. We only assume that the data belong to a location-scale family of distributions and estimate the associated densities of the distribution nonparametrically. No previous study has investigated the problem of detecting a change in location and scale parameters in this semiparametric setting.
While we first develop this idea for univariate data, we also show how the idea can be extended to multivariate data by combining the univariate location-scale methodology and copulas. Copulas are multivariate distributions that describe the dependence structure of the variables independently of their marginal distributions [28]. We represent the multivariate time series using its marginal distributions and copula functions. Existing methods in changepoint analysis using copulas are capable of detecting a change in the dependence structure of multivariate time series, and they assume common marginal distributions across time [6,10,42]. Most of them assume no change in the marginal distribution or fix the estimated marginal distribution for both null and alternative and just test the change in dependence. In this paper, we propose an approach that both models and detects a changepoint in the marginal distribution and the copula structure. The individual component series can belong to any distribution in the location-scale family, and there is no restriction that all the components have the same distribution; thus, the method allows us to analyze multiple series that are of different types.
The outline of the paper is as follows. Section 2.1 introduces the location-scale changepoint methodology for detecting a single changepoint in a univariate series. Section 2.2 describes the significance testing of changepoints. In Section 2.3, we extend the methodology of detecting a changepoint to multivariate time series using copulas, defining the frameworks for detecting changes in the marginal and dependence structures. We discuss some implementation details of the proposed changepoint models in Section 2.4. In Section 2.5, we combine our methodology with the binary segmentation algorithm to estimate multiple changepoints. We conduct comprehensive simulation studies for detecting single and multiple changepoint scenarios for both univariate and multivariate time series in Section 3. We also compare our methods with Gaussian methods and other competing nonparametric methods. In Section 4, we apply the proposed methods to health and financial datasets.

Location-scale changepoint model for univariate time series
We first consider the problem of detecting changepoints in an independent univariate sequence of observations. Our objective is to detect a change in location, scale, or location and scale of the time series. To this end, we assume that the time series belongs to the family of location-scale distributions, parameterized by a location parameter, a, and a non-negative scale parameter, b. A random variable X with cumulative distribution function (CDF) F X is said to belong to the location-scale family when its CDF is sim- , where a ∈ R, b > 0, and F(⋅) is some distribution function. Let Y = (X − a)∕b, then F(⋅) is the distribution function of Y , and henceforth we will denote this CDF as F Y .
If X is absolutely continuous with density function for some density f Y (.), which we call the base density. We do not assume a parametric form for the base CDF F Y (.) and base density f Y (.) and aim to estimate it from the observed data. Let x 1 , … , x n be an independent data sequence where each x i follows a location-scale family with location a i and scale b i , for i = 1, … , n. We are interested in testing whether there is a single changepoint in the location or scale of the time series at ∈ {1, 2, … , n − 1}. We define the null hypothesis of no change in location and scale as H (1) 0 ∶ a 1 = a 2 = · · · = a n and b 1 = b 2 = · · · = b n , against the alternative hypothesis where both location and scale changes as One natural approach to detecting a changepoint is by performing a hypothesis test with the likelihood-ratio test statistic. Under the alternative hypothesis, the log-likelihood function for the model with a changepoint in location and scale is given by Here is the unknown location of the single changepoint, a (0) and b (0) are location and scale parameters before the change, and a (1) and b (1) are location and scale parameters after the change. Note that the model becomes unidentifiable when we estimate f Y (.) and also location and scale parameters together, as one can compensate for changing location and scale parameters with a comparative change in f Y (.). There are numerous ways of addressing this, for example, by imposing constraints on the base density f Y (.). We deal with this by incorporating the constraint a (0) = 0 and b (0) = 1; where a (1) represents the change in location, and b (1) is the proportional change in scale.
For simplicity, we now write the notations of change in location and scale as a and b respectively. The log-likelihood function can then be expressed as We write Y t as a function of unknown parameters , a, and b: To estimate the base density, f Y (.), we use a kernel density estimate of Y t , t = 1, … , n. Note that the kernel density estimate of Y t , t = 1, … , n, is a function of unknown parameters a and b, and it is estimated simultaneously with the maximization over a and b. We maximize the log-likelihood function (2) over a and b and for each = 1, … , n − 1. The maximum likelihood estimates of a and b are denoted asâ andb, respectively, and the estimated density is denoted byf Y . Hence, the estimate of the densities for the data in the first segment isf Y (x t ), t = 1, … , , and for the data in the second segment isf Y { ( x t −â ) ∕b } ∕b, t = + 1, … , n. In practice, the likelihood maximization in Equation (2) is carried out using numerical optimization. This gives us an estimate of log-likelihood under the alternative.
Under the null hypothesis of no change in location and scale, the base density is estimated using a kernel density estimate of x t , t = 1, … , n, and it is denoted byf X (⋅). Note that if we fix a = 0, b = 1, then we get an estimate of the log-likelihood under the null. Hence, we have the likelihood-ratio test statistic: ] .
(4) Since we do not know the location of the changepoint, we evaluate the likelihood-ratio test statistic at = 1, … , n − 1 and use the maximum of LR : LR = max ∈{1, … ,n−1} LR . A changepoint is detected if LR > c for a suitably chosen threshold c. The choice of c also determines the significance level of the test. If LR > c, then the estimate of the location of changepoint is given bŷ Consequently, the change in location is estimated asâ and the proportional change in scale asb. We describe the procedure to estimate the distribution of the test statistic under the null hypothesis and how that helps to choose an appropriate threshold for our tests in the next Section 2.2.
To test for a change in location only, we need to fix b = 1 in the above procedure. Conversely, to detect a change in scale only, we can use the above procedure under the constraint a = 0. When detecting a change in location only (or scale only), the scale (location) parameter is considered unknown, and it does not need to be estimated as it is represented in the kernel density estimate. The LR test statistic (4) can be adjusted accordingly, and the rest of the procedure stays the same.

Changepoint significance testing
The previous section proposed a method to detect the location of the changepoint. We now describe the procedure to determine the statistical significance of the changepoint. Generally, calculating a suitable threshold for changepoint detection requires the knowledge of the underlying distribution, however this is unknown in our semiparametric setting. Instead, we propose using permutation tests [12,32], which only assume exchangeability of data within each segment.
We conduct a permutation test as described as follows: let x 1 , … , x n be the data sequence in which we are interested in testing a change in location or scale. The observations are permuted to get new sequences of length n. Then, we compute the likelihood-ratio test statistics as defined in Section 2.1, Equation (4) for the permuted sequences. Under the null hypothesis of no change, the distribution of the rank of the test statistic for the real data relative to the values of the test statistics for the permuted data is uniform over all possible ranks. Thus, we can use the distribution of the test statistics for the permuted data to calculate an exact p-value, or a threshold for a suitable significance level. It is computationally infeasible to consider all possible permutations of the data, so typically R random permutations are considered, for R of the order of 100-1000 s. Hence, for a level of significance , the threshold is computed as the (1 − )th sample quantile of the test statistics values. Permutation tests have been effectively used for significance testing in changepoint analysis [3,25].
Alternatively, in some applications, we have training data that resemble the null distribution, which can be used to estimate the threshold, as shown in the paper by Maeng et al. [24] and our applications. In such cases, we can resample randomly or deterministically chosen intervals and evaluate the test statistics. The threshold is then selected as the sample quantile of the test statistics.

Location-scale copula model for multivariate time series
We now turn to consider how the approach of Section 2.1 can be extended to the multivariate setting. Let X 1 , … , X n be an independent d-dimensional data sequence sampled from a random vector X = (X 1 , … , X d ) with d-dimensional CDF, F X , and marginal CDFs, Here, t ∈ Θ 1 ⊆ R p , p ≥ 1, and t ∈ Θ 2 ⊆ R 2d for t = 1, … , n distinguish the parameters related to dependence across different components of the time series and their marginal structures respectively. Sklar's theorem [39] states that any multivariate distribution can be written in terms of its univariate marginal distribution and a copula. Applying Sklar's theorem, we can write the d-dimensional CDF, F X , as where we represent the copula parameters as t and the set of parameters concerning marginal distributions as t . This framework allows us to model and detect a changepoint in both marginal structure and the dependence structure of the multivariate data sequence.
Differentiating Equation (5), we obtain, where c t is the copula density and f X i , i = 1, … , d are marginal densities. We assume that c q t belongs to an absolutely continuous parametric family of copulas. Throughout this section, the null hypothesis is given by In other words, there is no change in the marginal structure or the dependence structure of the multivariate time series.

2.3.1
Change in marginal structure in presence of dependence Let X it s follow a location-scale family of distributions with location parameters a it and scale parameter b it for i = 1, … , d. The location and scale parameters represent the parameters of the marginal structure, and are We assume that the copula parameters do not change and are unknown, denoted by . In other words, we are concerned with detecting any changes that are present in the marginal structure of the time series, and test the null hypothesis against the following alternative: where is unknown. We first estimate the copula parameters under the null hypothesis using maximum likelihood. We consider the following log-likelihood under the null hypothesis for copula parameters : We estimate the marginals nonparametrically, using the empirical CDFŝ Then the likelihood estimator for the copula parameters is realized by maximizing the first term of Equation (7) that depend on parameter , that is, .
We construct the likelihood ratio to test for change in the marginal structure. The likelihood-ratio test statistic is given by where the test statistic is carried out through maximum likelihood method. The marginal distributions are modeled using the location-scale distribution using the similar framework of Section 2.1. Consider the log-likelihood under the null in the numerator of Equation (9), where the location and scale parameters of each component are fixed to be 0 and 1, respectively. The marginal densities and CDFs of each component are estimated using the kernel density estimates and the empirical CDFs for the whole data. Under the alternative, the log-likelihood in the denominator of Equation (9) is maximized over the change in location and scale parameters for each component and copula parameters using numerical optimization. Hence, we obtain the likelihood-ratio statistic for change in the marginal structure. To test for a change in the location vector only, we need to fix the scale parameters of each component to 1. We compute the likelihood-ratio test statistic for = m, … , n − m, where m is the minimum segment length, and use the maximum. The choice of m can be problem specific and should at least allow enough observations for the estimation of parameters. The rest of the procedure for detecting a changepoint is same as explained in Section 2.1.

Change in dependence structure
The proposed framework allows us to detect changes in the dependence structure of the series with copula parameters, when there is no change in the marginal structure. We estimate the marginal distributions nonparametrically with their empirical CDFs for each segment. This framework is similar to existing copula-based methods for detecting changes in the dependence structure [10], and it is a particular case of our method. Here, we test the null hypothesis against the alternative The likelihood-ratio test statistic under null hypothesis to test for a change in copula parameter is given by where estimation of test statistics is carried out using maximum likelihood and the estimate of copula parameters is obtained aŝ ) .
As stated, the location of the changepoint is unknown, and we reject the null hypothesis for large values of The threshold is obtained using permutation approach as explained in Section 2.2. The users can choose from different parametric families of copulas, for example, elliptical copulas [13,35], including normal, Student t, Archimedean copulas [15] including Frank, Gumbel, and Clayton copulas, and pair copula constructions [1].

Change in both marginal and dependence structure
To deal with the case where both marginal and dependence structures can change, we adapt our procedure in the following way. First, we estimate the marginal distributions with their empirical CDFs for the whole data sequence using Equation (8). We use these estimated CDFs to construct the likelihood under the null and alternative hypothesis. Copulas describe the dependence structure between variables independently of their marginal distribution. With this framework, we can detect a change in copula parameters without the influence of changes in their marginal distribution. After finding the changes in copula parameters, we consider the copula parameters known and follow the rest of the procedure for detecting the changes in marginal structure as defined in Section 2.3.1.

Implementation details
In this section, we provide some details on implementing the proposed changepoint models. We estimate the kernel density estimates using a Gaussian kernel. The bandwidth is calculated with the method of Silverman [38], which uses the standard deviation, interquartile range, and sample size of the data. At a time point , we compute a bandwidth for the entire sequence, which is based on the optimized location and scale parameters using the above data-driven approach. For = 1, … , n − 1, the optimized values of location and scale parameters differ and the bandwidth changes adaptively. In the location-scale copula models for multivariate data sequences, for computational reasons, we estimate the CDF using ECDF and the density function using kernel density estimates.
For numerical optimization, we use the downhill simplex algorithm of Nelder and Mead [27]. For the cases where the bounds of parameters are known, we use the box-constrained method of Byrd et al. [8] where you can give lower and upper bounds on the parameters. Since the optimization is carried out at all possible changepoints from 1 to n − 1, we use the estimated values of the parameter from the previous step as the initial parameter values of the current step for efficient optimization. For the optimization of location-scale copula models, estimation of both copula and marginal parameters simultaneously in the alternative can be cumbersome in practice. Hence, we first estimate the marginal parameters with the estimated copula parameters under the null and then update the value of copula parameters in the alternative given the marginal parameters. In practice, it does not make much difference in the estimated value of copula parameters in the alternative but avoids the cumbersome optimization of both sets of parameters together. Note that if we assume that the data belong to a particular distribution in the location-scale family, then the methodology simplifies as we have the knowledge of base density. The estimation of threshold can be done with Monte Carlo methods, where we repeatedly simulate data from the known location-scale distribution with a fixed location and scale, and find the empirical quantile of the test statistics computed from the simulated data.

Estimating multiple changepoints
Our proposed methods for estimating a single changepoint in univariate and multivariate time series can be combined with the classical binary segmentation algorithm to locate multiple changepoints sequentially. The binary segmentation algorithm is an established search algorithm that can extend the likelihood-ratio statistic approach to detecting multiple changepoints. The idea is to first apply a single changepoint methodology to the whole time series. If no changepoint is detected, we stop. If a changepoint is detected, we split the time series into two (binary) segments, before and after the changepoint and apply the method to each segment. If a changepoint is detected in either or both segments, then we split into further segments. The procedure is repeated until no further changepoints are detected.

Algorithm 1.
Pseudocode for multiple changepoint algorithm based on the location-scale methodology for univariate and multivariate time series and binary segmentation

Input:
A set of data of the form, (x 1 , The pseudocode for detecting multiple changepoints with binary segmentation is given in Algorithm 1. The algorithm can be used for both the proposed univariate and multivariate location-scale methodologies. This considers the likelihood-ratio statistic computed on data segment x s∶e and a changepoint is detected if LR (x s∶e ) ≥ c and the changepoint location̂(x s∶e ) is recorded in , which denotes the set of changepoints. The current segment is removed from the set of segments , and the two new segments are added, splitting the segment at the changepoint. The iterations are continued until  is empty. The advantage of binary segmentation is that it is fast with computational cost O(n log n) where n is the length of the data. Other variants of binary segmentation can also be combined with our single changepoint methods, for example, wild binary segmentation [14] or seeded binary segmentation [22], which is suitable in case where changepoints may be too close to each other.

SIMULATION STUDY
This section presents simulation studies for the location-scale (LS) changepoint model for univariate distributions and the location-scale copula (LSC) model for multivariate distributions. We examine the empirical performance of the proposed methods in a range of settings and compare it with other competing methods. For univariate changepoint detection, we compare performance with a parametric method assuming Gaussian distribution [9], and a nonparametric method based on the pruned exact linear time algorithm [21] defined by Haynes et al. [17], referred to as NP-PELT, using the changepoint.np package in R [33]. For multivariate changepoint detection in location, we compare our method with a parametric method assuming multivariate normal distribution and a multivariate nonparametric method based on divisive clustering [25], referred to as NP E-divisive, using the ecp package in R. Throughout the simulation studies, we use the permutation approach defined in Section 2.2 to compute the threshold and use binary segmentation algorithm to estimate multiple changepoints.

Single changepoint detection
In this section, we compare the performance of the LS model with the Gaussian method and NP-PELT for detecting a single changepoint for various univariate sequences. We look at the power of the test and estimation accuracy over a range of sizes of the change. Firstly, we detect a changepoint in location, and we simulate the data from three distributions: Laplace distribution, Student's t-distribution with four degrees of freedom, and a negatively skewed triangular distribution given by The scale parameter of all the distributions is fixed to be 1. We set the sample size to n = 200 with the true changepoint location at 125 and denote the size of the change in location as Δ. We compute the power of the test, that is, the proportion of times a change is detected in presence of a change, and the mean squared error (MSE) of the estimated changepoint location with the true changepoint location for each Δ over 100 repetitions for all three methods. The results for detecting the change in location are shown in Figure 1. For data simulated from Laplace and t distribution, the LS method has the highest power and the lowest error in accuracy. For negatively skewed data, the power of both NP-PELT and LS methods is high, but the high power of NP-PELT comes at the cost of accuracy. We also examine the performance of the methods on detecting a single changepoint in scale. We simulate data from Gaussian, t, and Gumbel distribution. The location parameter is kept fixed at 0. The simulation settings are similar to the change in location scenario. Here, we vary the size of the change in scale from 0 to 2 for a length of 50 points. We use power of the test and mean absolute deviation (MAD) of the estimated changepoint location from true changepoint location to evaluate the performance. The results for detecting the changepoint in scale are shown in Figure 2. For Gaussian data, the performance of the proposed LS method is still close to the Gaussian method; it has slightly lower power than the Gaussian method but higher power than NP-PELT. For t and Gumbel distributed data, the LS method performs better than the NP-PELT and Gaussian power, as it has higher power and similar MAD values. Overall, the proposed LS method outperforms the Gaussian and NP-PELT methods in detecting a changepoint in location or scale for non-Gaussian distributions, including extreme value distributions such as Gumbel distribution.
We provide a further example where we simulate changes in both location and scale. We vary the l 2 norm of change of both location and scale from 0 to 2.2. The rest of the simulation settings are similar to the previous examples. We test for a change in location and scale both, and the results are shown in Figure 3. We provide additional MSE and MAD plots for these examples in the Data S1. The performance of proposed LS method is close to Gaussian method for Gaussian data, but for non-Gaussian data, it performs better than the competing methods. We also apply the test for change in location only and scale only for this simulation setting to examine the robustness of the method under misspecification and present the results in the Data S1. We use permutation approach to estimate the corresponding threshold of the test, which is a substantial part of the overall computational cost. However, once one goes beyond simple changepoint settings where accurate theory can be used to obtain thresholds, some form of Monte Carlo approach is used for obtaining sensible thresholds. The computational time for single changepoint simulation studies was reported close to 10 s by both the permutation procedure and Monte Carlo simulation.
We also simulate from an autoregressive (AR) model of order 1 with parameter with no changepoints. We vary the strength of autocorrelation ( ) from 0 to 0.8 to perform a sensitivity analysis of the permutation approach of estimating the threshold in case of temporal dependence. The permutation procedure assumes independent observations, and if there is autocorrelation in the data, then this method may lead to detection of large number of false changepoints. To overcome this, Lavielle and Moulines [23] suggested a simple approach that suggests increasing the threshold based on the level of autocorrelation. For the AR model, it is adjusted by a factor of (1 + )∕(1 − ).
We plot the detected changepoint proportions against the strength on autocorrelation for 0.05 level of significance in Figure 4. The permutation procedure without adjustment shows an increase in false positives as the strength of dependence increases, while the false-positive rate is stable after adjustment.

Multiple changepoint detection
In this section, we evaluate the performance of methods in detecting multiple changepoints in location. We simulate data from Gaussian, t distribution and a mixture setting, where the first half of the time series follows a Student's t-distribution with four degrees of freedom and the second half follows a Laplace distribution. The scale parameter is kept fixed at 1. We set the sample size n = 1000 with six changepoints, which are randomly chosen in the time series. The location parameters for the seven segments are set to be (1.5, −2, 0, 2.5, −1.1, 0, 1.7). We consider that a changepoint i is detected if there exists at least one estimator in the window of size h, that is if min 1≤ ≤m wherem is the number of detected changepoints. We use h = 2 for all the simulation studies. Denoting the true number of changepoints as m, we report the true-positive rate as the proportion of the correctly identified changepoints out of the m true changepoints. The false-positive rate is reported as the proportion of incorrect estimators out of them estimated changepoints. To further evaluate the accuracy of changepoint locations, we compute the distance between the estimated changepoints set̂, and true which quantifies the over-segmentation error and under-segmentation, respectively. We also report the adjusted rand index (ARI), which measures the similarity between the estimated and true segmentation [34].
The results for detecting multiple changepoints in location are presented in Table 1. As expected, when the data are Gaussian, the Gaussian method performs the best. Although the performance of the LS method is quite close to the Gaussian method and slightly better than the NP-PELT, for Student's-t data and mixture distribution, the LS method outperforms the other two methods. NP-PELT performs better in finding the true positives than Gaussian, but it overestimates the false positives. Looking at all the diagnostic measures, we argue that the LS method is more suitable when detecting the changes in location for non-Gaussian distributions as the Gaussian method works well when the data are Gaussian, and NP-PELT is suitable when the objective is to detect any kind of changes in distribution.

Change in location
In this section, we examine the performance of the LSC method for detecting changes in the location of the multivariate time series. In simulations, we also consider a LS method neglecting the copula to understand the impact of copulas in modeling the dependence of components across series. Firstly, we compare the performance of methods for a single changepoint task. We simulate bivariate data from a mixture of component distribution, where one component follows Student's t-distribution and the other follows an arcsine distribution [5]. The scale parameter of both components is fixed at 1. The location parameters are set such that the l 2 norm change in location vector varies from 0 to 3. We consider two scenarios of dependence across components, (i) dependence using Gaussian copula with correlation parameter 0.75 and (ii) independent components. We report the power, true-positive rate, false-positive rate, and mean squared error (MSE) of the estimated changepoint locations to true changepoint locations.
We compare the performances of the location-scale copula (LSC), the location-scale neglecting copula (LSnC), the multivariate Gaussian, and the NP E-divisive method. The results for detecting a single changpoint in the location vector of mixed components bivariate time series are shown in Figure 5 for the dependent case and Figure 6 for the independent case. For both the dependence cases, location-scale methods with or without copulas outperform the NP E-divisive and multivariate Gaussian method. For independent data case, the performance of LSC and LSnC is quite similar as modeling dependence does not give any improvement. However, for the dependence data case, the LSC method performs better than the LSnC method, and the improvement in the method shows the role of copulas in modeling dependence, which helps to detect a changepoint. Overall, the LSC method performs the best for detecting change in location for multivariate time series, but LSnC method can be used for computational advantages if the data are independent.
We examine the performance of methods in detecting multiple changepoint in location for multivariate time series for the dependent data. We consider the simulation setting with n = 1000, with three changepoints at 250, 500, and 750. We simulate from the mixture component distribution with scale parameters fixed to 1, and location parameters are such that the l 2 norm of change is = (0.83, 1.87, and 3.35). This also allows us to see the performance of methods at the different sizes of changes. We report the true-positive rate, false-positive rate, under-segmentation, over-segmentation error, and

TA B L E 1
Results for detecting multiple changepoints in location for data simulated using Gaussian, t, and mixture distributions using three methods: Gaussian, location-scale, and NP-PELT. ARI for all the methods computed over 100 runs. Table 2 summarizes the results. The proposed LCS method shows encouraging performance for all diagnostic measures. The LSnc method is competitive, and it highlights the drop in performance for not modeling the dependence. A pictorial representation of the results is also shown in Figure 7. It shows the performance of methods at each changepoint location. Again the LSC method tends to have high true positives and fewest false positives for all locations.

Change in dependence structure
We also look at the performance of the LSC model for detecting changes in the dependence structure, which is more challenging than detecting changes in the marginal structure [19]. We compare the method with change in covariance, assuming a multivariate normal distribution (see Chen and Gupta [9] for details) and the NP E-divisive method. We simulate data from bivariate Student's t-distribution with five degrees of freedom. The location and scale of the components are fixed to be 0 and 1, respectively. We choose the change in covariance with the known mean case for the multivariate Gaussian method. However, for the proposed method, the marginal distributions are estimated nonparametrically, and location and scale are considered unknown. We vary the size of change in correlation parameter from 0 to 0.9 to examine the performance of the methods for different sizes. We report the power and MAD for both the methods. The simulation results are shown in Figure 8. The proposed LS copula method, even with a Gaussian copula, has a higher power of detecting a change in dependence and also has higher accuracy of the estimated changepoint location than the multivariate Gaussian method. In practice, the choice of copula can be made with the goodness of fit tests. The proposed methods are built to detect particular properties of the data sequence like change in dependence, which are very difficult to detect from methods detecting change in distribution such as NP E-divisive as evident from the plot in Figure 8. We present an additional simulation study, which shows the robustness of the proposed method for dependence change in presence of marginal structure changes in the Data S1.

APPLICATIONS
We now illustrate our proposed methods with applications in health and financial data.

Health data
We first study physiological data of adult patients from a public-access database developed by Saeed et al. [36]. The database records time series of clinical measures like heart rate, systolic blood pressure, diastolic blood pressure, and oxygen saturation of adult patients in the intensive care unit. The database is freely available for researchers on https://archive.physionet.org/mimic2/. We consider blood pressure data, which are measured using two components: systolic blood pressure and diastolic blood pressure. Systolic blood pressure is the pressure exerted by blood against artery walls when the heart beats, and diastolic blood pressure is the same pressure measured when the heart is resting between beats. It is important to study the components together to monitor blood pressure, so we treat it as bivariate time series data. The data are recorded every

F I G U R E 6
Results for detecting single changepoint as a function of size (norm) of change for independent bivariate simulated data using three methods: mv Gaussian, location-scale without copula, and location-scale with copula.  [20,30]. We aim to detect changepoints in location for the bivariate distribution of systolic and diastolic blood pressure.

TA B L E 2
We apply the proposed LSC model to detect changes in the location of bivariate blood pressure time series. We first investigate which copula fits the data best using the AIC criterion. We consider elliptical copulas: Gaussian and Student's-t and Archimedean copulas: Clayton, Frank, and Gumbel to construct the multivariate dependence structure. The results for the fitted copulas are shown in Table 3. The elliptical copulas fit the data much better, with the Gaussian copula having the least AIC. The estimated degrees of freedom for t-copula are very high, suggesting that Gaussian copula is more suitable to capture the dependence structure well. The estimated value for the Gaussian copula parameter is 0.8046. We assume a common copula parameter for the whole period. We obtain the threshold by using the blood pressure time series of the same patient for the 11:17 a.m. March 24, 2012-9:24 p.m. 2012 with 606 training observations. We compute the test statistic for 1000 randomly chosen intervals, and the 95% quantile of the test statistics is chosen as the threshold. We compare our method with the NP E-divisive method of Matteson and James [25] for detecting multiple changepoints in the blood pressure time series. The results for detecting changepoints using both methods are shown in Figure 9. The proposed LSC model detects four significant changes in location of the bivariate blood pressure time series. The changepoints were detected at 00:29 a.m., 03:09 a.m., 07:33 a.m., and 12:05 p.m. In comparison, the nonparametric E-divisive method detects 24 changepoints with a 0.01 significance level. The reason TA B L E 3 Results of fitting copula to blood pressure time series for the whole dataset.

Financial data
We apply the proposed method to detect changes in the dependence structure of Standard & Poor 500 (S&P500) and Nasdaq indices for 2019 and 2020. The data for both indices are available freely on www.nasdaq.com, and they are recorded daily on stock market opening days. We consider daily log returns of both indices as they are commonly studied in finance. The two indices are quite popular equity indices in the United States and have been analyzed in the past for changepoint analysis in the dependence structure, see Guegan and Zhang [16]. We model the dependence structure of the two indices using copulas and are interested in the existence of change in parameters of the copula model.  To investigate the dependence structure between these two indices, we first find the best copula that fits the data over the whole period using the AIC criterion. The results for fitting the copula are given in Table 4. The estimation of parameters for all the considered copulas converged. Student t copula has the smallest AIC and therefore provides the best copula for the whole data. We then apply the proposed method with Student t copula to the bivariate financial time series of the two indices.
In order to apply the test in change in dependence as proposed in Section 2.3.2, we first estimate the marginal distribution nonparametrically. Assuming that the type of copula is the same, we test for a change in the dependence structure of the data by testing for a change in the copula parameter, which measures the correlation between the two time series. We test for multiple changepoints using the binary segmentation procedure. To obtain a threshold, we consider the daily time series of the two indices for 2017 and 2018 and compute the 95% quantile of the test statistics from 1000 randomly chosen intervals.
The algorithm detects only one significant changepoint on March 17, 2020. The log returns of both the indices and the estimated changepoint are shown in Figure 10. The correlation parameter before the changepoint was 0.97, and after the changepoint, it decreased to 0.83. The correlation parameter ignoring the changepoint was 0.9325. The changepoint date corresponds to the US stock market crash of 2020, which was triggered by COVID-19, and this period is also investigated by Mazur et al. [26] for the atypical stock market performance. Analyzing the dependence structure of the two indices and their changepoints can be helpful for portfolio management and risk assessment. We also refit the copula models for the two segments of the time series, and Student t copula still fits the best in terms of AIC criterion.

DISCUSSION
We defined a semiparametric approach for detecting changepoints in location, scale, and copula settings for univariate and multivariate data sequences. We do not assume an explicit form of the distribution beyond the LS family of distributions, which offers more flexibility. We use the dependence between the components of multivariate data to detect the subtle changes in the marginal structure. We show that detecting changes in the dependence structure is a special case of our method. The proposed method outperforms the Gaussian methods and other competing nonparametric methods in detecting the location and scale of non-Gaussian data sequences.
We estimated the associated base density of LS family nonparametrically using kernel density estimates, but other alternatives like k-nearest neighbors can be used. However, if the base density is known, then the proposed methodology can be simplified by using the known base density. To solve the unidentifiability issue for simultaneously estimating the base density and LS parameters, we used constraints on the parameters of the first segment, but alternatively, we can put constraints on the base density.
The proposed framework allows for mixed continuous distributions in the components of the multivariate data sequence. The methodology can be extended to allow for mixed discrete and continuous distributions using copula models for multivariate mixed discrete-continuous problems [2,43]. Also, we introduced our methodology for independent sequences, but it can be extended to model autocorrelation as well.