## SEARCH BY CITATION

### Keywords:

• asymptotic properties;
• bootstrap tests;
• density estimation;
• hypothesis testing;
• maximum likelihood estimators;
• spherical data;
• von Mises distribution

### ABSTRACT

In this paper, we study the problem of testing the hypothesis on whether the density f of a random variable on a sphere belongs to a given parametric class of densities. We propose two test statistics based on the L2 and L1 distances between a non-parametric density estimator adapted to circular data and a smoothed version of the specified density. The asymptotic distribution of the L2 test statistic is provided under the null hypothesis and contiguous alternatives. We also consider a bootstrap method to approximate the distribution of both test statistics. Through a simulation study, we explore the moderate sample performance of the proposed tests under the null hypothesis and under different alternatives. Finally, the procedure is illustrated by analysing a real data set based on wind direction measurements.

### Introduction

The problem of testing the hypothesis that a random sample x1, … ,xn is generated by a specific distribution or by some set of distributions has been widely studied in the literature. To our knowledge, the first authors who tackled the problem of goodness of fit when using non-parametric density estimators were Bickel & Rosenblatt (1973). See also Rosenblatt (1975, 1991), Ahmad & Cerrito (1993) and Fan (1994, 1998) for some results in the multidimensional case with Euclidean data, that is, when the observations belong to an open subset of . All these papers used as criterion the L2 distance between parametric and non-parametric (kernel-based) estimators of the unknown density. On the other hand, the literature with L1 distance is more scarce, and we refer to Cao & Lugosi (2005) for some development.

In many applications, as is the case when dealing with directional data, the variables under study have an additional structure, and this structure needs to be taken into account in both the estimation and inference procedures. Several authors such as Hall et al. (1987), Fisher et al. (1993) and Mardia & Jupp (2000) discussed estimation methods for spherical and circular data. Beran (1979) considered the situation of exponential models for directional data and a goodness-of-fit test for nested models. Related to the problem of predicting the average wind speed to harvest electricity from wind energy, Hering and Genton (2010) showed the advantage of treating the wind direction as a circular variable (also, Genton & Hering, 2007). In this setting, modelling the wind direction distribution using parametric density families is an important issue, and so the aim of our work is to consider the problem of testing the hypothesis on whether the density f of a random variable on a sphere belongs to a given class of densities.

For this purpose, the problem of testing whether f belongs to a parametric class of densities is considered. Our test statistic is based on Lp distances between a non-parametric density estimator of f(x) adapted to circular data and a smoothed version of a parametric estimator of the data density. In particular, we study the simple null hypothesis, that is, the situation in which we want to know if the sample density equals a completely specified density function f ∘ . In Section 2, we define the test statistics for each of the hypothesis to be considered. In Section 3, for the composite hypothesis, that is, for the problem of testing whether f belongs to a parametric class of densities being considered, the asymptotic distribution of the L2 test, under the null hypothesis and under a set of contiguous alternatives, is obtained. A discussion of the particular case of the L1 distance test, including the possible difficulties and some open problems, is provided in Section 4. Besides, for the simple null hypothesis, we give a heuristic argument regarding the asymptotic properties for the Lp test statistic under the null hypothesis. The study of the asymptotic properties of the test based on the Lp distances, when p ≠ 2, for the composite hypothesis is much more delicate and is unknown even in the Euclidean setting. Bootstrap procedures and their validation for the L2 distance are studied in Section 5. In Section 6, through a simulation study, we explore the performance of the test procedures introduced in this paper, for moderate sample sizes, under null hypothesis and under a set of alternatives. Some of the obtained results can be seen in the online Supporting Information on the journal website. Finally, Section 7 presents a real data example based on wind direction measurements. Some conclusions are discussed in Section 8. Proofs are relegated to the Appendix.

### The test statistics

Let x1, … ,xn be independent observations of a random variable x taking values in the d-dimensional unit sphere in with probability density function f(x) on such that , where ωd is the rotation-invariant measure on the sphere. We begin by considering the situation in which the null hypothesis is completely specified by a fixed density. In Section 1, we extend the procedure to adapt to the case of the composite null hypothesis.

#### Testing a simple null hypothesis

In this section, we study the problem of testing the hypothesis

• (1)

at a specified significance level α, where f ∘  is a fixed density function. A natural approach is to consider as a measure of discrepancy to build the test statistic the L2 or, more generally, Lp distance between the target density f ∘  and a non-parametric estimator fn, for example, a kernel estimator. Because the kernel estimator is biased, this measure needs to be modified. Instead of a comparison of the kernel estimators fn and f ∘ , the idea is to compare fn with its expected value under the null hypothesis. In the context of spherical data, kernel density estimators need to be adapted to the structure beyond the data. We will consider the kernel density estimator fn suggested by Bai et al. 1988) and Hall et al. (1987) and defined as

• (2)

where c(h) is a normalizing constant given by and h = hn stands for the smoothing parameter. Hence, the test statistic based on the Lp distance is defined as

• (3)

where and fn is the kernel estimator defined through ‘((2)’. In particular, we will denote as T ∘ ,n the test statistic based on the L2 distance.

Zhao & Wu (2001) studied the asymptotic behaviour of . Thus, as mentioned before, one may consider instead of T ∘ ,n. However, as pointed out for Euclidean data by Fan (1994), the bias introduced in kernel density estimation has a significant influence on . In particular, theorem 1 in Zhao & Wu (2001) entails that depending on the rate of the bandwidth, that is, the relation between bias and variance, three different rates of convergence are attained for . To be more precise, if nhd + 4 ∞ , that is, if the data are oversmoothed, the bias will be large relative to variance, so converges at the rate of , whereas if the data are undersmoothed (nhd + 4 0), the rate of convergence is nhd / 2. Finally, if nhd + 4δ, that is, if the bias is balanced with the variance, the rate of convergence is still . As for the Euclidean case, the tests derived from may have trivial power against certain Pitman alternatives when oversmoothing. For this reason, the statistic T ∘ ,n should be preferred because it removes the bias inherent in kernel density estimation. This idea was used by Härdle and Mammen (1993) to construct tests for a parametric regression model and also by Fan (1994) for testing the goodness of fit for a parametric density family. In particular, as shown in theorem 1, the asymptotic distribution of T ∘ ,n is the same whether the data are oversmoothed, optimally smoothed, or undersmoothed, and it is the same as that of for undersmoothed data. As stated in theorem 2, the test based on T ∘ ,n, similar to that based on , detects Pitman alternatives with order of convergence n1 / 2hd / 4; however, it should be preferred to the test based on because it allows a wider range of smoothing parameters. It is also worth noticing that when T ∘ ,n is considered, only the continuity of the density function f is needed.

#### Testing a null composite hypothesis

Let be a family of density functions parametrized with a vector of parameters , where is a subset of . The parameter β will denote the indexing parameter of the family . We are interested in testing the composite hypothesis

• (4)

Under H ∘ , we have f = fβ for some . In this case, a parametric estimator of the density f needs to be considered. One way to proceed is to measure the distance between and the non-parametric estimator fn and to use this distance for testing the parametric model. However, as in Section 2.1 and because of the bias of the non-parametric estimator, a better approach is to consider the L2 distance between fn and a smooth version of . Therefore, the test statistic based on the L2 distance is defined as

• (5)

whereas, more generally, that based on Lp distances is defined as

where fn and are the kernel estimator defined through ‘(2)’ and the parametric estimator, respectively, and , as before.

Typically, a root-n estimator of β ∘  needs to be considered to ensure the proper rate of convergence. A good root-n option for is the maximum likelihood estimator of β ∘ . It is well-known that, under regularity conditions, maximum likelihood estimators are asymptotically normally distributed. In particular, when dealing with circular data, Cox (1974) studied the asymptotic properties of maximum likelihood estimators in the case of the von Mises distribution.

### Asymptotic behaviour of the statistic based on the L2 distance

In this section, we derive the asymptotic distribution of the test statistic Tn defined in ‘(5)’, under the null hypothesis and under a sequence of regular contiguous alternatives. The proofs of these results are relegated to the Appendix.

It is worth noticing that, as the simple null hypothesis is a particular case of a composite hypothesis, the asymptotic null distribution of the test statistic T ∘ ,n, defined in ‘(3)’ for p = 2, may be derived from theorem 1 whereas its behaviour under contiguous alternatives may be derived from theorem 2. However, it should be pointed out that when considering the simple null hypothesis defined in ‘(1)’, assumptions A3 and A5 are not needed.

From now on, denote by τd the surface area of , that is, τd = 2π(d + 1) / 2 / Γ((d + 1) / 2), for d ≥ 1. Let γd and gd(r) be defined as, respectively,

and

To obtain the asymptotic distribution of the test statistics, we need the following assumptions:

• A1
The kernel is a bounded and integrable function with a compact support. Moreover, if d = 1, .
• A2
The density function f is continuous on .
• A3
The function fβ(x) is twice continuously differentiable with respect to β, and its partial derivatives are bounded and uniformly continuous with respect to (β,x).
• A4
The sequence h satisfies nhd ∞ and h 0 as n ∞ .
• A5
There exists β1 such that
1. , when xi ∼ f (besides, when , β1 = β ∘ ).

2. , when , where .

Remark 1. It is worth noting that A4 together with A1 entails that hdc(h) λ − 1 as n ∞ with , where τd is the surface area of (Bai et al., 1988). Assumptions A1, A2, and A4 and the fact that hdc(h) λ − 1 were used by Zhao & Wu (2001) to derive the limiting distribution of the integrated squared error of the kernel density estimator of f(x). As in Fan (1998), assumption A5 is introduced to examine the effect of estimating β on the asymptotic distribution of Tn. If the parametric distribution is correctly specified, that is, if H ∘  holds, then β1 = β ∘ , the true value of β. If H ∘  is not true, then β1 can be regarded as a pseudotrue value of β. As mentioned in Section 2.2, the validity of the assumption A5 was verified, under H ∘ , in Cox (1974) under certain regularity conditions for the maximum likelihood estimator when considering the von Mises distribution. Beran (1979) extended these results for exponential families, which include the von Mises and Bingham distributions and also provide a regression-based estimator, which turned out to be root-n consistent.

Theorem 1. Assume that A1A4 and A5(a) hold. Then, under H ∘ , that is, when , we have , where , and

Assume that when , A5 holds, that is, for some , and furthermore that . Hence, the dominated convergence theorem entails that . For a given significance level α, denote by zα the upper α-quantile of the normal distribution, that is, with Z ∼ N(0,1). Then, the test rejecting H ∘  when where provides a consistent test, as . As is well-known, consistency is a desirable property for test statistics, as a consistent test will reject a false null hypothesis with probability 1 asymptotically. An important issue for consistent tests is to study their local power properties under sequences of local alternatives. To achieve this goal, theorem 2 studies the behaviour of Tn under the sequence of regular contiguous alternatives defined by

• (6)

where . Local power properties for this family of Pitman alternatives sequence were studied by Fan (1994) for Euclidean data.

Theorem 2. Assume that A1A4 and A5(b) hold. Then, under H1c defined in ‘(6)’, we have , where λ, b, and are defined in theorem 1.

### Some remarks on the Lp distance test statistic when p ≠ 2

As mentioned in Section 1, the L1 or, more generally, any Lp distance for p ≠ 2 can also be considered to measure the discrepancy between the kernel estimator and a smooth version of . To be more precise, if we want to test f = f ∘ , one may consider the statistic , whereas when composite hypothesis ‘(4)’ is tested, the L1 test statistics equal

• (7)

where fn and are as in Section 2.2.

It is worth noting that the asymptotic distribution of the test based on the Lp distance, for p ≠ 2, is more delicate. In the case of real variables, that is, when , a central limit theorem for the Lp distance when p ≥ 1 is stated by Csörgö & Horváth (1987), whereas the proofs of the given results are provided by Csörgö & Horváth (1988). In particular, they proved that, when hn 0, , and K satisfies standard conditions, the L1 distance between the kernel density estimator and its expected value is asymptotically normally distributed. More precisely, they obtained that, when , , where σ2(K) is a constant depending on the kernel and W ∘ ,n is the L1 test statistic adapted to one-dimensional data.

The important property in this result is the rate of convergence. Instead of the rate nh1 / 2 attained by the L2 distance statistic when d = 1 and (Fan, 1994, 1998), the rate is achieved. An extension of the results used by Csörgö & Horváth (1987) to the situation of bivariate vectors was provided by Horváth (1991).

As mentioned earlier, under the simple null hypothesis, H ∘ : f(x) = f ∘ (x), theorem 1 entails that nhd / 2(T ∘ ,n − b / (nhd)) is asymptotically normally distributed. A systematic study of the asymptotic distribution of the Lp test statistic , under H ∘ , requires Poissonization techniques similar to those considered by Horváth (1991). This interesting topic may be the subject of future research. However, some standard computations allow us to heuristically derive that

is asymptotically normally distributed, where with N ∼ N(0,1).

In particular, for directional data, the test statistic W ∘ ,n based on the L1 distance will also have a root-n rate of convergence when testing the simple null hypothesis. We expect that the same rate is preserved when considering a composite parametric hypothesis if the estimator has a root-n rate of convergence.

In the L1 situation, the asymptotic behaviour of the test statistic for the composite null hypothesis is in fact much more important and challenging, as usually the parametric estimator is also root-n consistent. Hence, the asymptotic behaviour of both density estimators, the parametric and non-parametric ones, will be relevant for the asymptotic distribution of the test statistics Wn. As in the case of statistics based on the empirical process for real variables, such as the Kolmogorov–Smirnov or Anderson–Darling statistics, the asymptotic distribution will be much more complicated, and we argue that it will depend on the covariance between the expansion of the parametric component and the asymptotic component of the non-parametric one.

### The bootstrap test

The discussion given in the previous sections motivates the use of bootstrap methods. In particular, for the L2 test statistic, the rate of convergence is nhd / 2, and so, as in other non-parametric situations, we may expect that the normal approximation will not work well for moderate sample sizes. We briefly discuss this fact in Section 6. To provide an alternative to the asymptotic distribution of test statistics defined in Section 2, we study a bootstrap procedure for each of the hypotheses considered. For this purpose, we will use a parametric bootstrap by generating independent bootstrap samples according to the density f ∘  or depending on the hypothesis. The use of the parametric bootstrap in goodness of fit dates back to Stute et al. (1993), who applied it to a test statistic based on the cumulative distribution. Recently, Genest & Rémillard (2008) extended the result to goodness-of-fit statistics derived from processes having a rate of convergence. Besides, for data belonging to an open subset of , Fan (1998) considered a parametric bootstrap goodness-of-fit test. His approach approximates the critical values of a weighted L2 distance between the empirical characteristic function and its parametric estimate under the null model. Later on, Neumann & Paparoditis (2000) considered the independent and identically distributed parametric bootstrap to check parametric hypotheses about the stationary density of weakly dependent observations, based also on the L2 distance between non-parametric and smoothed versions of the parametric density estimator. Several authors, such as Berg (2009), Genest et al. (2009), and Bücher & Dette (2010), have investigated the performance of the parametric bootstrap, when considering goodness-of-fit tests for the parametric form of the copula, also based on L2 distances, showing that it yields reliable approximations to the nominal level.

#### Description of the bootstrap tests

Consider first the case of testing a simple null hypothesis. In this case, the distribution of the variables x1, … ,xn is completely known under the null hypothesis. Then, we may consider the bootstrap statistic constructed by generating independent bootstrap samples according to the density f ∘ . The bootstrap procedure in this case is in fact a Monte Carlo approximation and can be described as follows:

• Step 1.
Generate a random sample of size n, , from the distribution f ∘ .
• Step 2.
Compute the L1 or L2 statistics with the bootstrap sample and denote them as and , respectively, with , where is the kernel estimator based on .
• Step 3.
Repeat steps 1 and 2, B times. Let and stand for the empirical distributions of and , respectively. Compute t ∘ ,n,α (or w ∘ ,n,α), the upper α-percentile of .

Then, the bootstrap procedure based on either the L1 or L2 distance rejects H ∘  if W ∘ ,n > w ∘ ,n,α or T ∘ ,n > t ∘ ,n,α, respectively.

When the composite null hypothesis ‘(4)’ is considered, the bootstrap procedure is defined by generating samples according to a parametric density estimator obtained using the original observations. Let be a root-n estimator of the indexing parameter β ∘  and denote by the parametric estimator of f, under H ∘ , obtained from it. The bootstrap procedure can be described as follows:

• Step 1.
Generate a random sample of size n, , from the distribution .
• Step 2.
Compute the L1 or L2 statistics defined through ‘(5)’ and ‘(7)’ with the bootstrap samples and denote them by and , respectively, where , with and as the kernel density and the indexing parameter estimator based on , respectively.
• Step 3.
Repeat steps 1 and 2, B times. Let and stand for the empirical distributions of and , respectively. Compute tn,α (or wn,α), the upper α-percentile of .

Then, we will reject H ∘  if Tn > tn,α (or Wn > wn,α).

Note that the same family of estimators of the parameter β needs to be used both when generating the sample and in step 2. To be more precise, if, for instance, the maximum likelihood estimator is used to compute , then in step 2, is the maximum likelihood estimator of the indexing parameter based on the bootstrap sample .

#### Validity of the bootstrap procedure

When the simple null hypothesis ‘(1)’ is considered, it follows immediately that the null distribution of is the same as that of . This fact entails the validity of the bootstrap procedure for the Lp distance tests and, in particular, for both the L1 and L2 distance test statistics, when we consider a simple null hypothesis.

On the other hand, for the composite hypothesis, we mentioned, in Section 4, the difficulties arising when dealing with the Lp distance. Therefore, we will only study the null asymptotic properties of the test statistics based on the L2 distance, , when H ∘  is defined through ‘(4)’. To derive the validity of the bootstrap method for composite null hypothesis, we will need the following additional assumption related to the asymptotic behaviour of the parametric bootstrap estimator.

• A6 , where refers to the conditional distribution of 's on xi's. Moreover, almost surely.

Theorem 3. Assume that A1A4, A5(a), and A6 hold. Then, conditional on x1, … ,xn, we have in probability, where λ, b, and σ2 are defined in theorem 1.

Note that theorem 3 holds regardless of whether the null hypothesis is true or not because the bootstrap samples are generated according to . Therefore, the bootstrap procedure leads to a consistent test. Effectively, the following occur:

1. Under H ∘ , the bootstrap distribution of Tn converges to the asymptotic null distribution of Tn, so the asymptotic significance level of the test Tn based on the bootstrap critical value tn,α is indeed α.

2. When the null hypothesis is false, that is, when , we have , so that the test statistic nhd / 2(T n − b / (nhd)) converges to infinity, whereas asymptotically, the bootstrap critical value is still finite.

### Simulations

This section contains the results of a simulation study in the circle (d = 1), designed to evaluate the performance of the test procedures under H ∘  and under different alternatives, and is dedicated to exploring numerically different aspects regarding the finite sample performance of the proposed tests. The results of a Monte Carlo study conducted to study the performance of the L2 test statistics when using the asymptotic approximation derived in Section 3 can be seen in the online Supporting Information on the journal website. The results obtained therein show that approximations of the critical values, as those described in Section 5, are needed. For this reason, the main goal of this section is to validate numerically the good performance of the bootstrap procedure for both the L2 and L1 distances for the simple null hypothesis. A composite hypothesis was also considered, and the obtained results are given in the online Supporting Information.

The null hypothesis corresponds to a von Mises (circular normal) distribution. The von Mises distribution has a density function fμ,κ(x) = exp{κxTμ} / (2πI0(κ)), with μ = (cosμ,sinμ), where 0 ≤ μ < 2π is the mean parameter, κ > 0 is the concentration parameter, and I0(κ) stands for the modified Bessel function of the first kind and order zero, that is, . This model has many important applications, as described by Mardia & Jupp (2000) and Jammalamadaka & Sengupta (2001). In particular, when κ 0, the distribution converges to a uniform distribution on the sphere, whereas the larger the value of κ, the greater is the clustering around the mean parameter μ, which is also the mode. It provides an adequate model for phenomena that are rotationally symmetric around μ. In particular, Fisher (1953) described this distribution in detail when d = 2, in the context of statistical mechanics and palaeomagnetism problems. The von Mises distribution has served as a probability model for directions in the plane and is the natural analogue of the normal distribution on (see, for instance, Fisher, 1995, for applications on real circular data sets).

Under the null hypothesis, we generated x1, … ,xn random variables from a von Mises density with mean equal to π and concentration parameter 5, that is, f ∘  = fπ,5. More precisely, the simple null hypothesis corresponds to testing H ∘ : f = f ∘ . We have performed NR = 1000 replications, whereas for the bootstrap procedure, we considered B = 5000 replications.

In the smoothing procedure, we have used Epanechnichov's kernel with bandwidth parameter h. When performing the bootstrap approximation, we selected several bandwidths to investigate the sensitivity of the tests, in both level and power, with respect to bandwidth choice. In particular, the effects of undersmoothing and oversmoothing on the test power are investigated.

To analyse the performance of bootstrap tests under the null and alternative hypotheses, the sample size is taken as n = 100, whereas the bandwidths considered are h = 0.05,0.1,0.25,0.5,0.7,0.9,1.1, and 1.2. We select four particular alternatives denoted by Hδ for δ = 0.1,0.2,0.35, and 0.5. They correspond to generated observations x1, … ,xn with density fδ = (1 − δ)f ∘  + δf1, where f ∘  = fπ,5 and f1 = fπ / 2,5. Figure S1 in the online Supporting Information on the journal website shows the plot of the densities fδ chosen as alternatives.

This set of alternatives can be written as in ‘(6)’, taking Δ(x) = b(δ) (f1(x) − f ∘ (x)), where depends on the alternative index δ and also on the selected bandwidth. The values of b(δ) are given in Table S3 in the online Supporting Information on the journal website to make fair comparisons of the obtained results. For instance, the results for δ = 0.2 and h = 0.05 need to be compared with those related to δ = 0.1 if the bandwidth lies between 0.5 and 0.7.

For the simple hypothesis H ∘ : f = f ∘  = fπ,5, the observed frequencies of rejection of the bootstrap tests based on the L2 and L1 distances, at the 5% level, are reported in Fig. 1 and Table 1. Table 1 shows the improvement attained in level when the finite sample approximation provided by the bootstrap test is used. For almost all the range of bandwidths, the observed level of the bootstrap test is quite close to the nominal one, when the sample size is n = 100.

Table 1. Observed frequencies of rejection of the bootstrap test at the 5% level for the simple hypothesis H ∘ : f = fπ,5 and different values of the smoothing parameter h. Hδ stands for the alternative hypothesis when we consider f = fδ = (1 − δ)f ∘  + δf1 with f ∘  = fπ,5 and f1 = fπ / 2,5, for δ = 0.1,0.2,0.35, and 0.5
DistanceHypothesish
0.050.10.250.50.70.91.11.2
L2H ∘ 0.0470.0480.0560.0510.0530.0500.0500.050
L1 0.0550.0400.0510.0490.0510.0500.0500.049
L2H0.10.0250.0550.2180.5180.6540.7270.7630.780
L1 0.2930.3860.510.6730.7630.8020.8110.803
L2H0.20.0730.4300.8820.9900.9960.9970.9970.997
L1 0.8010.9170.9790.9950.9980.9980.9970.997
L2H0.350.7770.994111111
L1 11111111
L2H0.511111111
L1 11111111

The results of the power study reported in Table 1 and Fig. 1 indicate that both the L2 and L1 bootstrap test statistics have a good performance under the null hypothesis. As mentioned before, both tests attain the significance level for all the bandwidths considered. Besides, the test based on distance L1 detects more easily the alternatives considered for different values of bandwidth. Note that when δ = 0.2, the observed frequency of rejection is larger than 0.8 for all the bandwidths. On the other hand, the L2 test shows a loss of power for small bandwidths. This fact may be related to the rate of convergence of the L2 test, which depends on the bandwidth parameter.

### Real data example

In this section, we applied the proposed bootstrap test procedure to a real data set. The original data are measurements of the wind directions recorded each minute daily in two meteorological stations at Galicia, in the northwest of Spain. The stations will be referred to as B1 and C9, and their locations are plotted in Fig. 2. The observations studied here correspond to August 2009 with a sample size of 26,426. In the figures and tables, we denote as zero the north direction, and we measure the data in radians clockwise. The rose diagrams given in Fig. 3 show that the wind directions in both stations have a unimodal distribution with mean around the north.

Let f be the density function of the wind direction. For both stations, the null hypothesis considered was H ∘ : f ∈ {fμ,κ; (μ,κ) ∈ [0,2π) × (0, + ∞ )}, where fμ,κ is the density function of a von Mises distribution with mean μ and concentration parameter κ as described in Section 6. Table 2 reports the values of the maximum likelihood estimators of the parameters (μ,κ). For the null hypothesis of interest, we compute the p-values for the two bootstrap tests taking B = 5000 bootstrap replications. The p-values were computed for different bandwidths to analyse the sensitivity of the results. For each station, the set of bandwidths chosen includes the selector obtained via a cross-validation procedure using the Kullback–Leibler distance (for example, Marron, 1987). To perform the cross-validation, a grid with a length of 46 equally spaced bandwidths between 0.001 and 2π was taken. For both stations, the data-driven cross-validation bandwidth was 1.727154. Figure S3 in the online Supporting Information on the journal website shows the plots of the parametric density estimator and the non-parametric estimators for different values of bandwidths.

Table 2. Maximum likelihood estimators of the parameters (μ,κ) under a von Mises parametric model for stations B1 and C9
Stations
B10.60860.645
C90.1810.4697

Note that, for both B1 and C9, a bandwidth of 0.8 shows a good behaviour that is quite similar to that provided by the parametric estimation. Besides, the bandwidth obtained using the cross-validation procedure produces oversmoothed estimations. The obtained p-values for nine different bandwidths are reported in Table 3 and Fig. 4.

Table 3. Empirical p-values for stations B1 and C9
Testh
0.50.60.70.81.21.41.61.82
Station B1
L20.62120.61380.60060.59540.58080.60080.67260.86741.0000
L10.85480.90040.92980.97140.99941.00001.00001.00000.9998
Station C9
L20.51820.51340.51220.53400.58380.54480.58900.61920.8610
L10.28560.34300.40540.45000.52140.81860.93140.757870.6588

From the obtained results, for both meteorological stations, the wind directions can be assumed to have a von Mises distribution. In both cases, for all bandwidths considered, the empirical p-values imply that the null hypothesis cannot be rejected.

### Concluding remarks

Two test statistics were introduced to test if a sample from directional data has a density that belongs to a given parametric family. The asymptotic distribution of the test based on the L2 distance was given both under the null hypothesis and under a set of contiguous alternatives. The asymptotic performance of L1 deserves further study, but we conjecture from the results obtained in the Euclidean case that the order of convergence of the L2 test may be improved using instead the L1 distance statistic.

Our simulation study illustrates the well-known disadvantage of tests based on non-parametric estimators because, even if they have a normal distribution, their convergence towards the normal distribution is very slow. For this reason, approximations of critical values are needed. The bootstrap procedure proposed overcomes this problem, as the nominal level is attained for moderate sample sizes.

Finally, the results in Sections 6 and 7 suggest that a deeper study to select the smoothing parameter for testing problems is needed. This topic requires further careful investigation and is still an open problem, even in the setting of Euclidean data.

### Acknowledgements

This work was carried out while Daniela Rodriguez was visiting the Department of Statistics and Operation Research of the Universidad de Santiago de Compostela, supported by a posdoctoral fellowship from CONICET. She is very grateful to all the members of the Group of Statistics for their kind hospitality. This research was partially supported by grants from the Universidad de Buenos Aires, the CONICET, and the ANPCYT, Argentina, and also by a Spanish grant from the Ministerio Español de Ciencia e Innovación and a XUNTA grant from Galicia, Spain.

We wish to thank the Editor, the Associate Editor, and two anonymous referees for valuable comments that led to an improved version of the original paper.

### References

• & (1993). Goodness of fit tests based on the L2-norm of multivariate probability density functions. J. Nonparametr. Stat. 2, 169181.
• , & (1988). Kernel estimators of density function of directional data. J. Multivariate Anal. 27, 2439.
• (1979). Exponential models for directional data. Ann. Stat. 7, 11621178.
• (2009). Copula goodness-of-fit testing: an overview and power comparison. Eur. J. Financ. 15, 675701.
• & (1973). On some global measures of the deviations of density function estimates. Ann. Stat. 1, 10711095.
• & (2010). Some comments on goodness-of-fit tests for the parametric form of the copula based on L2-distances. J. Multivariate Anal. 101, 749763.
• & (2005). Goodness-of-fit tests based on the kernel density estimator. Scand. J. Stat. 32, 599616.
• (1974). Theoretical statistics, Chapman & Hall, London.
• & (1987). Asymptotics for Lp-norms of kernel estimators of densities. Comput. Stat. Data Anal. 6, 241250.
• & (1988). Central limit theorems for Lp-norms of density estimators. Probab. Theory Related Fields 80, 269291.
• (1994). Testing the goodness of fit of a parametric density function by kernel method. Economet. Theor. 10, 316356.
• (1998). Goodness-of-fit tests based on kernel density estimators with fixed smoothing parameters. Economet. Theor. 14, 604621.
• (1995). Statistical analysis of circular data, University Press, Cambridge.
• , & (1993). Statistical Analysis of Spherical Data, University Press, Cambridge.
• (1953). Dispersion on a sphere. Proc. R. Soc. London, Ser. A 217, 295305.
• & (2008). Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Ann. Inst. H. Poincaré Probab. Statist. 44, 10961127.
• , & (2009). Goodness-of-fit tests for copulas: a review and a power study. Insur. Math. Econ. 44, 199213.
• & (2007). Blowing in the wind. Significance 4, 1114.
• (1984). Central limit theorem for integrated square error of multivariate nonparametric density estimators. J. Multivariate Anal. 14, 116.
• , & (1987). Kernel density estimation with spherical data. Biometrika 74, 751762.
• & (1993). Comparing nonparametric versus parametric regression fits. Ann. Stat. 21, 19261947.
• & (2010). Powering up with space–time wind forecasting. J. Am. Stat. Assoc. 105, 92104.
• (1991). On Lp-norms of multivariate density estimators. Ann. Stat. 19, 19331949.
• & (2001). Topics in Circular Statistics, Multivariate Analysis, vol. 5, World Scientific, Singapore.
• & (2000). Directional data, Wiley, New York.
• (1987). A comparison of cross validation techniques in density estimation. Ann. Stat. 13, 152162.
• & (2000). On bootstrapping L2-type statistics in density testing. Stat. Probabil. Lett. 50, 137147.
• (1975). A quadratic measure of deviation of two-dimensional density estimates and a test of independence. Ann. Stat. 3, 114.
• (1991). Stochastic curve estimation, NSF-CMBS Regional Conference Series in Probability and Statistics, vol. 3, Institute of Mathematical Statistics, Hayward, California.
• , & (1993). Bootstrap based goodness-of-fit tests. Metrika 40, 243256.
• & (2001). Central limit theorem for integrated square error of kernel estimators of spherical density. Sci. China, Ser. A 44, 474483.

### Appendix: Proofs

Proof of theorem 1. Denoting , we have the expansion with , , and

Note that, under H ∘ , . Hence, assumptions A1, A2, and A4 together with lemma 4 in Zhao & Wu (2001) entail that , whereas lemma 6 in Zhao & Wu (2001) leads us to with Z ∼ N(0,1). As in the proof of lemma 4 in Zhao & Wu (2001), lemma 1(i) in Zhao & Wu (2001) together with assumptions A1, A2, and A4 imply that

Therefore, it only remains to be shown that

• (8)
• (9)

Using a Taylor expansion of order 1 and the fact that, under H ∘ , , we obtain

where ξn is an intermediate point between and β ∘ . Then, lemma 1 in Zhao & Wu (2001), the dominated convergence theorem, and the fact that from A5(a) has a root-n order of convergence entail that , concluding the proof of ‘(8)’.

Likewise, under H ∘ , a second-order Taylor expansion gives , where

Using lemma 1 from Zhao & Wu (2001) and assumptions A1 and A3, we obtain . Therefore, as , we have , which together with the fact that implies that the first term of nhd / 2T n4 converges to 0 in probability. Straightforward calculations allow us to show that the second term of Tn4 is negligible compared with the first one, as from A5(a), , concluding the proof of ‘(9)’.□

Proof of theorem 2. We have the expansion , where Tnj are defined as in the proof of theorem 1, with . Note that Tn4 = Tn41 + 2Tn42, where

and

As in the proof of theorem 1, we have . Besides, , where

Using lemma 1 in Zhao & Wu (2001) and the dominated convergence theorem, we can easily derive that , as A1 and A2 hold. Therefore, . Arguing as in theorem 1, we obtain , which entails that .

On the other hand, A1 and A2 and lemma 1 in Zhao & Wu (2001) imply that . Similarly, using from A5(b) and the Cauchy–Schwartz inequality, it is easy to see that . Besides, as in theorem 1, , and so .

Then, as in the proof of theorem 1, we obtain

which concludes the proof.□

Proof of theorem 3. Let , and note that

Using arguments analogous to those considered in the proof of theorem 1, when deriving the asymptotic behaviour of Tn3 and Tn4 and using A6, we obtain . Then, it is enough to obtain the asymptotic behaviour of . Let

where and denotes expectation with respect to the bootstrap distribution. Then, . The term is a quadratic form with . Then, as in lemma 6 in Zhao & Wu (2001), from A1 and A2, the fact that is a consistent estimator of β1, and the dominated convergence theorem, we obtain . Arguing as in lemma 6 in Zhao & Wu (2001), but conditional on x1, … ,xn, we can see that the conditions of theorem 1 in Hall (1984) are satisfied. Finally, arguments analogous to those considered in lemma 4 in Zhao & Wu (2001) allow us to show that A1, A2, and A4 imply that .□

### Supporting Information

FilenameFormatSizeDescription
sjos12020-sup-0001_Supplementary.pdfPDF document120KSupporting info item
sjos12020-sup-0001_Supplementary.texapplication/unknown19KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.