The analysis of sports data, in particular football match outcomes, has always produced an immense interest among the statisticians. In this paper, we adopt the generalized Poisson difference distribution (GPDD) to model the goal difference of football matches. We discuss the advantages of the proposed model over the Poisson difference (PD) model, which was also used for the same purpose. The GPDD model, like the PD model, is based on the goal difference in each game that allows us to account for the correlation without explicitly modelling it. The main advantage of the GPDD model is its flexibility in the tails by considering shorter as well as longer tails than the PD distribution. We carry out the analysis in a Bayesian framework in order to incorporate external information, such as historical knowledge or data, through the prior distributions. We model both the mean and the variance of the goal difference and show that such a model performs considerably better than a model with a fixed variance. Finally, the proposed model is fitted to the 2012–2013 Italian Serie A football data, and various model diagnostics are carried out to evaluate the performance of the model.

]]>Penalized splines are used in various types of regression analyses, including non-parametric quantile, robust and the usual mean regression. In this paper, we focus on the penalized spline estimator with general convex loss functions. By specifying the loss function, we can obtain the mean estimator, quantile estimator and robust estimator. We will first study the asymptotic properties of penalized splines. Specifically, we will show the asymptotic bias and variance as well as the asymptotic normality of the estimator. Next, we will discuss smoothing parameter selection for the minimization of the mean integrated squares error. The new smoothing parameter can be expressed uniquely using the asymptotic bias and variance of the penalized spline estimator. To validate the new smoothing parameter selection method, we will provide a simulation. The simulation results show that the consistency of the estimator with the proposed smoothing parameter selection method can be confirmed and that the proposed estimator has better behavior than the estimator with generalized approximate cross-validation. A real data example is also addressed.

]]>This article proposes three methods for merging homogeneous clusters of observations that are grouped according to a pre-existing (known) classification. This clusterwise regression problem is at the very least compelling in analyzing international trade data, where transaction prices can be grouped according to the corresponding origin–destination combination. A proper merging of these prices could simplify the analysis of the market without affecting the representativeness of the data and highlight commercial anomalies that may hide frauds. The three algorithms proposed are based on an iterative application of the *F*-test and have the advantage of being extremely flexible, as they do not require to predetermine the number of final clusters, and their output depends only on a tuning parameter. Monte Carlo results show very good performances of all the procedures, whereas the application to a couple of empirical data sets proves the practical utility of the methods proposed for reducing the dimension of the market and isolating suspicious commercial behaviors.

Cusum charts are widely used for detecting deviations of a process about a target value and also for finding evidence of change in the mean of a process. The testing theory approximates the process by a Wiener process or a Brownian bridge, respectively. For quality control, it is important that other aspects are monitored in addition to or instead of the mean. Here, we show that cusum theory is easily adapted when the target is not the mean but some other aspect of the distribution.

]]>In this paper, we propose a new first-order non-negative integer-valued autoregressive [INAR(1)] process with Poisson–geometric marginals based on binomial thinning for modeling integer-valued time series with overdispersion. Also, the new process has, as a particular case, the Poisson INAR(1) and geometric INAR(1) processes. The main properties of the model are derived, such as probability generating function, moments, conditional distribution, higher-order moments, and jumps. Estimators for the parameters of process are proposed, and their asymptotic properties are established. Some numerical results of the estimators are presented with a discussion of the obtained results. Applications to two real data sets are given to show the potentiality of the new process.

]]>No abstract is available for this article.

In this paper, a new randomized response model is proposed, which is shown to have a Cramer–Rao lower bound of variance that is lower than the Cramer–Rao lower bound of variance suggested by Singh and Sedory at equal protection or greater protection of respondents. A new measure of protection of respondents in the setup of the efficient use of two decks of cards, because of Odumade and Singh, is also suggested. The developed Cramer–Rao lower bounds of variances are compared under different situations through exact numerical illustrations. Survey data to estimate the proportion of students who have sometimes driven a vehicle after drinking alcohol and feeling over the legal limit are collected by using the proposed randomization device and then analyzed. The proposed randomized response technique is also compared with a black box technique within the same survey. A method to determine minimum sample size in randomized response sampling based on a small pilot survey is also given.

]]>While right-censored data are very common in survival analysis, they may also occur in the case of count data. The literature contains models to treat such right-censored count data. In this paper, we want to address issues of heterogeneity and clustering in this context. We propose a finite mixture of censored Poisson regressions to accommodate heterogeneity and also identify clusters in right-censored count data. We also develop an expectation maximization algorithm to facilitate the estimation of such models and discuss the computational aspects of the proposed algorithm. We then present results based on simulated data to show the effect of censoring in estimation. We also present a marketing application of the proposed approach involving the number of renewals of magazine subscriptions.

]]>During the last three decades, integer-valued autoregressive process of order *p* [or INAR(*p*)] based on different operators have been proposed as a natural, intuitive and maybe efficient model for integer-valued time-series data. However, this literature is surprisingly mute on the usefulness of the standard AR(*p*) process, which is otherwise meant for continuous-valued time-series data. In this paper, we attempt to explore the usefulness of the standard AR(*p*) model for obtaining coherent forecasting from integer-valued time series. First, some advantages of this standard Box–Jenkins's type AR(*p*) process are discussed. We then carry out our some simulation experiments, which show the adequacy of the proposed method over the available alternatives. Our simulation results indicate that even when samples are generated from INAR(*p*) process, Box–Jenkins's model performs as good as the INAR(*p*) processes especially with respect to mean forecast. Two real data sets have been employed to study the expediency of the standard AR(*p*) model for integer-valued time-series data.

In designing an experiment with one single, continuous predictor, the questions are composed of what is the optimal number of the predictor's values, what are these values, and how many subjects should be assigned to each of these values. In this study, locally D-optimal designs for such experiments with discrete-time event occurrence data are studied by using a sequential construction algorithm. Using the Weibull survival function for modeling the underlying time to event function, it is shown that the optimal designs for a linear effect of the predictor have two points that coincide with the design region's boundaries, but the design weights highly depend on the predictor effect size and its direction, the survival pattern, and the number of time points. For a quadratic effect of the predictor, three or four design points are needed.

]]>