Modelling sample selection using Archimedean copulas



Summary. By a theorem due to Sklar, a multivariate distribution can be represented in terms of its underlying margins by binding them together using a copula function. By exploiting this representation, the ‘copula approach’ to modelling proceeds by specifying distributions for each margin and a copula function. In this paper, a number of families of copula functions are given, with attention focusing on those that fall within the Archimedean class. Members of this class of copulas are shown to be rich in various distributional attributes that are desired when modelling. The paper then proceeds by applying the copula approach to construct models for data that may suffer from selectivity bias. The models examined are the self-selection model, the switching regime model and the double-selection model. It is shown that when models are constructed using copulas from the Archimedean class, the resulting expressions for the log-likelihood and score facilitate maximum likelihood estimation. The literature on selectivity modelling is almost exclusively based on multivariate normal specifications. The copula approach permits selection modelling based on multivariate non-normality. Examples of self-selection models for labour supply and for duration of hospitalization illustrate the application of the copula approach to modelling.