Factor copula models for mixed data

We develop factor copula models for analysing the dependence among mixed continuous and discrete responses. Factor copula models are canonical vine copulas that involve both observed and latent variables, hence they allow tail, asymmetric and non-linear dependence. They can be explained as conditional independence models with latent variables that don't necessarily have an additive latent structure. We focus on important issues that would interest the social data analyst, such as model selection and goodness-of-fit. Our general methodology is demonstrated with an extensive simulation study and illustrated by re-analysing three mixed response datasets. Our study suggests that there can be a substantial improvement over the standard factor model for mixed data and makes the argument for moving to factor copula models.


Introduction
It is very common in social science, e.g., in surveys, to deal with datasets that have mixed continuous and discrete responses. In the literature, two broad frameworks have been considered to model the dependence among such mixed continuous and discrete responses, namely the latent variable and copula framework.
There are two approaches for modelling multivariate mixed data with latent variables: the underlying variable approach that treats all variables as continuous by assuming the discrete responses as a manifestation of underlying continuous variables that follow the normal distribution (e.g., Muthén 1984;Lee et al. 1992;Quinn 2004), and, the response function approach that postulates distributions on the observed variables conditional on the latent variables usually from the exponential family (e.g., Moustaki 1996;Moustaki and Knott 2000;Wedel and Kamakura 2001;Huber et al. 2004;Moustaki and Victoria-Feser 2006). The former method assumes that the continuous and underlying variables follow a multivariate normal (MVN) distribution, while the latter assumes that the observed variables are conditionally independent given the latent variables which again follow a MVN distribution. The underlying variable approach calls the MVN distribution as a latent model for the discrete responses, hence requires multidimensional integrations (Nikoloulopoulos, 2013(Nikoloulopoulos, , 2016. The response function approach achieves dimension reduction with dependence coming from p latent (unobservable) variables/factors where p << d, where d is the number of observed variables and hence estimation is more feasible when many categorical or discrete variables are involved. Nevertheless, both approaches are restricted to MVN assumption that is not valid if tail asymmetry or tail dependence exists in the mixed data which is a realistic scenario. Ma and Genton (2010), Montanari and Viroli (2010), and Irincheeva et al. (2012a) stress that the MVN assumption might not be adequate, and acknowledge that the effect of misspecifying the distribution of the latent variables could lead to biased model estimates and poor fit. To this end, Irincheeva et al. (2012b) proposed a more flexible response function approach by strategically multiplying the MVN density of the latent variables by a polynomial function to achieve departures from normality.
As we have discussed, the underlying variable approach exploits the use of the MVN assumption to model the joint distribution of mixed data. The univariate margins are transformed to normality and then the MVN distribution is fitted to the transformed data. This construction is apparently the MVN copula applied to mixed data (Shen and Weissfeld, 2006;Hoff, 2007;Song et al., 2009;He et al., 2012;Jiryaie et al., 2016), but previous papers (e.g., Quinn 2004) do not refer to copulas as the approach can be explained without copulas. Smith and Khaled (2012), Stöber et al. (2015) and Zilko and Kurowicka (2016) called vine copulas to model mixed data. Vine copulas have two major advantages over the MVN copula as emphasized in Panagiotelis et al. (2017). The first is that the computational complexity of computing the joint probability distribution function grows quadratically with d, whereas for the MVN copula the computational complexity grows exponentially with d. The second is that vine copulas are highly flexible through their specification from bivariate parametric copulas with different tail dependence or asymmetry properties. Note in passing that they have as special case the MVN copula, if all the bivariate copulas are bivariate normal (BVN).
In this paper, we extend the factor copula models in Krupskii and Joe (2013) and Nikoloulopoulos and Joe (2015) to the case of mixed continuous and discrete responses. Factor copula models are vine copulas that involve both observed and latent variables, hence they allow flexible (tail) dependence. These types of models are more interpretable and fit better than vine copula models, when dependence can be explained through latent variables. That is theoretical concepts that cannot be measured directly such as intelligence in psychology, or welfare and poverty in economics. Furthermore, the factor copula models are closed under margins (same model with one more or less variable) whereas vine copulas without latent variables are not closed under margins. Finally, they can be explained as conditional independence models, i.e., they are a response function approach with dependence coming from latent (unobservable) variables/factors. Our model is the most general conditional independence model with univariate parameters separated from dependence parameters and latent variables that don't necessarily have an additive latent structure. Another mixed-variable model in the literature that is called factor copula model (Murray et al., 2013) is restricted to the MVN copula, hence has an additive latent structure and is not as general.
The remainder of the paper proceeds as follows. Section 2 introduces the factor copula models for mixed data and discusses its relationship with existing models. Estimation techniques and computational details are provided in Section 3. Sections 4 and 5 propose methods for model selection and goodness-of-fit, respectively.
Section 6 presents applications of our methodology to three mixed response data sets. Section 7 contains an extensive simulation study to gauge the small-sample efficiency of the proposed estimation, model selection and goodness-of-fit techniques. We conclude with some discussion in Section 8.

The factor copula model for mixed responses
The p-factor model assumes that the mixed continuous and discrete responses Y = (Y 1 , . . . , Y d ) are conditionally independent given p latent variables X 1 , . . . , X p . In line with Krupskii and Joe (2013) and Nikoloulopoulos and Joe (2015), we will use a general copula construction, based on a set of bivariate copulas that link observed to latent variables, to specify the factor copula models for mixed continuous and discrete variables. The idea in the derivation of this p-factor model will be shown below for the 1-factor and 2-factor case. It can be extended to p ≥ 3 factors or latent variables in a similar manner.
For the 1-factor model, let X 1 be a latent variable, which we assume to be standard uniform (without loss of generality). From Sklar (1959), there is a bivariate copula C X 1 j such that Pr( Letting C j|X 1 (F j (y)|x) = ∂C X 1 j (x, F j (y))/∂x for shorthand notation and y = (y 1 , . . . , y d ) be realizations of Y, the density ‡ for the 1-factor model is (2) ‡ We mean the density of Y w.r.t. the product measure on the respective supports of the marginal variables. For discrete margins with integer values this is the counting measure on the set of possible outcomes, for continuous margins we consider the Lebesgue measure in R. where is the density of Y j = y conditional on X 1 = x; c X 1 j is the bivariate copula density of X 1 and Y j and f j is the univariate density of Y j .
For the 2-factor model, consider two latent variables X 1 , X 2 that are, without loss of generality, independent uniform U (0, 1) random variables. Let C X 1 j be defined as in the 1-factor model, and let C X 2 j be a bivariate copula such that where F j|X 1 is given in (1). Then for 0 ≤ x 1 , x 2 ≤ 1, The density for the 2-factor model is where if Y j is continuous.
Note that the copula C X 1 j links the jth response to the first latent variable X 1 , and the copula C X 2 j links the jth response to the second latent variable X 2 conditional on X 1 . In our general statistical model there are no constraints in the choices of the parametric marginal F j or copula {C X 1 j , C X 2 j } distributions.

Choices of bivariate copulas with latent variables
In this subsection, we provide choices of parametric bivariate copulas that can be used to link the latent with observed variables. We will consider copula families that have different tail dependence (Joe, 1993) or tail order (Hua and Joe, 2011).
A bivariate copula C is reflection symmetric if its density satisfies c(u 1 , u 2 ) = c(1 − u 1 , 1 − u 2 ) for all 0 ≤ u 1 , u 2 ≤ 1. Otherwise, it is reflection asymmetric often with more probability in the joint upper tail or joint lower tail. Upper tail dependence means that c(1 − u, 1 − u) = O(u −1 ) as u → 0 and lower tail dependence is the survival or reflected copula of C; this "reflection" of each uniform U (0, 1) random variable about 1/2 changes the direction of tail asymmetry. Under some regularity conditions (e.g., existing finite density in the interior of the unit square, ultimately monotone in the tail), if there exists κ L (C) > 0 and some L(u) that is slowly varying at 0 + (i.e., L(ut) L(u) ∼ 1, as u → 0 + for all t > 0), then κ L (C) is the lower tail order of C. The upper tail order κ U (C) can be defined by the reflection of (U 1 , U 2 ), i.e., C(1 − u, 1 − u) ∼ u κ U (C) L * (u) as u → 0 + , where C is the survival function of the copula and L * (u) is a slowly varying function. With κ = κ L or κ U , a bivariate copula has intermediate tail dependence if κ ∈ (1, 2), tail dependence if κ = 1, and tail quadrant independence if κ = 2 with L(u) being asymptomatically a constant.
After briefly providing definitions of tail dependence and tail order we provide below a list of bivariate parametric copulas with varying tail behaviour: • Reflection symmetric copulas with intermediate tail dependence such as the BVN copula with κ L = κ U = 2/(1 + θ), where θ is the copula (correlation) parameter.
• Reflection asymmetric copulas with upper tail dependence only such as the Gumbel copula with κ L = 2 1/θ and κ U = 1, where θ is the copula paprameter; the Joe copula with κ L = 2 and κ U = 1.
• Reflection asymmetric copulas with upper and lower tail dependence that can range independently from 0 to 1, such as the BB1 and BB7 copulas with κ L = 1 and κ U = 1.
• Reflection asymmetric copulas with tail quadrant independence, such as the the BB8 and BB10 copulas.
The BVN, Frank, and t ν are comprehensive copulas, i.e., they interpolate between countermonotonicity (perfect negative dependence) to comonotonicity (perfect positive dependence). The other aforementioned parametric families of copulas, namely Gumbel, Joe, BB1, BB7, BB8 and BB10 interpolate between independence and perfect positive dependence. Nevertheless, negative dependence can be obtained from these copulas by considering reflection of one of the uniform random variables on (0, 1). If (U 1 , U 2 ) ∼ C for a bivariate copula C with positive dependence, then is the 1-reflected copula of C with negative lower-upper tail dependence; Negative upper-lower tail dependence means that c(1 − u, u) = O(u −1 ) as u → 0 + and negative lower-upper tail dependence means that c(u, 1 − u) = O(u −1 ) as u → 0 + (Joe, 2011). In Figure 1, to depict the concepts of refection symmetric or asymmetric tail dependence or quadrant tail independence, we plot contour plots of the corresponding copula densities with standard normal margins and dependence parameters corresponding to Kendall's τ value of 0.5.

Estimation
Suppose that data are y ij , j = 1, . . . , d, i = 1, . . . , n, where i is an index for individuals or clusters and j is an index for the within-cluster measurements. For continuous random variables, we estimate each marginal distribution non-parametrically by the empirical distribution function of Y j , viz.
where R ij denotes the rank of Y ij as in the semi-parametric estimation of Genest et al. (1995) and Shih and Louis (1995). Hence we allow the distribution of the continuous margins to be quite free and not restricted by parametric families.
Nevertheless, rank-based methods cannot be used for discrete variables with copulas (Genest and Nešlehová, 2007). Hence, • For an ordinal response variable Y j we use a univariate probit model (Agresti, 2010, Section 3.3.2). Let Z j be a standard normal latent variable, such that where K j is the number of categories of Y j (without loss of generality, we assume α 0j = −∞ and α K j j = ∞). From this definition, the ordinal response Y j is assumed to have density where γ j = (a 1j , . . . , a K j −1,j ) is the vector of the univariate cutpoints.
• For a count response variable Y j we use the negative binomial distribution (Lawless, 1987). It allows for over-dispersion and its probability mass function is where γ j = {µ j , ξ j } is the vector with the mean and dispersion parameters. In the limit ξ → 0 the negative binomial reduces to Poisson, which belongs to the exponential family of distributions and it is the only distribution for count data that existing latent variable models for mixed data can accommodate.
To this end, for a discrete random variable Y j , we approach estimation by maximizing the univariate log- over the vector of the univariate parameters γ j . That is equivalent with the first step of the IFM method in Joe (1997,2005).
After estimating the univariate marginal distributions we proceed to estimation of the dependence parameters. For the 1-factor and 2-factor models, we let C X 1 j and C X 2 j be parametric bivariate copulas, say with dependence parameters θ j and δ j , respectively. Let also θ = {γ j , θ j : j = 1, . . . , d} and θ = {γ j , θ j , δ j : j = 1, . . . , d} to denote the set of all parameters for the 1-and 2-factor model, respectively. Estimation can be achieved by maximizing the joint log-likelihood over the copula parameters θ j or δ j , j = 1, . . . , d with the univariate parameters/distributions fixed as estimated at the first step of the proposed two-step estimation approach. The estimated parameters can be obtained by using a quasi-Newton Nash (1990)  SEs that account for the estimation of univariate parameters can be obtained by maximizing the joint likelihood in (4) at one step over θ.
For factor copula models numerical evaluation of the joint density f Y (y; θ) can be easily done using Gauss-Legendre quadrature (Stroud and Secrest, 1966). To compute one-dimensional integrals for the 1-factor model, we use the following approximation: where {x q : q = 1, . . . , n q } are the quadrature points and {w q : q = 1, . . . , n q } are the quadrature weights. To compute two-dimensional integrals for the 2-factor model, the approximation uses Gauss-Legendre quadrature points in a double sum: With Gauss-Legendre quadrature, the same nodes and weights are used for different functions; this helps in yielding smooth numerical derivatives for numerical optimization via quasi-Newton (Nash, 1990). Our comparisons show that n q = 25 is adequate with good precision.

Model selection
In this section we propose an heuristic method that automatically selects the bivariate parametric copula families that link the observed to the latent variables. For multivariate mixed data, it is infeasible to estimate all possible combinations of bivariate parametric copula families, and compare them on the basis of information criteria. We develop an algorithm that can quickly select a factor copula model that accurately captures the (tail) dependence features in the data on hand. The linking copulas at each factor are selected with a sequential algorithm under the initial assumption that linking copulas are Frank, and in then sequentially copulas with non-tail quadrant independence are assigned to any of pairs where necessary to account for tail asymmetry (discrete data) or tail dependence (continuous data). Tail dependence of discrete variables is not clear, because maxima of discrete variables may not converge to a nondegenerate distribution (Feidt et al., 2010).
For the 1-factor model, the proposed model selection algorithm is summarized in the following steps: 1. For j = 1, . . . , d estimate the marginal distributions F j (y).
2. Fit the 1-factor copula model with Frank copulas to link each of the d observed variables with the latent variable, i.e., maximise the log-likelihood function of the factor copula model in (4) over the vector of copula parameters (θ 1 , . . . , θ d ).
3. If the jth linking copula hasθ j > 0, then select a set of copula candidates with ability to interpolate between independence and comonotonicity, otherwise select a set of copula candidates with ability to interpolate between countermonotonicity and independence.
4. For j = 1, . . . , d, (a) Fit all the possible 1-factor copula models iterating over all the copula candidates for the jth variable.
(b) Select the copula family that corresponds to the lowest information criterion, say the Akaike, that is AIC = −2 × ℓ + 2 × #copula parameters.
(c) Fix the selected linking copula family for the jth variable.
(d) Iterate through step (a) -(c) to select the copulas that link all the observed variables to the 1st factor.
For more than one factors we can select the appropriate linking copulas accordingly. We first select copula families in the first factor, and then we proceed to the next factor and apply exactly the same algorithm.

Techniques for parametric model comparison and goodness-of-fit
Factor copula models with different bivariate linking copulas can be compared via the log-likelihood or AIC at the maximum likelihood estimate. In addition, we will use the Vuong's test (Vuong, 1989) to show if a factor copula model provides better fit than the standard factor model with a latent additive structure, that is a factor copula model with BVN bivariate linking copulas (Krupskii and Joe, 2013;Nikoloulopoulos and Joe, 2015). The Vuong's test is the sample version of the difference in Kullback-Leibler divergence between two models and can be used to differentiate two parametric models which could be non-nested. This test has been used extensively in the copula literature to compare vine copula models (e.g., Brechmann et al. 2012;Joe 2014;Nikoloulopoulos 2017). We provide specific details in Section 5.1.
Furthermore, to assess the overall goodness-of-fit of the factor copula models for mixed data, we will use appropriately the limited information M 2 statistic (Maydeu-Olivares and Joe, 2006). The M 2 statistic has been developed for goodness-of-fit testing in multidimensional contingency tables. Nikoloulopoulos and Joe (2015) has used the M 2 statistic to assess the goodness-of-fit of factor copula models for ordinal data. We build on the aforementioned papers and propose a methodology to assess the overall goodness-of-fit of factor copula models for mixed continuous and discrete responses. Since the M 2 statistic has been developed for multivariate ordinal data, we propose to first transform the continuous and count variables to ordinal and then calculate the M 2 statistic at the maximum likelihood estimate before discretization. We provide the specifics for the M 2 statistic in Section 5.2.

Vuong's test for parametric model comparison
In this subsection, we summarize the Vuong's test for comparing parametric models (Vuong, 1989). Assume that we have Models 1 and 2 with parametric densities f (1) where θ 1 , θ 2 are the parameters in Models 1 and 2, respectively, that lead to the closest Kullback-Leibler divergence to the true f Y ; equivalently, they are the limits in probability of the MLEs based on Models 1 and 2, respectively.
Model 1 is closer to the true f Y , i.e., is the better . Vuong (1989) has shown that asymptotically that

M 2 goodness-of-fit statistic
In this subsection to self maintain the paper we provide the form of the limited information M 2 statistic in Maydeu-Olivares and Joe (2006). Assume d ordinal variables Y 1 , . . . , Y d where the jth (1 ≤ j ≤ d) variable consists of K j ≥ 2 categories labelled as 0, 1, . . . , K j − 1. Consider the set of univariate and bivariate residuals that do not include category 0. This is a residual vector of dimension For a factor copula model with parameter vector θ of dimension q, let π 2 (θ) = π 1 (θ) ⊤ ,π 2 (θ) ⊤ ⊤ be the column vector of the model-based marginal probabilities withπ 1 (θ) the vector of univariate marginal probabilities, andπ 2 (θ) the vector of bivariate marginal probabilities. Also, let p 2 = (ṗ ⊤ 1 ,ṗ ⊤ 2 ) ⊤ be the vector of the observed sample proportions, withṗ 1 the vector of univariate marginal proportions, andṗ 2 the vector of the bivariate marginal proportions.
With a sample size n, the limited information statistic M 2 is given by with where ∆ 2 = ∂π 2 (θ)/∂θ ⊤ is an s × q matrix with the derivatives of all the univariate and bivariate marginal probabilities with respect to the model parameters, ∆ 2 is an s × (s − q) orthogonal complement to ∆ 2 , such that [∆ 2 ] ⊤ ∆ 2 = 0, and Ξ 2 = diag(π 2 (θ))− π 2 (θ)π 2 (θ) ⊤ is the s × s covariance matrix of all the univariate and bivariate marginal sample proportions, excluding category 0. Due to equality in (6), C 2 is invariant to the choice of orthogonal complement. The limited information statistic M 2 has a null asymptotic distribution that is χ 2 with s − q degrees of freedom when the estimateθ is √ n-consistent. For details on the computation of Ξ 2 and ∆ 2 for factor copula models we refer the interested reader to Nikoloulopoulos and Joe (2015).

Applications
In this section we illustrate the proposed methodology by re-analysing three mixed response datasets. We construct a plausible factor copula model, to capture any type of reflection asymmetric dependence, by using the proposed algorithm in Section 4. For a baseline comparison, we also fit the factor copula models with the comprehensive bivariate parametric copula families that allow for reflection symmetric dependence; these are the BVN, Frank, and t ν copulas. For t ν copulas, we summarize the choice of integer ν with the largest log-likelihood. To make it easier to compare strengths of dependence, we convert the estimated parameters to Kendall's τ 's in (−1, 1) via the relations in Joe (2014, Chapter 4); SEs are also converted via the delta method.
If the number of parameters is not the same between the models, we use the AIC as a rough diagnostic measure for goodness-of-fit between the models, otherwise we use the likelihood at the maximum likelihood estimates.
We further compute the Vuong's tests with Model 1 being the factor copula model with BVN copulas, that is the standard factor model, to reveal if any other factor copula model provides better fit than the standard model. For the standard 2-factor model, to obtain a unique solution we must impose sufficient constraints. One parameter for the second factor can be set to zero and the likelihood can be maximized with respect to other 2d − 1 parameters. We report the varimax transform (Kaiser, 1958) of the loadings (a reparametrization of 2d parameters), converted to factor copula parameters via the relations where β j1 and β j2 are the loadings at the first and second factor, respectively. The overall fit of the factor copula models is evaluated using the M 2 statistic. Note that the M 2 statistic in the case with 2d − 1 copulas (one set to independence for the second factor) is computed with ∆ 2 having one less column.

Political-economic dataset
Quinn (2004) considered measuring the (latent) political-economic risk of 62 countries, for the year 1987, using 5 mixed variables, namely the continuous variable black-market premium in each country (used as a proxy for illegal economic activity), the continuous variable productivity as measured by real gross domestic product per worker in 1985 international prices, the binary variable independence of the national judiciary (1 if the judiciary is judged to be independent and 0 otherwise), and the ordinal variables measuring the lack of expropriation risk and lack of corruption. The dataset and its complete description can be found in Quinn (2004) or in the R package MCMCpack (Martin et al., 2011). Table 1 gives the estimated parameters, their standard errors (SE) in Kendalls τ scale, joint log-likelihoods, the 95% CIs of Vuong's tests, and the M 2 statistics for the 1-factor copula models. Table 1 indicates the parametric copula family chosen for each pair using the proposed heuristic algorithm. Copulas with asymmetric dependence are selected for all the copulas that link the latent variable to each of the observed variables. Hence, it is revealed that there are features in the data such as tail dependence and asymmetry which cannot cannot be captured by copulas with reflection symmetric dependence such as BVN, Frank and t ν copulas. In all the fitted models the estimated Kendall's τ 's are similar. Kendall's τ only accounts for the dependence dominated by the middle of the data, and it is expected to be similar amongst different families of copulas.
However, the tail dependence and tail order vary, as explained in Section 2.1, and they are properties to consider when choosing amongst different families of copulas (Nikoloulopoulos and Karlis, 2008).
The table shows that the selected model using the proposed algorithm provides the best fit and there is a substantial improvement over the standard factor copula model as indicated by the Vuong's and M 2 statistics (to calculate the M 2 statistics we discretized the continuous variables to 5 categories; similar inference was drawn, when we discretized the continuous variables to 3, 4, or 6 categories). The factor copula parameter of −0.51 on black market premium indicates a negative association between the illegal economic activity and the latent variable. All the other estimated factor copula parameters are positive indicating a positive association between each of the other observed variables (independent judiciary, productivity, lack of expropriation, and lack of corruption) with the latent variable. Hence, we can interpret the latent variable to be the political economical certainty.

General social survey
Hoff (2007) analysed six demographic variables of 464 male respondents to the 1994 General Social Survey.
Of these six, two were continuous (income and age), three were ordinal with 5 categories (highest degree of the survey respondent, income and highest degree of respondent's parents), and two were count variables (number of children of the survey respondent and respondent's parents). The data are available in Hoff (2007, Supplemental materials). Table 2 gives the estimated parameters, their standard errors (SE) in Kendalls τ scale, the joint loglikelihoods, the 95% CIs of Vuong's tests, and the M 2 statistics for the 1-factor and 2-factor copula models. The best fit for the 1-factor model is based on the bivariate copulas selected by the proposed algorithm, where there is improvement over the factor copula model with BVN copulas according to the Vuong's statistic. However, assessing the overall goodness-of-fit via the M 2 statistic, it is revealed that one latent variable is not adequate to explain the dependencies among the mixed responses. To apply the M 2 statistic, age was categorised into 4 age groups (18-24, 25-44, 45-64, and 65+), income was grouped into five categories that conforms to approximately equal percentiles, number of children of the survey respondent and respondent's parents were grouped into 4 and 8 categories, respectively, as there were very few respondents and respondent's parents with more than 3 and 7 children, respectively.
The 2-factor copula models with BVN, t ν , and Frank copulas provide some improvement over the 1-factor copula models but according to the M 2 statistic they still have a poor fit. Nevertheless, the selected 2-factor copula model using the algorithm in Section 4 shows improvement over the standard factor copula model according to the Vuong's statistic and better fit according to the M 2 statistic; it changes a p-value < 0.001 to one > 0.10.
For the 2-factor model based on the proposed algorithm for model selection, note that, without the need for a varimax rotation, the unique loading parameters (τ 's converted to normal copula parametersθ j 's andδ j 's and then to loadings using the relations in (7)) show that one factor is loaded only on the demographic variables of the respondent's parents. variable approach with one and two latent variables. In Figure 2 we depict the bivariate normal scores plots (Nikoloulopoulos et al., 2012) for the continuous data. The continuous data are converted to normal scores using the normal quantiles of their empirical distributions. Note that bivariate normal scores plots are better for assessing tail asymmetry and tail dependence. With a bivariate normal scores plot one can check for deviations from the elliptical shape that would be expected with the BVN copula, and hence assess if tail asymmetry and tail dependence exists on the data. Contrasting the bivariate normal scores plots in Figure 2 with the contour plots in Figure 1, it is apparent that for the continuous variables the linking copulas might be the BB10 copulas. Table 3 gives the estimated parameters, their standard errors (SE) in Kendalls tau scale, the joint loglikelihoods, the 95% CIs of Vuong's tests, and the M 2 statistics for the 1-factor and 2-factor copula models.

Swiss consumption survey
The best fitted 1-and 2-factor models result when we use BB10 copulas with asymmetric quantrant tail independence to link the latent variable to each of the continuous observed variables and copulas with lower tail dependence to link the latent variables to the discrete observed variables. Once again the one-factor copula model is not adequate to explain the dependence amongst the mixed responses based on the M 2 statistic (Table   3, 1-factor). To apply the M 2 statistic, we transformed the continuous to ordinal variables with 3 categories (equally weighted) and the count variable to ordinal with 6 categories (5 bicycles or more are grouped into a single category).
While it is revealed that the selected 2-factor copula model is the best model (lowest AIC) and there is sub-  Table 4 we list the maximum deviations of observed and expected counts for each bivariate margin, that is, D j 1 j 2 = n max y 1 ,y 2 |p j 1 ,j 2 ,y 1 ,y 2 − π j 1 ,j 2 ,y 1 ,y 2 (θ)|. From the table, it is revealed, that there is no misfit. The maximum discrepancy occurs between the continuous variables food and leisure. For this bivariate margin, the discrepancy of 509/9660 maximum occurs in the BVN factor copula model, while this drops to 133/9660 in the selected 2-factor copula model.
For the selected 2-factor model based on the proposed algorithm, note that, without the need for a varimax rotation, the unique loadings show that one factor is loaded only on the discrete variables (dishwasher, car, motorcycle, and bicycles), while both factors are loaded on the continuous variables (food, clothes, and leisure).
This reveals that the one latent variable which is only associated with the continuous variables measures the expenses, while the the other which is associated with all the mixed variables measures the possession.

Simulations
An extensive simulation study is conducted (a) to gauge the small-sample efficiency of the proposed estimation method, (b) to examine the reliability of using the heuristic algorithm to select the correct bivariate linking copulas, and (c) to study the small-sample performance of the M 2 statistic after transforming the continuous and count variables to categorical.
We set the type of the variables, the univariate margins and the bivariate linking copulas, along with their univariate and dependence parameters to mimic the data analysed in Section 6. The binary variables don't have tail asymmetries, hence parametric copulas are less distinguishable. Therefore instead of binary, we simulated from ordinal with 3 categories. We randomly generated samples of size n = {100, 300, 500} from the selected factor copula models. Table 5 contains the resultant biases, root mean square errors (RMSE), and standard deviations (SD), scaled by n, for the estimates obtained using the estimation approach in Section 3. The results show that the proposed estimation approach is highly efficient according to the simulated biases, SDs and RMSEs. Table 6 contains four common nominal levels of the M 2 statistic under the factor copula models for mixed  n = 100 n = 300 n = 500 Political-economic dataset -1-factor model  Table 7 presents the number of times that the true bivariate parametric copulas are chosen over 100 simulation runs. If the true copula has distinct dependence properties with medium to strong dependence, then the algorithm performs extremely well as the sample size increases. Low selection rates occur if the true copulas have low dependence or similar tail dependence properties, since for that case it is difficult to distinguish amongst parametric families of copulas (Nikoloulopoulos and Karlis, 2008). For example, • in the results from the 2-factor model for the general social survey, the true copula for the first continuous variable (1st factor) is the reflected Gumbel with τ = 0.34 and is only selected a considerable small number of times. The algorithm instead selected with a high probability the reflected Joe (results not shown here due to space constraints), because both reflected Joe and Gumbel copulas provide similar dependence properties, i.e., lower tail dependence.
• in the results from the 2-factor model for the Swiss consumption survey, the variables with Frank copulas have the lowest selection rates. This is due to the fact that their true Kendall's τ 's parameters are close to 0 (independence).

Discussion
We have extended the factor copula models proposed in Krupskii and Joe (2013) and Nikoloulopoulos and Joe (2015) to the case of mixed continuous and discrete responses. They include the standard factor model as a special case and they can be seen to provide a substantial improvement over the latter on the basis of the loglikelihood principle, Vuong's and M 2 statistics. Hence, superior statistical inference for the loading parameters of interest can be achieved.
This improvement relies on the fact that the latent variable distribution is expressed via factor copulas instead of the MVN distribution. The latter is restricted to linear and reflection symmetric dependence. Rizopoulos and Moustaki (2008) stressed that the inadequacy of normally distributed latent variables can be caused by the non-linear dependence on the latent variables. The factor copula can provide flexible reflection asymmetric tail and non-linear dependence as it is a truncated canonical vine copula (Brechmann et al., 2012) rooted at the latent variables. Joe et al. (2010) show that in order for a vine copula to have (tail) dependence for all bivariate margins, it is only necessary for the bivariate copulas in level 1 to have (tail) dependence and it is not necessary for the conditional bivariate copulas in levels 2, . . . , d − 1 to have tail dependence. The 1-factor copula has bivariate copulas with tail dependence in the 1st level and independence copulas in all the remaining levels of the vine (truncated after the 1st level). The 2-factor copula has bivariate copulas with tail dependence in the 1st and 2nd level and independence copulas in all the remaining levels (truncated after the 2nd level).
Even in the cases, where the effect of misspecifying the bivariate linking copula choice to build the factor copula models can be seen as minimal for the Kendall's τ (loading) parameters, the tail dependence varies, as explained in Section 2.1, and is a property to consider when choosing amongst different families of copulas and hence affects prediction. Rabe-Hesketh et al. (2003) highlighted the importance of the correct distributional assumptions for the prediction of latent scores. The estimated density of latent scores is simply the estimated density of latent variables. The latent scores will essentially show the effect of different model assumptions Figure 3: Comparison of the political-economic risk rankings obtained via our selected model, the standard factor model, and the mixed-data factor analysis of Quinn (2004). Factor copula model with BVN Selected factor copula model because it is an inference that depends on the joint distribution. Figure 3 demonstrates these effects by revisiting the political-economic dataset in Section 6.1 and comparing the political-economic risk ranking obtained via our selected model, the factor copula model with BVN copulas (standard factor model), and the mixed-data factor analysis of Quinn (2004). Note that even between the factor copula model with BVN copulas and the factor analysis model of Quinn (2004), there are small to moderate changes, because while these models share the same latent variables distribution, the former model does not assume the observed variables to be normally distributed, but rather uses the empirical distribution of the continuous observed variables, i.e. allows the margins to be quite free and not restricted by normal distribution. The differences at the lower panel graph are solely due the miss-specification the latent variable distribution.
As stated by many researchers (e.g., Skrondal 2001, 2004), the major difficulty of all the models with latent variables is identifiability. This issue as acknowledged by Hastie et al. (2001, page 494) has left many analysts skeptical of factor models, and may account for its lack of popularity in contemporary statistics. For example, for the standard factor model or the more flexible model in Irincheeva et al. (2012b) one of loadings in the second factor has to be set to zero, because the model with 2d loadings is not identifiable.
The standard factor model arises as special case of our model if we use as bivariate linking copulas the BVN copulas. Hence, for the 2-factor copula model with BVN copulas, one of the BVN copulas in the second factor has to be set as an independence copula. However, using other than BVN copulas, the 2-factor copula model is near-identifiable with 2d bivariate linking copulas as it has been shown by Krupskii and Joe (2013) and Nikoloulopoulos and Joe (2015).
We tackled issues that particularly interest the social data analyst as model selection. Model selection in previous papers on factor copula models was mainly based on simple diagnostics that are informative if the variables are of the same type, e.g. item response (ordinal) data in Nikoloulopoulos and Joe (2015) or financial data in Krupskii and Joe (2013). This is not the case for mixed continuous and discrete data as the continuous data might have reflection asymmetric tail dependence, while the extreme-value behavior of discrete distributions is often degenerate (Feidt et al., 2010). As regard as to the issue of goodness-of-fit testing, we proposed a technique that is based on the M 2 goodness-of-fit statistic (Maydeu-Olivares and Joe, 2006) in multidimensional contingency tables to overcome the shortage of goodness-of-fit statistics for mixed continuous and discrete response data (e.g., Moustaki and Knott 2000).