SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

Dominance-based ordinal multiple regression (DOR) is designed to answer ordinal questions about relationships among ordinal variables. Only one parameter per predictor is estimated, and the number of parameters is constant for any number of outcome levels. The majority of existing simulation evaluations of DOR use predictors that are continuous or ordinal with many categories, so the performance of the method is not well understood for ordinal variables with few categories. This research evaluates DOR in simulations using three-category ordinal variables for the outcome and predictors, with a comparison to the cumulative logits proportional odds model (POC). Although ordinary least squares (OLS) regression is inapplicable for theoretical reasons, it was also included in the simulations because of its popularity in the social sciences. Most simulation outcomes indicated that DOR performs well for variables with few categories, and is preferable to the POC for smaller samples and when the proportional odds assumption is violated. Nevertheless, confidence interval coverage for DOR was not flawless and possibilities for improvement are suggested.


1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The values of many variables measured in the social and behavioural sciences are ordered, but cannot be presumed to reflect equal spacing on the construct thought to underlie them – that is, they are ordinal variables. The present research is focused on contexts in which both response and predictor variables are ordinal. Consider, for example, Likert-type responses to questionnaire or interview items (e.g., strongly agree, disagree, strongly disagree), and sums of such responses. Methods appropriate for ordinal data also generally apply to dichotomous data as a special case.

Dominance-based ordinal multiple regression (DOR; Cliff, 1994; 1996, Chapter 4; Long, 1998, 1999, 2005) is designed to answer questions about ordinal relationships between one ordinal outcome and one or more ordinal predictors.1 A single parameter is estimated for each predictor, and the number of parameters does not increase with levels of the response. DOR is one of few regression methods designed to treat all variables as ordinal in a multiple regression context, but is understudied for this purpose. The majority of extant simulations about DOR use data with no (Long, 1999; Woodward, Hunter, & Kadlec, 2002) or few (Long, 2005) ties on the predictors. However, ties (i.e., yi=yj, xi=xj, or both) are virtually guaranteed in ordinal data. It is important to evaluate the performance of DOR when all variables are ordinal with ties, and the present simulations evaluate DOR for this case.

Probably the most popular multiple regression model for an ordinal response is the proportional odds version of the cumulative logit model (POC; Agresti 2010, p. 53; McCullagh, 1980). The POC differs from DOR because it is a generalized linear model (GLM; Nelder & Wedderburn, 1972). Despite advantages of the well-developed GLM framework such as detailed parameter interpretation and model fit evaluation, there are disadvantages of using the POC. For example, the number of parameters increases with the number of outcome categories, rendering it less parsimonious than DOR. Additionally, DOR may be better when the proportional odds (PO) assumption is violated, and show better properties with smaller sample sizes. Finally, if all variables are ordinal, they are treated that way by DOR, whereas with the POC an ordinal predictor is usually either treated as nominal or continuous. Both are misspecifications, and the nominal approach can fail completely if the predictor has too many categories.

Predictors may be treated as ordinal in GLMs. An easy approach is to assign unequally spaced scores, reflecting the unequal spacing of the ordinal variable on the underlying response continuum. However, there is often no good justification for selecting the spacing – most investigators will not know how to assign scores to quantify the distance between, for example, ‘strongly agree’ and ‘agree’. An alternative approach for treating predictors as ordinal in GLMs is isotonic regression (e.g., Gertheiss, Hogger, Oberhauser, & Tutz, 2011; Gertheiss, & Tutz, 2009; Rufibach, 2010; Walter, Feinstein, & Wells, 1987). These methods are important and worthy of attention in the future. However, they are not well known to social scientists and will not be pursued here.

In social science applications of the POC, the overwhelming majority treat ordinal predictors as continuous or nominal. In the present research, the ordinal predictors are reference-cell coded, thus treated as nominal, because the primary purpose is to evaluate DOR for variables with few categories, and it is desirable for the comparison method to be familiar to social science readers. Similarly, ordinary least squares (OLS) regression, treating all of the ordinal variables as continuous, will be included in the simulation because it is frequently (mis)applied to ordinal variables.

DOR and the POC have not been previously compared in simulations and will be compared here. This research compares DOR and the POC with varying sample sizes, with and without a violation of the PO assumption, for ordinal variables with three categories. DOR may perform better than the POC because it treats ordinal predictors as ordinal, estimates fewer parameters, requires fewer assumptions, and uses simpler methods for parameter estimation. OLS is theoretically misspecified for ordinal data but is included in the empirical comparison because it is familiar to, and popular among, social scientists. A real data illustration of DOR will be presented.

2. Dominance-based ordinal multiple regression

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

Building on previous literature (e.g., Hawkes, 1971; Smith, 1972; Somers, 1968, 1974), Cliff (1994; 1996, Chapter 4) introduced DOR to answer questions about ordinal relationships between one ordinal outcome and one or more ordinal predictors. As originally proposed, Cliff's (1994, 1996) ordinal multiple regression could be carried out with raw data (Pearson correlations) or ranked data (Spearman correlations). The present research is focused on the purely ordinal model that was named ‘dominance-based’ by Long (2005).

DOR is a linear model fitted to a transformed version of the raw data. The transformation involves a comparison between every non-redundant pair of the n original observations. Thus, the number of observations in the data matrix for DOR is inline image instead of n. The transformed data are dominance scores on each variable, which indicate whether the score for the first member of the pair is smaller or larger than the score for the second member (or whether the scores are equal). For the outcome, yi, the dominance score, dihy, takes the value −1 if yi < yh, 0 if yi=yh, or + 1 if yi > yh, for i < h. Dominance scores are also computed for each predictor: dihx1, dihx2,…, dihxp.

By the usual OLS formulae, the DOR parameters (weights), w1= (w1, w2, …, wp)T, are computed by

  • image(1)

Here, the inline image matrix X (where p is the number of predictors) holds dominance scores for each predictor (without a column of 1s), and the inline image rows of y hold dominance scores for the outcome. Weights equivalent to w1 can be computed using an adapted version of the OLS equation for standardized coefficients:

  • image(2)

where T is a matrix of τas (defined in the next paragraph) between each pair of predictors, and ty is a vector of τas between each predictor and the outcome. DOR weights are bounded between −1 and 1 when the predictors are orthogonal but can exceed these bounds with correlated predictors.

Kendall's (1938)τa is a bivariate measure of monotonic association given by

  • image

where C is the number of concordant pairs, D the number of discordant pairs, n the sample size, Tx the number of pairs tied on X but not on Y (i.e., xi=xj), Tythe number of pairs tied on Y but not on X (yi=yj), and Txy the number of pairs tied on both X and Y (xi=xj and yi=yj). An XY pair is concordant if an increase in X occurs with an increase in Y (i.e., xi > xj and yi > yj), or if both X and Y decrease (i.e., xi < xj and yi < yj). The pair is discordant if X and Y change in opposite directions (i.e., xi > xj and yi < yj or xi < xj and yi > yj).

Interpretation is necessarily more limited for DOR coefficients than standard OLS coefficients because ordinal data contain only information about relative ordering. The size of a DOR parameter estimate conveys how strongly associated the predictor is with yi, relative to the other predictors. Interpretation does not include expressing the outcome as an algebraic function of the weighted predictors inline image, because this is not logical for ordinal variables. Notice that numbers like −0.308 and 1.251 would result, which are not possible values for yi.

DOR parameters optimize the loss function

  • image

where inline image is a vector of inline image model-predicted dominance scores, inline image, y and X are matrices of dominance scores as defined above, and w=w1=w2 (Cliff, 1994). The sign(·) transformation filters the model-predicted values such that only information about the sign is used. Permissible values for the elements of inline image are −1, 0, or 1 (0s are rare). For example, if inline image, then inline image. The value of φ is interpreted as the proportion of pairs for which the predicted dominances are equal to the observed dominances.

Cliff (1994) presented an argument for why w1 optimizes the loss function. Here, we proceed with w2 so that variance formulae for linear combinations of τas may be used to compute standard errors (SEs; Long, 1998, 1999). Alternate methods could be used to compute SEs using w1 (standard OLS formulae do not apply with dominance scores).

2.1. Inference

Making use of developments by Long and Cliff (1997) and Cliff and Charlin (1991), Long (1998, 1999) described how to compute SEs and confidence intervals (CIs) for DOR weights (computed using equation (2)). The variance of the weight for predictor j is:

  • image(3)

where the t* terms are elements of T−1, inline image is the variance of the τa computed between the kth predictor and y, inline image is the covariance between two τas, and p is the number of predictors. As usual, the SE is the square root of the variance. Long (1998, 1999) used a standard normal sampling distribution for CIs. However, in simulations comparing the performance of SEs in CIs for bivariate ordinal association measures, Woods (2007, 2009) found that CIs using the t distribution with df=n– 2 (used by Cliff & Charlin, 1991) performed better than CIs based on the normal distribution. The t distribution is used here.

An equivalent formula (more convenient for programming) for the p×p covariance matrix of all the regression coefficients at once is

  • image(4)

where inline image is the covariance matrix among elements in ty and the SEs for the parameter estimates that match up with equation (3) are square roots of the diagonal elements of inline image.

The elements of inline image (variances and covariances among τas between y and predictor k) are somewhat complicated. It is useful to begin with a simple unit and work up to equation (4). Let inline image be an indicator variable that is 1 if a pair of variables is concordant, −1 if the pair is discordant, and 0 if there is a tie on either k or y:

  • image(5)

where ki and kh are two different realizations of variable k, and yi and yh are two different realizations of variable y. For each of n persons in the data set, there are n values of inline image.

The variance and covariance needed for inline image are estimated by

  • image(6)

and

  • image(7)

respectively, where the s2 and s terms are variances and covariances (respectively), and functions of inline image. The variance of inline image is

  • image

where τky is τa computed for k with y, and the covariance between inline image and inline image is

  • image

To define the remaining variance and covariance, let τi.ky refer to the result of summing inline image over h for all persons, and also dividing by n− 1. A consistent estimate of the variance of τi.ky is

  • image

and the covariance between τi.ky and τi.my is

  • image

Long's (2005) SAS macro is a convenient way to carry out DOR, including the inference. The present study contributes an additional implementation in C++; this code is freely available as an online appendix or by request from the author.

3. Proportional odds version of the cumulative logits model

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The POC was proposed in various forms by different people (e.g., Snell, 1964; Walker & Duncan, 1967; Williams & Grizzle, 1972; McCullagh, 1980). It is straightforward to fit the POC with maximum likelihood estimation using widely available software, and it is currently one of the most frequently applied regression models for an ordinal outcome. The POC is a multivariate extension of a logistic GLM that handles ordering of the response through the use of cumulative logits.

Consider an outcome with three ordered categories. There is a model-implied probability corresponding to each outcome (for person i): 1 (p1), 2 (p2), and 3 (p3). The cumulative logits, log   (p1i/(p2i+p3i)) and log   ((p1i+p2i)/p3i), together refer to the log odds of a lower versus higher response. The right-hand side of the model is the usual linear predictor. In general, for j= 1, 2, …, c– 1 and c total categories of response, inline image where Yi is the observed response, boj is the logit-specific intercept, B is the matrix of regression slopes and xi is the matrix of predictors for observation i. All parameters for the cumulative logits are estimated simultaneously and the regression slopes for all logits are constrained to be equivalent. This constraint is the proportional odds assumption.

To test the PO assumption, the POC can be compared via a χ2 difference test to a cumulative logits model for which slopes are estimated separately for each logit, that is, inline image (where a subscript has been added to B). The test can be useful if it is non-significant (indicating no violation of the assumption); however, type I error is known to be inflated, especially with smaller samples (Peterson & Harrell, 1990; Stokes, Davis, & Koch, 2000), so that it will too often, compared to the nominal α level, lead researchers to conclude that the assumption is violated. Thus, graphical methods can be used to evaluate the assumption instead (see http://support.sas.com/kb/22/954.html).

Agresti (2010, pp. 79–80) suggests alternatives for when the PO assumption is violated. Apart from adding predictors (e.g., an interaction) to the model, remedies are to use a different link function. All of the models Agresti mentions are fitted to data with large-sample methods (usually maximum likelihood), require more parameters than the POC, and have the usual GLM linear predictor. Thus, the real or hypothesized advantages that DOR has over the POC (interpretation, fewer assumptions, parsimony, greater power, better with smaller n) are maintained or increased if one of these alternative models is used.

With small samples, many model parameters, highly unbalanced data, or a combination of these, a POC regression parameter estimate may not be finite (typically displayed in software as unusually large with a gigantic SE). There also may be a warning about ‘quasi-separation’ as in SAS proc logistic. For the POC, quasi-separation means that there is no overlap in the sets of predictors with y= 1, y= 2, or y= 3 for a three-category response. (The problem can happen for response variables with any number of categories.) In extreme cases, the separation is complete and none of the estimates are finite. However, if some estimates are finite and some are not, the finite estimates are still valid, as are inference and prediction, as long as Wald tests and CIs are not used (Agresti, 2010, p. 65). Methods for correcting quasi-separation for binary (Firth, 1993), and nominal (Bull, Mak, & Greenwood, 2002; Bull, Lewinger, & Lee, 2006) responses are not available for cumulative logits, and exact logistic regression (see e.g., Mehta & Patel, 1995) does not work for cumulative logit models because the cumulative logit is not the canonical link for a multinomial distribution and non-canonical links do not have reduced sufficient statistics (Agresti, 2010, p. 357).

An alternative to the Wald CI is the profile likelihood CI, which usually produces the same results as the Wald for larger samples, but more accurate results for smaller samples (Agresti, 2010, pp. 29, 60). The 95% profile likelihood CI for a parameter, b, is the set of values, b1, b2,…., bq for which p > .05 from the χ2 difference test (df= 1) of the null hypothesis that b=bj. There seems to be no cost to using profile likelihood CIs for larger samples, only benefits in smaller samples. Profile CIs are used in the present simulations.

4. Simulation study

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The simulations compared DOR, the POC, and OLS with one ordinal outcome and three ordinal predictors. For fitting, the ordinal predictors were treated as ordinal by DOR, continuous by OLS, and dummy-coded as nominal for the POC.

4.1. Methods

4.1.1. Data

Data cannot be generated from the DOR because the outcome does not equal the linear combination of weights and predictors. The data were generated using the POC with continuous versions of the predictors before they were polytomized. Data were generated using a C++ program that implemented the following steps: (i) three predictors were sampled from a multivariate normal distribution, (ii) a three-category (1, 2, or 3) ordinal outcome was created according to the POC (parameters given below), and (iii) predictors were polytomized (1, 2, 3) using fixed cut scores. The cut scores were chosen so that (in the population) the following percentages of data were held in each of the categories for X1, X2 and X3, respectively: 25%, 50% and 25% (X1; cut-scores −0.67, 0.67), 50%, 25% and 25% (X2; cut-scores 0, 0.67), and 25%, 25% and 50% (X3; cut-scores −0.67, 0). With n= 40 or 80, this algorithm sometimes created a binary outcome. When this occurred, the data were re-generated until an ordinal outcome was created.

One thousand replications were simulated in each of 45 conditions that varied according to the degree of population correlation among predictors (0, .1, or .5), sample size (n= 40, 80, 160, 640, and 1,000), and pattern in the POC parameters (three patterns, described below). To obtain population parameters, estimates were used from one fitting with N= 100,000. Population values were obtained in this way for the nine conditions that varied according to the degree of correlation among predictors and the POC-parameter pattern.

POC intercepts were the same for all conditions (β01= .5, β02= 1) but the slopes were: (i) all 0 (to evaluate type I error), (ii) values from an application of the POC to item data (Kruizenga et al., 2010) (β1= 0.73, β2= 1.18, β3= 1.49, for both logits), or (iii) values from type (ii) conditions modified to introduce a violation of the PO assumption (β11= 0.73, β21= 1.18, β31= 1.49; β12= 30.73, β22= 31.18, β32= 31.49). This violation may appear extreme, but because the parameters apply to continuous predictors which are then polytomized, and the POC logits are fitted to binary indicators of ordinal variables, the violation occurs two levels removed from the POC model, and is not as extreme as one might first expect from these parameters.

4.1.2. Population parameters

For conditions with 0 true slopes, population slopes for all models were very near 0, POC intercepts were close to .5 and 1, and the OLS intercept was close to 1.65. For conditions with non-zero true slopes, population parameters are given in Table 1.

Table 1. Non-zero population parameters used for the simulation study
 PO assumption metPO assumption violated
Corr.:0.1.50.1.5
  1. Note. PO = proportional odds. Corr. = correlation among continuous predictors before polytomization. DOR = dominance-based ordinal multiple regression; POC = cumulative logits version of the proportional odds model; OLS = ordinary least squares

DOR
w1 −0.17−0.19−0.18−0.21−0.21−0.21
w2 −0.27−0.27−0.27−0.29−0.29−0.26
w3 −0.36−0.36−0.36−0.37−0.37−0.36
POC
c01 −2.80−2.98−2.93−3.39−3.48−3.61
c02 −2.36−2.54−2.49−2.85−2.96−3.09
c11 1.581.791.742.082.192.36
c12 0.800.910.871.011.051.18
c21 2.142.162.222.442.482.41
c22 1.171.221.201.351.381.39
c31 2.592.662.622.892.932.93
c32 1.241.291.251.351.371.38
OLS
bo 3.994.034.024.224.214.10
b1 −0.26−0.28−0.28−0.33−0.34−0.31
b2 −0.35−0.35−0.35−0.39−0.38−0.33
b3 −0.48−0.47−0.47−0.49−0.49−0.48
4.1.3. Outcomes

All outcomes were averaged over parameters (excluding intercepts). The proportion of replications for which 90% and 95% CIs included the population regression parameter was tabulated for the POC, DOR, and OLS. Because the pattern of results for 90% CIs was the same as for 95% CIs, these results are not reported here but are available upon request from the author. To decide whether the coverage proportions were close to the nominal rates, 95% binomial CI control limits inline image were used, where p is the target proportion and 1,000 is the number of replications. The control limits are .937 to .964 for 95% CIs. Power is the proportion of CIs that excluded 0 (i.e., rejected the null hypothesis that the parameter is 0) when the true parameter was non-zero. Type I error is the proportion of CIs that excluded 0 (or 1 minus the proportion covering the true parameter) when the true parameter was 0.

Three more outcomes were compared for regression parameters: (1) the difference between analytic SEs (averaged over replications) and empirical SEs, (2) absolute relative bias: inline image for population parameter δ and its inline image, and (3) relative CI width: inline image where L and U are respectively the lower and upper CI limits. Bias and CI width were divided by δ so that DOR, the POC, and OLS could be compared. Relative outcomes were not computed for conditions with δ= 0.

For the POC, the number of times SAS printed the quasi-complete separation error was recorded. When the separation was complete, the replication had to be deleted for the POC. DOR and OLS results were analysed with and without replications that were dropped for the POC. Power and type I error (α= .05) for the score test of the PO assumption were also evaluated. This is a test for the equality of separate slope parameters simultaneously for all predictors.

4.2. Results

4.2.1. Outcomes for the POC only

The quasi-complete separation error was more likely with smaller n, non-zero slopes, violation of the PO assumption, and a stronger predictor correlation (when the slopes were non-zero). When n = 40, an average of 100 errors per 1,000 replications was observed if the slopes were 0, but if the slopes were non-zero, the number of errors ranged from 296 to 700. When n = 80, the mean number of errors was 2.3 per 1,000 replications with 0 slopes, and the mean count ranged from 22 to 253 for conditions with non-zero slopes. With n = 160, the total count was 0 or 1 unless the predictor correlation was .5 (in which case counts were 30 or 31). The error was never observed when n = 160 with 0 slopes and n = 640 or 1,000 with non-zero slopes. Two replications were deleted from each of three conditions with n = 40 and non-zero slopes because separation of data points was complete. If estimation was overall successful but L or U was not estimated, the rest of the results were used, and only the interval for particular parameters with a missing L or U was omitted from the analysis.

When the PO assumption was met, the percentage of times the test indicated that the assumption was violated ranged from 7.7% (n = 1,000 and 0 predictor correlation) to 86.1% (n = 40, .5 predictor correlation). The rejection rate decreased with n and was lower when the population parameters were all 0. Given this type I error inflation, power for conditions where it was violated was of course very high, ranging from 81.3% (n = 40) to 100% (all conditions with n = 640 or 1,000).

4.2.2. CI coverage of the parameter and type I error

Table 2 lists the proportion of 95% CIs that contained the population parameter. The top panel shows conditions with population parameters equal to 0; thus, 1 minus the coverage proportion is the type I error. DOR coverage was near nominal (within the control limits) in every condition, as was OLS coverage. POC coverage was underestimated for n = 40 but inside the lower control limit for n ≥ 80.

Table 2. Ninety-five per cent CI coverage of the population parameter
 Corr. among preds: 0Corr. among preds: .1Corr. among preds: .5
N408016064010004080160640100040801606401000
  1. Note. PO = proportional odds. The 95% control limits for parameter coverage are .937 to .964. Table entries are the mean over slope coefficients in the (respective) model. DOR*= mean over DOR coefficients, without dropping the two replications that failed with the POC. OLS results were unchanged when the two POC-failed replications were dropped.

  Slopes all 0
DOR.951.955.949.957.952.954.949.949.950.951.953.957.947.943.941
POC.932.938.941.950.952.933.941.945.948.947.932.944.943.945.944
OLS.946.958.946.946.949.950.951.948.945.950.949.960.955.947.942
  Non-zero slopes (PO assumption met)
DOR.972.968.975.971.974.974.972.976.970.971.975.974.968.953.941
DOR*.968
POC.951.932.949.948.952.947.934.947.946.939.955.936.935.923.914
OLS.950.951.953.950.956.949.954.949.945.947.923.927.927.884.858
  Non-zero slopes (PO assumption violated)
DOR.974.978.972.978.975.974.974.981.974.980.976.973.966.975.978
DOR*.961.967
POC.948.934.943.939.948.946.929.940.939.947.961.939.939.948.942
OLS.906.855.752.632.624.907.847.758.622.603.901.874.807.600.535

When the population parameters were non-zero, CI coverage was a little overestimated with DOR, mostly accurate but underestimated for a few conditions with the POC, and well estimated from OLS except when the predictor correlation was .5. When the PO assumption was violated, coverage tended to be a little less accurate for DOR, less accurate for the POC except if the predictor correlation was .5, and much less accurate for OLS.

4.2.3. Power

Statistical power is shown in Table 3. For power to reach a range of .70 to .80, a minimum of n = 80 was required for DOR and OLS, whereas (for most conditions) this minimum n was 160 for the POC. At each n < 640, the power was greatest for OLS followed by DOR followed by the POC, unless the PO assumption was violated, in which case power was sometimes larger for DOR than for OLS. For DOR and the POC, power tended to be a little greater when the PO assumption was violated, probably because the assumption was violated by increasing parameter magnitude.

Table 3. Ninety-five per cent CI exclusion of 0 (power)
 Corr. among preds: 0Corr. among preds: .1Corr. among preds: .5
N408016064010004080160640100040801606401000
  1. Note. PO = proportional odds. The 95% control limits for parameter coverage are .937 to .964. Table entries are the mean over slope coefficients in the (respective) model. DOR*= mean over DOR coefficients, without dropping the two replications that failed with the POC. OLS results were unchanged when the two POC-failed replications were dropped.

  Non-zero slopes (PO assumption met)
DOR.49.75.921.001.00.51.77.931.001.00.50.73.931.001.00
DOR*.45
POC.39.62.81 .991.00.40.63.811.001.00.38.57.791.001.00
OLS.59.83.961.001.00.59.77.951.001.00.55.78.951.001.00
  Non-zero slopes (PO assumption violated)
DOR.54.84.981.001.00.56.83.971.001.00.49.80.971.001.00
DOR*.52.47
POC.45.70.871.001.00.45.69.871.001.00.32.64.841.001.00
OLS.83.73.80 .981.00.60.74.82 .991.00.56.73.84 .991.00
4.2.4. Standard errors

Tables 4–6 display differences between empirical and average analytic SEs. The metric is not comparable among methods and is influenced by the magnitude of the true parameters shown in Table 1. Nevertheless, all differences would be 0 if the SE were perfectly accurate. For DOR (Table 4) and OLS (Table 6), the SEs were very accurate and similar to one another. For the POC (Table 5), SEs tended to be overestimated with smaller n, but improved as n increased. For all methods, accuracy was best when all slopes were zero. For the POC and OLS but not DOR, violating the PO assumption reduced accuracy. For DOR and OLS, the results were unchanged when replications that failed for the POC were left in.

Table 4. Differences between empirical and average analytic SEs for DOR
 Corr. among preds: 0Corr. among preds: .1Corr. among preds: .5
N408016064010004080160640100040801606401000
  1. Note. PO = proportional odds. Difference = Empirical – Analytic.

  Slopes all 0
w1  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00−0.01 0.00 0.00 0.00
w2  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
w3  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
  Non-zero slopes (PO assumption met)
w1 −0.02−0.01−0.01 0.00 0.00−0.03−0.01−0.01 0.00 0.00−0.03−0.02−0.01 0.00 0.00
w2 −0.02−0.01−0.01 0.00 0.00−0.02−0.01−0.01 0.00 0.00−0.02−0.01−0.01 0.00 0.00
w3 −0.01−0.01−0.01 0.00 0.00−0.01−0.01−0.01 0.00 0.00−0.01−0.01−0.01 0.00 0.00
  Non-zero slopes (PO assumption violated)
w1 −0.03−0.02−0.01 0.00−0.01−0.03−0.02−0.01 0.00−0.01−0.03−0.02−0.01 0.00−0.01
w2 −0.02−0.01−0.01 0.00 0.00−0.02−0.01−0.01 0.00 0.00−0.02−0.01−0.01 0.00 0.00
w3 −0.01−0.01−0.01 0.00 0.00−0.02−0.01−0.01 0.00 0.00−0.02−0.01 0.00 0.00 0.00
Table 5. Differences between empirical and average analytic SEs for the POC
 Corr. among preds: 0Corr. among preds: .1Corr. among preds: .5
N408016064010004080160640100040801606401000
  1. Note. PO = proportional odds. Difference = Empirical – Analytic.

  Slopes all 0
c11  −7.88 −0.07 0.01−0.01 0.00 −5.95 −0.07 0.02 0.00 0.01 −7.49  0.02 0.00 0.01 0.00
c12  −3.04  0.03 0.02 0.01 0.00 −1.61  0.03 0.01 0.00 0.00 −3.12  0.04 0.03 0.00 0.00
c21  −3.07  0.02 0.01 0.00 0.00 −3.23 −0.16 0.01 0.00 0.00 −4.47 −0.13 0.01 0.00 0.00
c22  −5.16  0.03 0.01 0.00 0.00 −4.14  0.04 0.03 0.00 0.00 −4.32  0.04 0.01−0.01 0.01
c31  −2.04  0.05 0.02−0.01 0.00 −2.10  0.04 0.00 0.00 0.00 −4.00  0.02 0.03 0.02 0.00
c32  −6.47  0.07 0.02 0.00 0.00 −7.01 −0.07 0.01 0.00 0.01 −6.60  0.02 0.01 0.01−0.01
  Non-zero slopes (PO assumption met)
c11 −19.02−0.29 0.02 0.00 0.00−27.30−1.41 0.02 0.00 0.00−23.68−18.62  −0.56  0.02 0.01
c12  −5.03 0.06 0.02 0.00−0.01 −5.77 0.05 0.01 0.00 0.00−6.28−0.30 0.01 0.00 0.00
c21 −36.26−3.35 0.02 0.00 0.00−42.24−3.92−0.04  0.00 0.00−43.57−30.34 −3.66  0.01 0.00
c22 −14.53−0.17 0.00 0.00 0.00−12.07−0.20 0.01 0.00 0.01−13.58−0.48 0.03 0.01 0.01
c31 −19.74−0.57−0.01 0.00−0.01−24.53−0.64 0.02 0.01 0.00−23.73−3.83 0.03 0.00 0.01
c32 −17.93−0.55 0.00 0.00 0.00−21.95−0.31 0.04 0.00 0.00−22.68−3.53 0.03 0.00 0.01
  Non-zero slopes (PO assumption violated)
c11 −22.60 −0.51 0.02 0.01 0.01−32.53−0.76 0.03 0.02 0.00−75.31−18.71 −0.69 −0.01−0.02 
c12  −9.63 0.11 0.03 0.02 0.02 −9.76 0.13 0.06 0.04 0.02−26.10−1.77 0.07 0.02 0.01
c22 −14.61−0.01−0.01 0.00 0.00−14.74−0.13 0.02 0.00−0.01−20.96−1.24 0.04−0.01 0.00
c21 −37.43−3.09 0.00 0.01 0.00−40.98−7.03 0.00−0.01−0.01−81.85−31.25−3.30  0.00−0.01
c31 −23.76−0.53 0.04 0.01 0.01−30.55−0.54 0.01 0.01 0.01−48.74−8.28 0.00 0.01 0.01
c32 −20.18−0.56 0.02 0.00 0.00−29.83−0.54 0.01 0.00 0.00−48.09−8.26 0.00 0.01 0.02
Table 6. Differences between empirical and average analytic SEs for OLS
 Corr. among preds: 0Corr. among preds: .1Corr. among preds: .5
N408016064010004080160640100040801606401000
  1. Note. PO = proportional odds. Difference = Empirical – Analytic.

  Slopes all 0
b1  0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00−0.01 0.00 0.00 0.00
b2  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
b3  0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
  Non-zero slopes (PO assumption met)
b1  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 0.00 0.00
b2 −0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
b3  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.01 0.01 0.00 0.01
  Non-zero slopes (PO assumption violated)
b1  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
b2 −0.01 0.00−0.01 0.00 0.00−0.01 0.00−0.01 0.00 0.00 0.01 0.01 0.01 0.00 0.00
b3  0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.02 0.01 0.01 0.00 0.00

Absolute relative bias is displayed in Figure 1. For n < 640, bias was larger for the POC than for DOR, sometimes very large for the POC, and always near 0 for DOR. At larger sample sizes with no PO assumption violation, all methods showed bias near 0. Neither DOR nor the POC was strongly influenced by violation of the PO assumption. By contrast, OLS bias was near 0 for all n without PO violation, but strongly influenced by violation of the PO assumption. OLS bias was above that of the POC when n ≥ 160. OLS was not influenced by deletion of the two POC-failed replications. DOR was noticeably improved by the deletion of the two replications that failed for the POC if the PO assumption was violated. For two conditions (n = 40), the bias jumped to .18 when all replications were left in.

Figure 1. Absolute relative bias for conditions with (right) and without (left) a violation of the proportional odds assumption, for a predictor correlation of 0 (upper), .1 (middle), and .5 (lower). Key: += POC, = DOR, ⋄= OLS.

Download figure to PowerPoint

image

Confidence interval width is shown in Figure 2. For all methods, CI width decreased with increasing n, and assumption violation and predictor correlation showed minimal influence. The POC widths were the widest followed by DOR, followed by OLS, with differences among methods diminishing as n increased. For DOR and OLS, the results were unchanged when replications that failed for the POC were left in.

Figure 2. Relative confidence interval width for conditions with (right) and without (left) a violation of the proportional odds assumption, for a predictor correlation of 0 (upper), .1 (middle), and .5 (lower). Key: += POC, = DOR, ⋄= OLS.

Download figure to PowerPoint

image

5. Empirical example

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

DOR, the POC (with reference-cell coded predictors), and OLS were applied to four item responses from a scale about sensitivity to disgust-eliciting stimuli (the Disgust Scale; Haidt, McCauley, & Rozin, 1994) given by a sample of 562 college students (described fully by Tolin, Woods, and Abramowitz, 2003). Response options for all items are: ‘not disgusting’, ‘slightly disgusting’, or ‘very disgusting’. Here, responses to item 26 (‘You see a man with his intestines exposed after an accident’) are predicted from responses to items 24 (‘You hear about a 30-year-old man who seeks sexual relationships with 80-year-old women’), 18 (‘You are about to drink a glass of milk when you smell that it is spoiled’) and 19 (‘You see maggots on a piece of meat in an outdoor garbage pail’).

DOR estimates (with 95% CIs in brackets) are: for item 24, .079 [.012, .146]; for item 18, .038 [–.035, .111]; and for item 19, .186 [.102, .270]. The estimates convey the relative degree to which each predictor is associated with disgust about exposed intestines. Disgust about maggots (item 19) is more predictive of intestine disgust than sex disgust (item 24), though both are significant, and showing positive (not negative) monotonic association with the outcome. Disgust related to spoiled milk (item 18) was not significantly predictive.

For the POC, there was no indication that the PO assumption was violated (χ2= 6.22, df = 6, p = .399), and model fit was good (χ2= 32.71 df= 42, p= .847). The six exponentiated slopes (i.e., odds ratios) and their 95% CIs are given next. For each slope, comparisons are ‘slightly’ versus ‘not’ followed by ‘disgusting’ versus ‘not’: item 24, 1.19 [0.64, 2.25], 0.67 [0.35, 1.27]; item 18, 0.95 [0.44, 2.05], 0.82 [0.37, 1.79]; item 19, 0.34 [0.16, 0.75], 0.17 [0.08, 0.37]. Conclusions differ from those provided by DOR because only the parameters for item 19 are statistically significant (odds ratio CIs exclude 1). The odds of a more disgusted response to the intestines for someone who is slightly disgusted by the maggots are .34 times the odds for someone who is not disgusted by the maggots. The odds of a more disgusted response to the intestines for someone who is disgusted by the maggots are .17 times the odds for someone who is not disgusted by the maggots.

For OLS, the overall model R2= .07 and the linear slope estimates with 95% CIs are: item 24, .07 [–.00, .15]; item 18, .03 [–.06, .12]; item 19, .24 [.15, .33]. As with the POC, only item 19 is significantly predictive (slope CI excludes 0). One might suggest that with every one unit increase in disgustedness about maggots, there is a .24 unit increase in disgust about seeing intestines. However, for ordinal data, it makes no sense to talk about linear increase, the magnitude of the regression coefficient is not interpretable, and the confidence interval was computed based on statistical theory that is inapplicable.

6. Discussion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

This research evaluated DOR for variables with three ordered categories, and compared it to the POC and OLS. DOR and the POC are both preferable to OLS because OLS theory, assumptions and interpretations are inapplicable to ordinal data. OLS is not recommended for ordinal data on the basis of statistical reasoning, not empirical findings. Further, OLS probably performed better in the present study than it does in reality because the empirical results do not reflect the misspecification of OLS to ordinal data. All of the outcomes compared OLS parameters estimated from the misapplication to ordinal data with N = 100,000 to OLS parameters estimated from the misapplication to ordinal data with N = 40, 80, etc.

Qualifications aside, there were interesting empirical findings about OLS. Although SEs were accurate, CI widths were small, and bias and CI coverage were (mostly) good without assumption violation, violation of the PO assumption produced major inaccuracy in CI coverage, probably primarily due to bias in parameter estimation. It makes sense that a linear model would approximate an ordinal logistic model poorly when a different model applies for each logit. Not surprisingly, power decreased in accordance with decreases in the amount of information the method presumed available in the data: highest for continuous, followed by ordinal, followed by nominal.

Notably, implementation of the POC in this study did not exploit the best available methodological developments for this model. There is a relatively large literature about fitting GLMs with ordinal predictors. Some methods are Bayesian (e.g., Rufibach, 2010), and most require as many dummy variables as would be needed for nominal coding, with the addition of constraints to impose ordinality (e.g., Gertheiss et al., 2011; Gertheiss & Tutz, 2009; Walter et al., 1987), thus, they would be infeasible for many-valued predictors with smaller samples. Nevertheless, these methods deserve more attention in the social sciences literature, including a comparison to DOR.

Results were favourable for DOR, but it is not considered ideal. With non-zero population parameters, CI coverage of the true parameter was outside the control limits for both POC and DOR, and both methods were negatively impacted by violation of the PO assumption. The problem with the POC CIs seems due to both point estimation and SEs, but the problem with DOR CIs was neither inaccuracy in SEs nor point estimates. That leaves the distributional assumption (t with df=n− 2). In the future, analytic work may be possible to find a better distribution to use, or it may be necessary to use an empirical approach such as bootstrapping to improve the CI coverage for DOR.

Interpretation of DOR parameters is limited, but this is primarily due to limitations inherent in ordinal data rather than a limitation of the method. OLS provides appealing interpretations about linear relationships, which are unjustified for variables that contain only information about relative ordering. The POC is an improvement because the outcome is treated as ordinal, and interpretations as odds ratios or predicted probabilities are more informative than DOR interpretations; however, in this case predictors are presumed binary (nominal) or continuous, not ordinal.

The present simulations focused on a particular set of conditions that could be expanded in the future. For example, one could examine different effect sizes, more sample sizes, more predictors, outcomes with more than three categories, and models that include binary predictors and combinations of predictors with different numbers of categories. Finally, additional implementations of DOR would help it to become more widely used; for example, an implementation in R would be a valuable future endeavour.

In conclusion, most simulation outcomes indicated that DOR is a useful approach for this type of data, and is preferable to the POC for smaller samples (less than about n = 640) even if the PO assumption is met. The preference is a little stronger if the small sample is combined with a violation of the PO assumption. With DOR compared to the POC, CIs were more narrow and higher in power, point estimates were less biased, and SEs were more accurate. Additional advantages of DOR are that only one parameter per predictor is estimated, the number of parameters is constant for any number of outcome levels, and ordinal predictors are treated as ordinal, which relates to the power results, and is also an interpretational advantage.

Acknowledgements

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The author is grateful to Paul Johnson for insightful comments on a draft of this manuscript, to Adam Hafdahl and Michael Edwards for scholarly input related to data generation, and to Graham Rifenbark, Patrick Miller, and Addie Timmons for data processing assistance.

Footnotes
  • 1

    Cliff's (1994, 1996) ordinal multiple regression may be carried out with raw data (Pearson correlations) or ranked data (Spearman correlations), but the present focus is on the fully ordinal (dominance scores and τa) approach.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information
  • Agresti, A. (2010). Analysis of ordinal categorical data (2nd ed.). Hoboken , NJ : Wiley.
  • Bull, S., Lewinger, J., & Lee, S. (2006). Confidence intervals for multinomial logistic regression in sparse data. Statistics in Medicine , 26, 903918.
  • Bull, S., Mak, C., & Greenwood, C. (2002). A modified score function estimator for multinomial logistic regression in small samples. Computational Statistics and Data Analysis , 39, 5774.
  • Cliff, N. (1994). Predicting ordinal relations. British Journal of Mathematical and Statistical Psychology , 47, 127150.
    Direct Link:
  • Cliff, N. (1996). Ordinal methods for behavioral data analysis . Mahwah , NJ : Lawrence Erlbaum.
  • Cliff, N., & Charlin, V. (1991). Variances and covariances of Kendall's tau and their estimation. Multivariate Behavioral Research , 26, 693707.
  • Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika , 80, 2738.
  • Gertheiss, J., Hogger, S., Oberhauser, C., & Tutz, G. (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Applied Statistics , 60, 377395.
  • Gertheiss, J., & Tutz, G. (2009). Penalized regression with ordinal predictors. International Statistical Review , 77, 345365.
  • Haidt, J., McCauley, C., & Rozin, P. (1994). Individual differences in sensitivity to disgust: A scale sampling seven domains of disgust elicitors. Personality and Individual Differences , 16, 701713.
  • Hawkes, R. K. (1971). The multivariate analysis of ordinal measures. American Journal of Sociology , 76, 908926.
  • Kendall, M. G. (1938). A new measure of rank correlation. Biometrika , 30, 8193.
  • Kruizenga, H., De Vet, H., van Marissing, C., Stassen, E., Strijk, J., van Bokhorst-de van der Schueren, M., Horman, J., Schols, J., van Binsbergen, J., Eliens, A., Knol, D., & Visser, M. (2010). The SNAQ, an easy traffic light system as a first step in the recognition of under nutrition in residential care. Journal of Nutrition, Health, and Aging , 14, 8389.
  • Long, J. D. (1998). Descriptive and inferential aspects of ordinal multiple regression. Multiple Linear Regression Viewpoints , 25, 4553.
  • Long, J. D. (1999). A confidence interval for ordinal multiple regression weights. Psychological Methods , 4, 315330.
  • Long, J. D. (2005). Omnibus hypothesis testing in dominance based ordinal multiple regression. Psychological Methods , 10, 329351.
  • Long, J. D., & Cliff, N. (1997). Confidence intervals for Kendall's tau. British Journal of Mathematical and Statistical Psychology , 50, 3141.
    Direct Link:
  • McCullagh, P. (1980). Regression models for ordinal data (with discussion). Journal of the Royal Statistical Society, Series B , 42, 109142.
  • Mehta, C. R., & Patel, N. R. (1995). Exact logistic regression: Theory and examples. Statistics in Medicine , 14, 21432160.
  • Nelder, J., & Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A , 135, 370384.
  • Peterson, B., & Harrell, F. (1990). Partial proportional odds models for ordinal response variables. Applied Statistics , 39, 205217.
  • Rufibach, K. (2010).An active set algorithm to estimate parameters in generalized linear models with ordered predictors. Computational Statistics and Data Analysis , 54, 14421456. doi:10.1016/j.csda.2010.01.014
  • Smith, R. B. (1972). Neighborhood context and college plans: An ordinal path analysis. Social Forces , 51, 199217.
  • Snell, E. J. (1964). A scaling procedure for ordered categorical data. Biometrics , 20, 592607.
  • Somers, R. H. (1968). On the measurement of association. American Sociological Review , 33, 291292.
  • Somers, R. H. (1974). Analysis of partial rank correlation measures based on the product-moment model: Part one. Social Forces , 53, 229246.
  • Stokes, M., Davis, C., & Koch, G. (2000). Categorical data analysis using the SAS System (2nd ed.). Cary , NC : SAS Institute Inc.
  • Tolin, D. F., Woods, C. M., & Abramowitz, J. S. (2003). Relationship between obsessional beliefs and obsessive-compulsive symptoms. Cognitive Therapy and Research , 27, 657669.
  • Walker, S. H., & Duncan, D. B. (1967). Estimation of the probability of an event as a function of several independent variables. Biometrika , 54, 167178.
  • Walter, S. D., Feinstein, A. R., & Wells, C. K. (1987). Coding ordinal independent variables in multiple regression analysis. American Journal of Epidemiology , 125, 319323.
  • Williams, D., & Grizzle, J. E. (1972). Analysis of contingency tables having ordered response categories. Journal of the American Statistical Association , 67, 5563.
  • Woods, C. M. (2007). Confidence intervals for gamma-family measures of ordinal association. Psychological Methods , 12, 185204.
  • Woods, C. M. (2009). Consistent small-sample variances for six gamma-family measures of ordinal association. Multivariate Behavioral Research , 44, 525551.
  • Woodward, T. S., Hunter, M. A., & Kadlec, H. (2002). The comparative sensitivity of ordinal multiple regression and least squares regression to departures from interval scaling. British Journal of Mathematical and Statistical Psychology , 55, 305315.

Supporting Information

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Dominance-based ordinal multiple regression
  5. 3. Proportional odds version of the cumulative logits model
  6. 4. Simulation study
  7. 5. Empirical example
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Supporting Information

The following supporting information may be found in the online edition of this article:

C++ executable, with example, for carrying out ordinal multiple regression

FilenameFormatSizeDescription
bmsp2046_sm_SuppMat.txt44KSupporting info item

Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.