#### 4.2 Model fitting

We used each of the three mover–stayer random effects as the patient-specific random effect in the four-state multi-state model for hand joint damage (Figure 1). The mover–stayer gamma distribution used is the same as that used in Equation (2). The mover–stayer inverse Gaussian distribution used was chosen with parameter *μ* = 1 and the *ψ* parameter free to vary. We note that where *μ* = 1, the expectation of the mover–stayer inverse Gaussian becomes *E*(*U*_{k}) = 1 − *π*, and the variance is given by . These first two moments are identical to those for the mover–stayer gamma distribution (Equation (3)), with the change in parameter , although the shape of the distribution is not the same.

The CP-PVF model chosen has probability density function given by Equation (8), with a point mass of size exp( − *ρ*) at zero, such that P(Stayer) = exp( − *ρ*). For the CP-PVF model, we note that the expectation of the random effect is given by *ρ* ∕ *ν*. As such, setting the CP-PVF distribution to have unit expectation would have forced the undesirable constraint of *ρ* = *ν*, resulting in a less flexible CP-PVF random effects distribution. Instead, we set the baseline state 1 to state 2 transition intensity, *λ*_{012}, to be equal to one. Hence, we can consider the estimate of *ρ* ∕ *ν*, obtained using maximum likelihood estimation, to be an estimate for the state 1 to state 2 baseline transition intensity. We present the results from fitting the mover–stayer gamma, mover–stayer inverse Gaussian and CP-PVF distributions as patient-specific random effects in the four-state model (Figure 1). We chose explanatory variables concerning the activity level at a joint to act on the transitions of the model by using the relationship shown in Equation (1). The three binary variables used are tenderness only ( 1 = tenderness only in the joint and 0 = otherwise), effusion with or without tenderness ( 1 = effusion with or without tenderness in the joint and 0 = otherwise) and past activity ( 1 = joint has been active in the past and 0 = otherwise). The explanatory variable to describe present activity (tenderness or effusion) in the opposite joint of a pair to the joint undergoing a transition to damage is a binary variable such that 1 = joint active presently and 0 = joint not active presently.

#### 4.3 Results

Table 1 shows the results of fitting the multi-state model (Figure 1) to the Toronto PsA data, using the three mover–stayer random effects distributions, described in Section 3. The table also shows results where a (non-mover-stayer) gamma distribution was used as a random effects distribution [17] to compare the explanatory variable effects. We note that for each mover–stayer distribution, the estimate of P(Stayer) was constrained to be the same for all individuals although this will be relaxed later. A ‘transitive’ joint indicates a joint that may experience a transition to damage at a particular location (i.e. a joint for which the transition rate is being modelled), and an ‘opposite’ joint refers to the same joint on the opposite hand at a particular location.

Table 1. Intensity ratio estimates together with associated 95% confidence intervals fitted to the model incorporating activity and damage to each individual joint pair of the left and right hands by using mover-stayer (M-S) random effects. |

Intensity ratio |

No previous damage in either joint |

Effect on transition to damage | Gamma | M-S gamma | M-S inverse Gaussian | CP-PVF |

Tenderness in the transitive joint | 2.76 (2.06, 3.70) | 2.75 (2.05, 3.69) | 2.76 (2.19, 3.46) | 2.74 (2.05, 3.66) |

Effusion in the transitive joint | 4.47 (3.38, 5.90) | 4.46 (3.38, 5.88) | 4.51 (3.92, 5.19) | 4.32 (3.28, 5.68) |

Activity in the opposite joint | 1.18 (0.90, 1.55) | 1.18 (0.90, 1.56) | 1.17 (0.94, 1.46) | 1.20 (0.92, 1.57) |

Transitive joint active in the past | 2.14 (1.68, 2.71) | 2.14 (1.68, 2.73) | 2.14 (1.88, 2.42) | 2.07 (1.64, 2.62) |

Opposite joint active in the past | 1.10 (0.86, 1.41) | 1.10 (0.86, 1.41) | 1.10 (0.97, 1.25) | 1.07 (0.84, 1.37) |

Opposite joint damaged |

Effect on transition to damage | Gamma | M-S gamma | M-S inverse Gaussian | CP-PVF |

Tenderness in the transitive joint | 2.24 (1.51, 3.32) | 2.26 (1.53, 3.35) | 2.23 (1.91, 2.59) | 2.29 (1.55, 3.38) |

Effusion in the transitive joint | 2.19 (1.40, 3.41) | 2.21 (1.42, 3.44) | 2.18 (1.87, 2.54) | 2.27 (1.46, 3.53) |

Transitive joint active in the past | 1.37 (1.00, 1.86) | 1.37 (1.01, 1.86) | 1.38 (1.20, 1.59) | 1.35 (1.00, 1.84) |

Baseline intensities |

Parameter ( × 10^{ − 2}) | Gamma | M-S gamma | M-S inverse Gaussian * | CP-PVF * |

*λ*_{012} | 0.28 (0.21, 0.36) | 0.29 (0.22, 0.38) | 0.28 (0.16, 0.49) | 0.26 (0.20, 0.34) |

*λ*_{013} | 0.27 (0.21, 0.34) | 0.28 (0.21, 0.37) | 0.27 (0.15, 0.48) | 0.25 (0.19, 0.32) |

*λ*_{024} | 2.15 (1.49, 3.10) | 2.27 (1.57, 3.30) | 2.11 (1.14, 3.89) | 1.95 (1.67, 3.29) |

*λ*_{034} | 2.34 (1.58, 3.47) | 2.43 (1.61, 3.66) | 2.44 (1.09, 5.46) | 1.94 (1.54, 3.32) |

Random effect parameters |

Parameter | Gamma | M-S gamma | M-S inverse Gaussian | CP-PVF |

*θ* | 3.81 (2.98, 4.88) | 3.57 (2.65, 4.81) | | |

*ψ* | | | 0.33 (0.29, 0.37) | |

*ν* | | | | 176.43 (138.06, 225.48) |

*ρ* | | | | 0.46 (0.39, 0.55) |

Estimate of P(Stayer) | | M-S gamma | M-S inverse Gaussian | CP-PVF |

| | 0.042 (0.001, 0.797) | 0.334 (0.255, 0.437) | 0.631 (0.579, 0.679) |

The results in Table 1 indicate that the effects of tenderness, effusion and activity (past and current) show a similar effects pattern across the different choices of mover–stayer distribution. The effects pattern shows that where neither joint at a particular location is damaged, tenderness, effusion and past activity in the transitive joint all have large, significant effects on the transition to damage. Conversely, both current activity (tenderness or effusion) and past activity in an opposite joint do not have significant effects on the transition to damage. Where an opposite joint at a particular location is already damaged, tenderness and effusion have significant effects on the transition to damage although we no longer observe a stronger effect for effusion in the transitive joint compared with the effect of tenderness only as was the case where neither joint at a particular location exhibited damage. Past activity has a marginally significant effect on the transition to damage, and the effect is generally smaller than that seen where neither joint in a pair is previously damaged. The baseline transition intensity estimates are generally similar across the different random effects models.

However, the three mover–stayer distributions have resulted in widely varying estimates of P(Stayer) = *π*, with the mover–stayer gamma, mover–stayer inverse Gaussian and CP-PVF distributions producing estimates (with associated 95% confidence intervals in parentheses) of 0.042 (0.001, 0.797), 0.334 (0.255, 0.437) and 0.631 (0.579, 0.679), respectively. The estimate of *π* close to zero together with the associated wide confidence interval seen for the mover–stayer gamma distribution may suggest that there is little evidence, or great uncertainty, to determine whether the population contains separate sub-populations of movers and stayers, from this model. To investigate the differences in these estimates further, we produced plots of the profile log-likelihood for various values of *π*, along with empirical Bayes estimates of the random effects for each distributional choice. Figure 2 shows plots of the profile log-likelihood calculated for each mover–stayer distribution for various values of P(Stayer).

When a statistical model is specified for a data set, the likelihood function is generally used to provide information for the estimation of model parameters. For the mover–stayer models considered, the proportion of patients who remain damage free is, in each case, governed by one model parameter only (*π* in the mover–stayer gamma and mover–stayer inverse Gaussian distributions and *ρ* in the CP-PVF distribution). An examination of the shape of the profile log-likelihood, in each case, for these parameters allows an assessment of the identifiability of the proportion of patients who remain always damage free. A single peak in the profile log-likelihood plot for these parameters would suggest that an estimate for P(Stayer) is identifiable. Conversely, a relatively flat profile log-likelihood shape would suggest difficulty in identifying a maximum likelihood estimate for P(Stayer).

Figures 2 and 3 show plots of the profile log-likelihood for various values of P(Stayer) in each of the three mover–stayer models that were considered. The shape of the profile log-likelihood for *π* for the mover–stayer gamma model suggests that the profile log-likelihood increases as *π* approaches zero. The obtained estimate from the numerical optimisation procedure clearly does not occur at the point where the likelihood is maximised, implying that the optimisation process has not been successful at obtaining a maximum likelihood estimate for *π*. In essence, the shape of the profile log-likelihood for *π* appears to suggest that the maximum likelihood estimate for *π* would be effectively equal to zero. Clearly, the shape of the profile log-likelihood for *π* for this model is not concave, so it is unsurprising that the optimisation process has not attained the maximum likelihood estimate for *π*. However, it would appear that the other model parameter estimates (and associated asymptotic confidence intervals) appear reasonable and identifiable when compared with those obtained using a non-mover-stayer gamma random effects distribution. Because the maximum likelihood estimate was not attained, it is not possible to obtain an asymptotic 95% confidence interval for *π* using standard methods. As an alternative, we calculated a 95% likelihood ratio interval for *π* as (0.00, 0.30). This interval represents the values of *π*_{0} for which we are unable to reject the null hypothesis *π* = *π*_{0} in a likelihood ratio test. Given these results, the addition of a mover–stayer component to the gamma distribution does not appear to be necessary.

For the mover–stayer inverse Gaussian model, a peak is seen clearly in the plot of the profile likelihood for *π* (Figure 2). This suggests that the maximum likelihood estimate for *π* is identifiable for this model. We calculated this estimate for *π* as 0.334 with 95% Wald confidence interval (0.225, 0.437). This confidence interval does not lie close to zero, thereby implying that there is some evidence to support a mover–stayer scenario with regard to hand joint damage. We performed a generalised likelihood ratio test between this model and that where a non-mover-stayer inverse Gaussian distribution was used for the random effects. Because this is a test of the null hypothesis that *π* = 0, we compared the test statistic with a 50:50 mixture of a distribution and a point mass at zero. This yielded a test statistic of 6.31 with a corresponding *p*-value of 0.006, suggesting that the inclusion of a term to account for P(Stayer) is necessary for this model. Consequently, this provides evidence to suggest that a mover–stayer scenario may exist within these data, with regard to clinical damage in the hand joints.

The plot of the profile log-likelihood for the CP-PVF distribution shows an obvious peak, and the use of this distribution for the random effects yielded a maximum likelihood estimate of P(Stayer) = 0.631, with a relatively narrow 95% confidence interval of (0.579, 0.679). This suggests that the estimate for P(Stayer) is identifiable from these data under this distributional assumption. The 95% confidence interval does not lie close to zero, and hence, this model provides evidence to support the existence of a mover–stayer scenario with regard to hand joint damage. The CP-PVF distribution does not represent the same type of mixture distribution as the mover–stayer gamma and mover–stayer inverse Gaussian distributions. This is because the mover–stayer gamma and mover–stayer inverse Gaussian distributions had a separate parameter (*π*), which governed the proportion of stayers in the model only. The CP-PVF model contains the parameter *ρ* on which the proportion of stayers and the continuous part of the random effects distribution for the movers both depend.

We considered next empirical Bayes estimates of the random effect, *U*_{k}, for each patient under each of the chosen random effects distributions (including the original gamma distribution). Suppose that *ϕ* denotes the set of parameters of the random effects distribution. Then the empirical Bayes estimates of *u*_{k} are given by

Here, denotes the predictive distribution of *u*_{k}, given the parameter estimates and . That is,

where is the model likelihood contribution from the kth patient, having explanatory variables at times **t**_{k}. The random effects distributions considered have different expected values, and so estimates of *U*_{k} for each distribution are not directly comparable. Hence, to compare the random effect distributions, we chose to compare the estimated state 1 to state 2 transition intensities, under the assumption that all explanatory variables have a value of zero, for each patient. The state 1 to state 2 intensity estimate is given by

where is the empirical Bayes estimate of the random effect. In Figure 4, we show histograms of the estimated state 1 to state 2 baseline intensities, conditional on the patient being classed as a mover. Figure 5 shows plots of the random effects distribution cumulative density functions (CDFs) for each distribution type, whose parameters are estimated in Table 1. For the gamma and mover–stayer gamma random effects distributions, the histograms of empirical Bayes estimates of the state 1 to state 2 transition intensities (Figure 4) show that most random effect values lie near to zero. Conversely, the histogram for the CP-PVF distribution shows estimated values for the state 1 to state 2 transition intensities that are less skewed towards zero. The CDFs (Figure 5) show that the CDF shapes are very similar for the mover–stayer gamma and non-mover-stayer gamma models. The choice of a gamma distribution for the random effects allows the random effects distribution to have a substantial mass near to zero, which may make the identification of a sub-population of stayers difficult. This idea is supported by the shape of the profile log-likelihood (Figure 2) and the confidence interval for the estimate of *π* seen when a mover–stayer gamma random effects distribution was assumed. The inverse Gaussian CDF also has a similar shape although the probability density function for the continuous part of this distribution (Equation (4)) has a limit of zero as *u* → 0, which may be why the estimation of P(Stayer) is less problematic for this random effects distribution. In contrast, the CP-PVF CDF increases at a slower rate than that for the other distributions despite the fact that the continuous part of this CDF begins at P(Stayer) = 0.63. This suggests that the random effect values for the population of movers are less skewed towards zero for this random effects distribution, which is consistent with the shape of the empirical Bayes estimates histogram for the state 1 to state 2 baseline intensities in Figure 4.

The probability density function shape for the mover–stayer inverse Gaussian random effects distribution clearly does not asymptote towards ∞ as *u* → 0. However, the histogram for the empirical Bayes estimates of the state 1 to state 2 transition intensities suggests that values of the random effects tend to lie near zero, although this is less so than where the mover–stayer gamma random effect was used. From the estimation process, we obtained an estimate of P(Stayer) with a reasonable confidence interval, suggesting that the problem of *π* not being estimable is not present when a mover–stayer inverse Gaussian distribution is used for the random effects. The continuous part of the estimated CP-PVF probability density function may appear to asymptote to infinity as *u* → 0. However, Equation (7) indicates that asymptotes to as *u* → 0. This is verified by Figure 6, which shows the probability density function for the mover portions of both the mover–stayer gamma and CP-PVF distributions at values of *u*, which lie very close to zero, and is proved formally in Appendix B. The plots show that the CP-PVF distribution asymptotes towards *f*(*u*) = 51.23 as *u* → 0 whereas that for the mover–stayer gamma distribution takes very large values and is increasing as *u* → 0. This, together with both the shape of the profile log-likelihood for P(Stayer) and the estimate of P(Stayer) exhibiting a relatively narrow 95% confidence interval, implies that estimation of the proportion of damage-free patients under the CP-PVF random effects specification is less problematic than the case of the other mover–stayer random effects distributions considered. It may be that, as mentioned previously, the dependence of both P(Stayer) and the continuous part of the CP-PVF distribution on a common parameter (*ρ*) may result in this less problematic estimation of model parameters, under the assumption of a mover–stayer scenario within the data set.

It is important to note that the time origin used for all patients was the time at clinic entry. Clearly, some patients may have had the disease longer than others at this point, which may be informative about the proportion of patients who are stayers. We considered only those patients who had no clinical damage in all joints at clinic entry so that all patients were either stayers or movers who had not yet progressed, in an effort to make patients more comparable at this point. We recognise that additional controls, such as incorporating explanatory variables measuring the time since diagnosis or other suitable measures of disease duration could be included, if desired.

#### 4.4 Choosing the most appropriate model

The application of each of the three random effects distributions to these sets of four-state multi-state models resulted in similar estimates of the baseline transition intensities and explanatory variable effects. This is encouraging from the viewpoint of making robust conclusions with regard to the relationship between the behaviour of the activity and damage processes for these patients in continuous time. In short, we draw the same conclusions about this relationship regardless of the patient-level random effects distribution. The main aim in fitting the mixture models was to assess whether a mover–stayer scenario with regard to hand joint damage exists within these data. Two of the mover–stayer random effects models considered (the mover–stayer inverse Gaussian and CP-PVF) provided some evidence to suggest that a mover–stayer scenario exists with regard to hand joint damage. Conversely, the mover–stayer gamma random effects model provided insufficient evidence in support of the existence of a mover–stayer scenario. It is not possible to assess directly which of these three models represents the best description of the hand joint damage process in the Toronto PsA study, and we cannot know how close estimates of P(Stayer) lie to the (unknown) true value of P(Stayer). Nevertheless, it seems intuitive to examine the features of these fitted models in an effort to make a pragmatic assessment as to which models are the most appropriate for these data.

When a non-mover-stayer gamma model was fitted to these data [17], the shape of the density function was such that the density increased towards ∞ as *u* → 0. This could suggest that, under the assumption that there is no mover–stayer scenario, some members of the population would generally have small random effect values and thus a low rate of progression in the model. Conversely, this may suggest that a mover–stayer scenario could exist but that the presence of a group of stayers has drawn the shape of the random effect distribution towards zero. When the gamma distribution was extended to a mover–stayer gamma distribution, the CDF shape for the movers was almost identical to the shape of that for the non-mover/stayer gamma distribution (Figure 5). This implies that the fitting process for the mover–stayer gamma model results in a large proportion of probability density being concentrated on those movers who progress very slowly. This suggests that it is difficult to distinguish between true stayers and ‘slow-rate movers’ for this data set, which may explain why it was difficult to obtain an estimate of *π* for this model. The other mover–stayer densities that we considered do not asymptote to ∞ as *u* → 0, which may help to limit the occurrence of problems with regard to identifying the proportion of stayers in this data set.

The mover–stayer inverse Gaussian model provided evidence to suggest that a mover–stayer scenario with regard to hand joint damage exists within these data. We obtained this evidence by using a *p*-value from a generalised likelihood ratio test between the non-mover-stayer and mover–stayer inverse Gaussian random effects models and by noting that the asymptotic 95% confidence interval for P(Stayer) indicated a substantial departure from zero. The CP-PVF model also provided evidence in support of the existence of a mover–stayer scenario. Although a generalised likelihood ratio test against a non-mover-stayer model was not possible in this case, the substantial departure of the 95% confidence interval for P(Stayer) from zero implies that a mover–stayer scenario exists. For both models, an examination of the profile log-likelihood for P(Stayer) suggested that the maximum likelihood estimate for P(Stayer) is identifiable, and the shapes of the estimated random effects distributions for movers indicate that both of these models are reasonable for these data. Hence, there appears to be a mover–stayer scenario with regard to hand joint damage. However, although these two random effects models suggest a mover–stayer scenario, they provide different estimates of P(Stayer). There is no definitive method of deciding which of these two models best represents these data. The mover–stayer inverse Gaussian random effects distribution produced a smaller estimate for P(Stayer) with a larger standard error than that where the CP-PVF random effects distribution was used. In addition, we note that the estimate of P(Stayer) of 0.631 obtained using the CP-PVF model lies closer to the empirical estimate for P(Stayer) of 0.71 than the corresponding estimate of 0.334 obtained where a mover–stayer inverse Gaussian random effects distribution was employed. The model where a CP-PVF random effects distribution was used had the largest maximised log-likelihood of the mover–stayer models considered, further suggesting that the CP-PVF distribution may be desirable when considering a mover–stayer random effects distribution for these data. The CP-PVF distribution also contains one less parameter than each of the other mover–stayer random effects distributions considered. Hence, if an information criterion, such as the AIC [25], was to be considered for model selection, then this would provide further evidence in support of the CP-PVF random effects model.

Several authors (e.g. [26-28]) have considered zero-inflated Poisson models for count data where there may be a large number of zero counts and devised score tests to determine whether a zero-inflated Poisson model is preferable to a standard Poisson model. This problem is analogous to the problem of a large mass at zero in a mixture model involving a continuous distribution. In addition, Aguirre-Hernández and Farewell [29] considered negative binomial models for the increase in the number of damaged joints exhibited by a patient in the Toronto PsA data set between clinic visits, which allowed for a sub-population of patients who would not exhibit any joint damage. The authors compared a standard negative binomial model with a ‘zero-inflated’ negative binomial model with a larger mass at zero to represent the possibility of a significant sub-population of stayers. They concluded that there was no significant difference in the model fit for the negative binomial model and the zero-inflated negative binomial model. We note that the negative binomial regression models in their study were derived under the assumption of independent gamma random effects acting on the average rate of joint damage between each clinic visit. Solis-Trapala and Farewell [30] considered a similar zero-inflated negative binomial model for the change in joint count for the Toronto PsA data, only that in this model, patient-level gamma random effects were employed rather than observation-level random effects. Neither of these works found convincing evidence to suggest that a zero-inflated random effects model offered a significant improvement over a standard random effects model. The zero-inflated negative binomial models used corresponded, in each case, to gamma random effects being added to Poisson count models. In light of our work, this lack of evidence to establish a mover–stayer scenario may have been because a gamma random effects distribution was implicitly employed.

The density functions for the gamma and inverse Gaussian mover–stayer models both assume that the parameter representing P(Stayer) *π*, is distinct from the parameters in the distribution function for the movers. This is not the case in the CP-PVF model where the parameter *ρ* governs both the value of P(Stayer) and, to some extent, the shape of the continuous part of the CP-PVF distribution. This feature of the CP-PVF distribution may aid in alleviating problems in estimating the value of P(Stayer) efficiently. We note also that the CP-PVF distribution is the most general of the three random effects distributions considered for these data, having a flexible form that is the sum of independent gamma distributions. It is therefore possible that the likelihood function from this more general model is more informative in this data set.