Trivariate mover‐stayer counting process models for investigating joint damage in psoriatic arthritis

In psoriatic arthritis, many patients do not develop permanent joint damage even after a prolonged follow‐up. This has led several authors to consider the possibility of a subpopulation of stayers (those who do not have the propensity to experience the event of interest), as opposed to assuming the entire population consist of movers (those who have the propensity to experience the event of interest). In addition, it is recognised that the damaged joints process may act very differently across different joint areas, particularly the hands, feet and large joints. From a clinical perspective, interest lies in identifying possible relationships between the damaged joints processes in these joint areas for the movers and estimating the proportion of stayers in these joint areas, if they exist. For this purpose, this paper proposes a novel trivariate mover‐stayer model consisting of mover‐stayer truncated negative binomial margins, and patient‐level dynamic covariates and random effects in the models for the movers and stayers, respectively. The model is then extended to have a two‐level mover‐stayer structure for its margins so that the nature of the stayer property can be investigated. A particularly attractive feature of the proposed models is that only an optimisation routine is required in their model fitting procedures. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.


Introduction
Psoriatic arthritis (PsA) is an inflammatory arthritis associated with the skin condition psoriasis. A basic measure of disease progression in PsA is the attained number of permanently damaged joints. Since its inception in 1978, the University of Toronto PsA clinic has established the largest and most comprehensively studied cohort of PsA patients in the world. A notable feature of this cohort is the large number of patients whose disease process does not progress to permanent joint damage, even after a prolonged follow-up. Thus, when characterising the rate of accumulating damaged joints, there is generally a greater amount of zeros than is predicted from a standard count distribution. Several authors have considered mover-stayer damaged joints counting process models in order to understand the damage process at the total joint level (all joints, Aguirre-Hernández and Farewell [1]) or with regard to the joints in the hands (Cook et al. [2], Solis-Trapala and Farewell [3], and O'Keeffe et al. [4]). These models estimate the proportion of stayers (those who do not have the propensity to experience the event of interest), through a binary component, and describe the counting process of the movers (those who have the propensity to experience the event of interest) with regard to the occurrence of clinical joint damage over time. Sutradhar and Cook [5] provided an extension by developing a bivariate mover-stayer damaged joints counting process model for the situation where two damaged joints counting processes are of interest. A multinomial distribution was specified for the joint binary component, whilst a random effect was used to correlate the rates for the movers. Their model was clinically motivated by distinguishing between clinical and radiological damage at the total joint level. Note that these works have focused on two or less damaged joints counting processes with one stayer population being associated with each process.
In the present setting, the clinical aims are to identify possible relationships between accumulating damaged joints in the hands, feet and large joints for the movers, which may be asymmetric, and to estimate the stayer proportions in each joint area. Hence, three processes are now of interest. This would provide further understanding of the damage joints counting process in each joint area (hands, feet and large joints), which are expected to be different, and contribute new understanding of the possibly global nature of damage. In addition, it would also be of interest to investigate the nature of the stayer property, particularly to distinguish between patients who are inherent stayers (true stayers) or attain the stayer property through management/treatment strategies employed by the clinic (clinic-induced stayers). These clinical considerations motivate the need for the development of new methodology that allows the relationship between three damaged joints counting processes to be investigated whilst simultaneously allowing for the possibility that there could be two stayer populations associated with each process.
In the count data setting, mover-stayer models are referred to as zero-inflated models, and these are commonly used when count data sets display a large number of zeros. The zero-inflated Poisson (ZIP) model was first considered by Lambert [6]. Böhning [7] and Ridout et al. [8] provide a comprehensive review of this methodology, whilst Böhning et al. [9] demonstrate various applications to the public health and social science settings. When the study design also results in clustered observations, a popular modification is to incorporate cluster-specific random effects. These models can however be computationally intensive, primarily because of the required integration over the random effects. Hall [10] used the expectation-maximization (EM) algorithm to fit a ZIP model which included a random effect in the Poisson and not the binary component of the model. The ZIP model of Hur et al. [11] contained distinct random effects in both components of the model, although these authors chose to use numerical quadrature and a Newton-Ralphson solution to simultaneously estimate the parameters of their model. They state the advantages of this routine over the EM algorithm are easily obtainable standard errors and a quicker convergence rate. Lee et al. [12] and recently, Morgan et al. [13] extended these models to allow for a further level of clustering. This was performed by incorporating two separate random effects into each component. The former utilised the EM algorithm, whilst the latter took a Bayesian approach in the model fitting procedure. We note that marginal models have been proposed to handle clustering. These models have been constructed through generalised estimating equations that allow a working correlation matrix to be incorporated into the model fitting procedure, see Dobbie and Welsch [14] for an example.
Motivated by the aforementioned clinical considerations, we develop a novel trivariate model with mover-stayer truncated negative binomial margins, as a flexible alternative to ZIP models, and incorporate patient-level dynamic covariates and random effects in the models for the movers and stayers, respectively. The dynamic covariates allow asymmetric relationships to be identified, whilst the random effects provide information across processes to estimate the stayer proportions. We then extend this model to have a two-level mover-stayer structure for its margins so that inherent and clinic-induced stayers can be investigated for each marginal process. Patient-level random effects are again incorporated in each stayer/binary component. In contrast to the literature where the logit link function and the distributional assumption of normal random effects are usually specified for the binary component, we consider the complementary log-log link function, and this allows a closed form marginal likelihood to be obtained if the distributional assumption of the random effects is chosen to have a closed form Laplace transform. Thus, computationally intensive techniques such as numerical integration, EM algorithm and MCMC are not required in the model fitting procedure. This is particularly useful for the implementation of the extended model because it contains two patient-level random effects that must be integrated out for each patient's likelihood contribution.
The next section introduces the PsA data on which this analysis is undertaken.  (10 in each foot) and 16 large joints. The large joints consist of the left and right jaw, sterno-clavicular, shoulder, elbow, wrist, hip, knee and ankle. Figure 1 displays the location of these joints. Of the 1194 patients, 997 had more than one clinic visit. At clinic entry, the mean age was 37 years, with standard deviation of 13 years and 4 months, whilst the mean disease duration was 7 years with standard deviation of 8 years and 3 months. The mean follow-up time for those who had more than a single clinic visit was 9 years and 5 months, with interquartile range of 11 years and 1 month, and their mean and median inter-visit times were 10 and 6 months, with standard deviation of 1 year and 2 months.
As mentioned, a notable feature of this data set is the large proportion of patients who did not develop damaged joints in these joint areas. There were 698 (58%), 683 (57%) and 801 (67%) patients who remained damage free in the hands, feet and large joints, respectively. Figure 2 provides more details by displaying the frequencies of patients who remained damage free in the different combination of joint areas. For example, of the 801 patients who remained damage free in the large joint locations, 450 patients did not develop any damage in the hand or foot joints, 122 did not develop any damage in the hand joints but did so in the foot joints, 101 did not develop any damage in the foot joints but did so in the hand joints, and 128 developed damage in both the hand and foot joints.
In the next section, a trivariate mover-stayer damaged joints counting process model is developed for the hands, feet and the large joints.

Model
Let N k ij be random variables representing the total number of damaged joints that patient i has accumulated in joint area k up to the time of the jth clinic visit t ij . Then, D k ij = N k ij+1 − N k ij represents the number of damaged joints developed in joint area k between t ij and t ij+1 with j = 0, … , m i − 1 and k = h, f and l denoting the hands, feet and large joints, respectively. Here, m i represents the number of clinic visits for patient i. Suppose that D k ij is independent of conditional on the inclusion of the dynamic covariates N h ij , N f ij and N l ij . A regression model can be specified by allowing D k ij to have a negative binomial distribution with mean Λ k ij and dispersion parameter k ⩾ 0 that is truncated to account for the number of joints that have the propensity to become damaged; where T k = 28, 20 and 16 for k = h, f and l, respectively. Here, T k specifies the number of joints in the kth joint area, where k 0 is a constant baseline intensity, ) are covariates evaluated at t ij and ( are regression coefficients. To reflect the possibility that a subpopulation of stayers may exist, a partially observable binary variable C k i1 , where C k i1 = 0 if patient i is a stayer in joint area k and C k i1 = 1 otherwise, can be incorporated into the model. These variables are partially observable because they are known for a particular joint area of a patient if damage has occurred, that is C k i1 = 1, otherwise, they are unknown. The probability that patient i is a stayer in joint area k conditional on a patient-level random effect U i = u i can then be estimated as where z k i⋆ and k 1 are column vectors of time-invariant covariates and regression coefficients, respectively. The random effect U i , which is incorporated at the patient-level, reflects the characteristic that the stayer probabilities across locations are likely to be more similar within patients. Furthermore, if the stayer property is inherent (immune to damage before and after clinic entry), the information obtained between arthritis onset and clinic entry will contribute towards estimation of the stayer proportions, and thus, this information must be accounted for in the analysis. Let D k i0 * be the number of damaged joints patient i developed in area k in this period. A model for D k i0 * given that no damaged joints at arthritis onset is possible, that is N k i0 * = 0, can again be specified as a negative binomial distribution with mean Λ k i0 = ( t i0 − t i0 * ) k0 0 and dispersion parameter k0 and having again the relevant truncation. Here, k0 0 is a constant baseline intensity, and t i0 * is the time of arthritis onset for patient i. That is Note that inferences from these models are not of primary interest; their use is to provide information towards the estimation of the stayer proportions. Under this formulation, the conditional likelihood contribution from patient i in joint area k (given the patient-level random effect U i = u i and the dynamic covariates, the total number of damaged joints in each joint area , is firstly, if no damaged joints are observed; and secondly, if n k i0 , … , n k im i damaged joints are observed at t i0 , … , t im i and n im i ≠ 0; The likelihood contribution L i (Θ) (where all unknown parameters are contained in the vector Θ) from patient i can then be derived by assuming that conditional on the random effect U i and dynamic covariates N h i , N f i and N l i , the mover-stayer damaged joints counting processes within an individual are independent. Hence, The conditional likelihood is then computed by taking the product of all conditional likelihood contributions from each patient. The next subsection provides details on how the integration over the random effects can be computed analytically.

Computing the likelihood analytically
Firstly, the conditional likelihood contribution from patient i in joint area k can be rearranged as where the form of the third line can be seen by comparing terms when c k i * = 1 and 0, respectively. The conditional likelihood contribution from patient i is then .
From this parametrisation, it is evident that the marginal likelihood contribution from patient i can be computed analytically if the distributional choice for U i has a closed form Laplace transform. In this paper, as suggested by Conaway [15], the gamma distribution with unit mean and variance is considered for U i . Its Laplace transform is given by Under this distributional assumption, the marginal form for k i1 (integrating out u i ) is then This function, which is similar to the one used by Pregibon [16], contains the logit link when = 1 and the complementary log-log link as → 0. It can therefore be used to validate the suitability of these particular link functions as well as to offer greater flexibility.

Application
The model described in Section 3, which will be denoted the full model, was fitted to the 1194 PsA patients described in Section 2. In addition to the dynamic covariates, the attained number of damaged joints in each location; the current number of active joints in the hands, feet and large joints were also considered as covariates in the truncated negative binomial components. In a preliminary analysis, gender, age and arthritis duration were demonstrated to be not statistically and significantly related to the risk of damaged joints (results not shown). For the binary components, intercept only models were considered so that the marginal stayer proportions could be more simply investigated. Parameter estimation for this and all other subsequent models in this paper were achieved using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) [17] optimisation algorithm in the statistical package R [18]. Other methods might be used and be more appropriate in some situations. The reported confidence intervals for all parameters are 95% Wald intervals obtained from evaluating and then inverting the observed information matrix at the maximum likelihood estimates. Table I displays the results from the full model. Of primary interest is the relationship between damage progression in the hand and foot joints. The table indicates that the number of damaged hand joints is strongly and positively associated with damage progression in the foot joints, and that the reverse relationship (number of damaged foot joints on damage progression in the hand joints) is not statistically significant. This asymmetric relationship could be of clinical interest, especially as the strength of associations of the number of damaged hand joints (0.083) and active foot joints (0.1) when modelling damage progression in the foot joints are comparable in magnitude. Additionally, the attained number of damaged large joints demonstrates a significant positive association with damage foot joints progression, whilst little association with damage hand joints progression is seen. In contrast, the number of active joints only seems to have a local effect. The significant associations are only really seen within joint areas for these variables, except for the possibly counter intuitive negative association between the number of active foot joints and damage progression in the large joints. Because of the large joints category being composed of many different types of joints, it is not possible to draw any specific conclusions regarding this counter intuitive result. It is, however, interesting to note the positive association between the attained number of damaged hand joints and damage progression in the large joints. These results might prompt further investigation into the large joints category, and further highlights the importance of understanding the damaged hand joints process. Table I. Parameter estimates related to associations with damaged joint counts, and stayer probability estimates obtained from fitting the full model to 1194 psoriatic arthritis patients. The P(Stayer) estimates were calculated as k * 1 (̂1,̂) for k = h, f and l. Standard errors for these quantities were calculated using the delta method. In all three joint areas, there is strong evidence of a subpopulation of stayers as the confidence interval for each of the relevant estimated probabilities are far from zero. The marginal stayer proportion estimates are 0.37 (0.33, 0.41), 0.31 (0.26, 0.35) and 0.29 (0.22, 0.35) for the hands, feet and large joints, respectively. Table I suggests, on average, the large joints experience the slowest damage progression rate. This may explain why the large joints category has the smallest estimated stayer proportion even though most patients remained damage free in this area. There is also evidence from Table I that a gamma distributed patient-level random effect is required, and that the link function for the marginal binary components is neither the logit or complementary log-log functional form (̂=3.9 (2.76, 5.51)).
The inverse Gaussian distribution was also considered for U i , which is comparable to the normality assumption that is commonly employed in the literature. The resulting parameter estimates and corresponding confidence intervals (the regression coefficients in the truncated negative binomial components and the true stayer proportion estimates in each joint area together with their corresponding confidence intervals) were similar to those displayed in Table I. Comparison of likelihood values suggested a preference towards a gamma distributional assumption for U i , hence demonstrating its usefulness in this context.
For comparative purposes, trivariate models with truncated negative binomial ( k 1 ∶= 0 ∀k, TNB model) and mover-stayer truncated Poisson ( k = k 0 ∶= 0 ∀k, TM-SP model) margins were also fitted. Tables II and III displays the results from the TNB and TM-SP models, respectively. Rather reassuringly, the TNB model demonstrates similar results to the full model regarding the direction and significance of associations between covariates and outcomes. However, the effect sizes corresponding to the significant associations from the TNB model are generally inflated in comparison with the full model. The estimated values of k and k 0 for each k are also seen to be larger in the TNB model, which is unsurprising because these parameters now also reflect the variability from the existence of stayers. In contrast, the results from the TM-SP model are less similar to the full model. Specifically, the number of active large joints are now seen to be strongly and positively associated with damage progression in all three joint areas and the number of damaged large joints is strongly and negatively associated with damaged large joints progression. The strong association between the number of damaged hand joints and damage progression in all three joint areas has also been greatly attenuated, whilst the stayer proportion estimates are greatly inflated. It is also interesting to note that̂takes a smaller value in the TM-SP model (in comparison with the full model), which suggest that there is less correlation between the distribution of stayers in the different joint areas when heterogeneity is not taken into account in the damage progression models, after adjusting for covariates. The log-likelihood values strongly indicate, even after accounting for the 4 less parameters from the TNB model and the 6 less parameters from the TM-SP model, that the full model is to be preferred.  Table III. Parameter estimates related to associations with damaged joint counts and stayer probability estimates obtained from fitting the TM-SP model to 1194 psoriatic arthritis patients. The P(Stayer) estimates were calculated as k * 1 (̂1,̂) for k = h, f and l. Standard errors for these quantities were calculated using the delta method.

Simulation study
This section evaluates the empirical performance of the full model through simulation studies. The simulation studies consist of simulating data from the full model and then refitting the same model structure to the simulated data. This will determine if the parameters of interest can be reasonably estimated. Specifically if the regression coefficients corresponding to the dynamic covariates, that is the relationship between the damaged joints counting processes, and the stayer proportions can be well estimated.
In the simulation studies, 400 data sets were generated from the full model with each data set containing patients with 18 clinic visits and 6 months inter-visit time, thus reflecting the average patient in the PsA data. For simplicity, all patients are assumed to have entered the clinic with no damaged joints (in any joint area), and therefore, the models for D k i0 * are not applicable. Furthermore, the number of currently active joints are not considered. Simulations were performed with the number of patients in each data set being fixed at 200, 500 and 800.
Simulation from the full model was performed as follows. Firstly, a value u s i was simulated from a gamma distribution with unit mean and variance for each patient. Then for each joint area, a Bernoulli variable C ks i was simulated with success probability exp ) to determine if that simulated patient was a stayer (C ks i = 0) or a mover (C ks i = 1) in joint area k. Simulated stayers in joint area k were such that their simulated outcomes n ks ij = 0 for all time intervals, whilst for movers n ks ij+1 = d ks ij + n ks ij and n ks i0 = 0. The damage joints increment d ks ij were obtained by simulating from a negative binomial distribution with mean 0. and dispersion parameter k 0 , which had been truncated so that 0 ⩽ d ks ij ⩽ T k − n ks ij . Simulating from such a distribution was performed by simulating from a multinomial distribution with categories , where the category probabilities were calculated from the specified truncated negative binomial distribution. Table IV displays Table I (results from fitting the full model to our PsA data). When there are 500 or 800 patients in each data set, Table IV demonstrates that the model fitting procedure produces little empirical bias because mean estimated parameters are similar to their true values. More empirical bias is observed when only 200 patients are in each data set, although the bias is generally not substantial. The mean estimated standard error and the standard deviation for each estimated parameter is seen to be similar in all three scenarios, which indicate the reasonableness of asymptotic approximations even with relatively small sample sizes.

Model for trivariate mover-clinic-induced stayer-true stayer damaged joints counting processes
In the previous sections, the stayer property was thought of as inherent. It is plausible to think that this property could also have arisen from entry into the clinic; possibly through treatment/management strategies employed by the clinicians. Such patients would therefore be susceptible to damage before clinic entry and 'immune' thereafter (clinic-induced stayers). Hence, information before clinic entry would not contribute towards estimating these patients' clinic-induced stayer probabilities. To distinguish between these subgroups, true stayers (patients who are inherent stayers) and clinic-induced stayers are considered separately.

Model
Let C k i1 be again the partially observable binary variable such that C k i1 = 0 if patient i is a true stayer in joint area k and C k i1 = 1 otherwise. If this patient in this joint area is a mover before clinic entry, that is C k i1 = 1, let C k i2 be another partially observable binary variable such that C k i2 = 0 if this patient becomes a clinic-induced stayer in joint area k and C k i2 = 1 otherwise. These variables are again partially observable for reasons previously discussed. The probability of these events can then be estimated as where again z k i⋆ , z k i * and k 1 , k 2 are column vectors of covariates and regression coefficients, respectively, and u i , v i are realisations of patient-level random effects U i , V i , which are assumed independent. Note that V i is only relevant in joint area k if patient i is not a true stayer in this joint area. These random effects again represent the characteristic that the stayer probabilities in the different joint areas, both true and clinic-induced, are likely to be more similar within patients. The likelihood contribution Secondly, if n k im i ≠ 0 and n k and finally, if n k By then assuming independence between the damaged joints counting processes given the patient-level random effects and dynamic covariates, the likelihood contribution from patient i is where g(u i | 1 ) and g(v i | 2 ) are random effect densities with unit means and variances 1 and 2 , respectively. A closed form solution to the integrations in L i (Θ) can again be achieved by using the technique of Section 3.2; see Appendix A for more details. Note that the model of Section 3 is retained if k i2 = 0 for all i and k.

Exploring the existence of a clinic-induced and true stayer subpopulation
Intercept only models for k i1 and k i2 were fitted in order to investigate the existence of a true and clinicinduced stayer subpopulation in each joint area. Initially, the random effects U i and V i were assumed gamma distributed. The resulting marginal estimates k * 2 (̂k 2 ,̂2 ) (after integrating out v i but still conditional on C k i1 = 1) were small for each k. There was also a lot of uncertainty about the estimatê2, which therefore transfered to k * 2 (̂k 2 ,̂2 ) .
It is therefore unsurprising that identifiability issues arose. An alternative for V i is the inverse Gaussian distribution with unit mean and variance . In this case, g(v i | ) → 0 as v i → 0, hence, small values of k 2 are less likely to result from v i ≈ 0. The model was then fitted for this distributional assumption of V i , and this produced a similar likelihood value to the case when V i was assumed gamma distributed. Note that marginally )) .

Figures 3 and 4 display the profile log-likelihood for h 2 ,
f 2 , l 2 and log( ). These plots suggest that the optimisation procedure was able to identify the parameters h 2 , l 2 and log( ) as it converged at the maximum of their respective profile log-likelihoods. From Figure 3, it is also clear that the maximum of the profile log-likelihood for f 2 occurs at a large negative value. It is therefore likely that a clinic-induced subpopulation does not exist in the foot joints. The maximum likelihood estimates for h * 2 and l * 2 were 0.03 (SE = 0.0228) and 0.0498 (SE = 0.0508), respectively. The standard errors were approximated  using the delta method. Because there is large uncertainty associated with these parameter estimates, there is no convincing evidence to suggest clinic-induced stayer subpopulations exist in the hand and large joints. In contrast and similarly to the previous results, the estimates and corresponding confidence intervals for k * 1 were 0.36 (0.313, 0.407), 0.308 (0.264, 0.352) and 0.267 (0.193, 0.341) for k = h, f and l, respectively. As all of the confidence intervals are again far from zero, there is reasonable evidence to suggest that a subpopulation of true stayers exist in all three joint areas.

Discussion
This research was clinically motivated by the need to understand the relationship between damage progression in the hands, feet and large joints, under the assumption that the stayer property is inherent, and then relaxing this assumption by allowing for the possibility of a clinic-induced stayer population. For these purposes, this paper proposed novel trivariate (two-level) mover-stayer models consisting of mover-stayer/mover-clinic-induced stayer-true stayer truncated negative binomial margins, and patientlevel dynamic covariates and random effects to account for within-patient correlation. These models are appealing because the measures of associations between the counting processes are captured using regression coefficients, and the marginal likelihood can be computed analytically even when random effects are used to correlate the stayer components. The former allows asymmetric associations to be accommodated, whilst the latter facilitates the model fitting procedure. When applied to the PsA data, asymmetric associations between the three damaged joints counting processes resulted. A particular interesting observation was the significant positive associations between the attained number of damaged hand joints and damage progression in all three damaged joints areas. Confirmatory evidence of the local effect of active joints was also seen. Furthermore, little evidence was found for the existence of clinic-induced stayer populations in any of the three joint areas.
In the proposed methodology, a clinic-induced stayer was investigated by assuming 'immunity' to damage occurred at the point of clinic entry. Whilst this may be a reasonable approximation for patients who have a controllable amount of disease activity, it is likely to be inaccurate for patients who have severe disease activity at clinic entry. A fairer assessment of a clinic-induced stayer population might therefore allow such patients to gain 'immunity' to damage at other time points whilst in the clinic. This can most conveniently be achieved by modelling, for each patient, the rate of becoming a clinic-induced stayer, as opposed to modelling the explicit proportion of such a population. Such an analysis would however have to account for the uncertainty of not knowing if, in addition to when, a patient has become a clinic-induced stayer, and therefore, is likely to require the development of substantially new statistical methodology.

Appendix A
We demonstrate how the marginal likelihood of the trivariate mover-clinic-induced stayer-true stayer counting process model can be computed analytically. Let c k i1 * = 0 if n k im i = 0 and c k i1 * = 1 otherwise. If c k i1 * = 1, let c k i2 * = 0 if n k im i = n k i0 and c k i2 * = 1 otherwise. The likelihood contribution from patient i in joint area k can again be rearranged as When c k i1 * = 1 for each k, .