## 1 Introduction

The impairment caused by many neurodegenerative diseases such as Parkinson's disease (PD), Alzheimer's disease, and Huntington's disease is multidimensional (e.g., sensoria, functions, and cognition) and progressive. Its multidimensional nature precludes a single outcome to measure disease progression. Many clinical trials studying these diseases have been conducted to search for neuroprotective treatments capable of halting or slowing down disease progression (e.g., deprenyl and tocopherol antioxidative therapy of parkinsonism (DATATOP) study [1], ELLDOPA study [2], PRECEPT study [3], TEMPO study [4], and ADAGIO study [5]). Clinical trials of neurodegenerative diseases often collect multiple longitudinal outcomes of mixed types (categorical and continuous) to assess the treatment effects on overall improvement.

The multivariate longitudinal data structure of these studies has three levels of nesting, that is, multiple outcomes (level 1) are nested within visits (level 2) that are nested within individuals (level 3). To determine the overall treatment effects, the analysis model needs to account for three sources of correlation within the same individual: (1) inter-source (different outcomes at the same visit); (2) longitudinal (same outcome at different visits); and (3) cross correlation (different outcomes at different visits) [6]. A univariate analysis (generalized estimating equations and mixed effects models) that deals with each outcome separately ignores the inter-source and cross correlations, fails to provide an overall treatment effects estimate, and is subject to type I error inflation [7]. Another commonly used approach of reducing the multivariate outcomes into a single summary outcome (e.g., weighted average) results in substantial loss of information and more importantly, results cannot be interpreted on the original outcome scale. Rank-based tests [8-10] for multiple outcomes have been used in several clinical studies [11-14]. But they neither utilize the full longitudinal data information nor describe the disease progression process.

Multilevel item response theory (MLIRT) models have been recently used to analyze such multivariate longitudinal data [15]. It is assumed that the multivariate outcomes are clinical manifestations of a univariate latent variable measuring disease severity. The MLIRT model consists of two levels. The first level measurement model quantifies the relationship between an individual's latent disease severity and the response to the multivariate outcomes. In the second level structural multilevel model, the latent disease severity is regressed on covariates (e.g., treatment and disease duration), time, and subject-specific random effects to study the treatment effects [16-20]. The three sources of correlation are accounted for via the random effects. Advantages of the MLIRT models include better reflection of multilevel data structure, simultaneous estimation of measurement-specific parameters and covariate effects, and accurate inference about high-level measures [21-23]. To obtain valid inference from the MLIRT models, marginal maximum likelihood methods [20], and Bayesian methods [23-29] have been widely used. Skrondal and Rabe-Hesketh [30] and [31] provide good description of the IRT models.

During the course of clinical trials, the follow-up of some individuals could be stopped by a terminal event such as death, dropout due to adverse event or severe adverse event, or some other events. Because the terminal event may be related to the individual's underlying disease severity, the terminal mechanism is nonignorable. The dependent terminal event time is often termed ‘dependent censoring’ or ‘informative censoring’. Ignoring the dependent censoring leads to biased estimates [32, 33]. To address this issue, joint analysis of survival with repeated measures has been increasingly common [32-36]. Tsiatis and Davidian [37], and Yu *et al*. [38] give excellent review of joint modeling research. In the IRT modeling framework, Wang *et al*. [39] proposed a joint model to analyze multiple-item ordinal quality of life data in the presence of death. He and Luo [40] developed a joint model for multiple longitudinal outcomes of mixed types, subject to outcome-dependent terminal events. However, all these references focus on the proportional hazard (PH) model or its extensions. When the PH assumption is violated, the accelerated failure time (AFT) model is an attractive alternative approach. Tseng et al [41] proposed a joint modeling framework replacing the PH models by semiparametric AFT models. The advantage of the AFT model is that the interpretation of risk factors on the failure time is easy, because the AFT model simply regresses the logarithm of the survival time onto covariates and random effects.

In this article, we propose a joint modeling framework in which a MLIRT model is used for the multivariate longitudinal outcomes and a parametric AFT model is used for the dependent terminal event. The two models are linked via random effects. The rest of the article proceeds as follows, In Section 2, we describe a motivating clinical trial, the data structure, and the dependent terminal event. Section 3 discusses the joint random effects model, Bayesian inference, and Bayesian model selection criteria. Section 4 provides a simulation study to assess the performance of the proposed joint model. In Section 5, we apply the proposed model to a motivating clinical trial dataset. Section 6 gives some concluding remarks and discussions. To facilitate easy reading and implementation of the proposed methodology, the codes have been posted at the supporting information§ .