Improved Dynamic Predictions from Joint Models of Longitudinal and Survival Data with Time-Varying Effects using P-splines

In the field of cardio-thoracic surgery, valve function is monitored over time after surgery. The motivation for our research comes from a study which includes patients who received a human tissue valve in the aortic position. These patients are followed prospectively over time by standardized echocardiographic assessment of valve function. Loss of follow-up could be caused by valve intervention or the death of the patient. One of the main characteristics of the human valve is that its durability is limited. Therefore, it is of interest to obtain a prognostic model in order for the physicians to scan trends in valve function over time and plan their next intervention, accounting for the characteristics of the data. Several authors have focused on deriving predictions under the standard joint modeling of longitudinal and survival data framework that assumes a constant effect for the coefficient that links the longitudinal and survival outcomes. However, in our case this may be a restrictive assumption. Since the valve degenerates, the association between the biomarker with survival may change over time. To improve dynamic predictions we propose a Bayesian joint model that allows a time-varying coefficient to link the longitudinal and the survival processes, using P-splines. We evaluate the performance of the model in terms of discrimination and calibration, while accounting for censoring.


Introduction
In the field of cardio-thoracic surgery, valve function is monitored periodically over time after heart valve surgery.Aortic gradient (AG) (mmHg) is one of the continuous echocardiographic markers that measures valve (dys)function, where high values indicate a worsening of the patient's condition.Specifically, it measures aortic stenosis which occurs when the opening of the aortic valve located between the left ventricle of the heart and the aorta is narrowed.During the follow-up period after surgery, patients may require an intervention or may die.The motivation of this research comes from a study, conducted in the Erasmus University Medical Center, which includes all patients who received a human tissue valve allograft in the aortic position in the Department of Cardio-Thoracic Surgery (Bekkers et al., 2011).In total 296 patients who survived aortic valve or root replacement with an allograft valve were followed over time.Specifically, echocardiographic examinations were scheduled at six months, one year postoperatively and biennially thereafter.During follow-up, 161 (54%) patients either died or required a reoperation on the same valve.A total of 1669 echocardiographic measurements of AG were performed.The median number of visits is six and the median years of follow up is 9.3.One of the characteristics of human valves is that their durability is limited.Hence, it is important for the physicians to have a prognostic tool in order to carefully monitor trends in valve function over time and plan a future re-intervention.
Joint modeling of longitudinal and survival data is a popular framework to analyze data including repeated measurements and time-to-event outcomes appropriately (Tsiatis and Davidian, 2004;Rizopoulos, 2012;Andrinopoulou et al., 2012;Rizopoulos et al., 2015).The idea behind these models is that the longitudinal and the survival processes share common random effects, inducing correlation between the two processes.Specifically, we construct a mixed-effects model to describe the evolution over time for the longitudinal outcome, and use these estimated evolutions as a time-dependent covariate in a survival model.Several authors have focused on deriving predictions under joint modeling of longitudinal and survival data framework (Taylor, Yu, and Sandler, 2005;Garre et al., 2008;Yu, Taylor, and Sandler, 2008;Proust-Lima and Taylor, 2009;Rizopoulos, 2011;Andrinopoulou et al., 2015a).To improve the fit and the predictive accuracy of joint models, previous work allowed the inclusion of multiple longitudinal outcomes and investigated the selection of the optional functional form (Rizopoulos, 2011;Rizopoulos et al., 2014;Andrinopoulou et al., 2015b;Andrinopoulou and Rizopoulos, 2016).A common feature of all aforementioned models and previous work published on the motivating data is that the parameters that measure the strength of the association between the longitudinal and survival outcome were assumed to be constant in time.However, the heart valve degenerates, therefore it is natural and biologically more desirable to assume that the effect of AG is changing over time.To investigate that, we performed a preliminary analysis assuming a time-dependent Cox model for the composite event death/reoperation including as predictors the square root of AG (SAG), a transformation that was explored in previous work (Andrinopoulou et al., 2014), and gender.Figure 1 shows the smoothed scatterplot of the Schoenfeld residuals from this Cox model versus time.These residuals are typically used to investigate the proportional hazards assumption, and for the SAG they show an increasing trend indicating violation of a constant-effect assumption.
Even though the consideration of time-varying coefficients has been extensively studied in the general context of survival analysis using polynomials and B-splines (Nan et al., 2005;Perperoglou, 2014), relatively little work has been done on joint models with time-varying coefficients (Song and Wang, 2008).To our knowledge, no work has been done to evaluate whether such time-varying coefficients may improve the accuracy of individualized predictions within the framework of joint models.The idea behind the time-varying coefficient models is to include interactions of the covariates with an appropriate pre-defined time function.To enhance the predictive performance of the joint model, we assume a time-varying effect of SAG using P-splines.Specifically, it is approximated by a polynomial spline written in terms of a linear combination of B-spline basis functions (Eliers and Marx, 1996;Lang and Brezger, 2004;Eliers, Marx, and Durbán, 2015).To overcome the problem of the large number of parameters and to stabilize the predictions, a penalty is applied to the coefficients.To facilitate flexible modeling of the survival outcome, we use P-splines also for the logarithm of the baseline hazard.To evaluate the derived predictions we present extensions of classic measures of predictive ability, such as discrimination and calibration, in the time-dependent setting while accounting for censoring.
The rest of the paper is organized as follows.Section 2 describes the formulation of the joint model.Section 3 presents the Bayesian estimation.Section 4 presents measures to assess the predictive performance of the model.Section 5 shows the results for the cardio data analysis, while Section 6 contains simulation studies.Finally, in Section 7 we close with a discussion.

Joint Model Definition
We let T * i denote the true failure time for the i-th individual (i = 1, . . ., n), and C i the censoring time.Moreover, T i = min(T * i , C i ) denotes the observed failure time and is the event indicator where zero indicates censoring.We let y i denote the longitudinal response obtained at different time points t ij > 0, (j = 1, . . ., n i ).To describe the subject-specific evolution over time of the continuous longitudinal outcome SAG, we use a mixed-effects model.In particular, we postulate where x i (t) denotes the design vector for the fixed effects regression coefficients β and z i (t) the design vector for the random effects b i .Moreover, ǫ i (t) ∼ N (0, σ).For the corresponding random effects, we assume a multivariate normal distribution, namely We postulate a varying-coefficient joint model (VCJM) for the relationship between the survival and the longitudinal outcome.Specifically, we have where θ s is the parameter vector for the survival outcomes, w i is a vector of baseline covariates with a corresponding vector of regression coefficients γ, γ h 0 is the vector of the baseline hazard coefficient and H i (t) = {η i (ζ), 0 ≤ ζ < t} denotes the history of the true unobserved longitudinal process up to time point t.The function f {λ(t), H i (t)} specifies which features of the longitudinal submodel are included in the relative risk model (Rizopoulos and Ghosh, 2011;Rizopoulos, 2012).Previous work suggested that not only the value of SAG but also other characteristics of the biomarker may have an influence on the event of interest (Andrinopoulou et al., 2015b).Several specifications of f (.) have been proposed in the literature (Rizopoulos and Ghosh, 2011).Some examples are the following: Specifically, f (.) postulates that the hazard of the event is associated with the underlying value of the longitudinal outcome at a specific time point t, the value and the slope of the longitudinal outcome at t, or the accumulated longitudinal process up to time t.
In the standard constant-coefficient joint model (CCJM), λ(t) is assumed to be constant over time.For the VCJM, we model λ(t) flexibly assuming a smooth function.
Several approaches have been proposed for modelling and estimating smooth functions such as B-splines.A problem that arises in such methods is the selection of the number and position of the knots which has been a subject of much research.In this manuscript, we adopt the P-splines approach for λ(t).The basic idea of P-splines is to use a (relatively) high number of equally spaced knots.To obtain sufficient smoothness of the fitted curves and to avoid overfitting, a roughness penalty based on differences of adjacent B-spline coefficients is applied (Eliers and Marx, 1996).In particular, we take where α ℓ is a set of parameters that capture the strength of association between the longitudinal and survival outcomes and B ℓ (t) denotes the ℓ-th basis function of a B-spline.To further facilitate improving the derived predictions from the joint model, we model the baseline hazard with the same P-splines approach.Specifically, where γ h 0 ,u are the coefficients of the baseline hazard and B u (t) denotes the u-th basis function of a B-spline.

Bayesian Estimation
We employ a Bayesian approach where inference is based on the posterior of the model.
In particular, we use Markov chain Monte Carlo (MCMC) methods to estimate the parameters of the VCJM.The likelihood of the model is derived under the assumption that the longitudinal and survival processes are independent given the random effects (Rizopoulos, 2012).Moreover, the longitudinal responses of each subject are assumed independent given the random effects.The posterior distribution is written as where θ = (θ ⊤ s , θ ⊤ y ) ⊤ is the parameter vector for the survival and the longitudinal outcomes respectively.The likelihood contribution of the longitudinal and survival model together with the formulation of the deviance information criterion (DIC) are given in the Appendix.

Bayesian P-splines
The Bayesian P-splines approach was first introduced by Lang and Brezger (2004).In our case, the smoothness of functions λ(t) and h 0 (t) is controlled by the following priors for the coefficient that links the longitudinal and the survival outcomes α and the coefficient of the baseline hazard γ h 0 : where M α , M γ h 0 are the penalty matrices.In particular, , where D r is a r-th order difference matrix.The scaled identity matrix I ensures a positive define variance-covariance matrix.As described in the literature (Reinsch, 1967;Eliers and Marx, 1996), a common choice for the penalty matrices is to assume a second order penalty.The amount of smoothness is controlled by the variance parameters τ γ h 0 and τ α , where hyperpriors are assigned.A usual recommendation is to set c 1 and f 1 equal to 1 and c 2 and f 2 equal to a small number.
Alternative specifications of these hyperpriors can be found in Jullion and Lambert (2007).

Measuring Predictive Performance
As motivated in Section 1, it is important for physicians to have a prognostic tool for planning next interventions.To assess the predictive performance of the VCJM and to compare it to the CCJM, we focus on discrimination and calibration.Specifically, discrimination is how well can the model discriminate between patients who will experience the event from patients who will not (Pencina et al., 2008), whereas calibration is how well the model predicts the observed event rates (Schemper and Henderson, 2000).

Discrimination
A key feature of our model is to distinguish between patients who are going to die or require a reoperation within a specific time frame from patients who will not.In particular, for a future patient l with SAG measurements ỹl up to time point t, we are interested in investigating whether he will die or require a reoperation in the medically-relevant time frame (t, t + ∆t] within which the physician could intervene to improve survival.The survival/intervention-free probability of patient l within this interval is, where D n = {T i , δ i , y i , i = 1, . . ., n} denotes the sample on which the joint model was fitted.For the calculation of sensitivity and specificity, we have π l (t, ∆t) ≤ c if subject l died or required a reoperation and π l (t, ∆t) ≥ c if he did not experience death/reoperation, for a specific c ∈ [0, 1].In particular, we can define sensitivity and specificity as respectively.Using the area under the receiver operating characteristic curve (AUC) we can assess the discriminative capability of the model.In particular, given a randomly chosen pair of patients (l 1 , l 2 ), we have If patient l 1 experiences death/reoperation within the relevant time frame whereas patient l 2 does not, then we would expect the VCJM to assign higher survival/intervention-free probability during the period (t, t + ∆t] for the patient that did not experience death/reoperation.In the cardio data set, the values of time-to-death/reoperation are not fully observed for all patients.To account for this, the estimation of AUC(t, ∆t) is based on the following decomposition which include the following pairs of patients Ω (1) AUC 1 (t, ∆t) includes the pairs of patients who are comparable Ω (1) and l 1 l 2 (t), w = 2, 3, 4}.Then, with I(.) the indicator function which is the proportion of concordant subjects out of the set of comparable subjects at time t.Specifically, if in a randomly selected pair of patients, the one with the higher event probability experiences the event and the one with the lower probability does not experience the event, then this pair is said to be a concordant pair.For w = 1, we have K2 = 1 because the pairs of patients are comparable.For w = 2, 3, 4, the AUC w (t, ∆t) are weighted with the probability that the concordant subjects are comparable.In particular, K2 = 1 − πl 1 (t, ∆t), K3 = πl 2 (t, ∆t) and K4 = {1 − πl 1 (t, ∆t)} × πl 2 (t, ∆t)

Calibration
To assess the accuracy of the model, we use the prediction error (PE).Using all available information for a particular patient l, we are interested in comparing the predicted probability of survival/intervention-free of this patient to the observed truth: where N l (t) = I(T * l > t) is the event status at time t.To account for censoring, the following estimate has been proposed by Henderson et al. (2002): where R(t) denotes the number of subjects at risk at t.The term I(T l > t + ∆t){1 − πl (t, ∆t)} 2 refers to patients who were alive after t + ∆t and δ l I(T l < t + ∆t){0 − πl (t, ∆t)} 2 to patients who died before t + ∆t.The remaining term refers to patients who were censored in the interval [t, t + ∆t].

Analysis of the Cardio Data Set
In this section we present the analysis of the cardio data introduced in Section 1.Our primary focus is to investigate the association between SAG with time-to-death/reoperation. In Figure A.1, the evolution of the SAG for 12 randomly selected patients is presented, where it is shown that most of these patients have non-linear profiles.Therefore, we assumed a linear mixed-effects submodel including natural cubic splines for time.The DIC criterion indicated that the model assuming one internal knot at 5.02 years (corresponding to 50% of the observed follow-up times) in both the fixed-and random-effects parts had a better fit.Furthermore, we corrected for gender.The mixed-effects submodel for the SAG is where y i (t) are the measurements of SAG, ns(.) denotes the natural cubic splines, To investigate the association between SAG and survival, we postulated the VCJM.
Motivated by previous work (Andrinopoulou et al., 2015b), where different features of the SAG were found to have an influence on survival, we assumed the value and the slope of the longitudinal outcome to be associated with death/reoperation.Furthermore, we corrected for gender.Specifically, the survival submodel takes the form where α 1ℓ and α 2ℓ are the coefficients that link the longitudinal and survival processes, γ is the coefficient for gender and γ h 0 ,ℓ are the baseline hazard coefficients.We assumed in both cases quadratic B-splines basis B ℓ (t) = B u (t) with 8 equally distance internal knots ranging from zero until 20.1 years.
As described in Section 3, for the P-splines approach we assumed normal priors for the time-varying coefficients α = {α 1 , α 2 } and the baseline hazard coefficients γ h0 and gamma hyperpriors for τ α and τ γ h 0 where we took c 1 = f 1 = 1 and c 2 = f 2 = 0.005.For the rest of the parameters standard noninformative priors were used.For the coefficients of the longitudinal outcome β and the survival coefficient γ normal priors were taken with mean zero and large variance.For the variance-covariance matrix of the random effects Σ b we assumed inverse Wishart prior with an identity scale matrix and degrees of freedom equal to the total number of the random effects.For the precision parameter of the longitudinal outcome a gamma prior was taken with parameters that were based on the separate analysis of the outcome.We ran the MCMC using three chains with 150,000 iterations, 50,000 burn-in and 2 thinning.
In Figure 2 we present the mean estimates and the credible intervals of the estimated λ(t) for the value and the slope association parameters, respectively.The grey solid lines represent the coefficient when CCJM was assumed.The CCJM takes the form, where we use P-splines for the baseline hazard for a fair comparison.We observe that the effect of the SAG on survival seems to slightly increase over time, however this effect is not strong.The effect of the slope of SAG on survival appears to increase linearly with time.Specifically, at the beginning of the study (t = 5), for patients having the same gender and level of SAG, the log hazard ratio for one mmHg increase in the current slope of the SAG is 3. 6 Simulation study

Design
We performed a series of simulations to evaluate the performance of the VCJM.We simulated 400 patients with maximum number of repeated measurements equal to 10.
We assumed one longitudinal and one survival outcome as in the analysis of the cardio data set.For the continuous longitudinal outcome, we investigated the following linear mixed-effects model where For simplicity, we adopted a linear effect of time for both the fixed and the random part, and corrected for gender.Time t was simulated from a uniform distribution between zero and 19.5.For the survival part, we investigated the following scenarios.
For Scenario I, we postulated the following model: where B ℓ (t) denotes the ℓ-th basis function of a B-spline where the knots were placed at fixed time points.We assumed time-varying effect of the underlying value for the association parameter and corrected for gender.The baseline risk was simulated from a Weibull distribution h 0 (t) = ξt ξ−1 .For the simulation of the censoring times, an exponential censoring distribution was chosen with mean µ c , so that the censoring rate was between 40% and 60%.Scenario Ia assumed a linear evolution for λ(t), while Scenario Ib a non-linear evolution.
For Scenario II, the survival submodel takes the form, where a constant effect for the α coefficient was assumed.
Finally, for Scenario IIIa and Scenario IIIb, the survival submodel takes the form, and respectively.In this case, we assumed polynomial effects for the association parameters.
We simulated 200 data sets per scenario.More details are presented in Table B.3.

Analysis and Results
To evaluate the performance of the proposed model, we fit the VCJM and the CCJM with the same specification as the one presented in Section 6.1.To mimic the analysis of the cardio data set, the joint models were fitted using P-splines for the baseline hazard, rather than the Weibull one.The reason for observing greater variability at t < 5 and t > 15 is that relatively few events are observed in these regions.
To further evaluate whether the VCJM produces predictions of better quality than the CCJM, we performed an external validation procedure.For each simulated data set from scenarios Ia, Ib and II, we randomly excluded 200 patients.Using the remaining patients, we fitted the VCJM and the CCJM and computed the AUC(t, ∆t) and PE(t, ∆t) for the 200 patients that were initially excluded.These measures were calculated at follow-up times {t = 5.5, 7.5, 9.5} using ∆t = 2. Figures 4 and 5 present boxplots with the results under Scenarios Ia and Ib, respectively.It can be seen that, overall, the VCJM performs better than the CCJM.Most of the time we observe a higher AUC value and a lower PE value for the VCJM.Smaller differences are obtained for Scenario Ib for the AUC at t = 5.5 and the PE at t = 9.5.This is explained by the fact that not enough events were observed at the specific time periods.Figure 6, which presents boxplots for Scenario II, suggests that the VCJM provides accurate predictions even if the data is generated with a constant effect for the coefficient that links the longitudinal and survival outcomes.In all cases, we obtain AUC and PE values that are similar in both the VCJM and the CCJM.

Discussion
Motivated by the fact that the human tissue valves degenerate and hence the effect of SAG on survival could change over time, we developed a VCJM using P-splines.We used P-splines for the logarithm of the baseline hazard.We showed that even when the data is generated with a constant effect for the coefficient that links the longitudinal and survival outcome, the VCJM performs equal or better than the CCJM.In this paper we followed the Bayesian framework and therefore, extensions to more complex situations are comparably easy.A further advantage of the Bayesian approach, is that it allows us to automatically estimate the smoothing parameter in the P-splines approach by assigning a prior to it.
Further extensions to improve the VCJM can be considered for future work.In this paper we used only one longitudinal and one survival outcome.Extension to multiple longitudinal outcomes might be useful since other longitudinal outcomes that are related to the heart value may have an influence on survival.Finally, death and reoperation are competing risks and therefore a competing risk survival submodel would be more appropriate. where

Discussion of the data results
In Table B.1 and B.2 we present the posterior means, the standard errors, the 95% credible intervals and the Gelman-Rubin's Rhat diagnostic of the VCJM and CCJM respectively.
With regards to the VCJM, it can be seen that gender does not seem to be a strong factor for the SAG.A nonlinear effect of time seems to capture the evolution of SAG.
Furthermore, gender does not seem to have an influence on the survival outcome.The results of the association parameters are difficult to interpret therefore plots were created in the manuscript (Figure 2 in the main paper) where it can be seen that the slope of the SAG on survival appears to increase linearly with time.The Gelman-Rubin's diagnostic (potential scale reduction factor) suggests that all the parameters of the VCJM converged.
With regards to the CCJM, we found the same results for the longitudinal submodel as in the VCJM.Moreover, gender does not seem to have an influence on the survival outcome.The underlying value and slope of the SAG seem to have an effect on the survival outcome.In particular, the log hazard is increased by 0.17

Figure 1 :
Figure 1: Time-varying coefficient of the SAG, using the Schoenfeld residuals.The solid line represents the mean estimate while the dotted lines the corresponding confidence interval.The dashed grey line represents the coefficient of the SAG.

Figure 2 :Figure 3 :
Figure2: With black lines the mean estimates (solid line) and the credible intervals (dashed lines) of λ 1 (t) and λ 2 (t) functions corresponding to the value and slope association parameters are presented.The grey solid lines represent the mean estimates of the CCJM.
Figure A.4 and A.5, illustrate the true and estimated λ(t) function when simulating from the linear Scenario Ia (left panel) and the non-linear Scenario Ib (right panel) and when fitting the VCJM.In particular, in Figure A.4 we assumed 8 internal knots for the simulation and the fitting part while in Figure A.5 we assumed 20. Figure A.6 illustrates the true and estimated λ(t) function for Scenario II when fitting the VCJM assuming 8 internal knots.Figures A.7 and A.8 present the results when simulating for Scenario IIIa and Scenario IIIb respectively and when fitting the VCJM with 8 internal knots.Overall our model successfully recovers the true λ(t).

Figure 4 :Figure 5 :Figure 6 :
Figure 4: Boxplots of the AUC and PE measurements when assuming the VCJM and the CCJM at different time points with ∆t = 2.For the simulation of the data scenario Ia was used -External validation.
g) ]}, where b = b 1 , . . ., b n are the random effects, g = 1, . . ., G is the iteration of the sampler, θ (g) and b (g) denote the parameter samples at the gth iteration and θ and b represent the means of the posterior samples.The model with the smaller DIC value represents the model with the best fit.

Figure
Figure A.3: The SAG evolution for patient 10.
Figure A.4: Simulation results assuming a time-varying effect (Scenario I) and 8 internal knots when simulating and assuming the VCJM with the same knots when fitting the data: the left panel corresponds to Scenario Ia (linear λ(t)) and the right panel to Scenario Ib (non-linear λ(t)).The solid black lines denote the true λ(t) function, the dashed black lines the average estimates of this function from the 200 data sets and the grey lines the estimates from each of the 200 data sets.