SEARCH

SEARCH BY CITATION

Keywords:

  • Bayesian Survival analysis;
  • Log-linear median regression;
  • Quantile regression;
  • Transform-both-sides

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Summary We present a novel semiparametric survival model with a log-linear median regression function. As a useful alternative to existing semiparametric models, our large model class has many important practical advantages, including interpretation of the regression parameters via the median and the ability to address heteroscedasticity. We demonstrate that our modeling technique facilitates the ease of prior elicitation and computation for both parametric and semiparametric Bayesian analysis of survival data. We illustrate the advantages of our modeling, as well as model diagnostics, via a reanalysis of a small-cell lung cancer study. Results of our simulation study provide further support for our model in practice.


1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Semiparametric models such as Cox’s (1972) proportional hazards model and linear transformation models (Cheng, Wei, and Ying, 1995; Fine, Ying, and Wei, 1998) and their special cases (e.g., accelerated failure time model) are very popular for modeling effects of covariates on a survival response. For example, the main aim of a semiparametric model for a two-arm randomized trial for small-cell lung-cancer (SCLC) patients (Ying, Jung, and Wei, 1995) is to express the effects of treatment arm and age at entry on time from randomization to death (survival time). Often, there is substantial information available in the data to make inferences about the median. However, previous semiparametric models for survival data do not focus on the effects of covariates on the median and other quantiles. Several authors including Ying et al. (1995) gave compelling arguments in favor of focusing on the quantiles of the survival time for modeling and reporting of data analysis results. The effect of treatment and age on the quantiles including median time to death is useful for describing covariate effects. Clinical trials based on survival outcomes are often designed to detect differences in median survival between treatment arms. Models based on the median are often useful in dealing with heteroscedasticity.

Semiparametric Bayesian models for survival data, possibly with the exception of Kottas and Gelfand (2001), and Hanson and Johnson (2002), are either based on covariate effects on the hazard ratio (see Ibrahim, Chen, and Sinha, 2001) or on the mean survival time (e.g., Walker and Mallick, 1999). However, particularly for Bayesian survival analysis, medians and other quantiles are natural choices for elicitation of experts’ opinions. Clinical experts on the disease under study are likely to have useful prior information/opinions about survival quantiles (say, the median). In two-arm cancer clinical trials, the determination of a clinically significant difference and subsequent evaluation of power of the trial, even for frequentist trial designs, are often based on the prior evaluation of the median for the control arm as well as the clinically significant effect of treatment on median survival time (Piantadosi, 2005). In Section 2, we propose a novel semiparametric model for the median survival time with interpretable covariate effects via a log-linear median regression function. This wide class of semiparametric models has many desirable properties including model identifiability, closed form expressions for all quantile functions, and nonmonotone hazards. Unlike previous methods for Bayesian survival analysis (e.g., Hanson and Johnson, 2002), our model accommodates the situation when the location/median as well the scale and shape of the survival distribution are affected by the covariate. Unlike some of the previous frequentist methods for median regression, we do not require the restrictive assumption that all quantile functions below the median to be linear.

In Section 3, we present the likelihood, suitable nonparametric prior processes and Markov Chain Monte Carlo (MCMC) tools to estimate the model parameters. In Section 4, we consider the SCLC trial to demonstrate how our models can facilitate the determination of prior distributions. For the SCLC study, we also compare the results of our approach to existing approaches. In Section 5, a simulation study investigates small sample performance and robustness properties compared to competing methods for median regression. Some final remarks are in Section 6.

2. Semiparametric Models

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Let Ti be the survival time of subject i= 1, …, n and let Zi= (1, Zi1, …, Zip)′ be the corresponding vector of p time-constant covariates along with the intercept term. The transformation model (Cheng et al., 1995) assumes that

  • image(1)

where h is a monotone transformation, γ= (γ0, γ1, …, γp) is a regression parameter, and ei is an unspecified error variable with common density fe(·) free of covariate Zi. Usually the density fe(·) of ei is assumed to be a member of some parametric family with location 0 and with shape and scale free of Zi. Important special cases of (1) are the accelerated failure time model (AFT) when h= log  , the proportional odds model when ei comes from a logistic distribution, and Cox’s model (1972) when fe is the extreme-value density.

The monotone power transformation gλ(y) (Bickel and Doksum, 1981),

  • image(2)

where inline image for y < 0 and inline image otherwise, is an extension of the Box–Cox power family (Box and Cox, 1964), a popular transformation to obtain symmetric and unimodal density for the transformed random variable. We assume that for unknown λ, the transformed survival time gλ{log   (Ti)} is symmetric and unimodal with median gλ(β′Zi) =gλ(Mi), that is,

  • image(3)

where εi are i.i.d. from a unimodal and symmetric density fε(·) centered at 0, Mi=β′Zi, and β is the vector of regression parameters. Carroll and Ruppert (1984), Fitzmaurice, Lipsitz, and Parzen, (2007), among others proposed parametric versions of the transform-both-sides (TBS) regression model for an uncensored continuous response with the original Box–Cox transformation and N(0, σ2) density for error fε(·).

The transformation gλ(y) in (2) is monotone with derivative (with respect to λ) equal to inline image. The median of log   (Ti) is Mi=β′Zi because P[log   (Ti) > Mi] =P[gλ{log   (Ti)} > gλ(Mi)] =Fε(0) = 1/2, where Fε is the cdf of ε. As a consequence, the survival time Ti has a log-linear median regression function Q0.5(Zi) = exp (Mi) = exp (β′Zi) and survival function S(t|z) = 1 −Fε(gλ(log  t) −gλ(M)). For the SCLC study with Mi01z12z2, where z1 is a treatment indicator and z2 denotes age, this implies that the ratio of medians from two patients of the same age but different treatment arms is Q0.5(z1= 1, z2)/Q0.5(z1= 0, z2) = exp (β1). We also get a similar straightforward interpretation of exp (β2) as the ratio of the medians for unit increase in age. The following theorem shows that the parameter λ and the density fε of (3) are also identifiable, in the sense that for any survival time following (3), there is a unique (λ, fε) for which gλ{log   (Ti)} has a symmetric unimodal distribution.

Theorem 1 For the model in (3) if there is another tripletinline imagefor whichinline image, thenλ=λ*, β=β*andinline image.

The proof of Theorem 1 is in the Appendix. Similar to the transformation model of (1), we can rewrite the TBS model of (3) as

  • image(4)

where the error ei in (4) has asymmetric density function inline image, whereinline image. The shape and scale of the cdf Fε{gλ(Mi+u) −gλ(Mi)} of ei depends on the covariates Zi. The approximate variance of log  T is inline image, where fε has finite variance inline image. It is clear that unlike the usual assumption of the transformation model of (1) and Bayesian models of, say, Hanson and Johnson (2002), the median as well as the shape and scale of the error density fe(·|Zi) in (4) depend on the covariate Zi. This allows our model to be useful for dealing with heteroscedasticity of log  T. Thus, unlike the existing Bayes models, the covariate Z does affect the scale and shape of the fe in our TBS models. A parametric log-normal model with location M(Z) =β′Z for log   (T) is a special case of (3) with λ= 1 and Fε being N(0, σ2). The hazard function inline image of (3) can be nonmonotone; for example, a log-normal model has nonmonotone hazard.

Although the model in (3) apparently focuses on modeling the median, we can easily obtain other quantiles of log   (T). For the TBS model of (3), the α-quantile Qα(Z) of T is

  • image(5)

because inline image for α∈ (0, 1), where inline image is the α-quantile of fε(·) with inline image. For α= 0.5, we have inline image and get the log-linear median function exp (β′Z) for T in (3). The expression in (5) shows that this model is very convenient for simultaneously estimating all important quantiles of Ti using the estimates of inline image. However, unlike the existing methods including those of Portnoy (2003) and Peng and Huang (2008), Qα(Z) of the TBS model in (5) is not linear in covariate Z unless α= 0.5 (median). The Bayesian models of Kottas and Gelfand (2001) and Hanson and Johnson (2002) also have linear quantile functions inline image of log  T for all 1 > α > 0, and they are parallel to each other (with only the intercept of βα different for different α∈ (0, 1)).

The expression in (5) for the TBS model also implies that inline image for all α, α′∈ (0, 1). This means that under the model in (3), ordering between two patients’ median survival times implies uniform ordering between their corresponding survival functions over the entire time-axis. This property is similar to Cox’s model where ordering between two hazards (as well as survival functions) remain the same over the entire time-axis.

3. Likelihood, Prior Process and Inference

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Let Ti and Ci be the survival and censoring times, respectively, for i= 1, …, n. We observe (ti0, δi), where ti0=TiCi is the observed follow-up time and δi is the censoring indicator, with δi= 1 for Ti=ti0 and 0 otherwise. It is assumed that Ti and the random censoring time Ci are conditionally independent given covariate Zi. Given the observed data vector y0= (t0, δ*) with t0= (t10, …, tn0) and δ*= (δ1, …, δn), the likelihood function under our TBS model of (3) is as follows:

  • image(6)

where ωi=gλ(yi) −gλ(β′Zi) with yi= log   (ti0)inline image is the cdf of the unimodal symmetric density function dFε(u) =fε(udu.

In general, for the parametric versions of TBS model, any unimodal symmetric distribution, such as the Gaussian and logistic, can be used for Fε. For example, fε(w) and Fε(w) will be respectively replaced by the density φσ(w) and cdf Φσ(w) of N(0, σ2) for the Gaussian TBS model likelihood in (6). The corresponding posterior is p(τ, σ|y0) ∝L(τ, σ|y0)π (τ, σ), where π (τ, σ) is the joint prior density based on the available prior information, with τ= (β, λ). MCMC samples from this joint posterior can be used to implement a parametric Bayesian analysis. Under this parametric model, the maximum likelihood estimator (MLE) of the regression parameters β can be obtained via maximizing the log-likelihood L(τ, σ|y0). For example, the log-likelihood function of the (Gaussian) parametric TBS model is

  • image(7)

where inline image is the survival function of N(0, σ2). The MLE of the parameters under parametric TBS model is obtained via maximizing the corresponding log-likelihood function ℓ (β, τ|y0) using Newton–Raphson (NR) iterations. Under mild regularity conditions, the MLE of β (as well as the parametric Bayes estimator) is consistent and asymptotically efficient based on regular large sample theory for the MLE when the modeling assumption is correct.

Any parametric assumption about Fε in (3) is deemed as a restrictive parametric assumption for some data examples in practice. In the semiparametric version of (3), the unimodal symmetric density of ε is assumed unknown. For semiparametric maximum likelihood estimation (SPMLE) under this model, the likelihood of (6) is maximized with respect to the restriction that Fε is the cdf of a unimodal distribution symmetric around 0. The regularity conditions and asymptotic issues for the SPMLE under (6) are nontrivial and beyond the scope of this article. For semiparametric Bayesian analysis, we need the posterior

  • image(8)

where π12 and π3 are independent priors of τ= (β, λ) and Fε. This uses the simplifying, however reasonable, assumption that the prior opinions about parametric vector τ and nonparametric function Fε can be specified independently. We will discuss the practical justification of this assumption later.

Using the following result of Feller (1971, p.158), we introduce a class of nonparametric priors π3 defined over the space of symmetric unimodal distribution functions Fε in (3). Any symmetric unimodal distribution Fε can be expressed as a scale-mixture of uniform random variables

  • image(9)

for some mixing distribution G(θ), where ζ (u|θ) for θ > 0 is the uniform distribution with support (−θ, +θ). We use the Dirichlet process (DP) of Ferguson (1973), GDP(G0, ν), as a nonparametric prior for the unknown scale-mixing distribution G(θ) of (9). The DP(G0, ν) is characterized by the known “prior guess G0 (the prior expectation of G), and a positive scalar parameter ν, the precision parameter around the prior mean/guess G0. The prior mean G0 of the random mixing density G can be chosen appropriately to assure a desired prior mean/guess F* for unknown Fε. Using a result by Khintchine (1938), when the density f*(·) and its derivative inline image exist, the density inline image of G(θ) is given as

  • image(10)

For example, to obtain an approximate double exponential (Dexpo(γ)) prior mean density inline image for the regression error density fε, using (10), we need to choose G0(θ|γ) as Gamma (2, γ) with density inline image. The precision parameter ν also determines the degree of belief about how close Fε should be to its prior guess F*. When ν is large enough, the unknown nonparametric Fε is very close to its prespecified (often parametric) prior mean/guess F*(·|γ). A small ν implies very little confidence in unknown Fε being close to F*(·|γ), and the corresponding Bayes estimator of β is expected to be very close to the semiparametric likelihood estimator. The details of the specifications of the hyperparameters of the priors π12 and π3 in (8) are provided in the next section.

4. Data Analysis

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Here we analyze the data set from the randomized cross-over trial of Etoposide (E) and Cisplatin (C) for SCLC patients (Ying et al., 1995); 62 cancer patients (z1= 1) were randomized to arm A (C followed by E) and 59 patients (z1= 0) to arm B (E followed by C). Apart from the treatment indicator z1, another covariate is the patient’s age at entry (z2) centered at age 50. Each survival time (given in months) was either observed (δi= 1) or administratively censored (δi= 0). To evaluate the age-adjusted treatment difference, we consider the linear regression function Mi01z1i2z2i. The maximum likelihood estimates of the regression parameters β under the parametric TBS model (3) with Gaussian Fε are given by inline image, inline image, inline image with inline image.

Now we present a parametric Bayesian analysis using the TBS model of (3) with parametric N(0, σ2) density for Fε. One major advantage of the TBS model for Bayesian analysis is that the priors for the parameters 0, β1, β2, λ, σ) can be determined based on prior opinions about some key quantities related to the prior-predictive survival time T* of a patient with known covariate values, say, inline image. Without loss of generality, we assume that the priors are based on the following: (1) Prior guess and prior range of a quantile, say, the median, of the prior-predictive survival time T* of a patient at age 50 (inline image) from treatment arm B (inline image); (2) Change in the median of T* for a unit change in each age (z2) and treatment (z1). We point out that for most Phase 2 and 3 trials, these quantities are routinely elicited and used to design the trial and determine the power for detecting differences (e.g., Pintadosi, 1997). We first demonstrate the specification of these priors for the parametric TBS models.

We use the simplifying assumption that the joint prior is π (β, λ, σ) =π1(β)π2(λ)π3(σ|β, λ). This assumption can be justified in practice because the prior π1(β) is based on the median (location) of T*, whereas the prior π2(λ) is based on the shape (skewness) of log   (T*). The specification of the prior for β0 uses the fact that T* with inline image has a prior median exp (β0). For the lung cancer trial conducted before 1993, the current expert opinions about SCLC are not very appropriate. Based on the published literature about the treatment of SCLC before this article (e.g., Jett, Everson, and Therneau, 1990; Evans et al., 1987; Comis, 1986), the median survival time for treatment arm B was thought to be between 12–17 months for limited-stage and 9–10 months for extensive-stage SCLC patients. For our SCLC study with nearly equal proportions of these two types of patients, we use a mean prior guess of 13 months and a range of (8, 18) months for T*. These give us the prior inline image with A1= log   (13) and B1={log   (18) − log   (8)}/3 to ensure that the prior range of β0 has approximate length 3B1. Our prior opinion about β1 is based on the prior belief about the ratio of medians inline image of two patients with identical age, but, from different treatment arms. So, the prior β1N(0, 10) corresponds to a 95% prior probability that the ratio of medians inline image has range inline image and is centered at e0= 1 (indifferent opinion regarding superiority of either treatment arm). Similarly, the prior β2N(0, 10) corresponds to prior opinion that two patients from treatment B and with 1 year difference in age, have a ratio of medians between inline image with 68% probability. We have chosen such a noninformative prior opinion about β1 and β2 to allow for a meaningful comparison of our analysis results with results from frequentist and previous Bayes methods based on either no prior or a noninformative prior. We would like to point out that our point-wise Bayes estimates do not change substantially (<4% change) when we reduce the prior variances of β1 and β2 to 1 (instead of 10). The interval estimate of β1 (as an example) is around 12% narrower when we use these more skeptical N(0, 1) priors instead of N(0, 10) priors for β1 and β2.

We use the Unif(0, 3) prior for π2(λ) because it is difficult to interpret the after-transform linear model of (3) when λ > 3. In their original paper, Box and Cox (1964) recommended restricting the λ≤ 2. For a parametric Gaussian TBS model, log  T*, when inline image can be expressed approximately as log  T*≃β0+σ|β0|1−λe (Kettl, 1991), where β0 is the median of log  T* and eN(0, 1). This allows us to obtain prior π3(σ|β0, λ) based on prior opinion of inline image because inline image, where inline image is another quantile of log  T* for α*≠ 1/2, and inline image is the α*-percentile of standard normal. For example, when we take α*= 0.75, we have inline image. Based on the SCLC literature prior to this trial, we use the prior opinion that the third-quartile inline image of a patient in treatment arm with 50 years entry-age is between 10 months to 5 years with a center of 33 months. For given 0, λ), we use a Gamma density at the prior π3(σ|β0, λ) with mean equal to 0− log   (33)||β0|λ−1/0.6745 and approximate range between 0 and to (log   (60) − log   (10)) |β0|λ−1/0.6745. These prior densities give us approximately the same means and ranges of inline image and inline image that we expect from our prior opinion about these two quantiles of log   (T*). However, to simplify this further, we use an unconditional Gamma prior π3(σ) whose mean equals to inline image and variance equals to inline image (based on prior mean log   (13) for β0 and prior guess 1 for λ). We found no noticeable difference in posterior estimates using this unconditional prior for σ instead of a conditional prior π3(σ|β0, λ). We remind the reader that the priors used in our analysis are solely for demonstrating the method of development of one set of priors for the Bayesian analysis of the lung-cancer study. An expert’s prior opinions on the median survival time of SCLC can be very different from what we used, and that may lead to different prior specification of the parameters.

Our plot (left-hand panel of Figure 1) of residuals inline image versus the patient’s age at entry, where yi is the observed log(Ti) (subject to censoring) and inline image is the posterior predictive expectation of log   (Ti) under the model, does not show any trend of residuals under the parametric Bayes TBS model. Our plot (right-hand panel of Figure 1) of these residuals versus the estimated median survival times also does not reveal any serious inadequacy of the parametric TBS model. However, the Q–Q plot (Figure 2) of these residuals suggests that the assumption of Gaussian distribution for Fε in (3) is questionable due to the plot being nonlinear at the right tail. Later, we use a semiparametric Bayesian analyses to avoid the Gaussian assumption of ɛi. Our posterior means (Bayes estimates) of three quartiles Qα(z1, z2) for α= 0.25, 0.50, 0.75 of treatment A (z1= 1) are higher than the corresponding estimated quantiles of treatment B (z1= 0) at any age z2.

image

Figure 1. Plots of residuals versus the age at entry (in years) and versus the estimated median survival time (in months) using parametric TBS model for the lung cancer data.

Download figure to PowerPoint

image

Figure 2. Q–Q plots of the residuals under parametric TBS model for the lung cancer data.

Download figure to PowerPoint

For the semiparametric Bayesian analysis with a symmetric unimodal fε in (3), we need to specify the prior guess/mean F* of Fε and a prior precision parameter ν. We take the precision parameter ν= 1 to imply a very low confidence around our parametric prior guess F* of the nonparametric error distribution Fε. We take the prior mean f* of fε to be N{0, (σ0)2} where inline image. This makes f* equal to the prior mean of fε used for the parametric Bayes analysis of the TBS model. Using (10), this N{0, (σ0)2} density for f* corresponds to a Gamma (3/2,  1/{2(σ0)2}) for G0 in (10). The constructive definition of the DP mixture prior process for Fε is inline image (Sethuraman, 1994), where inline image, inline image with inline image. The actual implementation of the MCMC tool to sample from (8) is based on a finite approximation inline image of Sethuraman’s construction with, say, K= 1000 and VK= 1. The MCMC computational tool can be implemented, even via a standard package such as Winbugs. The rest of the conditional posteriors are the same as those used for the parametric Bayes.

We get the semiparametric Bayes point estimates inline image, inline image and inline image for 0, β1, β2) along with 95% credible intervals (2.836, 3.315), (0.003, 0.577) and (−0.022, 0.011) respectively, with inline image. The results of the Bayes estimators of regression parameters (β1 and β2) under parametric and semiparametric TBS models along with the ML estimator based on aparametric Gaussian error TBS model are presented in Table 1. The last line of Table 1 is the result for the Bayesian median regression model of Kottas and Gelfand (2001) using the model of (4) with fe(u) = (1/2) ηsgn(u)f0sgn(u)|u|) for a nonparametric density f0(u) defined on u > 0.

Table 1.  Pointwise and95 %interval estimates (within parenthesis) of regression parameters (β1for treatmentz1andβ2for agez2) for the lung cancer study under different procedures.
EstimatorTreatmentAge
MLE (TBS model)0.433 (0.141, 0.727) 0.019 (0.037, 0.002)
Parametric Bayes (TBS)0.318 (0.036, 0.604) 0.008 (0.023, 0.008)
Semiparametric Bayes (TBS)0.304 (0.083, 0.577) 0.009 (0.021, 0.002)
Portnoy0.369 (0.149, 0.591) 0.009 (0.031, 0.012)
KG Bayes0.389 (0.037, 0.845) 0.018 (0.028, 0.007)

The point estimates of the regression parameters of the median functional under different methods are not strikingly different to the corresponding point estimator obtained via Portnoy’s method (2003). This is also evident from Figure 3, where 3 estimated quantiles for Portnoy’s method (dotted straight lines) and for semiparametric Bayes TBS model (solid curved lines) are plotted (separately for 2 treatment arms).

image

Figure 3. Plots of observed survival times versus Age (z2) with three estimated quartile functions for two treatment arms. (Solid lines: estimated via TBS model; Dotted straight lines: estimated via Portnoy’s method; : censored observation). This figure appears in color in the electronic version of this article.

Download figure to PowerPoint

We find that the proportion of observations in each quantile-interval is closer to the expected proportions for Bayes estimates of quantile functions compared to Portnoy’s. However ML and Bayes methods yield smaller estimated standard errors and substantially narrower interval estimates than those obtained using Portnoy’s method. For this data example, the estimates based on TBS models have smaller estimated standard errors for the treatment effect compared to competing procedures. The widths of the interval estimates from parametric and semiparametric Bayes are substantially smaller than widths of the corresponding estimates based on Portnoy’s method (at least for the age-effect). This is not surprising because Portnoy’s median regression methods have a far larger number of regression parameters than the finite dimensional regression parameter β in (3). The posterior standard deviations of the TBS estimators are also smaller than those from Kottas and Gelfand (2001). Figure 4 plots the logarithm of the ratio of the Conditional Predictive Ordinate (CPO) of the semiparametric TBS model and CPO of the Gaussian TBS model against the observation numbers. A value greater than 0 for this supports a semiparametric model over a Gaussian model. In this example, approximately 67% of observations favor the semiparametric TBS model over the Gaussian TBS model, i.e., a substantially higher proportion of observations supporting the semiparametric model over parametric model. The final conclusion is that semiparametric model fits the data better than other competing parametric models for entire range of age and for both treatments.

image

Figure 4. Plot of the log-ratio of two CPOs obtained from semiparametric TBS and Gaussian TBS model (y-axis), versus Age (x-axis): , uncensored from treatment A; , censored from treatment A; , uncensored from treatment B; , censored from treatment B). This figure appears in color in the electronic version of this article.

Download figure to PowerPoint

5. Simulation Study

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

For our simulation models, we set the median of Y= log   (T) given Z to be M(Z) =β01Z= 6.5 +Z, i.e., β0= 6.5 and β1= 1.0, where Z can take four possible values 0, 0.5, 1.0, and 1.5, in equal proportions for each simulated data set. For each simulation distribution of T considered in the article, we simulate at least 5000 datasets with sample sizes n= 80, 160, and 320. The number of simulated datasets for different sample sizes may vary to assure that the Monte Carlo variability of the approximate bias and MSE of the regression estimates are smaller than 0.01.

For the simulation study, the Bayes estimators considered by us are based only on the semiparametric model of (3). The priors used for Bayes estimation in the simulation study are: β0N(6, 10) and β1N(0, 1). The prior mean for the DP is N(0, 1) and the precision is ν= 0.01. This prior for β1 implies that there is almost 5% prior probability that the ratio of medians is larger than 7.4 for a unit change in z. In order to avoid undue influence of the choice of the priors on the results of our simulation study, we use these somewhat vague priors here. However, the prior can also be viewed as a skeptical prior because the prior of the regression parameters is centered at the prior guess of no-covariate effect (β= 0). If a Bayes estimator can demonstrate good performance for detecting covariate-effects with this prior, this suggests that even a skeptical and unusually “flat prior may not hinder the Bayes method’s ability to detect the covariate effect. The implications of chosen priors for multiple model parameters are best described via various summaries of the prior predictions of the observables/responses. We generate various summary statistics including the sample median, range and width of the range of 500 survival times using a single set of parameters simulated from the joint prior. We then replicate the whole process of simulating these summary statistics 1000 times. We found the range of these 1000 sample medians is between 1400 and 5200 for z= 1.5, compared to the true median of ≃2981 for the simulation model. The range of survival times from the prior predictive models may have width as large as 108. These summaries indicate that our prior predictive models are very noninformative and can cover a wide range of survival patterns. In practice, we expect to use a more informative prior predictive model using often available information about the range of responses (even after incorporating a skeptical prior view about the covariate effect).

First we evaluate the robustness of the MLEs and of the Bayes estimates based on (3). We compare performances (bias and MSE) of these estimators to the competing frequentist estimator of Portnoy (2003). For this aim, we simulate survival data from parametric exponential and Pareto densities. Both exponential and Pareto simulation densities, being heteroscedastic and skewed for all λ, do not satisfy the assumptions of (3). The independent censoring distribution was generated from an exponential density eΛC) with rate parameter Λ chosen to obtain desired proportions of censoring. For example, the choice of Λ= log   (2)/30 results in approximately 20% censoring for exponential simulation model.

Table 2 presents the summary of the approximate sampling mean and mean-square-error (MSE) of various competing estimators of β1 under different simulation models. Results in Table 2 under an exponential and Pareto simulation model show that the MLE based on (3), and the Bayes estimators based on (3) have comparable biases relative to competing estimators. Further, the MSE of Portnoy’s estimators are much larger than the corresponding MSE of the MLE and Bayes estimators. The Bayes estimators under (3) have much smaller MSE compared to the MLE.

Table 2.  Results of simulation study under Exponential and Pareto models: Monte Carlo approximation of the sampling mean and Mean Square Error (MSE) of different estimators of knownβ1= 1.
Simulation ModelSampleGaussian TBS MLEPortnoySP TBS
MeanMSEMeanMSEMeanMSE
 
Exponential 800.912.660.934.270.920.90
 1600.971.351.112.281.080.65
 3200.940.690.961.200.930.48
Pareto 801.0312.011.1019.891.030.95
 1600.935.410.918.601.010.85
 3200.922.680.984.251.020.68
TBS 800.991.941.011.521.040.72
(Double1600.960.970.981.690.970.48
Exponential)3200.970.510.981.351.030.30

For Pareto simulation model, gλ(Y) has an extremely skewed and heavy-tailed density for all values of λ. In smaller samples (n≤ 160), Portnoy’s estimator has the most bias. For the largest sample size (n= 320), the bias of the Gaussian MLE inline image is highest. The bias of semiparametric Bayes estimators have the smallest bias for all samples, and also have much smaller MSE than other competing estimators.

For the last part of Table 2, we investigate the performance of the semiparametric Bayes estimator using data simulated from a TBS model of (3) with λ= 0.5 and double-exponential density for ε. We see that the Bayes estimators have substantial improvement in MSE compared to competing estimators. The bias of the MLE under the Gaussian TBS model is similar for n= 160 and 320.

In summary, when the distribution of log   (T) after an optimal transformation has a moderate degree of asymmetry, the MLE and Bayes estimators based on (3) have finite sample biases very similar to that of Portnoy (2003)’s estimator. More importantly, the precision of the Bayes estimators based on TBS is better even when the underlying assumptions of (3) are not entirely valid. However, the MLE’s performance depends on the degree of symmetry of the distribution of gλ(Y) under optimal λ. The semiparametric Bayes estimators have excellent biases and smallest MSE among all of its competitors. When the modeling assumption of (3) is correct, the Bayes estimator based on (3) shows much smaller MSE compared to any competing estimators. This implies that the semiparametric Bayes estimator based on (3) is a safer and more robust estimator to use in practice compared to its competitors.

6. Discussion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

In this article, we present a new class of semiparametric models amenable to Bayes estimation of the log-linear median regression function for censored survival data. Similar to previous semiparametric models (e.g., Cox’s model), our model has a finite dimensional parameter vector and one nonparametric symmetric unimodal function fε. We argue that our assumption of unimodality of fε justifies the importance of median as the location parameter of interest. Previous research, including Box and Cox (1964), has found that the transformation in (2) is often an effective tool to obtain symmetry and accommodate heteroscedasticity. Our method can be applied when the covariate Z affects the location as well as the scale and shape of log   (T).

Median regression offers a useful alternative to the popular regression functions of Cox (1972) and the transformation model of (1). There is a substantial literature on median regression for censored survival data, including Ying et al. (1995), Yang (1999), McKeague, Subramanian, and Sun, (2001) and Bang and Tsiatis (2003). These methods involve nonlinear discontinuous estimating equations that are difficult to solve, often with multiple solutions. The recursive nature of some of these methods (e.g., that of Portnoy, 2003) make the asymptotic justifications and computations complicated. Peng and Huang’s (2008) martingale based estimating equations involve minimization of an L1-type discontinuous convex functions. Unlike estimation with Cox’s model (Cox, 1972), martingale based methods may not be the most efficient for estimating regression parameters of the median survival time. For most of these methods, every quantile functional is assumed to be linear in Z, that is inline image for all α∈ (0, 1), where P{T > Qα(Z)}=α. Unlike our model of (3), these frequentist linear quantile models have an infinite number of regression parameters βα for all α∈ (0, 1). As a consequence, unlike the model of (3), there is no simple expression available for survival functions for these models. For more in-depth discussion about the implementation, comparisons, asymptotic rate of convergence and consequences of the restrictive assumptions for existing quantile regression approaches, we suggest the excellent review by Koenker (2008). This restrictive assumption of linearity of all quantile functions may not hold true for any real study and very few known stochastic models can satisfy this, except when inline image with eN(0, σ2), inline image and MiZ (Kettl, 1991). As an alternative to the semiparametric model of (3), we can also consider gλ{log   (Ti)}=gλ(Mi) +|Mi|γηi with a symmetric unimodal density for ηi. However, both of these models are less parsimonious than (3) due to using separate parameters to address skewness and heteroscedasticity. Our preliminary simulation studies (omitted due to brevity) also cast doubts about the practical advantages of these alternatives to (3).

Existing Bayesian median regression models of Kottas and Gelfand (2001) and Hanson and Johnson (2002) have the linear representation of (4) with fe(u) free of covariate Z. As a consequence, all quantile functions of log  T are linear with the same slope (regression coefficient) for each covariate. As we mentioned before, these previous Bayes models cannot accommodate heteroscedasticity of log   (Ti), a very common phenomena in most popular survival models including Weibull and Cox’s model (1972). We believe that our models achieve a sensible compromise between existing frequentist and Bayesian models via accommodating heteroscedasticity while not restricting to linear functional for all quantiles. Note that the parametric MLE inline image based on an assumed Gaussian ε yields a consistent quasi-likelihood estimator of β as long as the true ε is symmetric around 0 (even if it is not Gaussian); the variance of inline image can be estimated using the so-called “sandwich variance estimator (White 1982). The loss of efficiency of this estimator under a non-Gaussian model is beyond the scope of this article.

Although we focus on modeling the median functional, our method can be used to compute the joint confidence band of any other quantile functional via (5) involving inline image. For brevity, we have omitted the results of our simulation study showing an excellent accuracy of joint confidence bands of all these quantile functions (Q0.25(z), Q0.5(z), Q0.75(z)) under the Bayes TBS model of (3) (even when the simulation model is Pareto). For some diseases, such as cancers with very good prognosis, the main interest may center on modeling the quantile Qα(Z) as a log-linear function with P{T < Qα(Z)}=α for α > 1/2 (different than the median). In this case, we can use a modification of (3) with assumptions P(ε < 0) =Fε(0) =α and ζ (u|θ) (9) being the uniform density with support {2θ (α− 1),  2θα}. For the sake of brevity, we again omit the details of the rest of the methodology and related MCMC steps. Our methods can also predict the outcome of a future patient with known covariate values. We do not present any separate simulation study of parametric Bayes estimators because these estimators under diffuse prior information are numerically close to parametric ML estimators. All of these advantages make our proposed method an extremely attractive alternative to other existing semiparametric methods for censored data.

Acknowledgements

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Research work of this article was partially supported by grants from the National Cancer Institute (NCI) of USA and from Sao Paulo Research Foundation (FAPESP) of Brazil. The authors would like to thank the editorial boards and reviewers for their helpful and constructive comments.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix
  • Bang, H. and Tsiatis, A. A. (2003). Median regression with censored cost data. Biometrics 58, 643649.
  • Bickel, P. J. and Doksum, K. A. (1981). An analysis of transformations revisited. Journal of the American Statistical Association 76, 296311.
  • Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, Series B 26, 211243.
  • Carroll, R. J. and Ruppert, D. (1984). Power-transformations when fitting theoretical models to data. Journal of the American Statistical Association 79, 321328.
  • Cheng, S. C., Wei, L. J., and Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika 82, 835845.
  • Comis, R.L. (1986). Clinical trials of cyclophosphamide, etoposide and vincristine in the treatment of small-cell lung cancer. Seminars in Oncology 13, 4044.
  • Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society, Series B 34, 187200.
  • Evans, W. K., Feld, R., Murray, N., et al. (1987). Superiority of alternating non-cross-resistant chemotherapy in extensive small cell lung cancer. Annals of Internal Medicine 107, 451458.
  • Feller, W. (1971). An Introduction to Probability Theory and Its Applications . New York : Wiley.
  • Ferguson, T. S. (1973). Bayesian Analysis of some nonparametric problems. The Annals of Statistics 1, 209230.
  • Fine, J. P., Ying, Z. and Wei, L. J. (1998). On the linear transformation model with censored data. Biometrika 85, 980986.
  • Fitzmaurice, G. M., Lipsitz, S. R. and Parzen, M. (2007). Approximate median regression via the Box–Cox transformation. The American Statistician . 61, 233238.
  • Hanson, T. and Johnson, W. O. (2002). Modeling regression error with a mixture of Polya trees. Journal of the American Statistical Association 97, 10201033.
  • Hanson, T. (2006). Inference for mixtures of finite Polya tree models. Journal of the American Statistical Association 101, 15481565.
  • Ibrahim, J.G., Chen, M.-H., and Sinha, D. (2001). Bayesian Survival Analysis . New York : Springer-Verlag.
  • Jett, J. R., Everson, L., and Therneau, T. M. (1990). Treatment of limited-stage small-cell lung cancer with cyclophosphamide, doxorubicin and vincristine with or without etoposide: A randomized trial of the North Central Cancer Treatment Group. Journal of Clinical Oncology 8, 3338.
  • Koenker, R. (2008). Censored quantile regression redux. Journal of Statistical Software 27.
  • Khintchine, A. Y. (1938). On unimodal distributions. Inst. Mat. Mech. Tomsk. Gos. Univ. 2, 17.
  • Kettl, S. (1991). Accounting for heteroscedasticity in the transform both sides regression model. Applied Statistics 49, 261268.
  • Kottas, A. and Gelfand, A. E. (2001). Bayesian semiparametric median regression modeling. Journal of the American Statistical Association 456, 14581468.
  • McKeague, I. W., Subramanian, S., and Sun, Y. (2001). Median regression and the missing information principle. Journal of Nonparametric Statistics 13, 709727.
  • Peng, L. and Huang, Y. (2008). Survival Analysis With Quantile Regression Models. Journal of the American Statistical Association 103, 637649.
  • Portnoy, S. (2003). Censored regression quantiles. Journal of the American Statistical Association 98, 10011012.
  • Piantadosi, S. (2005). Clinical Trials: A Methodologic Perspective , 2nd edition. Wiley series in probability and statistics. New York : Wiley Interscience.
  • Sethuraman, J. (1994). Constructive definition of dirichlet priors. Statistica Sinica 4, 639650.
  • Walker, S. and Mallick, B. K. (1999). A Bayesian semiparametric accelerated failure time model. Biometrics 55, 477483.
  • White, H. (1999). Maximum likelihood estimation under misspecified models. Econometrica 50, 126.
  • Yang, S. (1999). Censored median regression using weighted empirical survival and hazard functions. Journal of the American Statistical Association 94, 137145.
  • Ying, Z., Jung, S. H., and Wei, L. J. (1995). Survival analysis with median regression models. Journal of the American Statistical Association 90, 178184.

Appendix

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Semiparametric Models
  5. 3. Likelihood, Prior Process and Inference
  6. 4. Data Analysis
  7. 5. Simulation Study
  8. 6. Discussion
  9. Acknowledgements
  10. References
  11. Appendix

Proof of Theorem 1 It will be sufficient to prove the following: If Yx+ε* and gλ(Y) =gλ(β*x) +ε**ε*∼F* and ε**∼F** for some λ and two symmetric unimodal distributions F* and F** around 0, then F*=F**, β=β* and λ= 1.

Note that P(Y < y|x) =F* (y−θ) =F**{gλ(y) −gλ(θ*)} for all y, where θ=βx and θ*=β*x. Taking y, we get F* (0) =F** (0) =F**{gλ(θ) −gλ(θ*)}= 1/2 [RIGHTWARDS DOUBLE ARROW] gλ(θ) =gλ(θ*). Because gλ is monotone, this implies θ=θ* and the rest of the proof follows from there.inline image