Abstract
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
Summary We present a novel semiparametric survival model with a loglinear median regression function. As a useful alternative to existing semiparametric models, our large model class has many important practical advantages, including interpretation of the regression parameters via the median and the ability to address heteroscedasticity. We demonstrate that our modeling technique facilitates the ease of prior elicitation and computation for both parametric and semiparametric Bayesian analysis of survival data. We illustrate the advantages of our modeling, as well as model diagnostics, via a reanalysis of a smallcell lung cancer study. Results of our simulation study provide further support for our model in practice.
1. Introduction
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
Semiparametric models such as Cox’s (1972) proportional hazards model and linear transformation models (Cheng, Wei, and Ying, 1995; Fine, Ying, and Wei, 1998) and their special cases (e.g., accelerated failure time model) are very popular for modeling effects of covariates on a survival response. For example, the main aim of a semiparametric model for a twoarm randomized trial for smallcell lungcancer (SCLC) patients (Ying, Jung, and Wei, 1995) is to express the effects of treatment arm and age at entry on time from randomization to death (survival time). Often, there is substantial information available in the data to make inferences about the median. However, previous semiparametric models for survival data do not focus on the effects of covariates on the median and other quantiles. Several authors including Ying et al. (1995) gave compelling arguments in favor of focusing on the quantiles of the survival time for modeling and reporting of data analysis results. The effect of treatment and age on the quantiles including median time to death is useful for describing covariate effects. Clinical trials based on survival outcomes are often designed to detect differences in median survival between treatment arms. Models based on the median are often useful in dealing with heteroscedasticity.
Semiparametric Bayesian models for survival data, possibly with the exception of Kottas and Gelfand (2001), and Hanson and Johnson (2002), are either based on covariate effects on the hazard ratio (see Ibrahim, Chen, and Sinha, 2001) or on the mean survival time (e.g., Walker and Mallick, 1999). However, particularly for Bayesian survival analysis, medians and other quantiles are natural choices for elicitation of experts’ opinions. Clinical experts on the disease under study are likely to have useful prior information/opinions about survival quantiles (say, the median). In twoarm cancer clinical trials, the determination of a clinically significant difference and subsequent evaluation of power of the trial, even for frequentist trial designs, are often based on the prior evaluation of the median for the control arm as well as the clinically significant effect of treatment on median survival time (Piantadosi, 2005). In Section 2, we propose a novel semiparametric model for the median survival time with interpretable covariate effects via a loglinear median regression function. This wide class of semiparametric models has many desirable properties including model identifiability, closed form expressions for all quantile functions, and nonmonotone hazards. Unlike previous methods for Bayesian survival analysis (e.g., Hanson and Johnson, 2002), our model accommodates the situation when the location/median as well the scale and shape of the survival distribution are affected by the covariate. Unlike some of the previous frequentist methods for median regression, we do not require the restrictive assumption that all quantile functions below the median to be linear.
In Section 3, we present the likelihood, suitable nonparametric prior processes and Markov Chain Monte Carlo (MCMC) tools to estimate the model parameters. In Section 4, we consider the SCLC trial to demonstrate how our models can facilitate the determination of prior distributions. For the SCLC study, we also compare the results of our approach to existing approaches. In Section 5, a simulation study investigates small sample performance and robustness properties compared to competing methods for median regression. Some final remarks are in Section 6.
2. Semiparametric Models
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
Let T_{i} be the survival time of subject i= 1, …, n and let Z_{i}= (1, Z_{i1}, …, Z_{ip})′ be the corresponding vector of p timeconstant covariates along with the intercept term. The transformation model (Cheng et al., 1995) assumes that
 (1)
where h is a monotone transformation, γ= (γ_{0}, γ_{1}, …, γ_{p}) is a regression parameter, and e_{i} is an unspecified error variable with common density f_{e}(·) free of covariate Z_{i}. Usually the density f_{e}(·) of e_{i} is assumed to be a member of some parametric family with location 0 and with shape and scale free of Z_{i}. Important special cases of (1) are the accelerated failure time model (AFT) when h= log , the proportional odds model when e_{i} comes from a logistic distribution, and Cox’s model (1972) when f_{e} is the extremevalue density.
The monotone power transformation g_{λ}(y) (Bickel and Doksum, 1981),
 (2)
where for y < 0 and otherwise, is an extension of the Box–Cox power family (Box and Cox, 1964), a popular transformation to obtain symmetric and unimodal density for the transformed random variable. We assume that for unknown λ, the transformed survival time g_{λ}{log (T_{i})} is symmetric and unimodal with median g_{λ}(β′Z_{i}) =g_{λ}(M_{i}), that is,
 (3)
where ε_{i} are i.i.d. from a unimodal and symmetric density f_{ε}(·) centered at 0, M_{i}=β′Z_{i}, and β is the vector of regression parameters. Carroll and Ruppert (1984), Fitzmaurice, Lipsitz, and Parzen, (2007), among others proposed parametric versions of the transformbothsides (TBS) regression model for an uncensored continuous response with the original Box–Cox transformation and N(0, σ^{2}) density for error f_{ε}(·).
The transformation g_{λ}(y) in (2) is monotone with derivative (with respect to λ) equal to . The median of log (T_{i}) is M_{i}=β′Z_{i} because P[log (T_{i}) > M_{i}] =P[g_{λ}{log (T_{i})} > g_{λ}(M_{i})] =F_{ε}(0) = 1/2, where F_{ε} is the cdf of ε. As a consequence, the survival time T_{i} has a loglinear median regression function Q_{0.5}(Z_{i}) = exp (M_{i}) = exp (β′Z_{i}) and survival function S(tz) = 1 −F_{ε}(g_{λ}(log t) −g_{λ}(M)). For the SCLC study with M_{i}=β_{0}+β_{1} z_{1}+β_{2} z_{2}, where z_{1} is a treatment indicator and z_{2} denotes age, this implies that the ratio of medians from two patients of the same age but different treatment arms is Q_{0.5}(z_{1}= 1, z_{2})/Q_{0.5}(z_{1}= 0, z_{2}) = exp (β_{1}). We also get a similar straightforward interpretation of exp (β_{2}) as the ratio of the medians for unit increase in age. The following theorem shows that the parameter λ and the density f_{ε} of (3) are also identifiable, in the sense that for any survival time following (3), there is a unique (λ, f_{ε}) for which g_{λ}{log (T_{i})} has a symmetric unimodal distribution.
The proof of Theorem 1 is in the Appendix. Similar to the transformation model of (1), we can rewrite the TBS model of (3) as
 (4)
where the error e_{i} in (4) has asymmetric density function , where . The shape and scale of the cdf F_{ε}{g_{λ}(M_{i}+u) −g_{λ}(M_{i})} of e_{i} depends on the covariates Z_{i}. The approximate variance of log T is , where f_{ε} has finite variance . It is clear that unlike the usual assumption of the transformation model of (1) and Bayesian models of, say, Hanson and Johnson (2002), the median as well as the shape and scale of the error density f_{e}(·Z_{i}) in (4) depend on the covariate Z_{i}. This allows our model to be useful for dealing with heteroscedasticity of log T. Thus, unlike the existing Bayes models, the covariate Z does affect the scale and shape of the f_{e} in our TBS models. A parametric lognormal model with location M(Z) =β′Z for log (T) is a special case of (3) with λ= 1 and F_{ε} being N(0, σ^{2}). The hazard function of (3) can be nonmonotone; for example, a lognormal model has nonmonotone hazard.
The expression in (5) for the TBS model also implies that for all α, α′∈ (0, 1). This means that under the model in (3), ordering between two patients’ median survival times implies uniform ordering between their corresponding survival functions over the entire timeaxis. This property is similar to Cox’s model where ordering between two hazards (as well as survival functions) remain the same over the entire timeaxis.
3. Likelihood, Prior Process and Inference
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
Let T_{i} and C_{i} be the survival and censoring times, respectively, for i= 1, …, n. We observe (t_{i0}, δ_{i}), where t_{i0}=T_{i}∧C_{i} is the observed followup time and δ_{i} is the censoring indicator, with δ_{i}= 1 for T_{i}=t_{i0} and 0 otherwise. It is assumed that T_{i} and the random censoring time C_{i} are conditionally independent given covariate Z_{i}. Given the observed data vector y_{0}= (t_{0}, δ*) with t_{0}= (t_{10}, …, t_{n0}) and δ*= (δ_{1}, …, δ_{n}), the likelihood function under our TBS model of (3) is as follows:
 (6)
where ω_{i}=g_{λ}(y_{i}) −g_{λ}(β′Z_{i}) with y_{i}= log (t_{i0}), is the cdf of the unimodal symmetric density function dF_{ε}(u) =f_{ε}(u) du.
In general, for the parametric versions of TBS model, any unimodal symmetric distribution, such as the Gaussian and logistic, can be used for F_{ε}. For example, f_{ε}(w) and F_{ε}(w) will be respectively replaced by the density φ_{σ}(w) and cdf Φ_{σ}(w) of N(0, σ^{2}) for the Gaussian TBS model likelihood in (6). The corresponding posterior is p(τ, σy_{0}) ∝L(τ, σy_{0})π (τ, σ), where π (τ, σ) is the joint prior density based on the available prior information, with τ= (β, λ). MCMC samples from this joint posterior can be used to implement a parametric Bayesian analysis. Under this parametric model, the maximum likelihood estimator (MLE) of the regression parameters β can be obtained via maximizing the loglikelihood L(τ, σy_{0}). For example, the loglikelihood function of the (Gaussian) parametric TBS model is
 (7)
where is the survival function of N(0, σ^{2}). The MLE of the parameters under parametric TBS model is obtained via maximizing the corresponding loglikelihood function ℓ (β, τy_{0}) using Newton–Raphson (NR) iterations. Under mild regularity conditions, the MLE of β (as well as the parametric Bayes estimator) is consistent and asymptotically efficient based on regular large sample theory for the MLE when the modeling assumption is correct.
Any parametric assumption about F_{ε} in (3) is deemed as a restrictive parametric assumption for some data examples in practice. In the semiparametric version of (3), the unimodal symmetric density of ε is assumed unknown. For semiparametric maximum likelihood estimation (SPMLE) under this model, the likelihood of (6) is maximized with respect to the restriction that F_{ε} is the cdf of a unimodal distribution symmetric around 0. The regularity conditions and asymptotic issues for the SPMLE under (6) are nontrivial and beyond the scope of this article. For semiparametric Bayesian analysis, we need the posterior
 (8)
where π_{12} and π_{3} are independent priors of τ= (β, λ) and F_{ε}. This uses the simplifying, however reasonable, assumption that the prior opinions about parametric vector τ and nonparametric function F_{ε} can be specified independently. We will discuss the practical justification of this assumption later.
Using the following result of Feller (1971, p.158), we introduce a class of nonparametric priors π_{3} defined over the space of symmetric unimodal distribution functions F_{ε} in (3). Any symmetric unimodal distribution F_{ε} can be expressed as a scalemixture of uniform random variables
 (9)
for some mixing distribution G(θ), where ζ (uθ) for θ > 0 is the uniform distribution with support (−θ, +θ). We use the Dirichlet process (DP) of Ferguson (1973), G∼DP(G_{0}, ν), as a nonparametric prior for the unknown scalemixing distribution G(θ) of (9). The DP(G_{0}, ν) is characterized by the known “prior guess G_{0} (the prior expectation of G), and a positive scalar parameter ν, the precision parameter around the prior mean/guess G_{0}. The prior mean G_{0} of the random mixing density G can be chosen appropriately to assure a desired prior mean/guess F_{*} for unknown F_{ε}. Using a result by Khintchine (1938), when the density f_{*}(·) and its derivative exist, the density of G(θ) is given as
 (10)
For example, to obtain an approximate double exponential (Dexpo(γ)) prior mean density for the regression error density f_{ε}, using (10), we need to choose G_{0}(θγ) as Gamma (2, γ) with density . The precision parameter ν also determines the degree of belief about how close F_{ε} should be to its prior guess F_{*}. When ν is large enough, the unknown nonparametric F_{ε} is very close to its prespecified (often parametric) prior mean/guess F_{*}(·γ). A small ν implies very little confidence in unknown F_{ε} being close to F_{*}(·γ), and the corresponding Bayes estimator of β is expected to be very close to the semiparametric likelihood estimator. The details of the specifications of the hyperparameters of the priors π_{12} and π_{3} in (8) are provided in the next section.
4. Data Analysis
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
Now we present a parametric Bayesian analysis using the TBS model of (3) with parametric N(0, σ^{2}) density for F_{ε}. One major advantage of the TBS model for Bayesian analysis is that the priors for the parameters (β_{0}, β_{1}, β_{2}, λ, σ) can be determined based on prior opinions about some key quantities related to the priorpredictive survival time T* of a patient with known covariate values, say, . Without loss of generality, we assume that the priors are based on the following: (1) Prior guess and prior range of a quantile, say, the median, of the priorpredictive survival time T* of a patient at age 50 () from treatment arm B (); (2) Change in the median of T* for a unit change in each age (z_{2}) and treatment (z_{1}). We point out that for most Phase 2 and 3 trials, these quantities are routinely elicited and used to design the trial and determine the power for detecting differences (e.g., Pintadosi, 1997). We first demonstrate the specification of these priors for the parametric TBS models.
We use the simplifying assumption that the joint prior is π (β, λ, σ) =π_{1}(β)π_{2}(λ)π_{3}(σβ, λ). This assumption can be justified in practice because the prior π_{1}(β) is based on the median (location) of T*, whereas the prior π_{2}(λ) is based on the shape (skewness) of log (T*). The specification of the prior for β_{0} uses the fact that T* with has a prior median exp (β_{0}). For the lung cancer trial conducted before 1993, the current expert opinions about SCLC are not very appropriate. Based on the published literature about the treatment of SCLC before this article (e.g., Jett, Everson, and Therneau, 1990; Evans et al., 1987; Comis, 1986), the median survival time for treatment arm B was thought to be between 12–17 months for limitedstage and 9–10 months for extensivestage SCLC patients. For our SCLC study with nearly equal proportions of these two types of patients, we use a mean prior guess of 13 months and a range of (8, 18) months for T*. These give us the prior with A_{1}= log (13) and B_{1}={log (18) − log (8)}/3 to ensure that the prior range of β_{0} has approximate length 3B_{1}. Our prior opinion about β_{1} is based on the prior belief about the ratio of medians of two patients with identical age, but, from different treatment arms. So, the prior β_{1}∼N(0, 10) corresponds to a 95% prior probability that the ratio of medians has range and is centered at e^{0}= 1 (indifferent opinion regarding superiority of either treatment arm). Similarly, the prior β_{2}∼N(0, 10) corresponds to prior opinion that two patients from treatment B and with 1 year difference in age, have a ratio of medians between with 68% probability. We have chosen such a noninformative prior opinion about β_{1} and β_{2} to allow for a meaningful comparison of our analysis results with results from frequentist and previous Bayes methods based on either no prior or a noninformative prior. We would like to point out that our pointwise Bayes estimates do not change substantially (<4% change) when we reduce the prior variances of β_{1} and β_{2} to 1 (instead of 10). The interval estimate of β_{1} (as an example) is around 12% narrower when we use these more skeptical N(0, 1) priors instead of N(0, 10) priors for β_{1} and β_{2}.
We use the Unif(0, 3) prior for π_{2}(λ) because it is difficult to interpret the aftertransform linear model of (3) when λ > 3. In their original paper, Box and Cox (1964) recommended restricting the λ≤ 2. For a parametric Gaussian TBS model, log T*, when can be expressed approximately as log T*≃β_{0}+σβ_{0}^{1−λ} e (Kettl, 1991), where β_{0} is the median of log T* and e∼N(0, 1). This allows us to obtain prior π_{3}(σβ_{0}, λ) based on prior opinion of because , where is another quantile of log T* for α*≠ 1/2, and is the α*percentile of standard normal. For example, when we take α*= 0.75, we have . Based on the SCLC literature prior to this trial, we use the prior opinion that the thirdquartile of a patient in treatment arm with 50 years entryage is between 10 months to 5 years with a center of 33 months. For given (β_{0}, λ), we use a Gamma density at the prior π_{3}(σβ_{0}, λ) with mean equal to β_{0}− log (33)β_{0}^{λ−1}/0.6745 and approximate range between 0 and to (log (60) − log (10)) β_{0}^{λ−1}/0.6745. These prior densities give us approximately the same means and ranges of and that we expect from our prior opinion about these two quantiles of log (T*). However, to simplify this further, we use an unconditional Gamma prior π_{3}(σ) whose mean equals to and variance equals to (based on prior mean log (13) for β_{0} and prior guess 1 for λ). We found no noticeable difference in posterior estimates using this unconditional prior for σ instead of a conditional prior π_{3}(σβ_{0}, λ). We remind the reader that the priors used in our analysis are solely for demonstrating the method of development of one set of priors for the Bayesian analysis of the lungcancer study. An expert’s prior opinions on the median survival time of SCLC can be very different from what we used, and that may lead to different prior specification of the parameters.
Our plot (lefthand panel of Figure 1) of residuals versus the patient’s age at entry, where y_{i} is the observed log(T_{i}) (subject to censoring) and is the posterior predictive expectation of log (T_{i}) under the model, does not show any trend of residuals under the parametric Bayes TBS model. Our plot (righthand panel of Figure 1) of these residuals versus the estimated median survival times also does not reveal any serious inadequacy of the parametric TBS model. However, the Q–Q plot (Figure 2) of these residuals suggests that the assumption of Gaussian distribution for F_{ε} in (3) is questionable due to the plot being nonlinear at the right tail. Later, we use a semiparametric Bayesian analyses to avoid the Gaussian assumption of ɛ_{i}. Our posterior means (Bayes estimates) of three quartiles Q_{α}(z_{1}, z_{2}) for α= 0.25, 0.50, 0.75 of treatment A (z_{1}= 1) are higher than the corresponding estimated quantiles of treatment B (z_{1}= 0) at any age z_{2}.
For the semiparametric Bayesian analysis with a symmetric unimodal f_{ε} in (3), we need to specify the prior guess/mean F* of F_{ε} and a prior precision parameter ν. We take the precision parameter ν= 1 to imply a very low confidence around our parametric prior guess F_{*} of the nonparametric error distribution F_{ε}. We take the prior mean f_{*} of f_{ε} to be N{0, (σ_{0})^{2}} where . This makes f_{*} equal to the prior mean of f_{ε} used for the parametric Bayes analysis of the TBS model. Using (10), this N{0, (σ_{0})^{2}} density for f_{*} corresponds to a Gamma (3/2, 1/{2(σ_{0})^{2}}) for G_{0} in (10). The constructive definition of the DP mixture prior process for F_{ε} is (Sethuraman, 1994), where , with . The actual implementation of the MCMC tool to sample from (8) is based on a finite approximation of Sethuraman’s construction with, say, K= 1000 and V_{K}= 1. The MCMC computational tool can be implemented, even via a standard package such as Winbugs. The rest of the conditional posteriors are the same as those used for the parametric Bayes.
Table 1. Pointwise and 95 % interval estimates (within parenthesis) of regression parameters (β_{1} for treatmentz_{1} and β_{2} for age z_{2}) for the lung cancer study under different procedures. Estimator  Treatment  Age 

MLE (TBS model)  0.433 (0.141, 0.727)  −0.019 (−0.037, −0.002) 
Parametric Bayes (TBS)  0.318 (0.036, 0.604)  −0.008 (−0.023, 0.008) 
Semiparametric Bayes (TBS)  0.304 (0.083, 0.577)  −0.009 (−0.021, −0.002) 
Portnoy  0.369 (0.149, 0.591)  −0.009 (−0.031, 0.012) 
KG Bayes  0.389 (0.037, 0.845)  −0.018 (−0.028, −0.007) 
The point estimates of the regression parameters of the median functional under different methods are not strikingly different to the corresponding point estimator obtained via Portnoy’s method (2003). This is also evident from Figure 3, where 3 estimated quantiles for Portnoy’s method (dotted straight lines) and for semiparametric Bayes TBS model (solid curved lines) are plotted (separately for 2 treatment arms).
We find that the proportion of observations in each quantileinterval is closer to the expected proportions for Bayes estimates of quantile functions compared to Portnoy’s. However ML and Bayes methods yield smaller estimated standard errors and substantially narrower interval estimates than those obtained using Portnoy’s method. For this data example, the estimates based on TBS models have smaller estimated standard errors for the treatment effect compared to competing procedures. The widths of the interval estimates from parametric and semiparametric Bayes are substantially smaller than widths of the corresponding estimates based on Portnoy’s method (at least for the ageeffect). This is not surprising because Portnoy’s median regression methods have a far larger number of regression parameters than the finite dimensional regression parameter β in (3). The posterior standard deviations of the TBS estimators are also smaller than those from Kottas and Gelfand (2001). Figure 4 plots the logarithm of the ratio of the Conditional Predictive Ordinate (CPO) of the semiparametric TBS model and CPO of the Gaussian TBS model against the observation numbers. A value greater than 0 for this supports a semiparametric model over a Gaussian model. In this example, approximately 67% of observations favor the semiparametric TBS model over the Gaussian TBS model, i.e., a substantially higher proportion of observations supporting the semiparametric model over parametric model. The final conclusion is that semiparametric model fits the data better than other competing parametric models for entire range of age and for both treatments.
5. Simulation Study
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
For our simulation models, we set the median of Y= log (T) given Z to be M(Z) =β_{0}+β_{1} Z= 6.5 +Z, i.e., β_{0}= 6.5 and β_{1}= 1.0, where Z can take four possible values 0, 0.5, 1.0, and 1.5, in equal proportions for each simulated data set. For each simulation distribution of T considered in the article, we simulate at least 5000 datasets with sample sizes n= 80, 160, and 320. The number of simulated datasets for different sample sizes may vary to assure that the Monte Carlo variability of the approximate bias and MSE of the regression estimates are smaller than 0.01.
For the simulation study, the Bayes estimators considered by us are based only on the semiparametric model of (3). The priors used for Bayes estimation in the simulation study are: β_{0}∼N(6, 10) and β_{1}∼N(0, 1). The prior mean for the DP is N(0, 1) and the precision is ν= 0.01. This prior for β_{1} implies that there is almost 5% prior probability that the ratio of medians is larger than 7.4 for a unit change in z. In order to avoid undue influence of the choice of the priors on the results of our simulation study, we use these somewhat vague priors here. However, the prior can also be viewed as a skeptical prior because the prior of the regression parameters is centered at the prior guess of nocovariate effect (β= 0). If a Bayes estimator can demonstrate good performance for detecting covariateeffects with this prior, this suggests that even a skeptical and unusually “flat prior may not hinder the Bayes method’s ability to detect the covariate effect. The implications of chosen priors for multiple model parameters are best described via various summaries of the prior predictions of the observables/responses. We generate various summary statistics including the sample median, range and width of the range of 500 survival times using a single set of parameters simulated from the joint prior. We then replicate the whole process of simulating these summary statistics 1000 times. We found the range of these 1000 sample medians is between 1400 and 5200 for z= 1.5, compared to the true median of ≃2981 for the simulation model. The range of survival times from the prior predictive models may have width as large as 10^{8}. These summaries indicate that our prior predictive models are very noninformative and can cover a wide range of survival patterns. In practice, we expect to use a more informative prior predictive model using often available information about the range of responses (even after incorporating a skeptical prior view about the covariate effect).
First we evaluate the robustness of the MLEs and of the Bayes estimates based on (3). We compare performances (bias and MSE) of these estimators to the competing frequentist estimator of Portnoy (2003). For this aim, we simulate survival data from parametric exponential and Pareto densities. Both exponential and Pareto simulation densities, being heteroscedastic and skewed for all λ, do not satisfy the assumptions of (3). The independent censoring distribution was generated from an exponential density (Λe^{ΛC}) with rate parameter Λ chosen to obtain desired proportions of censoring. For example, the choice of Λ= log (2)/30 results in approximately 20% censoring for exponential simulation model.
Table 2 presents the summary of the approximate sampling mean and meansquareerror (MSE) of various competing estimators of β_{1} under different simulation models. Results in Table 2 under an exponential and Pareto simulation model show that the MLE based on (3), and the Bayes estimators based on (3) have comparable biases relative to competing estimators. Further, the MSE of Portnoy’s estimators are much larger than the corresponding MSE of the MLE and Bayes estimators. The Bayes estimators under (3) have much smaller MSE compared to the MLE.
Table 2. Results of simulation study under Exponential and Pareto models: Monte Carlo approximation of the sampling mean and Mean Square Error (MSE) of different estimators of known β_{1}= 1. Simulation Model  Sample  Gaussian TBS MLE  Portnoy  SP TBS 

Mean  MSE  Mean  MSE  Mean  MSE 



Exponential  80  0.91  2.66  0.93  4.27  0.92  0.90 
 160  0.97  1.35  1.11  2.28  1.08  0.65 
 320  0.94  0.69  0.96  1.20  0.93  0.48 
Pareto  80  1.03  12.01  1.10  19.89  1.03  0.95 
 160  0.93  5.41  0.91  8.60  1.01  0.85 
 320  0.92  2.68  0.98  4.25  1.02  0.68 
TBS  80  0.99  1.94  1.01  1.52  1.04  0.72 
(Double  160  0.96  0.97  0.98  1.69  0.97  0.48 
Exponential)  320  0.97  0.51  0.98  1.35  1.03  0.30 
For Pareto simulation model, g_{λ}(Y) has an extremely skewed and heavytailed density for all values of λ. In smaller samples (n≤ 160), Portnoy’s estimator has the most bias. For the largest sample size (n= 320), the bias of the Gaussian MLE is highest. The bias of semiparametric Bayes estimators have the smallest bias for all samples, and also have much smaller MSE than other competing estimators.
For the last part of Table 2, we investigate the performance of the semiparametric Bayes estimator using data simulated from a TBS model of (3) with λ= 0.5 and doubleexponential density for ε. We see that the Bayes estimators have substantial improvement in MSE compared to competing estimators. The bias of the MLE under the Gaussian TBS model is similar for n= 160 and 320.
In summary, when the distribution of log (T) after an optimal transformation has a moderate degree of asymmetry, the MLE and Bayes estimators based on (3) have finite sample biases very similar to that of Portnoy (2003)’s estimator. More importantly, the precision of the Bayes estimators based on TBS is better even when the underlying assumptions of (3) are not entirely valid. However, the MLE’s performance depends on the degree of symmetry of the distribution of g_{λ}(Y) under optimal λ. The semiparametric Bayes estimators have excellent biases and smallest MSE among all of its competitors. When the modeling assumption of (3) is correct, the Bayes estimator based on (3) shows much smaller MSE compared to any competing estimators. This implies that the semiparametric Bayes estimator based on (3) is a safer and more robust estimator to use in practice compared to its competitors.
6. Discussion
 Top of page
 Abstract
 1. Introduction
 2. Semiparametric Models
 3. Likelihood, Prior Process and Inference
 4. Data Analysis
 5. Simulation Study
 6. Discussion
 Acknowledgements
 References
 Appendix
In this article, we present a new class of semiparametric models amenable to Bayes estimation of the loglinear median regression function for censored survival data. Similar to previous semiparametric models (e.g., Cox’s model), our model has a finite dimensional parameter vector and one nonparametric symmetric unimodal function f_{ε}. We argue that our assumption of unimodality of f_{ε} justifies the importance of median as the location parameter of interest. Previous research, including Box and Cox (1964), has found that the transformation in (2) is often an effective tool to obtain symmetry and accommodate heteroscedasticity. Our method can be applied when the covariate Z affects the location as well as the scale and shape of log (T).
Median regression offers a useful alternative to the popular regression functions of Cox (1972) and the transformation model of (1). There is a substantial literature on median regression for censored survival data, including Ying et al. (1995), Yang (1999), McKeague, Subramanian, and Sun, (2001) and Bang and Tsiatis (2003). These methods involve nonlinear discontinuous estimating equations that are difficult to solve, often with multiple solutions. The recursive nature of some of these methods (e.g., that of Portnoy, 2003) make the asymptotic justifications and computations complicated. Peng and Huang’s (2008) martingale based estimating equations involve minimization of an L_{1}type discontinuous convex functions. Unlike estimation with Cox’s model (Cox, 1972), martingale based methods may not be the most efficient for estimating regression parameters of the median survival time. For most of these methods, every quantile functional is assumed to be linear in Z, that is for all α∈ (0, 1), where P{T > Q_{α}(Z)}=α. Unlike our model of (3), these frequentist linear quantile models have an infinite number of regression parameters β_{α} for all α∈ (0, 1). As a consequence, unlike the model of (3), there is no simple expression available for survival functions for these models. For more indepth discussion about the implementation, comparisons, asymptotic rate of convergence and consequences of the restrictive assumptions for existing quantile regression approaches, we suggest the excellent review by Koenker (2008). This restrictive assumption of linearity of all quantile functions may not hold true for any real study and very few known stochastic models can satisfy this, except when with e∼N(0, σ^{2}), and M_{i}=βZ (Kettl, 1991). As an alternative to the semiparametric model of (3), we can also consider g_{λ}{log (T_{i})}=g_{λ}(M_{i}) +M_{i}^{γ}η_{i} with a symmetric unimodal density for η_{i}. However, both of these models are less parsimonious than (3) due to using separate parameters to address skewness and heteroscedasticity. Our preliminary simulation studies (omitted due to brevity) also cast doubts about the practical advantages of these alternatives to (3).
Existing Bayesian median regression models of Kottas and Gelfand (2001) and Hanson and Johnson (2002) have the linear representation of (4) with f_{e}(u) free of covariate Z. As a consequence, all quantile functions of log T are linear with the same slope (regression coefficient) for each covariate. As we mentioned before, these previous Bayes models cannot accommodate heteroscedasticity of log (T_{i}), a very common phenomena in most popular survival models including Weibull and Cox’s model (1972). We believe that our models achieve a sensible compromise between existing frequentist and Bayesian models via accommodating heteroscedasticity while not restricting to linear functional for all quantiles. Note that the parametric MLE based on an assumed Gaussian ε yields a consistent quasilikelihood estimator of β as long as the true ε is symmetric around 0 (even if it is not Gaussian); the variance of can be estimated using the socalled “sandwich variance estimator (White 1982). The loss of efficiency of this estimator under a nonGaussian model is beyond the scope of this article.
Although we focus on modeling the median functional, our method can be used to compute the joint confidence band of any other quantile functional via (5) involving . For brevity, we have omitted the results of our simulation study showing an excellent accuracy of joint confidence bands of all these quantile functions (Q_{0.25}(z), Q_{0.5}(z), Q_{0.75}(z)) under the Bayes TBS model of (3) (even when the simulation model is Pareto). For some diseases, such as cancers with very good prognosis, the main interest may center on modeling the quantile Q_{α}(Z) as a loglinear function with P{T < Q_{α}(Z)}=α for α > 1/2 (different than the median). In this case, we can use a modification of (3) with assumptions P(ε < 0) =F_{ε}(0) =α and ζ (uθ) (9) being the uniform density with support {2θ (α− 1), 2θα}. For the sake of brevity, we again omit the details of the rest of the methodology and related MCMC steps. Our methods can also predict the outcome of a future patient with known covariate values. We do not present any separate simulation study of parametric Bayes estimators because these estimators under diffuse prior information are numerically close to parametric ML estimators. All of these advantages make our proposed method an extremely attractive alternative to other existing semiparametric methods for censored data.