A semiparametric Cox–Aalen transformation model with censored data

We propose a broad class of so‐called Cox–Aalen transformation models that incorporate both multiplicative and additive covariate effects on the baseline hazard function within a transformation. The proposed models provide a highly flexible and versatile class of semiparametric models that include the transformation models and the Cox–Aalen model as special cases. Specifically, it extends the transformation models by allowing potentially time‐dependent covariates to work additively on the baseline hazard and extends the Cox–Aalen model through a predetermined transformation function. We propose an estimating equation approach and devise an expectation‐solving (ES) algorithm that involves fast and robust calculations. The resulting estimator is shown to be consistent and asymptotically normal via modern empirical process techniques. The ES algorithm yields a computationally simple method for estimating the variance of both parametric and nonparametric estimators. Finally, we demonstrate the performance of our procedures through extensive simulation studies and applications in two randomized, placebo‐controlled human immunodeficiency virus (HIV) prevention efficacy trials. The data example shows the utility of the proposed Cox–Aalen transformation models in enhancing statistical power for discovering covariate effects.

hazard function.In contrast, the additive hazards models furnish an additive effect between the covariates and the baseline hazard function, enabling a direct reflection of the risk increase or decrease (Aalen, 1980;Huffer & McKeague, 1991;Lin & Ying, 1994).Without prior domain knowledge, it is hard to determine which approach is preferable among multiplicative and additive hazards models.In fact, both models may often be used to complement each other and provide more complete insights.Therefore, various multiplicative-additive hazards models have been proposed to capture both multiplicative and additive effects (Lin & Ying, 1995;Martinussen & Scheike, 2002).In particular, Scheike & Zhang (2002) suggested a Cox-Aalen model by replacing the baseline hazard function in the Cox model with Aalen's additive model.The Cox-Aalen model has been studied for various types of censored data, for example, right-censored (Scheike & Zhang, 2002), interval-censored (Boruvka & Cook, 2015), left-truncated and right-censored (Shen & Weng, 2018), left-truncated and mixed interval-censored (Shen & Weng, 2019), and recurrent-event (Qu and Sun, 2019).
Transformation models have also received wide attention in survival analysis.Dabrowska and Doksum (1988) introduced the class of linear transformation models, which includes the proportional hazards and proportional odds models (Bennett, 1983;Pettitt, 1982).Estimators for this class of models were proposed by Dabrowska and Doksum (1988), Cheng et al. (1995), Fine et al. (1998), Chen et al. (2002), among others.Zeng & Lin (2006) extended the linear transformation models to allow time-dependent covariates.Hereafter, we refer to this class of transformation models as Zeng and Lin's model to avoid confusion.There is rich literature investigating Zeng and Lin's model.Zeng & Lin (2006) proposed a nonparametric maximum likelihood estimator (NPMLE) in the presence of rightcensored data.Zeng & Lin (2007) derived a system of self-consistent equations for the jump sizes of the baseline cumulative hazard function at exact failure times through an expectation-maximization (EM) algorithm.Chen (2009) showed that the self-consistent estimator derived in Zeng & Lin (2007) is asymptotically equivalent to a weighted Breslow-type estimator, which can be solved by a computationally-efficient iterative reweighting algorithm.Liu & Zeng (2013) investigated variable selection procedures by minimizing a weighted negative partial loglikelihood function plus an adaptive lasso penalty.More recently, Zeng et al. (2016) and Zhou et al. (2021) studied the nonparametric maximum likelihood estimation of Zeng and Lin's model with interval-censored and partly interval-censored data, respectively.
However, one limitation of Zeng and Lin's model is that all covariate effects are assumed to be multiplicative within the transformation function.This assumption is too restrictive in some applications.For example, in an analysis of risk factors on mortality among patients with myocardial infarction, Scheike and Zhang (2003) showed that some covariates (e.g., ventricular fibrillation and congestive heart failure) have additive effects, while others (e.g., age and sex) have multiplicative effects.In addition, they pointed out that naively treating all covariates as multiplicative led to incorrect results when predicting survival probabilities.Another example is human immunodefi-ciency virus (HIV) prevention efficacy trials, for which HIV incidence varies across geographic regions and by sex/gender; thus, the different regions/sex/gender subgroups have different baseline hazard functions (Corey et al., 2021).Moreover, a Kaplan-Meier plot shows that survival curves for different regions cross, potentially suggesting an additive region effect.To the best of our knowledge, no existing work considers a class of semiparametric transformation models in which the baseline hazard function is allowed to depend on some potentially time-varying covariates additively.Therefore, it is desirable to provide a larger class of semiparametric transformation models that can accommodate both multiplicative and additive covariate effects under one unified framework.
The EM algorithm is a powerful tool for performing maximum likelihood estimation in the presence of latent variables or missing data (Dempster et al., 1977).In particular, various EM-type algorithms have been proposed to find NPMLE for semiparametric transformation models (Liu & Zeng, 2013;Zeng & Lin, 2007;Zeng et al., 2016;Zhou et al., 2021).In analogy to EM, Elashoff and Ryan (2004) proposed an expectation-solving (ES) algorithm that handles missing data for general estimating equations, greatly facilitating its application to a broader framework.When the complete-data estimating equations correspond to the score functions from the likelihood, the ES algorithm essentially reduces to the EM.The ES algorithm dramatically improves computational efficiency for solving estimating equations involving frailty or latent variables.For example, Johnson and Strawderman (2012) developed a smoothing expectation and substitution algorithm for the semiparametric accelerated failure time frailty model.Henderson and Rathouz (2018) considered an approximate EM procedure for a longitudinal latent class model for count data.
In this paper, we propose a broad class of so-called Cox-Aalen transformation models that incorporate both multiplicative and additive covariate effects upon the baseline hazard function within a transformation.The proposed class of models is very flexible and contains Zeng and Lin's model and the Cox-Aalen model as special cases.However, the multiplicative-additive structure within the transformation and the need to estimate several nonparametric functions simultaneously impose additional challenges for model estimation.To alleviate such difficulties, we devise an ES-type algorithm, which iterates between an E-step wherein functions of complete data are replaced by their expectations and an S-step where these expected values are substituted into the complete-data estimating equations, which are then solved.More specifically, within the S-step, the high-dimensional parameters are calculated explicitly, while the low-dimensional parameters are updated via the Newton-Raphson method.Consequently, the proposed ES algorithm is fast and stable even under a high-percentage censoring rate, as evidenced by our simulation studies and real-data applications.Another attraction of our approach is that we provide simple variance estimators for both parametric and nonparametric estimates.Furthermore, the theoretical properties of the proposed estimators are rigorously studied via modern empirical process techniques.
The rest of the article is organized as follows.In Section 2, we present the proposed Cox-Aalen transformation models.In Section 3, we formally describe the estimation procedure and establish the asymptotic properties of the proposed estimators.In Section 4, simulation studies are conducted to evaluate the finite-sample performance of the proposed method.In Section 5, we apply our method to two randomized HIV prevention efficacy trials.Section 6 concludes with a discussion.
Remark 2. When () = , according to Equation (1), the cumulative hazard function on the left-hand side can be written as Thus, the conditional hazard function of  is  ⊤ ()() exp{ ⊤ ()}.Therefore, the Cox-Aalen model is a special case of the proposed models.In particular, when  2 , … ,   represent levels in a set of factors, model (4) further reduces to the stratified Cox model (Kalbfleisch & Prentice, 2002, Section 4.4).
Remark 3.For () = log(1 + ), the odds of surviving beyond time  based on Equation (1) are In the special case where  are time-independent covariates, which is a stratified proportional odds model when  is a random variable indicating strata.
As illustrated above, our proposed class of semiparametric models is very flexible and contains many popular models in survival analysis.To motivate our approach, we first set up the observed data likelihood and derive the NPMLE for a special case.Then, for more general situations, we propose estimating the parameters  and (⋅) using estimating equations along with an easily-implemented ES algorithm.Finally, the asymptotic properties of the resulting estimators are derived via modern empirical process theory (van der Vaart and Wellner, 1996).

Nonparametric maximum likelihood estimation
Assume that   and   are conditionally independent given   (⋅) and   (⋅).Under the proposed Cox-Aalen transformation model (1), the likelihood for the observed data is where Λ ′  (⋅) and  ′ (⋅) are the derivatives of Λ  (⋅) and (⋅), respectively.The likelihood (5) involves  and  infinitedimensional parameters   ( = 1, … , ), and it may not be concave in these parameters.Thus, the nonparametric maximum likelihood techniques are usually employed to restrict the parameter space.
To establish a simple and efficient estimation procedure, we adopt the idea in Zeng & Lin (2007) by treating  as a latent variable in the class of frailty-induced transformations (2).Note that model ( 1) is equivalent to the survival time  with cumulative hazard function .
To obtain the NPMLE of  and (⋅), we propose an EMtype algorithm by treating  as missing data.In the E-step, we evaluate the posterior mean of   given the observed data, denoted by Ê(  ).The detailed calculations are given in the next section.In the M-step, we maximize the expectation of Equation ( 7) conditional on the observed data.More specifically, we set the derivatives of the conditional expectation of Equation ( 7) with respect to   ( = 1, … , ) and  to zeros, respectively.Then, one can solve for the estimates through the following equations: The dimension of the unknown parameters ( 1 , … ,   , ) depends on , which could be a large number when  is large or the censoring rate is low.Therefore, Equations ( 8) and ( 9) are a system of high-dimensional nonlinear equations that is notoriously difficult to solve due to the curse of dimensionality.For a special case, that is,  is a vector of design variables for categories, there exist explicit formulae for calculating the high-dimensional parameters   ( = 1, … , ).See Web Appendix A for details.However, such explicit formulae do not exist for more general scenarios; hence, we consider an alternative estimating equation approach to overcome the aforementioned computational difficulties.

Estimating equations
Following Elashoff and Ryan (2004), we develop an ES algorithm for model (1) in this section.We begin by constructing a system of complete-data estimating equations based on model ( 6), which is equivalent to the proposed model ( 1) under the frailty-induced transformations (2).The connection between the proposed ES algorithm and the EM algorithm is discussed at the end of this section.Note that the intensity for   () is where ( 0 ,  0 ) are the true values of (, ).It is clear that {  ()  ()} = 0 for any 0 ≤  ≤  and {∫  0   ()  ()} = 0.By treating   as missing, we consider the following complete-data estimating equations: By the previous arguments that the nonparametric estimator for Λ  is a step function with jump size  ⊤ (  )  at   ( = 1, … , ), it follows that Equations ( 10) and ( 11) can be written as Write  = ( ⊤ 1 , … ,  ⊤  ,  ⊤ ) ⊤ .We propose to estimate  through an ES-type algorithm by treating   as missing.The ES algorithm iterates between an E-step, wherein the functions of the complete data are replaced by their expectations, and an S-step where these expected values are substituted into the complete-data estimating equations (12), which are then solved.After specifying initial values of the unknown parameters , say  (0) , the proposed ES algorithm iterates between the following two steps until convergence.E-step.Evaluate the posterior means Ê(  ).When Δ  = 1, the posterior density of   given the observed data (Δ  = 1,   ,   ,   ) is proportional to   exp(−   1 )(  ), where . Hence, we obtain by taking the derivative twice of the equation exp{−()} = ∫ ∞ 0 exp(−)(), where  ′ (⋅) and  ′′ (⋅) are the first and second derivatives of (⋅), respectively.When Δ  = 0, the posterior density of   given the observed data (Δ  = 0,   ,   ,   ) is proportional to exp(−   2 )(  ), where  2 = (1 − Δ  ) ∑   ≤  ( ⊤    ) exp( ⊤   ).One can obtain Ê(  ) =  ′ ( 2 ).Therefore, the E-step can be summarized as
Step 2. Fix  1 , … ,   , we update  by solving the following equation using the Newton-Raphson method: Note that within the S-step, we iterate between Steps 1 and 2 until convergence.The S-step is declared convergent when the sum of the absolute differences of the estimates at two successive iterations is less than a small positive number, say 10 −3 .We iterate between the E-and S-steps until convergence and denote the final estimates by θ = ( â⊤ 1 , … , â⊤  , β⊤ ) ⊤ .A natural estimator of () is Â() = ∑   ≤ â for 0 ≤  ≤ .Moreover, recall that () = ∫  0 (), hence we can estimate (), 0 ≤  ≤  via a kernel estimator where () is the kernel function and ℎ is the bandwidth.Throughout this paper, we choose the Epanechnikov kernel function, that is, () = 3 4 max{1 −  2 , 0}.The proposed ES algorithm has several desirable features.First, a closed-form formula for computing Ê(  ) is obtained in the E-step.Second, it avoids solving a large system of nonlinear equations in the S-step because the high-dimensional parameters   ( = 1, … , ) are calculated explicitly, while the low-dimensional parameter  is updated via the Newton-Raphson method.Accordingly, the proposed ES algorithm performs stably and satisfactorily without calculating the inverse of any highdimensional matrices.Third, when  is a vector of design variables for categories, the corresponding ES algorithm coincides with the EM algorithm proposed in Section 3.2 by observing that for fixed , Equations ( 8) and ( 13) share the same solution in terms of   ( = 1, … , ).This implies that the proposed ES estimator is also efficient under this special case.See Web Appendix B for detailed justifications.Similarly, it can be shown that the ES algorithm coincides with the EM algorithm when  ≡ 1, that is,  = 1.Finally, we remark that Equation ( 13) can be considered as a weighted version of Equation ( 8), where each participant  is assigned weight  ⊤    .

Variance estimator
In this section, we provide easy-to-compute variance estimators for both the parametric estimates β and the nonparametric estimates Â(), α().Note that Ê(  ) is a function of the observed data   and the unknown parameter : Ê(  ) = (  , ).With  the collection of   ( = 1, … , ), the proposed ES estimator is intrinsically equivalent to solving the following observed-data estimating equation: (, ) = 0, where (, ) = (  1 , … ,    ,   ), Note that   1 , … ,    ,   also depend on the observed data  and the unknown parameter .Here, we compress the notation when there is no confusion.From Equation (15), one can easily note that (, ) can be expressed as the sum of independent terms: Let (, ) be the derivative of (, ) with respect to .
The covariance matrix of θ is consistently estimated by (16) Therefore, the variance-covariance matrix of β can be consistently estimated by the  ×  lower right-hand corner of Equation ( 16).The variance covariance matrix of â ( = 1, … , ) can be consistently estimated by the () × () upper left-hand corner of Equation ( 16).
Condition 5. Let Ψ 0 be the Fréchet derivative of () with respect to  at  =  0 .See Web Appendix C for detailed expressions of Ψ 0 .We assume that Ψ 0 is an invertible map.
Remark 4. Conditions 1 and 2 state the boundedness of the covariates and the compactness of the Euclidean parameter space, which are conventional conditions used in most regression analyses.Condition 3 ensures the existence and uniqueness of the jump sizes in Equation ( 14).Condition 4 ensures that the transformation function  is strictly increasing on [0, ∞).Condition 5 is a classical condition for Z-estimators.
For the transformation functions, we consider the class of logarithmic transformations () =  −1 log(1 + ) with  = 0, 0.5 and 1, where  = 0 specifies the Cox-Aalen model.For all setups, we let  = 1 be the duration of the study.For each study participant, we generate one censoring time  ∼ exponential(0.5).We set Δ = 1 if  ≤ min(, ), and 0 otherwise.This process yields about 75% − 85% right-censored observations for  = 0, 0.5, and 1.For each dataset, we applied the proposed ES algorithm by setting the initial value of  to 0 and the initial value of   to be (1∕, 0, … , 0) for each  = 1, … , .We also tried other initial values, yielding almost identical results.We set  = 200, 500, or 800, and all simulation results are based on 1000 replicates.
Table 1 summarizes the results for estimation of  1 and  2 for all scenarios.Despite the high censoring percentage, from Table 1, one can see that the proposed procedures perform well in several ways: (i) the estimators are virtually unbiased; (ii) the estimated standard error is fairly close to the empirical standard error; (iii) the empirical coverage probability of 95% confidence intervals are all close to the nominal 95% level; (iv) when the sample size increases, the bias, and the variability of the parameter estimator, decreases.Thus, our proposed estimation procedures are reliable for various Cox-Aalen transformation models.
Figure 1 shows the estimation results for the cumulative regression functions (⋅) in Scenario 1.The proposed estimators are again virtually unbiased and the estimated curves are able to capture the shapes of the true cumulative regression functions very well; the estimated standard errors are close to the empirical standard errors; and the confidence intervals have reasonably accurate coverage probabilities.To save space, estimation results for (⋅) via the kernel smoothing approach with bandwidth ℎ = 0.1 are provided in Web Appendix D for Scenario 1.In addition, estimation results for (⋅) and (⋅) under Scenarios 2 and 4 are also presented in Web Appendix D. These results further confirm the satisfactory performance of our proposed method in various numerical settings.We also conducted simulation studies to investigate the robustness of the proposed estimator under the misspecification of the  function.The setups were the same as Scenario 3, and simulation results are displayed in Web Appendix D. The results suggested that the misspecification of the transformation function led to biased estimates and lower coverage probabilities than the nominal levels.
Moreover, we demonstrate the superiority of our proposed model over Zeng and Lin's model in one simulation example.Specifically, we generated the data from our proposed Cox-Aalen transformation model, where one  covariate has a multiplicative effect, and the other has an additive effect.If we falsely assume that both covariates have multiplicative effects and fit Zeng and Lin's model, we will obtain biased estimates of the survival function and cumulative hazard.Thus, our proposed model can better capture complex hazard functions.See Web Appendix D for details.

AN HIV PREVENTION STUDY EXAMPLE
In this section, we apply the proposed model and methods to two harmonized randomized trials, HIV Vaccine Trials Network (HVTN) 704/HIV Prevention Trials Network (HPTN) 085 and HVTN 703/HPTN 081 (Corey et al., 2021), designed to determine whether a broadly neutralizing monoclonal antibody (bnAb) can prevent the acquisition of human immunodeficiency virus type 1 (HIV).The HVTN 704/HPTN 085 trial enrolled 2687 men who have sex with men and transgender persons in the Americas and Europe, and HVTN 703/HPTN 081 enrolled 1924 females in sub-Saharan Africa.For each trial, HIV uninfected participants were randomly assigned in 1:1:1 ratio to receive infusions of a bnAb (VRC01) at a dose of 10 mg/kg of body weight (low-dose group), VRC01 at 30 mg/kg (high-dose group) or saline placebo, administered at 8-week intervals for 10 total infusions.The primary efficacy endpoint was diagnosis of HIV infection by the week 80 trial visit, and HIV testing was conducted at each 4-week trial visit starting at week 0. For participants acquiring HIV infection, the diagnosis date was determined by the adjudicated diagnosis date based on validated assays (Corey et al., 2021).Participant follow-up is right-censored by the minimum of their last negative HIV sample collection date and  = 85.9 weeks (Corey et al., 2021).Therefore, the observed data consist of exact and right-censored observations.Among the 4, 559 HIV negative participants from both trials, 1, 401 are in the U.S. and Switzerland, 1, 249 in Brazil and Peru, 1, 009 in South Africa, and 900 in other sub-Saharan African countries (Switzerland was pooled with the U.S. given few participants in Switzerland).We analyze the two trials pooled together, which is valid given the harmonized protocols such that essentially the study Figure 2 reveals that the risk of HIV infection diagnosis in different regions crosses over.Therefore, without imposing proportional hazards for different regions, we consider the following Cox-Aalen transformation model to assess the association between treatment assignment, age, and region with the time since the first infusion to HIV infection diagnosis: where  is the unknown regression coefficients and Λ  () = ∫  0 { ⊤ ()} =  ⊤ () with () = ( 1 (), … ,  4 ()) ⊤ .Here,  = ( 1 ,  2 ,  3 ,  4 ,  5 ) ⊤ , where  1 and  2 are indicators of being assigned to the low-dose and high-dose groups, respectively, with the placebo group as the reference group;  3 ,  4 ,  5 are indicators of the age groups [21,30], [31,40], and [41, 52], respectively, with [17,20] as the reference age group.In addition, let  = (1,  2 ,  3 ,  4 ) ⊤ , where  2 ,  3 ,  4 are indicators of participants from Brazil and Peru, South Africa, and other sub-Saharan African countries, respectively.The participants from USA and Switzerland are considered as the reference group.Here, "USAS", "BP", "SA" and "Other SSA" represent USA and Switzerland, Brazil and Peru, South Africa, and other sub-Saharan African countries, respectively.
We conducted the analysis using the class of logarithmic transformations () =  −1 log(1 + ), with  values ranging from 0 to 3 with an increment of 0.1.The  value that maximizes the log-likelihood function evaluated at the final parameter estimates was selected.The log-likelihood is maximized at  = 0, though the values do not change greatly for different values of  due to a high censoring rate (about 96.2%); this phenomenon is verified in our simulation studies (see Web Appendix E for details).
The lower panel of Table 2 shows the regression parameter estimates for the selected transformation function ( = 0).High-dose VRC01 significantly lowers the risk of HIV infection diagnosis, while low-dose VRC01 does not.The model fit also shows a significant association between older age and a lower risk of HIV infection diagnosis.Figure 3 displays the estimated baseline cumulative hazard function Λ( | ,  = 0) for the four different regions.The risk of HIV infection diagnosis is the highest in Brazil and Peru and lowest in the U.S. and Switzerland.The estimates for South Africa and other sub-Saharan African countries cross; in particular, South Africa has a lower risk at early times after the first infusion but a higher risk at later times.In addition, Figure 3 shows that the HIV infection diagnosis hazards are not proportional across geographic regions.Figure 4 plots the estimates of conditional survival functions at sixteen different combinations of covariates: four age groups crossed with four regions.This figure further confirms our findings above.In Web Appendix E, we also report the analysis results under other values of  and observe the same patterns.
The four other panels of Table 2 (upper panels) show results from Zeng and Lin's model fit to each of the four geographic regions separately; this method was not applied to the full cohort (pooled) data because it cannot flexi-

DISCUSSION
In this paper, we proposed a class of semiparametric Cox-Aalen transformation models that includes Zeng and Lin's model (Zeng & Lin, 2006;Zeng et al., 2016) and the Cox-Aalen model (Scheike & Zhang, 2002) as special cases.By considering the class of frailty-induced transformations, we successfully developed a fast and stable ES algorithm to estimate the parametric and nonparametric components of the proposed model along with easy-to-compute variance estimators.In addition, the asymptotic properties of our proposed estimators are rigorously studied.Elashoff and Ryan (2004) pointed out that an ES algorithm can be regarded as a block Newton-Gauss-Seidel algorithm (see Ortega 1972, p. 146).Following Ortega (1972, p. 147), an ES algorithm converges locally to the solution, θ, of () = 0 if the Jacobian matrix  = ∕ is nonsingular at  = θ and the largest eigenvalue of  −1 ( θ) is less than 1.For general estimating equations, the two conditions above are difficult to verify in advance, especially for the second condition.Nevertheless, the matrix  is needed to calculate the variance of θ in Equation ( 16), and hence one can check the required conditions numerically.
In real-data applications, we ascertain whether a covariate has a multiplicative or additive effect based on the following criteria.First, we may employ the underlying biological, physical meaning, or other domain knowledge for decision-making.Second, initial data exploration can be performed for each covariate, such as drawing the Kaplan-Meier (KM) plot.If the KM curves cross, tions from the additive components and the multiplicative components of the model may not be in the multiplicative form.In addition to choosing the right covariates for the multiplicative and additive components of the model, mis-specifying the transformation function can result in erroneous inferences.
Assessing the adequacy of the proposed model is crucial because model misspecification affects the validity of inference and prediction accuracy.

TA B L E 2
Regression analysis results for the HIV trials from Zeng and Lin's model fit to each of the four geographic regions separately and the proposed Cox-Aalen transformation model based on the full cohort data with the logarithmic transformation () =  −1 log(1 + ) with  = 0.
For Zeng and Lin's model, Chen et al. (2012) considered appropriate time-dependent residuals and constructed various graphical and numerical procedures for model assessment.In our analysis of the HIV prevention trial data, we use the log-likelihood to select the transformation function, even though the log-likelihood surface is relatively flat.Similar to Chen et al. (2012), we suggest constructing the cumulative sums of residuals over the argument of the transformation function, that is, (, ) =  −1∕2 ∑  =1 ∫  0 (∫  0   ()e β⊤   ()  ⊤  () Â() ≤ )  (; β, Â) to check the transformation form, where   (; , ) =   () − {∫  0   ()e  ⊤   ()  ⊤  ()()}.A thorough theoretical and numerical investigation of model misspecification is still needed for the proposed model.We are currently pursuing this direction.A C K N O W L E D G M E N T S This research is part of Xi Ning's PhD dissertation.This research was partially supported by the National Institutes of Health, National Institute of Allergy and Infectious Diseases [grant numbers UM1 AI68635 and R37 AI054165].The research of Xi Ning was also supported, in part, by the 2022 Graduate School Summer Fellowship Program from the University of North Carolina at Charlotte (UNC Charlotte).Yinghao Pan's work was partially supported by funds provided by UNC Charlotte.The research of Yanqing Sun was also partially supported by the National Science Foundation [grant number DMS-1915829] and the Reassignment of Duties fund provided by UNC Charlotte.We thank the HIV Vaccine Trials Network (HVTN) for providing the data analyzed in this article, especially Dr. Yunda Huang.The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.D ATA AVA I L A B I L I T Y S TAT E M E N TThe data that support the findings of this paper are available in the Supporting information of this paper.O R C I DXi Ning https://orcid.org/0000-0003-3585-8771Yinghao Pan https://orcid.org/0000-0002-4022-1815Yanqing Sun https://orcid.org/0000-0002-3140-4572RE F E R E N C E S Simulation results for estimation of the regression parameters under Scenarios 1 to 4. Bias, bias of the parameter estimator; SE, empirical standard error of the parameter estimator; SEE, mean of the standard error estimator; CP, empirical coverage percentage of the 95% confidence interval.
TA B L E 1Note: Est and SE stand for the estimates of the regression parameters and the estimated standard errors, respectively."Other SSA" is for other sub-Saharan African countries."USA/Switzerland", "Brazil/Peru", "South Africa", and "Other SSA" correspond to the estimation results by fitting Zeng and Lin's model to each geographic region."The Proposed Model" corresponds to the estimation results when fitting the proposed Cox-Aalen transformation models to the full cohort data. Note: