Biased estimation of thrombosis rates in cancer studies using the method of Kaplan and Meier


Jeffrey Zwicker, Beth Israel Deaconess Medical Center, 330 Brookline Ave, Boston, MA 02215, USA.
Tel.: +1 617 667 9299; fax: +1 617 667 9922.


To cite this article: Campigotto F, Neuberg D, Zwicker JI. Biased estimation of thrombosis rates in cancer studies using the method of Kaplan and Meier. J Thromb Haemost 2012; 10: 1449–51.

Based on the well-described association between venous thromboembolic events (VTE) and cancer, there has been considerable interest in defining risk factors and a role for anticoagulants in the management of cancer patients. However, unlike in clinical studies in high-risk orthopedic patients or even acutely ill medical patients, a significant percentage of patients enrolled in these cancer-anticoagulant studies ultimately die from their underlying illness. Analyzing these cohorts based on Kaplan–Meier analysis is biased because study subjects who die prior to the development of a VTE are censored. By definition, individuals censored in a Kaplan–Meier (KM) analysis should be at an equal risk for developing a VTE as those still at risk beyond the observation time point. In order to account for these competing endpoints, a competing risk analysis is a more appropriate statistical approach [1]. The issue of competing risks is well recognized in the oncology literature but has been applied inconsistently to the analysis of clinical studies assessing thrombotic rates in cohorts associated with a high incidence of mortality such as malignancy or heparin-induced thrombocytopenia.

A large number of clinical studies evaluating anticoagulants or risk factors in cancer populations fail to account for death as a risk competing with the primary thrombosis endpoint [2–7]. The Food and Drug Administration (FDA) in the reviewing of dalteparin for the treatment of cancer-associated VTE performed a hypothetical analysis using extreme assumptions of informative censoring and concluded that the significant benefit of dalteparin over oral anticoagulation was unlikely to be obscured by mortality as a competing risk [8]. The accuracy of this conclusion was recently affirmed [9] but the same may not be true of other studies where differences in VTE incidence are less pronounced or when biomarkers such as D-dimer are linked both with survival outcomes [10] as well as incidence of VTE [7]. In order to illustrate the potential bias of KM analysis in determining the probability of VTE in cancer populations, we simulated two scenarios using datasets of 300 patients.

Simulations were performed with the following assumptions: (i) observations were available on the failure time of n individuals taken to be independent; (ii) an individual could experience only a single event; (iii) the competing risks data were right censored; (iv) random censorship model was assumed (in the type II censoring model with uninformative censoring); (v) there were only two types of events: the primary event of VTE occurrence and the competing event of death in the absence of VTE; and (vi) those patients who experience neither VTE nor death during the study observation period were censored observations.

An exponential distribution was used to generate VTE failure times, reflecting the assumption that the cumulative probability of recurrent VTE after 6 months is 12.5%. In order to explore the potential impact of KM vs. competing risk analysis on interpretation of VTE rates, two scenarios were considered based on different survival times. In the first scenario (Scenario A), median survival of the cohort was approximately 5 months, thereby reflecting a patient population with advanced malignancy at high risk of VTE and in which death attributable to the underlying malignancy occurred within 2 months in 25% of the patients enrolled. In the second scenario (Scenario B), the median survival was reduced to 2 months. An exponential distribution with a failure rate of 0.0289 was used to generate censoring times to simulate a median follow-up of 24 months. The cumulative incidence function (CIF) for the occurrence of VTE was first estimated using the standard KM approach [11] whereby the CIF estimation is the complement of the KM estimate of the survival function and then by competing risk (CR) methodology [12].

In our simulations, the data were generated by assuming the existence of two possible types of competing events, VTE and death. In order to compare the two approaches, for each of the two simulated data sets, we calculated the KM and the CR estimates of the cumulative incidence function of the VTE event at 6 months. In scenario A, the cumulative probability of developing VTE at 6 months after study entry was 56.3% (95% confidence interval [CI], 47.8–63.3%] by KM analysis compared with 39.5% (95% CI, 33.8–45.2%) by competing risk analysis (Fig. 1A). The difference was even more apparent when the median survival was shorter (Scenario B) with a cumulative incidence of VTE of 55.4% (95% CI, 43.2–64.9%) and 28.8% (95% CI, 23.6–34.2%) by KM and competing risk analysis, respectively (Fig. 1B).

Figure 1.

 Overestimation of the cumulative probability of venous thromboembolic events (VTE) in cancer cohorts generated by Kaplan–Meier analysis. Simulated datasets are based on an exponential distribution VTE rate of 0.125 at 6 months and modeled to reflect two different survival times: (A) 5-month overall survival, and (B) 2-month overall survival. Kaplan–Meier estimation (solid lines, censored events in hash marks) and competing risk estimation (dashed lines).

Because the Kaplan–Meier method censors death, the cumulative risk of VTE was incorrectly overestimated at 6 months. This estimate is biased and misleading as an estimate of the probability of VTE in the presence of competing risks factors. The magnitude of bias associated with the KM estimation of the cumulative incidence function in a competing risks setting depends on the incidence of the competing events. A shorter survival increases the magnitude of the bias in the statistical estimate. Thus, the KM estimation of the cumulative incidence function is always greater than or equal to that from the CR approach, with equality holding only up to the time of the first failure from the cause of interest that follows the first competing-risk event. Accordingly, the most appropriate approach to test whether two cumulative incidence distributions are equal is the Gray test, which is a modification of the log-rank test that accounts for the multiplicity of competing risks [12].

The number of failures from the competing risk (death) influences the number of failures from the cause of primary interest (VTE) and consequently the estimate of the probability of failure from this cause. Hence, failures from the competing risk reduce the number of patients at risk of failure from the cause of interest. The Kaplan–Meier approach overestimates the probability of failure and therefore overestimates the cumulative incidence function. On the other hand, the competing risk approach treats a competing event as an actual event rather than a censored observation. We conclude that the Kaplan–Meier analysis is an inappropriate statistical methodology to evaluate the probability of VTE in cancer cohorts. A competing risk analysis was recently utilized in the analysis of a large randomized clinical trial [13] and should serve as the reference methodology to analyze time-to-event data in cancer thrombosis studies.


Grant support by NHLBI K23 HL084052 (J. I. Zwicker).

Disclosure of Conflict of Interest

The authors state that they have no conflict of interest.