- Top of page
- EXAMPLE STUDY
- CONFLICT OF INTEREST
Electronic healthcare databases are widely used to assess the comparative effectiveness and safety of therapeutics in real-world settings.[1, 2] However, as these databases are not created for research purposes, data on certain important confounders may not be recorded. For example, administrative claims databases rarely, if ever, include direct measures of cigarette smoking, left ventricular ejection fraction, or depression severity.
Confounder data are sometimes available for a subset of patients. For example, measures of cigarette smoking or left ventricular ejection fraction may be recorded for some patients with electronic health records. Administrative claims databases and electronic health records can be linked to each other, or to device and disease registries,[4-7] birth certificates,[8, 9] or survey data to provide additional confounder information that is otherwise not available. However, supplemental information is available only for patients whose records can be found in both data sources and linked successfully.
With a rapid increase in database linkage and the use of electronic health records in comparative effectiveness and safety research, many researchers are now dealing with partial missingness of confounder information. Methods that can handle missing data have been described.[11-13] Here, we discuss and compare several analytic approaches to handle partially missing confounder data in studies that use electronic healthcare databases. We used the relation between non-steroidal anti-inflammatory drugs (NSAIDs) and upper gastrointestinal bleeding (UGIB) as our example, because the relation is well known—randomized trials reported a 40%–60% lower risk of UGIB for selective cyclo-oxygenase-2 inhibitors (coxibs) compared with traditional NSAIDs (tNSAIDs)[14, 15]—and because severe confounding is expected in observational studies as coxibs are likely to be preferentially given to patients who have a higher risk of UGIB.
- Top of page
- EXAMPLE STUDY
- CONFLICT OF INTEREST
Table 1 shows the distribution of baseline characteristics of initiators of coxibs and tNSAIDs ascertained during the 12-month period before the first NSAID prescription. The crude OR of UGIB for coxib initiators versus tNSAID initiators was 1.50 (95%CI: 0.98, 2.28). The OR was 1.04 (0.68, 1.59) after adjustment for age and sex, 0.98 (0.63, 1.52) upon further adjustment for calendar year of treatment initiation, and 0.84 (0.54, 1.31) after further adjustment for measures of healthcare utilization. When we further adjusted for all remaining confounders in X, the OR was 0.81 (0.52, 1.27) for the entire study cohort, 0.64 (0.38, 1.07) for the 78% patients of the cohort with complete information on all three lifestyle variables in L, and 1.93 (0.78, 4.74) for patients with missing values on any of the three lifestyle variables.
Table 1. Baseline characteristics of initiators of selective cyclo-oxygenase-2 inhibitors (coxibs) or non-selective (traditional) non-steroidal anti-inflammatory drugs (tNSAIDs) ascertained during the 12-month period before the first NSAID prescription
|Characteristics||Patients with no missing supplemental confounder data*||Patients with missing supplemental confounder data*|
|Coxib initiators (n = 33 693)||tNSAID initiators (n = 320 733)||Coxib initiators (n = 9876)||tNSAID initiators (n = 90 883)|
|Age (years)|| |
|Calendar year of treatment initiation|| |
|No. of distinct drugs in the prior year|| |
|No. of outpatient visits in the prior year|| |
|Hospitalized in the prior year||9.8||8.1||7.2||6.1|
|Charlson comorbidity score ≥1||41.4||27.9||33.5||20.4|
|Prior use of|| |
|Diagnosis of|| || || || |
|Peptic ulcer disease||0.4||0.1||0.3||0.1|
|Congestive heart failure||2.9||1.1||2.5||0.9|
|Coronary artery disease||15.9||8.3||10.1||4.3|
|Alcohol consumption (drinks/week)|| |
|Body mass index (kg/m2)|| |
Table 2 shows the results from different analytic approaches to deal with missing confounder data. The adjusted ORs were 0.65 and 0.67 for the unweighted and IP-weighted complete-case analyses, respectively. In the IP-weighted analysis, the weight had a mean of 1.28 (standard deviation 0.15) and ranged from 1.04 to 2.49. The adjusted ORs ranged between 0.80 and 0.83 for the imputation methods. The 95%CIs from the different methods were overlapping; the 95%CI for any estimate included all other point estimates.
Table 2. Odds ratios of upper gastrointestinal bleeding during the first 180 days following initiation of selective cyclo-oxygenase-2 inhibitors versus non-selective non-steroidal anti-inflammatory drugs, by different analytic approaches to incorporate supplemental confounder data available in a subset of the study cohort
|Analytic methods||Number of patients included in the analysis||Adjusted odds ratio* |
(95% confidence interval)
|Standard error of log odds ratio|
|Complete-case analysis; unweighted||354 426||0.65 (0.39, 1.09)||0.27|
|Complete-case analysis; inverse probability weighted||354 426 (outcome/PS model) ||0.67 (0.38, 1.16)||0.28|
|Missing-category approach||455 185||0.81 (0.51, 1.26)||0.23|
|Missing-indicator approach||455 185||0.80 (0.51, 1.25)||0.23|
|Single imputation||455 185||0.83 (0.53, 1.30)||0.23|
|Multiple imputation||455 185||0.82 (0.52, 1.29)||0.23|
|PS calibration||455 185 (error-prone PS model) |
300 000 (gold-standard PS model)
|0.80 (0.50, 1.27)||0.24|
Results from all approaches did not materially change when the PS was included as a continuous variable instead of deciles in the outcome model (as was necessary for the PS calibration approach). The c-statistic for the PS model was around 0.80 for all analyses, and the covariates were overall well balanced within PS strata (data not shown).
- Top of page
- EXAMPLE STUDY
- CONFLICT OF INTEREST
We have reviewed and compared several approaches to deal with partially missing confounder information in electronic healthcare databases. We used the NSAID–UGIB example to illustrate their application to comparative effectiveness and safety research of therapeutics. All these methods require the assumptions of no unmeasured confounding for the effect of treatment on the outcome and no misspecification of the outcome and PS models.
The missing-category/indicator approach and single imputation by the most common category further require additional assumptions that are generally implausible. In essence, they all assume that patients with missing information on certain variables are unconditionally exchangeable and can be grouped together for analysis. Single imputation by the most common category goes a step further and assumes that patients with missing data are not only comparable with each other but also with patients with a certain (often arbitrarily chosen) covariate value. Although these methods are easy to implement, they have been shown to produce biased estimates even when patients with and without missing data are unconditionally exchangeable (i.e., data missing completely at random).[11, 38-40]
Multiple imputation requires that missingness be unassociated with the outcome conditional on the measured confounders or the corresponding PS (i.e., data missing at random) and that the imputation model for each covariate with missing data be correctly specified. The approach has been shown to provide more valid estimates than the missing-indicator approach and single imputation when these assumptions are true.[11, 34, 39-42] A recent study that used The Health Improvement Network database found that patients with missing information on smoking, alcohol consumption, weight, or height differ systematically from the others in terms of comorbidities such as cardiovascular disease and chronic obstructive pulmonary disease. Our estimate from multiple imputation would be incorrect if missingness was associated with other prognostic factors that were not included in the analysis. We used a version of multiple imputation that does not require the often unrealistic assumption of joint multivariate normality.[30, 31]
The PS calibration approach is valid under the assumptions that there is an appropriate internal or external validation sample, the linear measurement error model is correctly specified, and the error-prone PS is an appropriate surrogate for the gold-standard PS.[35, 44] The last assumption may be violated if the direction of confounding from the unmeasured or partially measured confounders is in the opposite direction to that from the measured covariates. This approach may be combined with single imputation of the gold-standard PS based on the parameters of the measurement error model to do away with the need to specify the outcome model through matching or stratifying on the imputed gold-standard PS.
Yet, despite all these differences in the conditions required for valid estimates, we found only small differences across different imputation methods. The reasons might be that the proportion of missingness was relatively low and that the three variables with missing values might not be strong confounders after conditioning on other measured variables. Indeed, the OR adjusted for all potential confounders available in the entire study cohort (0.81) was similar to the ORs that were further adjusted for the three lifestyle variables by using different imputation approaches (0.80–0.83).
Like the imputation methods, the IP-weighted complete-case analysis estimates the effect in the entire study population.[25, 46] It is valid under an additional assumption that the weight models are correctly specified. The unweighted complete-case analysis estimates the effect only among patients without missing values; its results cannot be applied to the entire study population unless the data are missing completely at random. The unweighted complete-cases analysis has been shown to produce more biased estimates compared to other approaches, such as multiple imputation.[11, 39, 40]
The point estimates of complete-case analyses and imputation methods were somewhat different, which may be due to random variability (wide and overlapping 95%CIs) or to real differences between patients with complete and incomplete confounder information beyond the information recorded in the database. For example, general practitioners who record patient lifestyle factors—and patients who respond to these questions—might have certain unmeasured characteristics that are associated with the outcome risk. Also, the effect of NSAIDs on UGIB might be modified by certain patient characteristics for which missingness is a proxy.
In conclusion, a number of methods are available to deal with missing data in comparative effectiveness and safety studies of therapeutics that analyze electronic healthcare databases. Researchers need to be aware of the underlying assumptions of various methods when choosing among them.