Pragmatic estimates of the proportion of pediatric inpatients exposed to specific medications in the USA

Purpose To provide pragmatic national estimates of the proportion of hospitalized pediatric patients exposed to specific drugs in the USA. Methods We used Premier Perspective Database and the Pediatric Health Information System data including specific drug exposures of 1.15 million inpatients <18 years old in 411 general and 52 children’s hospitals throughout the USA in 2006, extrapolating this information into the probability-based Kids’ Inpatient Database, which has demographic and clinical characteristics but no drug exposure data. We used a multivariable stratified resampling (MSR) technique to estimate the proportion of drug exposure for the 700 most commonly used drugs and performed additional stability and sensitivity analyses for 19 drugs. Results The estimated proportion of pediatric inpatients exposed to specific drugs in 2006 ranged from high levels such as that of acetaminophen (17.36; 95%CI: 17.32, 17.41) to rare exposures such as bosentan (0.0018; 95%CI: 0.0013, 0.0023). Additional analyses for 19 drugs revealed that the MSR estimates were close to estimates generated by multivariable multiple imputation, with a maximum absolute difference of 0.03 for acetaminophen (17.36 vs. 17.33) and famotidine (1.90 vs. 1.93), and that even with 50% of the hospitals removed at random, the proportion estimates did not vary by more than 2.5-fold at the upper 97.5 percentile. Conclusions These pragmatic national estimates of the proportion of pediatric inpatient drug exposures, generated using an MSR technique, provide a context for interpretation of drug-related adverse event reports and prioritization of pediatric pharmacology research. © 2013 The Authors. Pharmacoepidemiology and Drug Safety published by John Wiley & Sons, Ltd.

clinical information, such as age, length of stay (LOS), diagnosis (recorded as an All Patient Refined Diagnostic Related Group [APR-DRG]), and hospital type (children's or general hospitals), characteristics that we have found are associated with the likelihood of inpatient drug exposures. 13 One can then conceptualize the extrapolation in an analytic framework where each of the three data sources is incomplete, but together they are complementary (Supplemental Figure A).
In conceptualizing the KID as missing data regarding drug exposure, the mechanism underlying this missing data is known: all subjects in KID are missing drug exposure data. What is not completely understood is how patients and patterns of care recorded in KID differ from those in the PHIS and Premier, both in terms of individual patient demographic and clinical characteristics and in terms of the hospitals (and consequently, physicians, healthcare teams, and practice patterns) sampled in these different sample frames, how these differences affect the likelihood that specific patients will be exposed to specific drugs, and whether these individual differences affect the population-level average proportion exposed estimates.
In this study, we used a multivariable stratified resampling (MSR) procedure, 14,15 based on the four strata of age, LOS, hospital type, and APR-DRG, to generate national-level estimates of the proportion of pediatric inpatient exposure for 700 of the most commonly used drugs, and then for 19 selected drugs, we assessed the stability of these estimates compared with a multivariable multiple imputation (MMI) procedure 16,17 and performed two sensitivity analyses to quantify the potential range of estimation error due to patient sampling error and hospital composition error.

METHODS
The Children's Hospital of Philadelphia's Institutional Review Board approved this study.

Data sources
We used three primary data sources. First, PHIS (Children's Hospital Association, Kansas City, KS) comprises administrative discharge data from children's hospitals for major metropolitan areas across the USA. Second, PPD (Premier, Inc, San Diego, CA) comprises data from a broad array of academic medical centers, community-based hospitals, and large multihospital systems. For this study, 40 hospitals in PHIS and 423 hospitals in the PPD in 2006 contained detailed pharmacy information for each day of the hospital stay. 11 Third, KID is a probability-based sample of inpatient pediatric admissions from all hospitals that provide data to AHRQ. KID employs a complex sampling scheme, enabling generation of national estimates in the 2006 KID; 38 states participated, with 80% of pediatric hospitalizations randomly chosen from each participating hospital, except for "normal newborns," of whom 20% were randomly selected from each hospital. Of the KID hospitalizations, 7% are from children's and 93% from general hospitals. The KID only identifies discrete hospitalizations, not unique patients, and does not contain pharmacy information. 12 Together, PHIS and Premier constitute 19.9% of all pediatric hospitalizations for 2006.

Data management
We categorized PHIS and PPD records into children's hospitals and general hospitals. From PPD, records from two hospitals included in PHIS were omitted; 12 hospitals identified as children's hospitals and exhibiting demographics consistent with those observed in children's hospitals in the PHIS and KID databases were classified as children's hospitals; the remaining hospitals were classified as general hospitals. We implemented a standardized dictionary of generic drug entities, specified by 1227 distinct codes in PHIS and 1564 in PPD. After harmonizing terminology, PHIS had 1144 distinct codes and PPD 1337. 13 Comparison of PHIS and Premier samples to KID sample We first described the demographic and clinical characteristics of the sample by calculating percentages (in KID, accounting for the survey design) and computed the standardized proportion differences between PHIS/Premier and KID. 18 We calculated the percentage of hospitalizations for each APR-DRG in all three databases and measured the difference between the children's hospitals in PHIS/PPD and in KID, doing likewise for general hospitals. We also assessed the proportion of hospitalizations in PHIS/PPD that had multivariable matches in KID, and vice versa.

Multivariable stratified resampling
Stratified resampling was used to generate national estimates of drug exposure for 700 drugs (Supplementary Figure B), stratifying PHIS/Premier data and KID data into 7068 strata based on patient age (<1, 1-4, 5-12, and 13-17 years), LOS (1, 2-7, and >7 days), APR-DRG (315 observed), and hospital type (general and children's), with 74.83% of the strata present in both PHIS/Premier and KID. For each hospitalization in KID, we randomly sampled with replacement a corresponding stratified record from PHIS/Premier. 19 Within each matched pair, the KID record was updated with the drug exposure status from PHIS/Premier. We resampled 1000 times and examined the distribution of exposure for each of the 700 drugs. We generated national proportion exposed estimates by taking the average exposure (across the 1000 samples) weighted by the KID discharge weight.

Multivariable multiple imputation
For 19 drugs selected across the range of the proportions of patients exposed (Supplementary Figure C), we performed multiple imputation as a stability analysis 20 of the stratified resampling approach. We combined the three databases and imputed missing values for drug exposure in the KID database using PHIS and Premier records with no missing values regarding age, LOS, hospital type, APR-DRG, and drug exposure (<1% of the combined sample had missing values for any of the non-drug covariates). For each of the 19 drugs, we fit separate data augmentation models using Markov Chain Monte Carlo algorithms (which has been shown in simulations to perform adequately for imputing binary variables 17 ), fitting separate models for children's and general hospitals, and then combined the estimates. To obtain national estimates of drug exposure for the KID observations, we computed the mean probability of drug exposure using the KID sample weights and accounting for the KID survey structure to correctly estimate the variances, then compared the estimates generated by these techniques.

Sensitivity analysis regarding proportion of patients cared for at children's hospitals
Because children treated in children's and general hospitals differ in terms of demographic, clinical, and drug usage patterns, extrapolation methods must account for these differences to reflect the national case mix. 13 To assess the consequences of a source population that deviates from the national case mix, we performed sensitivity analyses for each of the 19 drugs (Supplemental Figure D). We drew a random subset of 100 000 records from PHIS/Premier, such that 7% of the patients were treated at children's hospitals and 93% at general hospitals (which is the percentage observed in KID), and considered this database a new "base case" sample. We then drew another random "modified" sample of 100 000 records from PHIS/Premier, systematically varying the percentage of patients treated at children's hospitals from 0% to 14%. The difference in the proportion of observed drug exposure between the "modified" sample and the "base case" sample was noted for each of the 19 drugs. We repeated this procedure 1000 times, resampling with replacement, for each percentage of children's hospitals. We evaluated the effect of the sample modifications by comparing the average differences between the modified and base case estimates.
Sensitivity analysis of specific hospitals' influence on proportion exposed estimates Hospitals vary in their use of specific drugs, raising concern that over-sampling or under-sampling of high-utilization hospitals could distort national averages. 13 To assess the magnitude of this potential distortion, we developed and implemented a "hospital knockout" methodology (Supplemental Figure E). Specifically, we first randomly eliminated a fixed percentage of the hospitals (10%, 25%, or 50%) in PHIS/Premier. We then matched each KID record to a record from PHIS/Premier and replaced the missing drug exposure data in the KID record with the data from PHIS/Premier. We resampled with replacement 1000 times, as detailed earlier, to generate national estimates of the proportion of inpatient drug exposure for each of the 19 drugs. For each drug, we examined how the percentage of hospitals "knocked out" from PHIS/Premier affected drug exposure estimates.

Statistical software
All data management and analyses were conducted using SAS 9.3 (SAS Institute Inc., Cary, NC) and Stata 12.1 (StataCorp, College Station TX).

RESULTS
Compared with the KID children's and general hospitals, characteristics of patients in our sample were qualitatively similar in terms of demographic characteristics such as age, gender, and race, hospital location and teaching status, and insurance payer, and clinical characteristics such as LOS and disposition (Table 1). In terms of medical conditions, the median difference in the proportion of patients in each of 315 APR-DRG groups in the combined databases and the KID was 0.01 (maximum, 0.48) in children's hospitals and 0.01 (maximum, 0.69) in general hospitals ( Figure 1). In terms of the degree to which the samples had similar subjects (defined by the four stratification variables of age, LOS, APR-DRG, and

National estimates of drug exposure
We used the MSR procedure, following the steps described in the Methods section, to generate national estimates for 700 drugs with the highest levels of exposure using the combined PHIS and Premier datasets, but corresponding to the KID database case mix. Estimates of exposure for 25 leading drugs are reported in Table 2 and for all 700 drugs (Supplemental Table A).

Stability analysis for alternative estimation procedures
To perform a stability analysis and compare the estimates generated by the MSR technique to an alternative procedure, we selected 19 drugs for further evaluation, ranging from high to low levels of exposure, and from either equivalent levels of exposure in children's and general hospitals or higher levels in either children's or general hospitals (Table 3, columns labeled "Study sample"). We then performed MMI to generate the estimated weighted national proportions of exposure to each of the 19 drugs, reflecting the patient and hospital case mix observed in the KID database (Table 3, column labeled "Multiple imputation"). We then compared the estimates of the MSR (Table 3, column labeled "Stratified resampling") and the MMI procedures both in absolute and relative terms ( Table 3, columns at right labeled "Absolute difference" and "Ratio") for the 19 drugs. The absolute differences in the proportion of patients exposed (MRS À MMI) across the 19 drugs ranged from À0.03 to 0.03 and, in general, decreased in magnitude as the estimated proportion of drug exposure decreased. The relative difference of the results of the two estimation techniques (stratified resampling estimate divided by multiple imputation estimate) ranged from 0.87 to 1.14.

Sensitivity analyses of impact of patient and hospital composition of sample
A major concern about extrapolating national estimates from a given large sample of patients and hospitals is that a subpopulation may be over-represented or under-represented, causing a distortion. We sought to assess its potential magnitude with two additional assessments using stratified resampling. First, we examined the impact of having a smaller or larger proportion of patients treated in children's hospitals in the sample, compared with the 7% proportion observed in the KID. The absolute median differences in the estimated proportion of drug exposure, across samples ranging from 0% to 14% children's hospital patients, were again largest for the drugs with the higher levels of exposure, with differences ranging from 1.7 to 0.0002 (Figure 2, left panel). The ratios of the estimates, by contrast, were in general larger for the lower exposure drugs, ranging from 1.0 to 2.3. Some estimates of drugs with lower exposure proportions were stable across the range of children's hospital patients (such as tetracycline, estradiol, and rosiglitazone; Figure 2, right panel).
Second, we examined the impact of the inclusion or exclusion of specific hospitals on the stability of the estimates by employing a "hospital knockout" procedure. We observed substantial stability of the estimates with removal of 10%, 25%, and 50% of the hospitals, with absolute differences less than 0.5 and ratios of estimates of less than 1.5 ( Figure 3).

DISCUSSION
Our study, using data from the PHIS and Premier databases and extrapolating from the KID sample framework, provides national-level estimates of the proportion of pediatric inpatients exposed to 700 specific drugs that are the most common exposures. Similar to other estimates of the 10 most common drugs, 6 acetaminophen is the drug to which the largest proportion of inpatients are exposed (17.36%), while lidocaine, ampicillin, morphine, fentanyl, ceftriaxone, ibuprofen, gentamicin, and albuterol are all on both top 10 lists, although with modestly different point estimates of the proportion of patients exposed. Descending downward from these common drugs, our estimation procedure was used for far less commonly used drugs, such as halothane, with an estimated proportion exposed of 0.0019% (95%CI: 0.0014%, 0.0024%).
Are our methods capable of providing reasonably accurate national-level estimates of these exposures in the USA? In this study, we approached this question from three directions. First, we demonstrated the similarity of the PHIS and Premier patients, regarding  demographic and clinical characteristics, to those in the probability-based KID database. Second, we compared two different methods of extrapolating the drug exposure data about numerous stratified subgroups of patients from the PHIS and Premier records into the corresponding subgroups in the KID database and found that multivariable regression-based multiple imputation yielded estimates that were very close to the multivariable stratified resampling estimates. Third, we performed sensitivity analyses to determine the susceptibility of the estimates to manipulation of the source data (PHIS/Premier) in ways often cited as concerns for generating national pediatric estimates, namely changing the ratio of children's versus general hospitals and randomly eliminating up to 50% of the source hospitals, and observed that drug exposure estimates were more concordant when the children's hospital to general hospital ratio reflected the national ratio and were remarkably stable with random elimination of hospitals, lessening the concern that several outlier hospitals may dramatically effect national-level estimates.
These sensitivity analyses address the cardinal limitations of this study, namely that there may be differences between the hospitals that do and do not contribute to the PHIS and Premier dataset, and that such differences might bias the national estimates obtained by the procedures we have used. Although our additional analyses demonstrate the stability of the exposure estimates despite systematic alteration of the PHIS and Premier source data, without a comparison set of "gold standard" exposure measurements, the accuracy of our estimates will remain somewhat uncertain. We also examined only 19 different drugs in detail, which although selected to span the range of incidence and divergence of usage, may not behave the way that other drugs might.
In light of these limitations, how might these estimates of the proportion of pediatric inpatients exposed to specific drugs be appropriately used? For pediatric pharmacovigilance, these estimates help establish a range of the proportion of inpatients exposed to specific drugs, against which to examine the number of adverse drug events, and thus helps indicate whether to further investigate drug safety. 4,6 For prioritizing pediatric drug research, the proportion of patients exposed can provide one parameter for prioritization. To be useful, the estimates must be sufficiently accurate and precise at the population level to aid decision makers, who will have to determine whether the estimates are sufficiently "reasonably accurate" for the purposes of the decisions they confront. This standard is different from what would be required to draw inferences at the individual level based on drug exposure extrapolations using either the imputation or the resampling approaches; population-level estimates, benefiting from the statistical phenomenon underlying the central limit theorem, are more robust than individual-level estimates.
Our findings also underscore two core aspects of pediatric inpatient drug usage. First, because patients cared for at children's versus general hospitals differ substantially, in terms of ages, conditions, and patterns of drug usage 13 , and as we showed here that nationallevel estimates of drug exposure vary depending upon the ratio of general-to-children's hospitals in the sample, these estimates should be generated using a nationally representative ratio. Second, as indicated by the "hospital knockout" sensitivity analysis, some drugs are used in a consistent manner across hospitals (such as ampicillin and gancylovir), whereas others display greater heterogeneity of use (such as prazosin and drotecogen alpha [which was in use in 2006 prior to being recently removed from the market]). This evidence of hospital-level variation in the use of specific drugs points to potential areas of research and quality improvement but has a minimal impact on the national-level estimates, which center around the average usage across all hospitals.
We believe that the pediatric inpatient drug exposure estimates provided by this analysis, as well as the methods used, are useful for pediatric pharmacovigilance and drug research prioritization, both for generating national-level exposure estimates and for testing the sensitivity of these estimates to potential limitations of the source data.

CONFLICT OF INTEREST
The authors declare that they have no conflicts of interest.

KEY POINTS
• National-level epidemiological studies of adverse drug events require accurate estimates of drug exposure rates.
• Our study demonstrates the use of multivariable stratified resampling techniques to generate exposure estimates for 700 drugs for the United States pediatric inpatient population, and that these estimate are resistant to a variety of potential problems with the underlying data.
• These national exposure estimates can be used to inform further research into adverse drug events and drug-drug interactions among hospitalized children.

SUPPORTING INFORMATION
Additional supporting information may be found in the online version of this article.  Table A. Estimated percentage of patients exposed to the 700 most commonly used medications, sorted by percentage Table B. Estimated percentage of patients exposed to the 700 most commonly used medications, sorted alphabetically