Postpartum readmission for hypertension and pre‐eclampsia: development and validation of a predictive model

To develop a model for predicting postpartum readmission for hypertension and pre‐eclampsia at delivery discharge and assess external validation or model transportability across clinical sites.


| I N TRODUC TION
Hypertension and pre-eclampsia are leading causes of severe maternal morbidity and mortality, and together with other cardiovascular diseases are the primary cause of maternal mortality in the USA. 1 Complications due to hypertension and pre-eclampsia occur antepartum, during labour and delivery, and in the postpartum period. 2 Postpartum hypertension is one of the most common reasons for readmission (25%) 1 but the rate of readmission for this indication among postpartum individuals is low (<1%). 35][6] But most readmitted individuals do not necessarily have a prior history of chronic hypertension or pre-eclampsia. 7Predicting which individual a delivery discharge will be readmitted for postpartum hypertension and pre-eclampsia remains challenging. 8][10][11] After further study, a prediction model could be used to guide timing of hospital discharge after delivery, initiation of antihypertensive medication, intensity of outpatient blood pressure surveillance, and scheduling postpartum care. 12A useful model may include variables that are readily available in the electronic health record (EHR). 13ecently, predictive models have been developed for low frequency postpartum complications, 14 including postpartum haemorrhage, 15 infectious morbidity, 16 venous thromboembolism 17 and cardiovascular diseases. 18,19he objective of the current study was to develop and internally validate a predictive model for postpartum readmission for hypertension and pre-eclampsia at delivery discharge.
Additionally, although single-centre studies can be convenient, they can result in inadequate model performance during external validation.Therefore, our second objective was to assess external validation or model transportability across different sites.

| Setting and participants
This analysis used data from two tertiary care health systems, the University of North Carolina (Chapel Hill, NC), from 1 September 2014 to 30 September 2015, and Christiana Care (Newark, DE) from 16 August 2017 to 15 August 2019, as previously described at both sites. 8,20The time periods were selected based on the availability of datasets assessing readmission for hypertension and pre-eclampsia at the two study sites.Standard of care at both sites included providing individuals with routine postpartum care as recommended by the American College of Obstetrics & Gynecology and counselling at delivery discharge about symptoms of preeclampsia and elevated blood pressure with parameters to seek care. 21At the University of North Carolina (hereafter referred to as the Southern health system), data using an Epic® EHR platform were retrospectively accessed from the Carolina Data Warehouse for Health (CDW-H) using the National Centers for Biomedical Computing's Informatics for Integrating Biology and the BEDSIDE (i2b2) software. 22,23At Christiana Care (hereafter referred to as the Northeastern health system), data using a Cerner® EHR platform were accessed from a prospective obstetrical database with established internal and external validity checks. 8,24nclusion criteria at both sites were pregnant individuals >18 years of age and who delivered an infant >21 weeks' gestation, and individuals were included regardless of chronic comorbid conditions, including hypertension.For those individuals with >1 delivery during the study period, only the first delivery was included.This study was approved by the Institutional Review Boards at both the University of North Carolina and Christiana Care.We followed reporting guidelines as proposed in the Transparent Reporting of a multivariable predictive model for Individual Prognosis Or Diagnosis (TRIPOD): the TRIPOD Statement. 25Patients were not involved in the development of the research.

| Outcome
The primary outcome was postpartum readmission for the primary indication of either hypertension or pre-eclampsia <6 weeks of delivery.Diagnostic criteria for pre-eclampsia at both sites were based on current guidelines of the American moderate calibration (intercept −0.153, slope 0.960, E max 0.042) and provided superior net benefit at clinical decision-making thresholds between 1% and 7% for interventions preventing readmission.An online calculator is provided here.Conclusions: Postpartum readmission for hypertension and pre-eclampsia may be accurately predicted but further model validation is needed.Model updating using data from multiple sites will be needed before use across clinical settings.

| 1533
POSTPARTUM READMISSION FOR HYPERTENSION AND PRE-ECLAMPSIA College of Obstetricians and Gynecologists. 26A readmission included an unplanned admission within 6 weeks after the index delivery hospital discharge date, consistent with 2015 measurement guidelines from the Center for Medicare and Medicaid Services. 27At the Southern health system, readmissions were identified per International Classification of Diseases, Ninth Revision (ICD-9-CM) codes for hypertension (642.0,642.1, 642.2, 401), gestational hypertension (642.3) and pre-eclampsia (642.4,642.5), which were then reviewed by manual chart review using a structured data collection form. 20At the Northeastern health system, readmissions were identified by either postpartum magnesium administration for seizure prophylaxis or postpartum readmission for hypertension management. 8That study did not include a core outcome set.

| Sample size
Sample size calculation for prediction models aims to reduce overfitting and to provide the number of parameters that could be used to fit a model to provide precise estimates of predictions.The minimum sample size required for the development of a multivariable prediction model was calculated a priori to ensure an adequate number of individuals and outcome events relative to the number of predictor parameters. 28The available sample sizes for each site were adequate for modelling up to 20 parameters or degrees of freedom.For the Southern site, using a maximum Cox-Snell R 2 statistic of 0.04 and a global shrinkage factor of 0.90, a minimum sample size of 9783 individuals with at least 30 events was required to consider up to 20 parameters for model fitting.For the Northeastern site, using a maximum Cox-Snell R 2 statistic of 0.12 and a global shrinkage factor of 0.90, a minimum sample size of 3070 individuals with at least 37 events was required to consider up to 20 parameters for model fitting.

| Covariate selection
Using our sample size guidance, 10 candidate predictors were selected based on expert opinion, prior consensus statements 29 and previous studies assessing predictors of readmission for hypertension and pre-eclampsia at both sites. 8,20dditional consideration was given to variables that were accessible as discrete data fields within the EHR to allow for automated data retrieval. 13These variables included sociodemographic characteristics (i.e.age), clinical characteristics (i.e.chronic hypertension, gestational hypertension, pre-eclampsia at delivery admission or prior to discharge), obstetrical characteristics (i.e.parity, gestational age at delivery, mode of delivery, infant birthweight) and vital signs (i.e.highest postpartum systolic and diastolic blood pressures any time between time of delivery and discharge from the hospital).
Candidate variables were further selected for model inclusion based on three variable reduction approaches: (1)   redundancy analysis in which variables are predicted by other variables without the outcome, and variables are removed using flexible parametric additive regression models; (2) hierarchical clustering analysis without consideration of the outcome based on a similarity matrix, and (3) backwards elimination with stepwise logistic regression.During candidate predictor selection, we aimed to reduce the number of parameters to estimate in the model without affecting statistical inference for the parameters and resultant overfitting.This was first accomplished using unsupervised methods that ignore the outcome during the data reduction process and does not require resampling.Specifically, we used redundancy analysis using the redun function and variable clustering using the varclust function in the Hmisc R package on each original dataset.During the supervised variable selection process bootstrapping was performed to reduce the risk of overfitting.This latter process was conducted in each imputed dataset.Interaction effects were examined for inclusion across all covariates included in the final model.

| Missing data
Missing covariates in each cohort (<1% for three variables) were imputed using multiple imputed chained equations (MICE) using the mice R package and the number of imputations was determined using methods described by Von Hippel. 30,31Imputation was performed separately for each site, as we assumed missingness clustered by site.The mice algorithm used Rubin's rules for pooling the results of each imputed dataset.

| Model fitting and validation
After running MICE, the fit.mult.imputefunction from the Hmisc package was used to perform two tasks separately for each artificially complete dataset corresponding to each imputation: choose final penalty parameters and fit the chosen model.During each fit, a series of logistic regression models using penalised maximum likelihood estimation were generated using the pentrace function in the rms R package to solve for the optimum penalty factor or combination of factors penalising different kinds of terms in the model.Continuous variables were relaxed using restricted cubic splines to get the maximum information from the data and to explore non-linear continuous associations. 32An internal-external cross-validation (IECV) approach was used to assess external validation or the transportability of the model across the two sites.In the IECV process, a model was developed and internally validated using data from all sites except one and subsequently externally validated using data from the held-out site.This process was repeated so each site was held out and validated in turn (e.g. in this case, only two sites).If the predictive performance of each site's model was deemed unacceptable at the alternate site, we planned to explore the effect of recalibration by updating the model's intercept or overall slope or, if needed, by re-estimating individual predictor effects using both sites as a single dataset.A closed testing procedure was used to choose an appropriate model updating method that resulted in a single final model. 33his procedure tested whether model-updating methods were most appropriate (e.g.whether the intercept should be freely estimated -Recalibration in the Large; whether the coefficient on the linear predictor should to be freely estimated as well -Logistic Recalibration; and whether the Northeastern model coefficients should be freely estimated -Model Revision).We did not assess how the final (pooled) model performed by site because this final model requires further study at the two study sites.
Consistent with TRIPOD recommendations, 15,34 model performance was assessed using three recommended measures: (1) the apparent and bias-corrected c-statistic (i.e.concordance index), or area under the receiver operating characteristic (ROC) curve, (2) bias-corrected calibration curves and (3) decision curves. 25During internal validation, the processMI function in the rms package was used for bootstrapping validations and calibrations to be separately run for each imputation and then stacked to estimate final metrics.All internal validation performance metrics were assessed in each imputed dataset and bias-corrected using bootstrapping.The c-statistic is a measure of model discrimination which quantifies the ability of the model to predict a high or low probability in patients who are actually highor low-risk for having the outcome.Calibration quantifies model performance in terms of 'what proportion, or %, of individuals with a risk prediction of x% actually have the outcome?' 11 across the range of risk thresholds, and is visualised by plotting the relationship between the model's predicted outcome versus the cohort's observed outcome in which a perfectly calibrated model follows a 45° line. 35Decision curve analysis (DCA) assesses the benefits (treating a truepositive case) and harms (treating a false-positive case) associated with the model. 36DCA quantifies the 'net benefit' of using the model for postpartum readmission.Stated otherwise, the net benefit is the difference between the expected benefit (readmitted individuals who were correctly identified by the model) and the expected harm (individuals who were not readmitted but were incorrectly identified by the model). 37All statistical analyses were performed using R statistical software version 4.

| R E SU LTS
The Southern site included 10 100 deliveries with 32 postpartum readmissions for hypertension and pre-eclampsia for an absolute rate of 3 per 1000.The Northeastern site included 18 101 deliveries with 214 postpartum readmissions for hypertension and pre-eclampsia for an absolute rate of 12 per 1000.Characteristics for both sites overall are presented in Table 1 and by site in Tables S1 and S2.
During the variable reduction process, flexible parametric additive regression models suggested removal of the predictor gestational age at delivery; hierarchical clustering analysis suggested removal of gestational age at delivery and systolic blood pressure, and backwards elimination with stepwise logistic regression suggested removal of gestational hypertension and chronic hypertension.The final recommended predictors included: age, parity, maximum postpartum diastolic blood pressure (mm Hg), birthweight (in grams), pre-eclampsia before discharge, and delivery mode (and an interaction term of pre-eclampsia × delivery mode).
A risk equation for predicting the log odds of readmission for hypertension and pre-eclampsia using the estimated beta coefficients multiplied by the corresponding predictors included with the intercept by site is presented in Table 2.The regression coefficients represent the log odds ratio for a change of 1 unit of the corresponding predictor.Model performance was initially examined for separate models at each site.Discrimination ability was adequate during internal validation (c-statistic South: 0.88, 95% CI 0.87-0.89;and Northeast: 0.74; 95% CI 0.74-0.74).During external validation, the Northeastern model discrimination ability improved when tested on the Southern site (cstatistic 0.86; 95% CI 0.76-0.92),but the Southern model's discriminatory ability substantially decreased when tested on the Northeastern site (c-statistic 0.61; 95% CI 0.57-0.65)(Table 2).Each site's model demonstrated poor calibration on the other site during external validation.The calibration intercepts and slopes differed between the Southern and Northeastern sites from intercepts: −1.56 to −0.93; and slopes: 0.18-1.04,respectively (Table 2, Figures S1 and S2).Given the calibration findings, the closed testing procedure was used to choose an appropriate model-updating method for the Northeastern model given its better performance and overall model revision was ultimately recommended.Therefore, a model was revised using a combined dataset from both sites in which a new model based on updated parameters was developed using the combined dataset.
The combined dataset of 28 201 postpartum individuals had 246 postpartum readmissions for hypertension and preeclampsia for an absolute rate of 9 per 1000 (Table 1).The model equation for predicting the absolute risk readmission for hypertension and pre-eclampsia from the final combined model (Table 2) was: where risk score is the predicted log odds of readmission from the developed model.This final model had adequate and improved discrimination ability (c-statistic: 0.80, 95% CI 0.80-0.80)(Table 2).Figure 1 demonstrates the calibration curve of the model's performance.The final model calibration curve demonstrated accurate predictions of readmission risk from 0% to 8%.Decision curve analysis also demonstrated the final model provided superior net benefit when clinical decision-making risk thresholds for interventions preventing readmission were between 1% and 7% (Figure 2).The final model is provided as an online calculator here (https://duke-som.shinyapps.io/pre-eclampsia_readm it/).

| Main findings
Postpartum readmission for hypertension and preeclampsia may be accurately predicted using EHR data from two US sites, but further model validation is needed.This predictive model is beneficial when a postpartum individual's predicted risk is ≤7%.Given that the average absolute risk of readmission for hypertension and pre-eclampsia is <1%, this model is useful for prediction up to seven times the average risk.To achieve this level of accuracy, it was necessary to combine the two sites into a combined final model, as the individual site models were not well calibrated.This lack of external validation or transportability for each individual site model suggests that, prior to clinical use, additional updating will be necessary using data from multiple sites.

| Strengths and limitations
There are several limitations to note.First, the final model overpredicted the risk of readmission for hypertension and pre-eclampsia when this risk was >8%.Given the absolute risk of postpartum readmission is closer to 1%, there are few patients whose risk is greater than this threshold.Future model updating will be necessary to determine whether the highest risk individuals can be reliably predicted.Secondly, we used EHR data that could be easily integrated and implemented into the current system.Limitations of using EHR data include the possibility of variables that are inaccurate and incomplete, transformed in ways that undermine their meaning, and of insufficient granularity. 38The available EHR data did not include candidate variables that could have improved model performance, such as use of antihypertensive medication at delivery hospital discharge.While we are confident of the validity of the primary outcome variable, which was cross-validated by manual chart review, and other discretely and objectively measured variables (e.g.maternal age, birthweight), there is the possibility for misclassification for other variables, such as chronic hypertension and pre-eclampsia status.Thirdly, it is possible that some patients were readmitted to hospitals outside of the assessed health system.National claims data suggest that up to 15% of postpartum readmissions may occur at a hospital different from where the delivery occurred. 39Although this may have underestimated the true frequency of readmission, we believe this rate is likely lower in the current study setting given we used data from two large integrated tertiary care health systems.Fourthly, model performance may vary with systematic and consistent interventions for blood pressure monitoring and postpartum pre-eclampsia management for all patients.In the current study, postpartum follow-up was at the discretion of the provider.Differential misclassification by outcome (readmission for hypertension and pre-eclampsia) is possible if provider ascertainment and documentation of clinical data in the EHR varied by severity of hypertension and pre-eclampsia at delivery.In addition, lack of systematic identification of cases of postpartum readmission could also have affected model performance when external validation was initially attempted.The care settings in which the current model was developed is likely most generalisable to other sites.
Strengths of this study include using data that are readily available in the EHR and the use of IECV.Importantly, ensured that data mappings were consistent for EHR abstraction of variables at both sites, and manually reviewed the study outcome to confirm that those who required hospital readmission were appropriately classified.Model performance may be negatively impacted when solely relying on EHR data without further outcome validation. 40Secondly, we provide an easily implementable model that clinicians may be more likely to utilise for point of care postpartum planning and decision making, as it provides an automatically generated risk score  using factors identified in the EHR. 34this study included a large sample of >20 000 patients of varying risk for postpartum readmission for hypertension and pre-eclampsia at two sites.We did not assess how the final (pooled) model performed in each site.This predictive model needs to be updated and validated in multiple settings and by alternative researchers before recommending widespread use. 41

| Interpretation
Recently, predictive models for postpartum maternal morbidity have been developed using clinical variables available in the EHR at single sites.Malhamé et al. 19 developed a predictive model for postpartum cardiovascular severe maternal morbidity in a statistical model with 11 variables and three interaction terms with adequate discrimination and calibration without external validation at a Northeastern US tertiary care site.Hoffman et al. 8 previously developed a machine learning-based predictive model for readmission for pre-eclampsia and hypertension at the Northeastern site of the current study.This model had good discriminative ability (AUC >0.8) but was not validated in other care settings and included 31 clinical features.While these clinical features were taken from discrete data fields, many of these variables may not be easily retrievable in many care settings, limiting model transportability.Prediction models developed at a single site often cause undesirable problems or inaccurate predictions when tested on new individuals at additional sites. 11his is because the true value of the model parameters varies across the validation or targeted setting or population.Such heterogeneity results from differences in observed and unobserved individual characteristics, site-specific patient management strategies, and differences in predictor and outcome definitions. 42Single-centre prediction models are also typically developed using smaller sample sizes, are prone to overfitting, and are less reproducible.These issues can become more pronounced when the outcome event is a low frequency event, such as postpartum readmission.
Pooling studies or cohorts from electronic health records from different sites may help reduce these problems by increasing the sample size and variability in the sample characteristics. 42In the current study, we attempted to mitigate some of these limitations by performing sample size calculations and adopting penalisation model fitting to reduce the variability of model predictions.We also chose to use IECV, which combines the strength of external validation with the strength of prediction model development using all available data; although in this study, IECV was performed using only two sites.Although model discrimination was adequate, model calibration was not.Rather than stopping and discarding the data, our closed testing procedure recommended updating the model using the combination of the two datasets.
Postpartum readmission for hypertension and preeclampsia is a rare event: the probability in the current study varied between both sites from 0.3% to 1.2%; the national average is 0.3%. 3 Bruce et al. 43 reported that in a large California-based managed care organisation in 2018, the frequency of readmission for hypertension or stroke within 42 days after delivery was 4.4%.The rate of readmission may vary across sites due to differences in observed and unobserved patient characteristics, readmission management strategies, measurement methods, and outcome definitions.Since the intent of the current analysis was potentially to use the model in new sites, we wanted to capture as much heterogeneity as possible through combining the cohort to allow transportability of the final model at new sites.But hypertension and pre-eclampsia represent one of the primary causes of both postpartum readmission as well as maternal morbidity and mortality in the USA. 44,45This risk of hypertension after a pregnancy complicated by pre-eclampsia is highest in the early postpartum period. 46Currently, providers frequently attempt to estimate a patient's risk based on known risk factors, 3 such as a prior or current diagnosis of hypertension and pre-eclampsia, even though many readmitted individuals do not have these risk factors. 7An accurate predictive model for postpartum readmission for hypertension and pre-eclampsia may be of value, as estimating risk is challenging due to the overall low probability of this outcome (≤1%) 3 and individual risk factors have been shown not to identify patients at high risk adequately. 7e did not assess race and ethnicity for inclusion in this model.Recent studies caution against developing predictive models in obstetrics using race and ethnicity as a putative risk factor, which is at best an imperfect measure of structural racism and health inequities. 47The inclusion of race in such predictive models may contribute to the incorrect notion of race as a biological category and detracts from the underlying implications of racism. 48ollowing further prospective assessment that includes model updating with data from additional sites, this predictive model could be used to target more accurately and effectively postpartum care for patients identified to be at high risk for readmission for hypertension and pre-eclampsia.Such interventions may include a longer postpartum hospital stay, 49 home blood pressure monitoring, 12 more aggressive antihypertensive management with medication and increased postpartum clinical surveillance via telemedicine or outpatient visit.The current model overpredicts the risk of readmission when the predicted risk is above 10%.However, as long as interventions to prevent readmission are low-risk, this overprediction is unlikely to cause significant harm.To date, trials have not demonstrated that interventions can reduce the risk of postpartum readmission for hypertension and pre-eclampsia.Whether application of a predictive model can affect readmission rates through improved identification of patients or through better targeting the above interventions will need to be studied. 50,51An additional application is in adaptive trial design or a 'smart trial' of the above interventions in which patients are randomised based on their predicted risk of postpartum readmission. 52ore broadly, such a predictive model could potentially decrease unnecessary healthcare utilisation for patients at low 3.0 (21 April 2023; R Foundation for Statistical Computing).

T A B L E 1
Characteristics of combined cohort stratified by readmission status.

F I G U R E 1
Calibration plot demonstrating the performance of predicting postpartum hospital readmission for hypertension or preeclampsia.The dash line indicates perfect agreement between the predicted probability of the model and the actual probability.

F I G U R E 2
Decision curve analysis of predicting postpartum hospital readmission for hypertension or pre-eclampsia.The x-axis indicates the range of threshold probabilities predicted by the model for risk of postpartum readmission for hypertension or pre-eclampsia.The y-axis indicates the standardised net benefit.The net benefit is calculated as true-positive rate − (false-positive rate × weighting factor).The weighting factor is calculated as the threshold probability/1 − threshold probability.The decision curves indicate the net benefit of the model as well as two clinical alternatives (classifying no individuals as having the outcome versus classifying all individuals as having the outcome) over a specified range of threshold probabilities of outcome.
Model performance characteristics and coefficients by site and combined.