Volume 30, Issue 24
Featured Article

Subgroup identification from randomized clinical trial data

Jared C. Foster

Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109 USA

Search for more papers by this author
Jeremy M.G. Taylor

Corresponding Author

E-mail address: jmgt@umich.edu

Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109 USA

Jeremy M.G. Taylor, Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.

E‐mail: jmgt@umich.edu

Search for more papers by this author
Stephen J. Ruberg

Global Statistical Sciences, Advanced Analytics, Eli Lilly, Indianapolis, IN, USA

Search for more papers by this author
First published: 04 August 2011
Citations: 248

Abstract

We consider the problem of identifying a subgroup of patients who may have an enhanced treatment effect in a randomized clinical trial, and it is desirable that the subgroup be defined by a limited number of covariates. For this problem, the development of a standard, pre‐determined strategy may help to avoid the well‐known dangers of subgroup analysis. We present a method developed to find subgroups of enhanced treatment effect. This method, referred to as ‘Virtual Twins’, involves predicting response probabilities for treatment and control ‘twins’ for each subject. The difference in these probabilities is then used as the outcome in a classification or regression tree, which can potentially include any set of the covariates. We define a measure urn:x-wiley:02776715:media:sim4322:sim4322-math-0001 to be the difference between the treatment effect in estimated subgroup urn:x-wiley:02776715:media:sim4322:sim4322-math-0002 and the marginal treatment effect. We present several methods developed to obtain an estimate of urn:x-wiley:02776715:media:sim4322:sim4322-math-0003, including estimation of urn:x-wiley:02776715:media:sim4322:sim4322-math-0004 using estimated probabilities in the original data, using estimated probabilities in newly simulated data, two cross‐validation‐based approaches, and a bootstrap‐based bias‐corrected approach. Results of a simulation study indicate that the Virtual Twins method noticeably outperforms logistic regression with forward selection when a true subgroup of enhanced treatment effect exists. Generally, large sample sizes or strong enhanced treatment effects are needed for subgroup estimation. As an illustration, we apply the proposed methods to data from a randomized clinical trial. Copyright © 2011 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 248

  • Machine learning analysis plans for randomised controlled trials: detecting treatment effect heterogeneity with strict control of type I error, Trials, 10.1186/s13063-020-4076-y, 21, 1, (2020).
  • Identifying treatment effects of an informal caregiver education intervention to increase days in the community and decrease caregiver distress: a machine-learning secondary analysis of subgroup effects in the HI-FIVES randomized clinical trial, Trials, 10.1186/s13063-020-4113-x, 21, 1, (2020).
  • Building Health Application Recommender System Using Partially Penalized Regression, Statistical Modeling in Biomedical Research, 10.1007/978-3-030-33416-1_6, (105-123), (2020).
  • Subgroup Identification for Tailored Therapies: Methods and Consistent Evaluation, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_8, (181-197), (2020).
  • A Novel Method of Subgroup Identification by Combining Virtual Twins with GUIDE (VG) for Development of Precision Medicines, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_7, (167-180), (2020).
  • Semiparametric Mixture of Regression Models Under Unimodal Error Distribution, Journal of Statistical Theory and Practice, 10.1007/s42519-020-00113-8, 14, 3, (2020).
  • Data-Driven and Confirmatory Subgroup Analysis in Clinical Trials, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_3, (33-91), (2020).
  • Exploratory Subgroup Identification for Biopharmaceutical Development, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_12, (245-270), (2020).
  • Subgroup Analysis with Partial Linear Regression Model, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_11, (229-243), (2020).
  • Logical Inference on Treatment Efficacy When Subgroups Exist, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_10, (209-228), (2020).
  • The GUIDE Approach to Subgroup Identification, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_6, (147-165), (2020).
  • Evaluating heterogeneity of treatment effects, Biostatistics & Epidemiology, 10.1080/24709360.2020.1724003, 4, 1, (98-104), (2020).
  • Subgroup Analysis: A View from Industry, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_15, (309-330), (2020).
  • Statistical Learning Methods for Optimizing Dynamic Treatment Regimes in Subgroup Identification, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_13, (271-297), (2020).
  • Subgroup Analysis from Bayesian Perspectives, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_16, (331-345), (2020).
  • Depth Importance in Precision Medicine (DIPM): A Tree and Forest Based Method, Contemporary Experimental Design, Multivariate Analysis and Data Mining, 10.1007/978-3-030-46161-4, (243-259), (2020).
  • Heterogeneous Treatment and Spillover Effects Under Clustered Network Interference, SSRN Electronic Journal, 10.2139/ssrn.3666101, (2020).
  • Gabapentin Enacarbil Extended‐Release Versus Placebo: A Likely Responder Reanalysis of a Randomized Clinical Trial, Alcoholism: Clinical and Experimental Research, 10.1111/acer.14414, 44, 9, (1875-1884), (2020).
  • Statistical Data Mining of Clinical Data, Quantitative Methods in Pharmaceutical Research and Development, 10.1007/978-3-030-48555-9, (225-315), (2020).
  • Predictive factors of long-term follow-up in treatment of Korean alcoholics with naltrexone or acamprosate, International Clinical Psychopharmacology, 10.1097/YIC.0000000000000324, 35, 6, (345-350), (2020).
  • Counterfactual clinical prediction models could help to infer individualized treatment effects in randomized controlled trials—An illustration with the International Stroke Trial, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2020.05.022, 125, (47-56), (2020).
  • Innovative trial design in precision oncology, Seminars in Cancer Biology, 10.1016/j.semcancer.2020.09.006, (2020).
  • Determining causal relationships in leadership research using Machine Learning: The powerful synergy of experiments and data science, The Leadership Quarterly, 10.1016/j.leaqua.2020.101426, (101426), (2020).
  • Avelumab as second-line therapy for metastatic, platinum-treated urothelial carcinoma in the phase Ib JAVELIN Solid Tumor study: 2-year updated efficacy and safety analysis, Journal for ImmunoTherapy of Cancer, 10.1136/jitc-2020-001246, 8, 2, (e001246), (2020).
  • A Bayesian Method for the Detection of Proof of Concept in Early Phase Oncology Studies with a Basket Design, Statistics in Biosciences, 10.1007/s12561-020-09267-2, (2020).
  • Personalized treatment selection via the covariate-specific treatment effect curve for longitudinal data, Statistical Theory and Related Fields, 10.1080/24754269.2020.1762059, (1-12), (2020).
  • Quantile-Based Subgroup Identification for Randomized Clinical Trials, Statistics in Biosciences, 10.1007/s12561-020-09286-z, (2020).
  • Histopathological imaging‐based cancer heterogeneity analysis via penalized fusion with model averaging, Biometrics, 10.1111/biom.13357, 0, 0, (2020).
  • Subgroup analysis with a nonparametric unimodal symmetric error distribution, Communications in Statistics - Theory and Methods, 10.1080/03610926.2019.1710754, (1-22), (2020).
  • Bayesian credible subgroup identification for treatment effectiveness in time-to-event data, PLOS ONE, 10.1371/journal.pone.0229336, 15, 2, (e0229336), (2020).
  • Subgroup analysis based on structured mixed-effects models for longitudinal data, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2020.1730867, (1-16), (2020).
  • A threshold linear mixed model for identification of treatment-sensitive subsets in a clinical trial based on longitudinal outcomes and a continuous covariate, Statistical Methods in Medical Research, 10.1177/0962280220912772, (096228022091277), (2020).
  • Using recursive partitioning to find and estimate heterogenous treatment effects in randomized clinical trials, Journal of Experimental Criminology, 10.1007/s11292-019-09410-0, (2020).
  • A semiparametric Bayesian approach to population finding with time‐to‐event and toxicity data in a randomized clinical trial, Biometrics, 10.1111/biom.13289, 0, 0, (2020).
  • Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data, Journal of the American Statistical Association, 10.1080/01621459.2020.1772080, (1-18), (2020).
  • Individualized Multidirectional Variable Selection, Journal of the American Statistical Association, 10.1080/01621459.2019.1705308, (1-17), (2020).
  • A Two-Part Framework for Estimating Individualized Treatment Rules From Semicontinuous Outcomes, Journal of the American Statistical Association, 10.1080/01621459.2020.1801449, (1-23), (2020).
  • Precision medicine: Subgroup identification in longitudinal trajectories, Statistical Methods in Medical Research, 10.1177/0962280220904114, (096228022090411), (2020).
  • A multiple comparison procedure for dose‐finding trials with subpopulations, Biometrical Journal, 10.1002/bimj.201800111, 62, 1, (53-68), (2019).
  • Adjustment for exploratory cut‐off selection in randomized clinical trials with survival endpoint, Biometrical Journal, 10.1002/bimj.201800302, 62, 3, (627-642), (2019).
  • Single‐reference coupled cluster methods for computing excitation energies in large molecules: The efficiency and accuracy of approximations, WIREs Computational Molecular Science, 10.1002/wcms.1445, 10, 3, (2019).
  • Should learners use their hands for learning? Results from an eye‐tracking study, Journal of Computer Assisted Learning, 10.1111/jcal.12396, 36, 1, (102-113), (2019).
  • On the use of comparison regions in visualizing stochastic uncertainty in some two‐parameter estimation problems, Biometrical Journal, 10.1002/bimj.201800232, 62, 3, (598-609), (2019).
  • Lower digit ratio (2D:4D) in alcohol dependence: Confirmation and exploratory analysis in a population‐based study of young men, Addiction Biology, 10.1111/adb.12815, 25, 4, (2019).
  • The consequences of (not) seeing eye‐to‐eye about the past: The role of supervisor–team fit in past temporal focus for supervisors' leadership behavior, Journal of Organizational Behavior, 10.1002/job.2416, 41, 3, (244-262), (2019).
  • SIMEX for correction of dietary exposure effects with Box‐Cox transformed data, Biometrical Journal, 10.1002/bimj.201900066, 62, 1, (221-237), (2019).
  • SNAIL1 employs β‐Catenin‐LEF1 complexes to control colorectal cancer cell invasion and proliferation, International Journal of Cancer, 10.1002/ijc.32644, 146, 8, (2229-2242), (2019).
  • Glioblastoma initiating cells are sensitive to histone demethylase inhibition due to epigenetic deregulation, International Journal of Cancer, 10.1002/ijc.32649, 146, 5, (1281-1292), (2019).
  • A utility approach to individualized optimal dose selection using biomarkers, Biometrical Journal, 10.1002/bimj.201900030, 62, 2, (386-397), (2019).
  • Identification of the optimal treatment regimen in the presence of missing covariates, Statistics in Medicine, 10.1002/sim.8407, 39, 4, (353-368), (2019).
  • Selecting biomarkers for building optimal treatment selection rules by using kernel machines, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12379, 69, 1, (69-88), (2019).
  • Exploratory identification of predictive biomarkers in randomized trials with normal endpoints, Statistics in Medicine, 10.1002/sim.8452, 39, 7, (923-939), (2019).
  • Clinical risk prediction with random forests for survival, longitudinal, and multivariate (RF-SLAM) data analysis, BMC Medical Research Methodology, 10.1186/s12874-019-0863-0, 20, 1, (2019).
  • Composite interaction tree for simultaneous learning of optimal individualized treatment rules and subgroups, Statistics in Medicine, 10.1002/sim.8105, 38, 14, (2632-2651), (2019).
  • A model‐based multithreshold method for subgroup identification, Statistics in Medicine, 10.1002/sim.8136, 38, 14, (2605-2631), (2019).
  • Approaches to treatment effect heterogeneity in the presence of confounding, Statistics in Medicine, 10.1002/sim.8143, 38, 15, (2797-2815), (2019).
  • Subgroup identification via homogeneity pursuit for dense longitudinal/spatial data, Statistics in Medicine, 10.1002/sim.8192, 38, 17, (3256-3271), (2019).
  • Subgroup identification using covariate‐adjusted interaction trees, Statistics in Medicine, 10.1002/sim.8214, 38, 21, (3974-3984), (2019).
  • PSICA: Decision trees for probabilistic subgroup identification with categorical treatments, Statistics in Medicine, 10.1002/sim.8308, 38, 22, (4436-4452), (2019).
  • Estimating the quality of optimal treatment regimes, Statistics in Medicine, 10.1002/sim.8342, 38, 25, (4925-4938), (2019).
  • Estimating individual treatment effects by gradient boosting trees, Statistics in Medicine, 10.1002/sim.8357, 38, 26, (5146-5159), (2019).
  • Subgroup identification for precision medicine: A comparative review of 13 methods, WIREs Data Mining and Knowledge Discovery , 10.1002/widm.1326, 9, 5, (2019).
  • Bagging and deep learning in optimal individualized treatment rules, Biometrics, 10.1111/biom.12990, 75, 2, (674-684), (2019).
  • Multicategory individualized treatment regime using outcome weighted learning, Biometrics, 10.1111/biom.13084, 75, 4, (1216-1227), (2019).
  • A topoclimate model for Quaternary insular speciation, Journal of Biogeography, 10.1111/jbi.13689, 46, 12, (2769-2786), (2019).
  • A comprehensive survey of error measures for evaluating binary decision making in data science, WIREs Data Mining and Knowledge Discovery , 10.1002/widm.1303, 9, 5, (2019).
  • Modification of the CO‐rebreathing method to determine haemoglobin mass and blood volume in patients suffering from chronic mountain sickness, Experimental Physiology, 10.1113/EP087870, 104, 12, (1819-1828), (2019).
  • Improved Biocompatibility of Amino‐Functionalized Graphene Oxide in Caenorhabditis elegans, Small, 10.1002/smll.201902699, 15, 45, (2019).
  • Additives for Cycle Life Improvement of High‐Voltage LNMO‐Based Li‐Ion Cells, ChemElectroChem, 10.1002/celc.201901120, 6, 20, (5255-5263), (2019).
  • Periodic Operation of a Dynamic DNA Origami Structure Utilizing the Hydrophilic–Hydrophobic Phase‐Transition of Stimulus‐Sensitive Polypeptides, Small, 10.1002/smll.201903541, 15, 45, (2019).
  • Predictive Subgroup/Biomarker Identification and Machine Learning Methods, Statistical Methods in Biomarker and Early Clinical Development, 10.1007/978-3-030-31503-0, (1-22), (2019).
  • Evaluating Potential Subpopulations Using Stochastic SIDEScreen in a Cross-Over Trial, Contemporary Biostatistics with Biopharmaceutical Applications, 10.1007/978-3-030-15310-6_17, (299-322), (2019).
  • Adaptive Trial Designs for Biomarker-Driven Clinical Trials With Quantitative and Multiple Candidate Biomarkers, Companion and Complementary Diagnostics, 10.1016/B978-0-12-813539-6.00014-6, (279-287), (2019).
  • Toward an Understanding of Adversarial Examples in Clinical Trials, Energy Transfer Processes in Polynuclear Lanthanide Complexes, 10.1007/978-3-030-10925-7_3, (35-51), (2019).
  • Case‐only trees and random forests for exploring genotype‐specific treatment effects in randomized clinical trials with dichotomous end points, Journal of the Royal Statistical Society: Series C (Applied Statistics), 10.1111/rssc.12366, 68, 5, (1371-1391), (2019).
  • Borrowing strength and borrowing index for Bayesian hierarchical models, Computational Statistics & Data Analysis, 10.1016/j.csda.2019.106901, (106901), (2019).
  • undefined, 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), 10.1109/CBMS.2019.00089, (423-428), (2019).
  • Application of structured statistical analyses to identify a biomarker predictive of enhanced tralokinumab efficacy in phase III clinical trials for severe, uncontrolled asthma, BMC Pulmonary Medicine, 10.1186/s12890-019-0889-4, 19, 1, (2019).
  • Dynamic Risk Profiling Using Serial Tumor Biomarkers for Personalized Outcome Prediction, Cell, 10.1016/j.cell.2019.06.011, (2019).
  • Review of Statistical Methods for Biomarker-Driven Clinical Trials, JCO Precision Oncology, 10.1200/PO.18.00407, 3, (1-9), (2019).
  • Developing and Validating Risk Assessment Models of Clinical Outcomes in Modern Oncology, JCO Precision Oncology, 10.1200/PO.19.00068, 3, (1-12), (2019).
  • Bayesian Approaches to Subgroup Analysis and Related Adaptive Clinical Trial Designs, JCO Precision Oncology, 10.1200/PO.19.00003, 3, (1-9), (2019).
  • The Good, the Bad, and the Unflinchingly Selfish, ACM Transactions on Economics and Computation, 10.1145/3355947, 7, 3, (1-14), (2019).
  • Cue‐induced effects on decision‐making distinguish subjects with gambling disorder from healthy controls, Addiction Biology, 10.1111/adb.12841, 0, 0, (2019).
  • ASIED: a Bayesian adaptive subgroup-identification enrichment design, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2019.1696356, (1-16), (2019).
  • Heterogeneous Subgroup Identification in Observational Studies, Journal of Research on Educational Effectiveness, 10.1080/19345747.2019.1615159, (1-19), (2019).
  • Identifying Exceptional Responders in Randomized Trials: An Optimization Approach, INFORMS Journal on Optimization, 10.1287/ijoo.2018.0006, (ijoo.2018.0006), (2019).
  • Reduction of recruitment costs in preclinical AD trials: validation of automatic pre-screening algorithm for brain amyloidosis, Statistical Methods in Medical Research, 10.1177/0962280218823036, (096228021882303), (2019).
  • A non-parametric statistical test of null treatment effect in sub-populations, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2019.1636810, (1-17), (2019).
  • Look before you leap: systematic evaluation of tree-based statistical methods in subgroup identification, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2019.1584204, (1-21), (2019).
  • Estimating the Gains from New Rail Transit Investment: A Machine Learning Tree Approach, Real Estate Economics, 10.1111/1540-6229.12249, 48, 3, (886-914), (2018).
  • Immigration, Capital Flows and Housing Prices, Real Estate Economics, 10.1111/1540-6229.12267, 48, 3, (915-949), (2018).
  • Institutional Investment, Asset Illiquidity and Post‐Crash Housing Market Dynamics, Real Estate Economics, 10.1111/1540-6229.12231, 48, 3, (673-709), (2018).
  • Local House Price Diffusion, Real Estate Economics, 10.1111/1540-6229.12241, 48, 3, (710-743), (2018).
  • A New Measure of Real Estate Uncertainty Shocks, Real Estate Economics, 10.1111/1540-6229.12270, 48, 3, (744-771), (2018).
  • A nonparametric Bayesian basket trial design, Biometrical Journal, 10.1002/bimj.201700162, 61, 5, (1160-1174), (2018).
  • Subgroups from regression trees with adjustment for prognostic effects and postselection inference, Statistics in Medicine, 10.1002/sim.7677, 38, 4, (545-557), (2018).
  • Estimating heterogeneous treatment effects for latent subgroups in observational studies, Statistics in Medicine, 10.1002/sim.7970, 38, 3, (339-353), (2018).
  • Development of predictive signatures for treatment selection in precision medicine with survival outcomes, Pharmaceutical Statistics, 10.1002/pst.1842, 17, 2, (105-116), (2018).
  • Random forests of interaction trees for estimating individualized treatment effects in randomized trials, Statistics in Medicine, 10.1002/sim.7660, 37, 17, (2547-2560), (2018).
  • See more

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.