Volume 38, Issue 4
SPECIAL ISSUE PAPER

Subgroups from regression trees with adjustment for prognostic effects and postselection inference

Wei‐Yin Loh

Corresponding Author

E-mail address: loh@stat.wisc.edu

University of Wisconsin‐Madison, Madison, WI 53706 U.S.A.

Correspondence

Wei‐Yin Loh, University of Wisconsin‐Madison, Madison, WI 53706, U.S.A.

Email: loh@stat.wisc.edu

Search for more papers by this author
Michael Man

Eli Lilly and Company, Indianapolis, IN 46285 U.S.A.

Search for more papers by this author
Shuaicheng Wang

BioStat Solutions, Inc, Frederick, MD 21703 U.S.A.

Search for more papers by this author
First published: 19 April 2018
Citations: 3

Abstract

Identification of subgroups with differential treatment effects in randomized trials is attracting much attention. Many methods use regression tree algorithms. This article addresses 2 important questions arising from the subgroups: how to ensure that treatment effects in subgroups are not confounded with effects of prognostic variables and how to determine the statistical significance of treatment effects in the subgroups. We address the first question by selectively including linear prognostic effects in the subgroups in a regression tree model. The second question is more difficult because it falls within the subject of postselection inference. We use a bootstrap technique to calibrate normal‐theory t intervals so that their expected coverage probability, averaged over all the subgroups in a fitted model, approximates the desired confidence level. It can also provide simultaneous confidence intervals for all subgroups. The first solution is implemented in the GUIDE algorithm and is applicable to data with missing covariate values, 2 or more treatment arms, and outcomes subject to right censoring. Bootstrap calibration is applicable to any subgroup identification method; it is not restricted to regression tree models. Two real examples are used for illustration: a diabetes trial where the outcomes are completely observed but some covariate values are missing and a breast cancer trial where the outcome is right censored.

Number of times cited according to CrossRef: 3

  • The GUIDE Approach to Subgroup Identification, Design and Analysis of Subgroups with Biopharmaceutical Applications, 10.1007/978-3-030-40105-4_6, (147-165), (2020).
  • Subgroup identification for precision medicine: A comparative review of 13 methods, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10.1002/widm.1326, 9, 5, (2019).
  • Predictive Subgroup/Biomarker Identification and Machine Learning Methods, Statistical Methods in Biomarker and Early Clinical Development, 10.1007/978-3-030-31503-0, (1-22), (2019).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.