The primary goal of phase II studies is to assess the efficacy of the new treatment in order to decide whether it has sufficient activity to warrant further evaluation in a phase III comparative trial. However, many adequately conducted phase II trials are negative leading to termination of drug development. Heterogeneity of the population is often considered to be a cause of treatment effect dilution. One approach to determine the sensitive subpopulation is to conduct several phase II trials, one in each specific subset of patients. This option might unethically increase the number of non-sensitive patients under evaluation. Adaptive two-stage designs have been recently proposed. London and Chang proposed a global one-sample test for response rates for stratified phase II clinical trials, whereas Jones and Holmgren proposed an adaptive design that allows preliminary determination of efficacy that may be restricted to a specific subpopulation defined by biomarker status. These two methods do not allow early termination for efficacy in one or several subgroups as they are extensions of the Simon design. The authors propose an alternative method to deal with stratification in phase II clinical trials and identification of the best target population. This method is based on the multiple-stage Fleming design allowing for early stopping rules for either efficacy or inefficacy. It also integrates a procedure testing whether treatment effects are similar or heterogeneous between the two groups. The operating characteristics of this method were compared with those of a standard Fleming design using exact binomial probabilities. Copyright © 2011 John Wiley & Sons, Ltd.