The proliferation of electronic health records, driven by advances in technology and legislative measures, is stimulating interest in the analysis of passively collected administrative and clinical data. Observational data present exciting challenges and opportunities to researchers interested in comparing the effectiveness of different treatment regimes and, as personalized medicine requires, estimating how effectiveness varies among subgroups. In this study, we provide new motivation for the local control approach to the analysis of large observational datasets in which patients are first clustered in pretreatment covariate space and treatment comparisons are made within subgroups of similar patients. The motivation for such an analysis is that the resulting local treatment effect estimates make inherently fair comparisons even when treatment cohorts suffer variation in balance (treatment choice fraction) across pretreatment covariate space. We use an example of Simpson's paradox to show that estimates of the overall average treatment effect, which marginalize over covariate space, can be misleading. Thus, we provide an alternative definition that uses a single, shared marginal distribution to define overall treatment comparisons that are inherently fair given the observed covariates. However, we also argue that overall treatment comparisons should no longer be the focus of comparative effectiveness research; the possibility that treatment effectiveness does vary across patient subpopulations must not be left unexplored. In the spirit of the now ubiquitous concept of personalized medicine, estimating heterogeneous treatment effects in clinically relevant subgroups will allow for, within the limits of the available data, fair treatment comparisons that are more relevant to individual patients.