To make causal inferences from observational data, researchers have often turned to matching methods. These methods are variably successful. We address issues with matching methods by redefining the matching problem as a subset selection problem. Given a set of covariates, we seek to find two subsets, a control group and a treatment group, so that we obtain optimal balance, or, in other words, the minimum discrepancy between the distributions of these covariates in the control and treatment groups. Our formulation captures the key elements of the Rubin causal model and translates nicely into a discrete optimization framework.