Groundwater modeling has become a vital component to water supply and contaminant transport investigations. These models require representative hydraulic conductivity (K) and specific storage (Ss) estimates, or a set of estimates representing subsurface heterogeneity. Currently, there are a number of approaches for characterizing and modeling K and Ss heterogeneity in varying degrees of detail, but there is a lack of consensus for an approach that results in the most robust groundwater models with the best predictive ability. The main goal of this study is to compare different heterogeneity modeling approaches (e.g., effective parameters, geostatistics, geological models, and hydraulic tomography) when input into a forward groundwater model and used to predict 16 independent cross-hole pumping tests. We first characterize a sandbox aquifer through single- and cross-hole pumping tests, and then use these data to construct forward groundwater models of various complexities (both homogeneous and heterogeneous distributions). Two effective parameter models are constructed: (1) by taking the geometric mean of single-hole test K and Ss estimates and (2) calibrating effective K and Ss estimates by simultaneously matching the response at all ports during a cross-hole test. Heterogeneous models consist of spatially variable K and Ss fields obtained via (1) kriging single-hole data; (2) calibrating a geological model; and (3) conducting transient hydraulic tomography (Zhu and Yeh, 2005). The performance of these parameter fields are then tested through the simulation of 16 independent cross-hole pumping tests. Our results convincingly show that transient hydraulic tomography produces the smallest discrepancy between observed and simulated drawdowns.