We derive regional-scale (∼104 km2) CO2 flux estimates for summer 2004 in the northeast United States and southern Quebec by assimilating extensive data into a receptor-oriented model-data fusion framework. Surface fluxes are specified using the Vegetation Photosynthesis and Respiration Model (VPRM), a simple, readily optimized biosphere model driven by satellite data, AmeriFlux eddy covariance measurements and meteorological fields. The surface flux model is coupled to a Lagrangian atmospheric adjoint model, the Stochastic Time-Inverted Lagrangian Transport Model (STILT) that links point observations to upwind sources with high spatiotemporal resolution. Analysis of CO2 concentration data from the NOAA-ESRL tall tower at Argyle, ME and from extensive aircraft surveys, shows that the STILT–VPRM framework successfully links model flux fields to regionally representative atmospheric CO2 data, providing a bridge between ‘bottom-up’ and ‘top-down’ methods for estimating regional CO2 budgets on timescales from hourly to monthly. The surface flux model, with initial calibration to eddy covariance data, produces an excellent a priori condition for inversion studies constrained by atmospheric concentration data. Exploratory optimization studies show that data from several sites in a region are needed to constrain model parameters for all major vegetation types, because the atmosphere commingles the influence of regional vegetation types, and even high-resolution meteorological analysis cannot disentangle the associated contributions. Airborne data are critical to help define uncertainty within the optimization framework, showing for example, that in summertime CO2 concentration at Argyle (107 m) is ∼0.6 ppm lower than the mean in the planetary boundary layer.