The present study describes approaches to improve the performance of empirical models developed from a large nationwide data set to predict sediment toxicity from chemistry for regional applications. The authors developed 4 multiple chemical (PMax) models selected from individual chemical models developed using 1) a previously published approach applied to the nationwide data set; 2) a broader array of response and explanatory variables (e.g., different normalization approaches and toxicity classifications) applied to the nationwide data set; 3) a data set from the New York/New Jersey, USA, region; and 4) both nationwide and regional data sets. The models were calibrated using the regional data set. Performance was tested using an independent data set from the same region. The performance of the final PMax model developed using the calibration process substantially improved over that of the uncalibrated PMax model developed using the nationwide data set. The improvements were achieved by selecting the best performing individual chemical models and eliminating those that performed poorly when applied together. Although the best performing PMax model included both nationwide and region-specific models, the performance of the PMax model derived using only nationwide models was nearly as good. These results suggest that calibrating nationwide models to a regional data set may be both a more efficient and effective approach for improving model performance than developing region-specific models. Environ Toxicol Chem 2014;33:708–717. © 2013 SETAC. This article is a US Government work and is in the public domain in the USA.