This work presents cost-effective multi-graphics processing unit (GPU) parallel implementations of a finite-volume numerical scheme for solving pollutant transport problems in bidimensional domains. The fluid is modeled by 2D shallow-water equations, whereas the transport of pollutant is modeled by a transport equation. The 2D domain is discretized using a first-order Roe finite-volume scheme. Specifically, this paper presents multi-GPU implementations of both a solution that exploits recomputation on the GPU and an optimized solution that is based on a ghost cell decoupling approach. Our multi-GPU implementations have been optimized using nonblocking communications, overlapping communications and computations and the application of ghost cell expansion to minimize communications. The fastest one reached a speedup of 78 × using four GPUs on an InfiniBand network with respect to a parallel execution on a multicore CPU with six cores and two-way hyperthreading per core. Such performance, measured using a realistic problem, enabled the calculation of solutions not only in real time but also in orders of magnitude faster than the simulated time.Copyright © 2012 John Wiley & Sons, Ltd.