Running unstructured grid-based CFD solvers on modern graphics hardware

Authors

  • Andrew Corrigan,

    1. CFD Center, Department of Computational and Data Sciences, M.S. 6A2, College of Science, George Mason University, Fairfax, VA 22030-4444, U.S.A.
    Search for more papers by this author
  • Fernando F. Camelli,

    1. CFD Center, Department of Computational and Data Sciences, M.S. 6A2, College of Science, George Mason University, Fairfax, VA 22030-4444, U.S.A.
    Search for more papers by this author
  • Rainald Löhner,

    Corresponding author
    1. CFD Center, Department of Computational and Data Sciences, M.S. 6A2, College of Science, George Mason University, Fairfax, VA 22030-4444, U.S.A.
    • CFD Center, Department of Computational and Data Sciences, M.S. 6A2, College of Science, George Mason University, Fairfax, VA 22030-4444, U.S.A.
    Search for more papers by this author
  • John Wallin

    1. CFD Center, Department of Computational and Data Sciences, M.S. 6A2, College of Science, George Mason University, Fairfax, VA 22030-4444, U.S.A.
    Search for more papers by this author

Abstract

Techniques used to implement an unstructured grid solver on modern graphics hardware are described. The three-dimensional Euler equations for inviscid, compressible flow are considered. Effective memory bandwidth is improved by reducing total global memory access and overlapping redundant computation, as well as using an appropriate numbering scheme and data layout. The applicability of per-block shared memory is also considered. The performance of the solver is demonstrated on two benchmark cases: a NACA0012 wing and a missile. For a variety of mesh sizes, an average speed-up factor of roughly 9.5 × is observed over the equivalent parallelized OpenMP code running on a quad-core CPU, and roughly 33 × over the equivalent code running in serial. Copyright © 2010 John Wiley & Sons, Ltd.

Ancillary