An efficient parallel spectral method for direct numerical simulations of transitional and turbulent flows is described in this paper. The parallelization is classically based on a bidimensional domain decomposition, but has been specifically developed for a solenoidal Fourier–Chebyshev spectral approximation where in one Fourier direction, the number of modes is very large compared with the two other directions. The approach therefore differs from classical libraries developed for cubic Fourier boxes. The strategy uses message-passing interface (MPI) for message-passing among nodes and is fairly portable. One of the originalities of this paper is the use of an efficient hybrid programming with MPI for internodes communications and a coarse grain parallelism using OpenMP for core shared-memory computation, instead of the classical hybrid programming with MPI and a fine granularity parallelism at the loop level with OpenMP directives. This hybrid parallelism has been tested on the recent generation of high-performance parallel supercomputers involving a few tens of cores per node. Performances are evaluated on different low-frequency and high-frequency processors massively parallel platforms. We demonstrate that spectral methods, which are known to be inherently ill-fitted for the new generation of high-performance distributed-memory computers, can be implemented efficiently using this hybrid programming with good scalability and a very fast wall-clock time per iteration. New numerical experiments are therefore now accessible on petascale computers, while keeping the attractive features of spectral methods such as accuracy, exponential convergence, computational efficiency and conservative properties. This is illustrated by a direct numerical simulation of the transition of the boundary layers developing from the entrance section of a plane channel and interacting to merge into a fully turbulent flow. Copyright © 2012 John Wiley & Sons, Ltd.