Experience with an efficient parallel kernel memory allocator



There has been great progress from the traditional allocation algorithms designed for small memories to more modern algorithms exemplified by McKusick's and Karels' allocator (McKusick MK, Karels MJ. Design of a general purpose memory allocator for the 4.3BSD UNIX kernel. In USENIX Conference Proceedings, Berkeley, CA, June 1988). Nonetheless, none of these algorithms have been designed to meet the needs of UNIX kernels supporting commercial data-processing applications in a shared-memory multiprocessor environment.

On a shared-memory multiprocessor, memory is a global resource. Therefore, allocator performance depends on synchronization primitives and manipulation of shared data as well as on raw CPU speed.

Synchronization primitives and access to shared data depend on system bus interactions. The speed of system buses has not kept pace with that of CPUs, as witnessed by the ever-larger caches found on recent systems. Thus, the performance of synchronization primitives and of memory allocators that use them have not received the full benefit of increased CPU performance.

An earlier paper (McKenney PE, Slingwine J. Efficient kernel memory allocation on shared-memory multiprocessors. In USENIX Conference Proceedings, Berkeley, CA, February 1993), describes an allocator designed to meet this situation. This article reviews the motivation for and design of the allocator and presents the experience gained during the seven years that the allocator has been in production use. Copyright © 2001 John Wiley & Sons, Ltd.