Programming many-core architectures - a case study: dense matrix computations on the Intel single-chip cloud computer processor
Article first published online: 15 AUG 2011
Copyright © 2011 John Wiley & Sons, Ltd.
Concurrency and Computation: Practice and Experience
Volume 24, Issue 12, pages 1317–1333, 25 August 2012
How to Cite
Marker, B., Chan, E., Poulson, J., van de Geijn, R., Van der Wijngaart, Rob F., Mattson, T. G. and Kubaska, T. E. (2012), Programming many-core architectures - a case study: dense matrix computations on the Intel single-chip cloud computer processor. Concurrency Computat.: Pract. Exper., 24: 1317–1333. doi: 10.1002/cpe.1832
- Issue published online: 4 AUG 2012
- Article first published online: 15 AUG 2011
- Manuscript Accepted: 26 JUN 2011
- Manuscript Revised: 24 JUN 2011
- Manuscript Received: 21 MAR 2011
- collective communication;
- dense linear algebra library;
- many-core architecture
A message passing, distributed-memory parallel computer on a chip is one possible design for future, many-core architectures. We discuss initial experiences with the Intel Single-chip Cloud Computer research processor, which is a prototype architecture that incorporates 48 cores on a single die that can communicate via a small, shared, on-die buffer. The experiment is to port a state-of-the-art, distributed-memory, dense matrix library, Elemental, to this architecture and gain insight from the experience. We show that programmability addressed by this library, especially the proper abstraction for collective communication, greatly aids the porting effort. This enables us to support a wide range of functionality with limited changes to the library code. Copyright © 2011 John Wiley & Sons, Ltd.