Standard Article

Attacking performance bottlenecks

Part 4. Bioinformatics

4.8. Modern Programming Paradigms in Biology

Tutorial

  1. Ruud van der Pas

Published Online: 15 DEC 2006

DOI: 10.1002/047001153X.g409423

Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics

Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics

How to Cite

van der Pas, R. 2006. Attacking performance bottlenecks. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. 4:4.8.

Author Information

  1. Sun Microsystems, Inc., Amersfoort, The Netherlands

Publication History

  1. Published Online: 15 DEC 2006

Abstract

Multicore processors are becoming mainstream at a rapid pace. Thanks to this trend, even desktop and laptop computers are being turned into relatively small parallel computers.

Although the aggregate computing capacity is increasing, the individual cores are less powerful than a single monolithic processor.

This affects the application developer in two ways. First and foremost, waiting for a faster processor as a way to dramatically improve performance is not a good strategy. This is simply because the future speed increase of the individual cores is relatively modest. In addition to this, to exploit all the computational power of the processor, the application has to be parallelized (or “multithreaded”) to take advantage of the multiple cores.

Before parallelizing an application, however, it is strongly recommended that the performance of a single thread be first tuned. Getting good performance out of a single thread not only gives an immediate performance benefit, but it is a prerequisite for scalable parallel performance.

Four sequential tuning phases can be distinguished. Three of these steps typically require a modest effort. The fourth step involves changes in source code. The amount of time spent on this depends on various constraints, such as time to market for the application, but the reward could be substantial.

Several programming models are available for implementing the parallelism. They all have their own specific pros and cons.

High-quality tools are indispensable when tuning and parallelizing an application. An advanced optimizing compiler is a must. A tool to identify the most time consuming parts in an application greatly reduces the tuning cycle. Preferably, such a tool should also give easy access to the low-level hardware event counters most modern processors support. To increase the parallel efficiency, it is very useful to be able to analyze the application behavior at the thread level.

Keywords:

  • application performance;
  • multicore;
  • parallelization;
  • OpenMP;
  • compilers;
  • performance analysis tools;
  • correctness