Attacking performance bottlenecks
Part 4. Bioinformatics
4.8. Modern Programming Paradigms in Biology
Published Online: 15 DEC 2006
Copyright © 2005 John Wiley & Sons, Ltd
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics
How to Cite
van der Pas, R. 2006. Attacking performance bottlenecks. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. 4:4.8.
- Published Online: 15 DEC 2006
Multicore processors are becoming mainstream at a rapid pace. Thanks to this trend, even desktop and laptop computers are being turned into relatively small parallel computers.
Although the aggregate computing capacity is increasing, the individual cores are less powerful than a single monolithic processor.
This affects the application developer in two ways. First and foremost, waiting for a faster processor as a way to dramatically improve performance is not a good strategy. This is simply because the future speed increase of the individual cores is relatively modest. In addition to this, to exploit all the computational power of the processor, the application has to be parallelized (or “multithreaded”) to take advantage of the multiple cores.
Before parallelizing an application, however, it is strongly recommended that the performance of a single thread be first tuned. Getting good performance out of a single thread not only gives an immediate performance benefit, but it is a prerequisite for scalable parallel performance.
Four sequential tuning phases can be distinguished. Three of these steps typically require a modest effort. The fourth step involves changes in source code. The amount of time spent on this depends on various constraints, such as time to market for the application, but the reward could be substantial.
Several programming models are available for implementing the parallelism. They all have their own specific pros and cons.
High-quality tools are indispensable when tuning and parallelizing an application. An advanced optimizing compiler is a must. A tool to identify the most time consuming parts in an application greatly reduces the tuning cycle. Preferably, such a tool should also give easy access to the low-level hardware event counters most modern processors support. To increase the parallel efficiency, it is very useful to be able to analyze the application behavior at the thread level.
- application performance;
- performance analysis tools;