Get access

Parallel pairwise statistical significance estimation of local sequence alignment using Message Passing Interface library

Authors


  • A conference version of this paper with preliminary results appeared in the ECMLS Workshop Proceedings of HPDC 2010, pp. 470–476. [1]

Ankit Agrawal, Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Rd, Evanston, IL 60208, USA.

E-mail: ankitag@eecs.northwestern.edu

SUMMARY

Homology detection is a fundamental step in sequence analysis. In the recent years, pairwise statistical significance has emerged as a promising alternative to database statistical significance for homology detection. Although more accurate, currently it is much time consuming because it involves generating tens of hundreds of alignment scores to construct the empirical score distribution. This paper presents a parallel algorithm for pairwise statistical significance estimation, called MPIPairwiseStatSig, implemented in C using MPI library. We further apply the parallelization technique to estimate non-conservative pairwise statistical significance using standard, sequence-specific, and position-specific substitution matrices, which has earlier demonstrated superior sequence comparison accuracy than original pairwise statistical significance. Distributing the most compute-intensive portions of the pairwise statistical significance estimation procedure across multiple processors has been shown to result in near-linear speed-ups for the application. The MPIPairwiseStatSig program for pairwise statistical significance estimation is available for free academic use at www.cs.iastate.edu~ankitag/MPIPairwiseStatSig.html. Copyright © 2011 John Wiley & Sons, Ltd.

Ancillary