Special Issue Paper
Biocompute 2.0: an improved collaborative workspace for data intensive bio-science
Article first published online: 27 JUN 2011
DOI: 10.1002/cpe.1782
Copyright © 2011 John Wiley & Sons, Ltd.
Issue

Concurrency and Computation: Practice and Experience
Volume 23, Issue 17, pages 2305–2314, 10 December 2011
Additional Information
How to Cite
Carmichael, R., Braga-Henebry, P., Thain, D. and Emrich, S. (2011), Biocompute 2.0: an improved collaborative workspace for data intensive bio-science. Concurrency Computat.: Pract. Exper., 23: 2305–2314. doi: 10.1002/cpe.1782
Publication History
- Issue published online: 20 OCT 2011
- Article first published online: 27 JUN 2011
- Manuscript Accepted: 17 APR 2011
- Manuscript Revised: 26 MAR 2011
- Manuscript Received: 1 OCT 2010
Funded by
- Notre Dame's strategic investment in Global Health, Genomics and Bioinformatics and by NSF. Grant Number: CNS0643229
- Abstract
- Article
- References
- Cited By
Keywords:
- bioinformatics;
- web portal;
- makeflow;
- interface design
SUMMARY
The explosion of data in the biological community requires scalable and flexible portals for bioinformatics. To help address this need, we proposed characteristics needed for rigorous, reproducible, and collaborative resources for data-intensive science. Implementing a system with these characteristics exposed challenges in user interface, data distribution, and workflow description/execution. We describe ongoing responses to these and other challenges. Our Data-Action-Queue design pattern addresses user interface and system organization concepts. A dynamic data distribution mechanism lays the foundation for the management of persistent datasets. Makeflow facilitates the simple description and execution of complex multi-part jobs and forms the kernel of a module system powering diverse bioinformatics applications. Our improved web portal, Biocompute 2.0, has been in production use since the summer of 2010. Through it and its predecessor, we have provided over 56 years of CPU time through its five modules—BLAST, SSAHA, SHRIMP, BWA, and SNPEXP—to research groups at three universities. In this paper, we describe the goals and interface to the system, its architecture and performance, and the insights gained in its development. Copyright © 2011 John Wiley & Sons, Ltd.

1532-0634/asset/olbannerleft.gif?v=1&s=a4e4e145787de94e1d91eaab3c8c29d8a9d96a26)