An experience report: porting the MG-RAST rapid metagenomics analysis pipeline to the cloud


Folker Meyer, Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Avn., Argonne, IL, USA.



Existing applications in computational biology typically favor a local cluster based integrated computational platform. We present a lessons learned type report for scaling up an existing metagenomics application that outgrew the available local cluster hardware. In our example, removing a number of assumptions linked to tight integration allowed to expand beyond one administrative domain, increase the number and type of machines available for the application, and also improved scaling properties of the application. The assumptions made in designing the computational client make it well suitable for deployment as a virtual machine inside a cloud. This paper discusses the decision process and describes the suitability of deploying various bioinformatics computations to distributed heterogeneous machines. Copyright © 2011 John Wiley & Sons, Ltd.