Research Article
Fault-tolerant execution of large parameter sweep applications across multiple VOs with storage constraints
Article first published online: 25 AUG 2008
DOI: 10.1002/cpe.1353
Copyright © 2008 John Wiley & Sons, Ltd.
Issue
1532-0634/asset/cover.gif?v=1&s=6094df24c795ce080ff6df6ff3b6bcec19adb708)
Concurrency and Computation: Practice and Experience
Special Issue: The Best of CCGrid'2007: A Snapshot of an ‘Adolescent’ Area
Volume 21, Issue 3, pages 377–392, 10 March 2009
Additional Information
How to Cite
Ayyub, S., Abramson, D., Enticott, C., Garic, S. and Tan, J. (2009), Fault-tolerant execution of large parameter sweep applications across multiple VOs with storage constraints. Concurrency and Computation: Practice and Experience, 21: 377–392. doi: 10.1002/cpe.1353
Publication History
- Issue published online: 30 JAN 2009
- Article first published online: 25 AUG 2008
- Manuscript Accepted: 7 MAY 2008
- Manuscript Revised: 30 MAR 2008
- Manuscript Received: 11 DEC 2007
Funded by
- Australian Research Council
- CSIRO Division of Atmospheric Research
- Abstract
- Article
- References
- Cited By
Keywords:
- e-science;
- parameter sweep applications;
- Grid
Abstract
Applications that span multiple virtual organizations (VOs) are of great interest to the e-science community. However, our recent attempts to execute large-scale parameter sweep applications (PSAs) for real-world climate studies with the Nimrod/G tool have exposed problems in the areas of fault tolerance, data storage and trust management. In response, we have implemented a task-splitting approach that facilitates breaking up large PSAs into a sequence of dependent subtasks, improving fault tolerance; provides a garbage collection technique that deletes unnecessary data; and employs a trust delegation technique that facilitates flexible third party data transfers across different VOs. Copyright © 2008 John Wiley & Sons, Ltd.

1532-0634/asset/olbannerleft.gif?v=1&s=a4e4e145787de94e1d91eaab3c8c29d8a9d96a26)