• 1
    The chipping forecast II. Nature Genetics 2002; 32(4s):461552.
  • 2
    Clarke R, Ressom HW, Wang A, Xuan J, Liu MC, Gehan EA, Wang Y. The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nature Reviews Cancer 2008; 8(1):3749.
  • 3
    Wang Y, Miller DJ, Clarke R. Approaches to working in high-dimensional data spaces: gene expression microarrays. British Journal of Cancer 2008; 98(6):10231028.
  • 4
    Breitling R, Armengaud P, Amtmann A, Herzyk P. Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Letters 2004; 573:8392. DOI: 10.1016/j.febslet.2004.07.055.
  • 5
    Dudley J, Pouliot Y, Chen R, Morgan A, Butte A. Translational bioinformatics in the cloud: an affordable alternative. Genome Medicine 2010; 2(8):51.
  • 6
    Giancarlo R, Scaturro D, Utro F. Computational cluster validation for microarray data analysis: experimental assessment of clest, consensus clustering, figure of merit, gap statistics and model explorer. BMC Bioinformatics 2008; 9(1):462.
  • 7
    Hill J, Hambley M, Forster T, Mewissen M, Sloan T, Scharinger F, Trew A, Ghazal P. SPRINT: a new parallel framework for R. BMC Bioinformatics 2008; 9(1):558.
  • 8
    Mewissen M. SPRINT – user requirements survey results. Technical Report, Division of Pathway Medicine, University of Edinburgh, 2010.
  • 9
    Message Passing Interface Forum. MPI: a message-passing interface standard version 2.2, 2009.
  • 10
    Petrou S, Sloan TM, Mewissen M, Forster T, Piotrowski M, Dobrzelecki B, Ghazal P, Trew A, Hill J. Optimization of a parallel permutation testing function for the SPRINT R package. Concurrency and Computation: Practice and Experience 2011; 23(17):22582268. DOI: 10.1002/cpe.1787.
  • 11
    Breiman L. Random forests. Machine Learning 2001; 45:532.
  • 12
    Liaw A, Wiener M. Classification and regression by randomForest. R News 2002; 2(3):1822.
  • 13
    Hong F, Breitling R, McEntee CW, Wittner BS, Nemhauser JL, Chory J. RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis. Bioinformatics 2006; 22(22):28252827. DOI: 10.1093/bioinformatics/btl476.
  • 14
    Ihaka R, Gentleman R. R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics 2005; 5(3):299314.
  • 15
    R Development Core Team. R: a language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, 2010. (Available from:, ISBN 3-900051-07-0 [accessed 1st September 2011].
  • 16
    Yu H. Interface (wrapper) to MPI (Message-Passing Interface), 2010. (Available from:, r package version 0.5-9 [accessed 1st September 2011].
  • 17
    Vera G, Jansen R, Suppi R. R/parallel – speeding up bioinformatics analysis with R. BMC Bioinformatics 2008; 9(1):390.
  • 18
    Samatova NF, Bauer D, Yoginath S. Task-parallel R, 2009. (Available from:, r package version 0.34 [accessed 1st September 2011].
  • 19
    REvolution Computing. foreach: Foreach looping construct for R, 2009. (Available from:, r package version 1.3.0 [accessed 1st September 2011].
  • 20
    Breiman L. Classification and Regression Trees. Wadsworth: Belmont, California, 1984.
  • 21
    Quinlan JR. Induction of decision trees. Machine Learning 1986; 1(1):81106.
  • 22
    Kotsiantis SB. Supervised machine learning: a review of classification techniques. Informatica 2007; 31:249268.
  • 23
    Joshi MV, Karypis G, Kumar V. ScalParC: a new scalable and efficient parallel classification algorithm for mining large datasets. Proceedings of the International Parallel Processing Symposium, Orlando, Florida, USA, 1998; 573579.
  • 24
    Joshi MV, Karypis G, Kumar V. ScalParC: a new scalable and efficient parallel classification algorithm for mining large datasets. Technical Report, University of Minnesota, 1998. (Available from: accessed 1st September 2011.
  • 25
    Shafer JC, Agrawal R, Mehta M. SPRINT: a scalable parallel classifier for data mining. In VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Vijayaraman TM, Buchmann AP, Mohan C, Sarda NL (eds). Morgan Kaufmann: Mumbai (Bombay), India, 1996; 544555.
  • 26
    Topiẃc G, Šmuc T, Šojat Z, Skala K. Reimplementation of the random forest algorithm. In Parallel Numerics '05, Vajteršic M, Trobec R, Zinterhof P (eds). University of Salzburg: Salzburg, Austria, 2005; 119125.
  • 27
    Manilich EA, Ozsoyoglu ZM, Trubachev V, Radiovoyevitch T. Classification of large microarray data sets using fast random forest generation. Proceedings of the 9th Computational Systems Bioinformatics Conference, Stanford, California, USA, 2010; 8291.
  • 28
    Schwarz DF, König IR, Ziegler A. On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics 2010; 26(14):17521758.
  • 29
    DeRisi JL, Iyer VR, Brown PO. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 1997; 278(5338):680686. DOI: 10.1126/science.278.5338.680.
  • 30
    Koziol JA. Comments on the rank product method for analyzing replicated experiments. FEBS Letters 2010; 584(5):941944.
  • 31
    Akl SG. Parallel Sorting Algorithms. Academic Press: Orlando, Florida, 1985.
  • 32
    Shi H, Schaeffer J. Parallel sorting by regular sampling. Journal of Parallel and Distributed Computing 1992; 14:361372.
  • 33
    Helman DR, Jájá J, Bader DA. A new deterministic parallel sorting algorithm with an experimental evaluation. Journal of Experimental Algorithsm 1998; 3:4.
  • 34
    Smith CL, Dickinson P, Craigon M, Ross A, Khondoker MR, Forster T, Ivens A, Lynn DJ, Orme J, Jackson A, Lacaze P, Stenson BJ, Ghazal P. Host immune-metabolic network response detecting human neonatal bacterial infection. Journal of Clinical Investigation 2011. Submitted.
  • 35
    Ziegler A, DeStefano AL, König IR. Data mining, neural nets, trees – problems 2 and 3 of genetic analysis workshop 15. Genetic Epidemiology 2007; 31(Supplement 1):S51S60.
  • 36
    Schwarz D, Szymczak S, Ziegler A, Konig I. Picking single-nucleotide polymorphisms in forests. BMC Proceedings 2007; 1(Suppl 1):S59.
  • 37
    Meng Y, Yu Y, Cupples LA, Farrer L, Lunetta K. Performance of random forest when SNPs are in linkage disequilibrium. BMC Bioinformatics 2009; 10(1):78. DOI: 10.1186/1471-2105-10-78.