You have free access to this content

UNIT 10.5 Using Galaxy to Perform Large-Scale Interactive Data Analyses

  1. Jennifer Hillman-Jackson1,
  2. Dave Clements2,
  3. Daniel Blankenberg1,
  4. James Taylor2,
  5. Anton Nekrutenko1,
  6. Galaxy Team1,2

Published Online: 1 JUN 2012

DOI: 10.1002/0471250953.bi1005s38

Current Protocols in Bioinformatics

Current Protocols in Bioinformatics

How to Cite

Hillman-Jackson, J., Clements, D., Blankenberg, D., Taylor, J., Nekrutenko, A. and Team, G. 2012. Using Galaxy to Perform Large-Scale Interactive Data Analyses. Current Protocols in Bioinformatics. 38:10.5:10.5.1–10.5.47.

Author Information

  1. 1

    Penn State University, University Park, Pennsylvania

  2. 2

    Emory University, Atlanta, Georgia

Publication History

  1. Published Online: 1 JUN 2012

Literature Cited

  1. Literature Cited
  • Birney, E., Andrews, D., Bevan, P., Caccamo, M., Cameron, G., Chen, Y., Clarke, L., Coates, G., Cox, T., Cuff, J., Curwen, V., Cutts, T., Down, T., Durbin, R., Eyras, E., Fernandez-Suarez, X.M., Gane, P., Gibbins, B., Gilbert, J., Hammond, M., Hotz, H., Iyer, V., Kahari, A., Jekosch, K., Kasprzyk, A., Keefe, D., Keenan, S., Lehvaslaiho, H., McVicker, G., Melsopp, C., Meidl, P., Mongin, E., Pettett, R., Potter, S., Proctor, G., Rae, M., Searle, S., Slater, G., Smedley, D., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Storey, R., Ureta-Vidal, A., Woodwark, C., Clamp, M., and Hubbard, T. 2004. Ensembl 2004. Nucleic. Acids Res. 32:D468-D470.
  • Blankenberg, D., Taylor, J., Schenck, I., He, J., Zhang, Y., Ghent, M., Veeraraghavan, N., Albert, I., Miller, W., Makova, K.D., Hardison, R.C., and Nekrutenko, A. 2007. A frame-work collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly. Genome Res. 17:960-964.
  • Blankenberg, D., Gordon, A., Von Kuster, G., Coraor, N., Taylor, J., Nekrutenko, A.; Galaxy Team. 2010. Manipulation of FASTQ data with Galaxy. Bioinformatics 26:1783-1785.
  • Blankenberg, D., Taylor, J., Nekrutenko, A.; Galaxy Team. 2011. Making whole genome multiple alignments usable for biologists. Bioinformatics 27:2426-2428.
  • Fujita, P.A., Rhead, B., Zweig, A.S., Hinrichs, A.S., Karolchik, D., Cline, M.S., Goldman, M., Barber, G.P., Clawson, H., Coelho, A., Diekhans, M., Dreszer, T.R., Giardine, B.M., Harte, R.A., Hillman-Jackson, J., Hsu, F., Kirkup, V., Kuhn, R.M., Learned, K., Li, C.H., Meyer, L.R., Pohl, A., Raney, B.J., Rosenbloom, K.R., Smith, K.E., Haussler, D., and Kent, W.J. 2011. The UCSC Genome Browser database: Update 2011. Nucleic Acids Res. 39:D876-D882.
  • Giardine, B., Riemer, C., Hardison, R.C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., Miller, W., Kent, W.J., and Nekrutenko, A. 2005. Galaxy: A platform for interactive large-scale genome analysis. Genome Res. 15:1451-1451.
  • Goecks, J., Nekrutenko, A., Taylor, J.; Galaxy Team. 2010. Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11:R86.
  • Gupta, R., Bhattacharyya, A., Agosto-Perez, F.J., Wickramasinghe, P., and Davuluri, R.V. 2011. MPromDb update 2010: An integrated resource for annotation and visualization of mammalian gene promoters and ChIP-seq experimental data. Nucleic Acids Res. 39:D92-D97.
  • Karolchik, D., Baertsch, R., Diekhans, M., Furey, T.S., Hinrichs, A., Lu, Y.T., Roskin, K.M., Schwartz, M., Sugnet, C.W., Thomas, D.J., Weber, R.J., Haussler, D., Kent, W.J.; University of California Santa Cruz. 2003. The UCSC Genome Browser Database. Nucleic Acids Res. 31:51-51.
  • Karolchik, D., Hinrichs, A.S., Furey, T.S., Roskin, K.M., Sugnet, C.W., Haussler, D., and Kent, W.J. 2004. The UCSC table browser data retrieval tool. Nucleic Acids Res. 32:D493-D496.
  • Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25.
  • Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:2078-2078.
  • Maglott, D., Ostell, J., Pruitt, K.D., and Tatusova, T. 2005. Entrez gene: Gene-centered information at NCBI. Nucleic Acids Res. 33:D54-D58.
  • Park, P.J. 2009. ChIP-seq: Advantages and challenges of a maturing technology. Nat. Rev. Genet. 10:669-680.
  • Pepke, S., Wold, B., and Mortazavi, A. 2009. Computation for ChIP-seq and RNA-seq studies. Nat. Methods 6:S22-S32.
  • Phillips, J.E. and Corces, V.G. 2009. CTCF: Master weaver of the genome. Cell 137:1194-1211.
  • Pruitt, K.D., Tatusova, T., and Maglott, D.R. 2005. NCBI Reference Sequence (RefSeq): A curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33:D501-D504.
  • Raney, B.J., Cline, M.S., Rosenbloom, K.R., Dreszer, T.R., Learned, K., Barber, G.P., Meyer, L.R., Sloan, C.A., Malladi, V.S., Roskin, K.M., Suh, B.B., Hinrichs, A.S., Clawson, H., Zweig, A.S., Kirkup, V., Fujita, P.A., Rhead, B., Smith, K.E., Pohl, A., Kuhn, R.M., Karolchik, D., Haussler, D., and Kent, W.J. 2011. ENCODE whole-genome data in the UCSC genome browser (2011 update). Nucleic Acids Res. 39:D871-D875.
  • Rosenbloom, K.R., Dreszer, T.R., Pheasant, M., Barber, G.P., Meyer, L.R., Pohl, A., Raney, B.J., Wang, T., Hinrichs, A.S., Zweig, A.S., Fujita, P.A., Learned, K., Rhead, B., Smith, K.E., Kuhn, R.M., Karolchik, D., Haussler, D., and Kent, W.J. 2009. ENCODE whole-genome data in the UCSC Genome Browser. Nucleic Acids Res. 38:D620-D625.
  • Schneider, K.L., Pollard, K.S., Baertsch, R., Pohl, A., and Lowe, T.M. 2006. The UCSC Archaeal Genome Browser. Nucleic Acids Res. 34:D407-D410.
  • Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M., and Sirotkin, K. 2001. dbSNP: The NCBI database of genetic variation. Nucleic Acids Res. 29:308-311.
  • Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers, R.M., Brown, M., Li, W., and Liu, X.S. 2008. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9:R137.