24. Computational Infrastructure and Basic Data Analysis for Next-Generation Sequencing

  1. Dr. Matthias Harbers2,3 and
  2. Prof. Dr. Günter Kahl4,5,6
  1. David Sexton

Published Online: 23 JAN 2012

DOI: 10.1002/9783527644582.ch24

Tag-Based Next Generation Sequencing

Tag-Based Next Generation Sequencing

How to Cite

Sexton, D. (2011) Computational Infrastructure and Basic Data Analysis for Next-Generation Sequencing, in Tag-Based Next Generation Sequencing (eds M. Harbers and G. Kahl), Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, Germany. doi: 10.1002/9783527644582.ch24

Editor Information

  1. 2

    4-2-6 Nishihara, Kashiwa-Shi, Chiba 277-0885, Japan

  2. 3

    DNAFORM Inc., Leading Venture Plaza 2, 75-1 Ono-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0046, Japan

  3. 4

    Mohrmühlgasse 3, 63500 Seligenstadt, Germany

  4. 5

    University of Frankfurt am Main Biocenter, Max-von-Lauestraße 9, 60439 Frankfurt am Main, Germany

  5. 6

    Frankfurt Biotechnology Innovation Center (FIZ), GenXPro Ltd, Altenhöferallee 3, 60438 Frankfurt am Main, Germany

Author Information

  1. Baylor Medical College Human Genome Sequencing Center, 2005 South Mason Rd #906, Katy, TX 77450, USA

Publication History

  1. Published Online: 23 JAN 2012
  2. Published Print: 14 DEC 2011

ISBN Information

Print ISBN: 9783527328192

Online ISBN: 9783527644582



  • computational infrastructure;
  • basic data analysis;
  • next-generation sequencing;
  • background;
  • applications;
  • perspectives


Mapping of short sequence tags, specifically, and next-generation sequencing, in general, requires enormous computational infrastructure resources. A single sequencing run can generate up to 100 Gb of sequence and requires almost 3 TB of data storage. In order to transfer and analyze data on this scale, the data infrastructure must be up to the task. There are many solutions to creating this infrastructure. Local servers and data storage can be purchased and installed by the group using the sequencing instrument. Communal resources at the home institution can be used to create the necessary infrastructure or newer external cloud-based solutions can be used to store and analyze data. Each sequencing platform has slightly different infrastructure requirements and the IT solution must be tailored to the systems in use or generalized to cover all possible instruments. The majority of installed instruments are based on Illumina technology with a smaller portion being ABI or 454. What is presented here is a solution that is generalized for any instrument, including not just the IT requirements, but the bioinformatics requirements as well.