26. Multidimensional Context of Sequence Tags: Biological Data Integration

  1. Dr. Matthias Harbers2,3 and
  2. Prof. Dr. Günter Kahl4,5,6
  1. Korbinian Grote and
  2. Thomas Werner

Published Online: 23 JAN 2012

DOI: 10.1002/9783527644582.ch26

Tag-Based Next Generation Sequencing

Tag-Based Next Generation Sequencing

How to Cite

Grote, K. and Werner, T. (2011) Multidimensional Context of Sequence Tags: Biological Data Integration, in Tag-Based Next Generation Sequencing (eds M. Harbers and G. Kahl), Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, Germany. doi: 10.1002/9783527644582.ch26

Editor Information

  1. 2

    4-2-6 Nishihara, Kashiwa-Shi, Chiba 277-0885, Japan

  2. 3

    DNAFORM Inc., Leading Venture Plaza 2, 75-1 Ono-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0046, Japan

  3. 4

    Mohrmühlgasse 3, 63500 Seligenstadt, Germany

  4. 5

    University of Frankfurt am Main Biocenter, Max-von-Lauestraße 9, 60439 Frankfurt am Main, Germany

  5. 6

    Frankfurt Biotechnology Innovation Center (FIZ), GenXPro Ltd, Altenhöferallee 3, 60438 Frankfurt am Main, Germany

Author Information

  1. Genomatix Software GmbH, Bayerstrasse 85a, 80335 Munich, Germany

Publication History

  1. Published Online: 23 JAN 2012
  2. Published Print: 14 DEC 2011

ISBN Information

Print ISBN: 9783527328192

Online ISBN: 9783527644582

SEARCH

Keywords:

  • multidimensional context;
  • sequence tags;
  • biological data integration;
  • methods

Summary

Next-generation sequencing (NGS) has revolutionized sequence data generation in many fields of biology. NGS covers whole genomes randomly multiple times in genotyping approaches and thereby a wealth of information becomes available at an ever-increasing pace. Assembly, mapping, and clustering are essential, required steps in data handling necessary for any subsequent NGS data analysis. However, biological meaning is derived from the process of “annotation” – putting the reads and clusters into a known or deduced biological context. Annotation of NGS read data currently is done in four major dimensions: (i) genome annotation, (ii) transcriptome annotation, (iii) phylogenetic annotation, and (iv) functional annotation. Except for the genome annotation, which can be directly taken from genome browsers or similar databases (passive annotation), all other dimensions require specific sequence analyses to be carried out (active annotation). Comparative sequence analysis is the most powerful method to derive functional information from such data, especially about aspects of gene regulation. This chapter details how such active annotation can be carried out in clearly defined segments.