Mass spectrometric genomic data mining: Novel insights into bioenergetic pathways in Chlamydomonas reinhardtii

Authors

  • Jens Allmer,

    1. Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
    2. Present address: Department of Biology, Institute of Plant Biochemistry and Biotechnology, University of Münster, Hindenburgplatz 55, 48143 Münster, Germany
    Search for more papers by this author
  • Bianca Naumann,

    1. Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
    2. Present address: Department of Biology, Institute of Plant Biochemistry and Biotechnology, University of Münster, Hindenburgplatz 55, 48143 Münster, Germany
    Search for more papers by this author
  • Christine Markert,

    1. Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
    Search for more papers by this author
  • Monica Zhang,

    1. Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
    Search for more papers by this author
  • Michael Hippler Dr.

    1. Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
    Search for more papers by this author

Abstract

A new high-throughput computational strategy was established that improves genomic data mining from MS experiments. The MS/MS data were analyzed by the SEQUEST search algorithm and a combination of de novo amino acid sequencing in conjunction with an error-tolerant database search tool, operating on a 256 processor computer cluster. The error-tolerant search tool, previously established as GenomicPeptideFinder (GPF), enables detection of intron-split and/or alternatively spliced peptides from MS/MS data when deduced from genomic DNA. Isolated thylakoid membranes from the eukaryotic green alga Chlamydomonas reinhardtii were separated by 1-D SDS gel electrophoresis, protein bands were excised from the gel, digested in-gel with trypsin and analyzed by coupling nano-flow LC with MS/MS. The concerted action of SEQUEST and GPF allowed identification of 2622 distinct peptides. In total 448 peptides were identified by GPF analysis alone, including 98 intron-split peptides, resulting in the identification of novel proteins, improved annotation of gene models, and evidence of alternative splicing.

Ancillary