These authors contributed equally to this work.
Standardisation & Guidelines
The PSI semantic validator: A framework to check MIAPE compliance of proteomics data
Article first published online: 15 OCT 2009
DOI: 10.1002/pmic.200900189
Copyright © 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Additional Information
How to Cite
Montecchi-Palazzi, L., Kerrien, S., Reisinger, F., Aranda, B., Jones, A. R., Martens, L. and Hermjakob, H. (2009), The PSI semantic validator: A framework to check MIAPE compliance of proteomics data. Proteomics, 9: 5112–5119. doi: 10.1002/pmic.200900189
- †
These authors contributed equally to this work.
Publication History
- Issue published online: 17 NOV 2009
- Article first published online: 15 OCT 2009
- Manuscript Accepted: 25 AUG 2009
- Manuscript Revised: 13 JUL 2009
- Manuscript Received: 19 MAR 2009
Funded by
- Wellcome Trust. Grant Number: WT085949MA
- European Commission under FELICS. Grant Number: 021902 (RII3)
- Research Infrastructure Action of the FP6 “Structuring the European Research Area” Programme
- Center for Bioinformatics, Eberhard Karls University Tübingen
- Abstract
- Article
- References
- Cited By
Keywords:
- Bioinformatics;
- Data mining;
- Semantic validation
Abstract
The Human Proteome Organization's Proteomics Standards Initiative (PSI) promotes the development of exchange standards to improve data integration and interoperability. PSI specifies the suitable level of detail required when reporting a proteomics experiment (via the Minimum Information About a Proteomics Experiment), and provides extensible markup language (XML) exchange formats and dedicated controlled vocabularies (CVs) that must be combined to generate a standard compliant document. The framework presented here tackles the issue of checking that experimental data reported using a specific format, CVs and public bio-ontologies (e.g. Gene Ontology, NCBI taxonomy) are compliant with the Minimum Information About a Proteomics Experiment recommendations. The semantic validator not only checks the XML syntax but it also enforces rules regarding the use of an ontology class or CV terms by checking that the terms exist in the resource and that they are used in the correct location of a document. Moreover, this framework is extremely fast, even on sizable data files, and flexible, as it can be adapted to any standard by customizing the parameters it requires: an XML Schema Definition, one or more CVs or ontologies, and a mapping file describing in a formal way how the semantic resources and the format are interrelated. As such, the validator provides a general solution to the common problem in data exchange: how to validate the correct usage of a data standard beyond simple XML Schema Definition validation. The framework source code and its various applications can be found at http://psidev.info/validator.

1615-9861/asset/olbannerleft.gif?v=1&s=5e7e0f1cdb0951c5b1ba024be31918c1f138c065)
