Automated DNA-based plant identification for large-scale biodiversity assessment
Version of Record online: 12 APR 2014
© 2014 John Wiley & Sons Ltd
Molecular Ecology Resources
Volume 15, Issue 1, pages 136–152, January 2015
How to Cite
Papadopoulou, A., Chesters, D., Coronado, I., De la Cadena, G., Cardoso, A., Reyes, J. C., Maes, J.-M., Rueda, R. M. and Gómez-Zurita, J. (2015), Automated DNA-based plant identification for large-scale biodiversity assessment. Molecular Ecology Resources, 15: 136–152. doi: 10.1111/1755-0998.12256
- Issue online: 17 DEC 2014
- Version of Record online: 12 APR 2014
- Accepted manuscript online: 25 MAR 2014 11:36AM EST
- Manuscript Accepted: 22 FEB 2014
- Manuscript Revised: 17 FEB 2014
- Manuscript Received: 18 DEC 2013
- Spanish private fund
- Fundación BBVA
- Spanish Ministry of Foreign Affairs and Cooperation. Grant Number: PCI-AECID 2010
- Spanish Ministry of Science and Innovation
- Chinese Academy of Sciences. Grant Number: KSXC2-EW-B-02
- Chinese National Science Foundation. Grant Numbers: 30870268, 31172048, J1210002
Table S1 Species of angiosperms from the Pacific side of Nicaragua and the northern province of Estelí used to construct a local reference psbA-trnH sequence database for automated identification of flora in the Mesoamerican seasonally deciduous tropical forest. Alongside the plant taxonomy, geographical sources, sample voucher numbers (UNAN Herbarium, IBE-JGZ DNA collection) and public sequence database Accession numbers are given
Table S2 Species of Cassidinae (Coleoptera: Chrysomelidae) of known diet used to test the performance of the automated DNA-based identification pipeline. Several individuals for each species were used to retrieve psbA-trnH sequences from whole-specimen DNA extractions, thus likely representing plant tissue remains in the insect gut. The number of specimens tested, of putative diet sequences retrieved and the results obtained with the pipeline are given
Table S3 GenBank psbA-trnH sequences removed a posteriori from the curated database for automated identification due to problematic taxonomic annotations
Table S4 Resuls of BAGpipe automated taxonomic assignment of 104 infertile plant samples based on psbA-trnH sequences and a reference database including GenBank data and a custom Nicaraguan sclerophyll deciduous tropical forest database
Fig. S1 Distribution of minimum p-distances between 114 Nicaraguan SDTF sequences and their conspecifics from GenBank.
Fig. S2 Maximum-likelihood tree with clade support values above 70% produced automatically by the BAGpipe pipeline for a group of 16 putative diet sequences of the cassid Parorectis rugosa. The pipeline isolated a group of closely related sequences from GenBank and our purpose-built SDTF local reference (NPL codes) and placed the diet sequences within a Physalis clade with high confidence.
Fig. S3 Maximum-likelihood tree with clade support values above 70% produced automatically by the BAGpipe pipeline for a group of seven putative diet sequences of the cassid Physonota alutacea. The pipeline isolated a group of closely related sequences from GenBank and our purpose-built SDTF local reference (NPL codes) and placed the diet sequences within a specific clade of Cordia (+Varronia) sequences and with high confidence.
Appendix S1 Examples of GenBank sequences, annotated as psbA-trnH but not retrieved by any of the similarity searches performed.
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.