Volume 17, Issue 1
Special Issue: Population Genomics with R

vcfr: a package to manipulate and visualize variant call format data in R

Brian J. Knaus

Horticultural Crops Research Unit, USDA‐ARS, Corvallis, OR, 97330 USA

Search for more papers by this author
Niklaus J. Grünwald

Corresponding Author

E-mail address: Nik.Grunwald@ars.usda.gov

Horticultural Crops Research Unit, USDA‐ARS, Corvallis, OR, 97330 USA

Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, 97331 USA

Correspondence: Niklaus J. Grünwald, Fax: 541‐738‐4025; E‐mail: Nik.Grunwald@ars.usda.govSearch for more papers by this author
First published: 12 July 2016
Citations: 110

Abstract

Software to call single‐nucleotide polymorphisms or related genetic variants has converged on the variant call format (VCF) as the output format of choice. This has created a need for tools to work with VCF files. While an increasing number of software exists to read VCF data, many only extract the genotypes without including the data associated with each genotype that describes its quality. We created the r package vcfr to address this issue. We developed a VCF file exploration tool implemented in the r language because r provides an interactive experience and an environment that is commonly used for genetic data analysis. Functions to read and write VCF files into r as well as functions to extract portions of the data and to plot summary statistics of the data are implemented. vcfr further provides the ability to visualize how various parameterizations of the data affect the results. Additional tools are included to integrate sequence (fasta) and annotation data (GFF) for visualization of genomic regions such as chromosomes. Conversion functions translate data from the vcfr data structure to formats used by other r genetics packages. Computationally intensive functions are implemented in C++ to improve performance. Use of these tools is intended to facilitate VCF data exploration, including intuitive methods for data quality control and easy export to other r packages for further analysis. vcfr thus provides essential, novel tools currently not available in r.

Number of times cited according to CrossRef: 110

  • Phylogeographic and phenotypic outcomes of brown anole colonization across the Caribbean provide insight into the beginning stages of an adaptive radiation, Journal of Evolutionary Biology, 10.1111/jeb.13581, 33, 4, (468-494), (2020).
  • Genomic signals found using RNA sequencing show signatures of selection and subtle population differentiation in walleye (Sander vitreus) in a large freshwater ecosystem, Ecology and Evolution, 10.1002/ece3.6418, 10, 14, (7173-7188), (2020).
  • Mining ancient microbiomes using selective enrichment of damaged DNA molecules, BMC Genomics, 10.1186/s12864-020-06820-7, 21, 1, (2020).
  • Female spider aggression is associated with genetic underpinnings of the nervous system and immune response to pathogens, Molecular Ecology, 10.1111/mec.15502, 29, 14, (2626-2638), (2020).
  • SnpHub: an easy-to-set-up web server framework for exploring large-scale genomic variation data in the post-genomic era with applications in wheat, GigaScience, 10.1093/gigascience/giaa060, 9, 6, (2020).
  • Genome-Wide Association Study Reveals Novel Candidate Genes Associated with Productivity and Disease Resistance to Moniliophthora spp. in Cacao ( Theobroma cacao L.) , G3: Genes|Genomes|Genetics, 10.1534/g3.120.401153, 10, 5, (1713-1725), (2020).
  • A new strain group of common carp: The genetic differences and admixture events between Cyprinus carpio breeds, Ecology and Evolution, 10.1002/ece3.6286, 10, 12, (5431-5439), (2020).
  • prewas: data pre-processing for more informative bacterial GWAS, Microbial Genomics, 10.1099/mgen.0.000368, (2020).
  • Natural Selection Shapes Variation in Genome-wide Recombination Rate in Drosophila pseudoobscura, Current Biology, 10.1016/j.cub.2020.03.053, (2020).
  • Rapid and strong population genetic differentiation and genomic signatures of climatic adaptation in an invasive mealybug, Diversity and Distributions, 10.1111/ddi.13053, 26, 5, (610-622), (2020).
  • In silico quality assessment of SNPs—a case study on the Axiom® Wheat genotyping arrays, Current Plant Biology, 10.1016/j.cpb.2020.100140, (100140), (2020).
  • Genetic diversity and domestication of hazelnut (Corylus avellana L.) in Turkey, PLANTS, PEOPLE, PLANET, 10.1002/ppp3.10078, 2, 4, (326-339), (2020).
  • Comparative genomics confirms a rare melioidosis human-to-human transmission event and reveals incorrect phylogenomic reconstruction due to polyclonality, Microbial Genomics, 10.1099/mgen.0.000326, (2020).
  • The Seagrass Methylome Is Associated With Variation in Photosynthetic Performance Among Clonal Shoots, Frontiers in Plant Science, 10.3389/fpls.2020.571646, 11, (2020).
  • Genetic Diversity and Population Structure of Races of Fusarium oxysporum Causing Cotton Wilt, G3: Genes|Genomes|Genetics, 10.1534/g3.120.401187, 10, 9, (3261), (2020).
  • STAGdb: a 30K SNP genotyping array and Science Gateway for Acropora corals and their dinoflagellate symbionts, Scientific Reports, 10.1038/s41598-020-69101-z, 10, 1, (2020).
  • The role of introgression and ecotypic parallelism in delineating intraspecific conservation units, Molecular Ecology, 10.1111/mec.15522, 29, 15, (2793-2809), (2020).
  • Fonio millet genome unlocks African orphan crop diversity for agriculture in a changing climate, Nature Communications, 10.1038/s41467-020-18329-4, 11, 1, (2020).
  • Genetic Mapping in Autohexaploid Sweet Potato with Low-Coverage NGS-Based Genotyping Data, G3: Genes|Genomes|Genetics, 10.1534/g3.120.401433, 10, 8, (2661-2670), (2020).
  • Coping with Pleistocene climatic fluctuations: Demographic responses in remote endemic reef fishes, Molecular Ecology, 10.1111/mec.15478, 29, 12, (2218-2233), (2020).
  • Tracing key genes associated with the Pinctada margaritifera albino phenotype from juvenile to cultured pearl harvest stages using multiple whole transcriptome sequencing, BMC Genomics, 10.1186/s12864-020-07015-w, 21, 1, (2020).
  • Examining the molecular mechanisms contributing to the success of an invasive species across different ecosystems, Ecology and Evolution, 10.1002/ece3.6688, 10, 18, (10254-10270), (2020).
  • The Brazilian Initiative on Precision Medicine (BIPMed): fostering genomic data-sharing of underrepresented populations, npj Genomic Medicine, 10.1038/s41525-020-00149-6, 5, 1, (2020).
  • Genome wide analysis reveals genetic divergence between Goldsinny wrasse populations, BMC Genetics, 10.1186/s12863-020-00921-8, 21, 1, (2020).
  • Spatial, climate and ploidy factors drive genomic diversity and resilience in the widespread grass Themeda triandra, Molecular Ecology, 10.1111/mec.15614, 29, 20, (3872-3888), (2020).
  • Gene flow creates a mirage of cryptic species in a Southeast Asian spotted stream frog complex, Molecular Ecology, 10.1111/mec.15603, 29, 20, (3970-3987), (2020).
  • Foundation species promote local adaptation and fine‐scale distribution of herbaceous plants, Journal of Ecology, 10.1111/1365-2745.13461, 0, 0, (2020).
  • Fostering Conservation via an Integrated Use of Conventional Approaches and High-Throughput SPET Genotyping: A Case Study Using the Endangered Canarian Endemics Solanum lidii and S. vespertilio (Solanaceae), Frontiers in Plant Science, 10.3389/fpls.2020.00757, 11, (2020).
  • Genome-Wide Increased Copy Number is Associated with Emergence of Dominant Clones of the Irish Potato Famine Pathogen Phytophthora infestans , mBio, 10.1128/mBio.00326-20, 11, 3, (2020).
  • A single gene underlies the dynamic evolution of poplar sex determination, Nature Plants, 10.1038/s41477-020-0672-9, (2020).
  • “Frozen evolution” of an RNA virus suggests accidental release as a potential cause of arbovirus re-emergence, PLOS Biology, 10.1371/journal.pbio.3000673, 18, 4, (e3000673), (2020).
  • Population Diversity and Structure of Podosphaera macularis in the Pacific Northwestern United States and Other Populations , Phytopathology™, 10.1094/PHYTO-12-19-0448-R, (PHYTO-12-19-044), (2020).
  • Low genetic differentiation despite high fragmentation in the endemic serpentinophyte Minuartia smejkalii (M. verna agg., Caryophyllaceae) revealed by RADSeq SNP markers, Conservation Genetics, 10.1007/s10592-019-01239-4, (2020).
  • Analysis of ACE2 Genetic Variability among Populations Highlights a Possible Link with COVID-19-Related Neurological Complications, Genes, 10.3390/genes11070741, 11, 7, (741), (2020).
  • Genetic mapping of the c1 locus by GBS-based BSA-seq revealed Pseudo-Response Regulator 2 as a candidate gene controlling pepper fruit color, Theoretical and Applied Genetics, 10.1007/s00122-020-03565-5, (2020).
  • Developing Heterotic Groups for Successful Hybrid Breeding in Perennial Ryegrass, Agronomy, 10.3390/agronomy10091410, 10, 9, (1410), (2020).
  • Genetic impacts of conservation management actions in a critically endangered parrot species, Conservation Genetics, 10.1007/s10592-020-01292-4, (2020).
  • Phenotypic, ecological, and genomic variation in common bully ( Gobiomorphus cotidianus ) populations along depth gradients in New Zealand’s southern Great Lakes , Canadian Journal of Fisheries and Aquatic Sciences, 10.1139/cjfas-2020-0015, (1-10), (2020).
  • Adaptive Introgression across Semipermeable Species Boundaries between Local Helicoverpa zea and Invasive Helicoverpa armigera Moths, Molecular Biology and Evolution, 10.1093/molbev/msaa108, (2020).
  • Self‐service traps inspected by avian and terrestrial predators as a management option for rodents, Pest Management Science, 10.1002/ps.5550, 76, 1, (103-110), (2019).
  • Additive interaction between a root‐knot nematode Meloidogyne javanica and a root‐feeding flea beetle Longitarsus bethae on their host Lantana camara, Pest Management Science, 10.1002/ps.5493, 76, 1, (198-204), (2019).
  • Presence and roles of myrtenol, myrtanol and myrtenal in Dendroctonus armandi (Coleoptera: Curculionidae: Scolytinae) and Pinus armandi (Pinales: Pinaceae: Pinoideae), Pest Management Science, 10.1002/ps.5492, 76, 1, (188-197), (2019).
  • Re: Agreement between current status and retrospective data for prevalence and duration of exclusive breast feeding from low‐ and middle‐income countries surveys: Methodological issues, Paediatric and Perinatal Epidemiology, 10.1111/ppe.12614, 34, 1, (99-99), (2019).
  • Metaanalysis on obesity and risk of inflammatory bowel disease: Reanalysis is needed, Obesity Reviews, 10.1111/obr.12965, 21, 1, (2019).
  • Bariatric surgery is a cost‐saving treatment for obesity—A comprehensive meta‐analysis and updated systematic review of health economic evaluations of bariatric surgery, Obesity Reviews, 10.1111/obr.12932, 21, 1, (2019).
  • Advancing cover cropping in temperate integrated weed management, Pest Management Science, 10.1002/ps.5639, 76, 1, (42-46), (2019).
  • Predictive control of a water distribution system based on process historian data, Optimal Control Applications and Methods, 10.1002/oca.2559, 41, 2, (571-586), (2019).
  • Genome‐wide markers reveal temporal instability of local population genetic structure in the cotton fleahopper, Pseudatomoscelis seriatus (Hemiptera: Miridae), Pest Management Science, 10.1002/ps.5518, 76, 1, (324-332), (2019).
  • Linked-read Sequencing Analysis Reveals Tumor-specific Genome Variation Landscapes in Neurofibromatosis Type 2 (NF2) Patients, Otology & Neurotology, 10.1097/MAO.0000000000002096, 40, 2, (e150-e159), (2019).
  • Ancestry prediction efficiency of the software GenoGeographer using a z-score method and the ancestry informative markers in the Precision ID Ancestry Panel, Forensic Science International: Genetics, 10.1016/j.fsigen.2019.102154, (102154), (2019).
  • Genome‐wide SNP data reveal improved evidence for Antarctic glacial refugia and dispersal of terrestrial invertebrates, Molecular Ecology, 10.1111/mec.15269, 28, 22, (4941-4957), (2019).
  • An ancestry informative marker panel design for individual ancestry estimation of Hispanic population using whole exome sequencing data, BMC Genomics, 10.1186/s12864-019-6333-6, 20, S12, (2019).
  • Pleistocene glacial cycles drove lineage diversification and fusion in the Yosemite toad (Anaxyrus canorus), Evolution, 10.1111/evo.13868, 73, 12, (2476-2496), (2019).
  • Emergence of the Ug99 lineage of the wheat stem rust pathogen through somatic hybridisation, Nature Communications, 10.1038/s41467-019-12927-7, 10, 1, (2019).
  • Strengths and potential pitfalls of hay transfer for ecological restoration revealed by RAD‐seq analysis in floodplain Arabis species, Molecular Ecology, 10.1111/mec.15194, 28, 17, (3887-3901), (2019).
  • Limited gene exchange between two sister species of leaf beetles within a hybrid zone in the Alps, Journal of Evolutionary Biology, 10.1111/jeb.13538, 32, 12, (1406-1417), (2019).
  • Rapid, complete reproductive isolation in two closely related Zosterops White‐eye bird species despite broadly overlapping ranges*, Evolution, 10.1111/evo.13797, 73, 8, (1647-1662), (2019).
  • LDJump: Estimating variable recombination rates from population genetic data, Molecular Ecology Resources, 10.1111/1755-0998.12994, 19, 3, (623-638), (2019).
  • VIVA (VIsualization of VAriants): A VCF File Visualization Tool, Scientific Reports, 10.1038/s41598-019-49114-z, 9, 1, (2019).
  • RADdesigner: a workflow to select the optimal sequencing methodology in genotyping experiments on woody plant species, Tree Genetics & Genomes, 10.1007/s11295-019-1372-3, 15, 4, (2019).
  • Novel multimarker comparisons address the genetic population structure of silvertip sharks (Carcharhinus albimarginatus), Marine and Freshwater Research, 10.1071/MF18296, 70, 7, (1007), (2019).
  • From reference genomes to population genomics: comparing three reference-aligned reduced-representation sequencing pipelines in two wildlife species, BMC Genomics, 10.1186/s12864-019-5806-y, 20, 1, (2019).
  • An Ultra-Dense Haploid Genetic Map for Evaluating the Highly Fragmented Genome Assembly of Norway Spruce (Picea abies ) , G3: Genes|Genomes|Genetics, 10.1534/g3.118.200840, 9, 5, (1623-1632), (2019).
  • isma: an R package for the integrative analysis of mutations detected by multiple pipelines, BMC Bioinformatics, 10.1186/s12859-019-2701-0, 20, 1, (2019).
  • Neoantigen Dissimilarity to the Self-Proteome Predicts Immunogenicity and Response to Immune Checkpoint Blockade, Cell Systems, 10.1016/j.cels.2019.08.009, (2019).
  • Molecular Epidemiology of Xanthomonas perforans Outbreaks in Tomato Plants from Transplant to Field as Determined by Single-Nucleotide Polymorphism Analysis , Applied and Environmental Microbiology, 10.1128/AEM.01220-19, 85, 18, (2019).
  • Using R and Bioconductor in Clinical Genomics and Transcriptomics, The Journal of Molecular Diagnostics, 10.1016/j.jmoldx.2019.08.006, (2019).
  • Distribution of local ancestry and evidence of adaptation in admixed populations, Scientific Reports, 10.1038/s41598-019-50362-2, 9, 1, (2019).
  • Spatiotemporal dynamics of multidrug resistant bacteria on intensive care unit surfaces, Nature Communications, 10.1038/s41467-019-12563-1, 10, 1, (2019).
  • The complex geography of domestication of the African rice Oryza glaberrima, PLOS Genetics, 10.1371/journal.pgen.1007414, 15, 3, (e1007414), (2019).
  • Genetic Diversity of Verticillium dahliae Isolates From Mint Detected with Genotyping by Sequencing , Phytopathology, 10.1094/PHYTO-12-18-0475-R, (PHYTO-12-18-047), (2019).
  • Local adaptation drives the diversification of effectors in the fungal wheat pathogen Parastagonospora nodorum in the United States, PLOS Genetics, 10.1371/journal.pgen.1008223, 15, 10, (e1008223), (2019).
  • Comparison of Phylogenetic Tree Topologies for Nitrogen Associated Genes Partially Reconstruct the Evolutionary History of Saccharomyces cerevisiae, Microorganisms, 10.3390/microorganisms8010032, 8, 1, (32), (2019).
  • Population Genomic Analyses Reveal Connectivity via Human-Mediated Transport across Populus Plantations in North America and an Undescribed Subpopulation of Sphaerulina musiva , Molecular Plant-Microbe Interactions®, 10.1094/MPMI-05-19-0131-R, (MPMI-05-19-0131), (2019).
  • Increased Adaptive Variation Despite Reduced Overall Genetic Diversity in a Rapidly Adapting Invader, Frontiers in Genetics, 10.3389/fgene.2019.01221, 10, (2019).
  • Recurrent somatic mutations reveal new insights into consequences of mutagenic processes in cancer, PLOS Computational Biology, 10.1371/journal.pcbi.1007496, 15, 11, (e1007496), (2019).
  • Comprehensive Outline of Whole Exome Sequencing Data Analysis Tools Available in Clinical Oncology, Cancers, 10.3390/cancers11111725, 11, 11, (1725), (2019).
  • Oxamniquine resistance alleles are widespread in Old World Schistosoma mansoni and predate drug deployment, PLOS Pathogens, 10.1371/journal.ppat.1007881, 15, 10, (e1007881), (2019).
  • Combinations of Spok genes create multiple meiotic drivers in Podospora, eLife, 10.7554/eLife.46454, 8, (2019).
  • Linkage disequilibrium and haplotype block patterns in popcorn populations, PLOS ONE, 10.1371/journal.pone.0219417, 14, 9, (e0219417), (2019).
  • Replicated landscape genomics identifies evidence of local adaptation to urbanization in wood frogs, Journal of Heredity, 10.1093/jhered/esz041, (2019).
  • Genome Sequence Analysis Reveals Selection Signatures in Endangered Trypanotolerant West African Muturu Cattle, Frontiers in Genetics, 10.3389/fgene.2019.00442, 10, (2019).
  • Population Structure of Phytophthora plurivora on Rhododendron in Oregon Nurseries , Plant Disease, 10.1094/PDIS-12-18-2187-RE, (PDIS-12-18-2187), (2019).
  • A reference high-density genetic map of greater yam (Dioscorea alata L.), Theoretical and Applied Genetics, 10.1007/s00122-019-03311-6, (2019).
  • Whole genomes define concordance of matched primary, xenograft, and organoid models of pancreas cancer, PLOS Computational Biology, 10.1371/journal.pcbi.1006596, 15, 1, (e1006596), (2019).
  • Global invasion history of the agricultural pest butterfly Pieris rapae revealed with genomics and citizen science , Proceedings of the National Academy of Sciences, 10.1073/pnas.1907492116, (201907492), (2019).
  • Genotyping-by-Sequencing Reveals Fine-Scale Differentiation in Populations of Pseudoperonospora humuli , Phytopathology, 10.1094/PHYTO-12-18-0485-R, (PHYTO-12-18-048), (2019).
  • Identity and relationships of Sempervivum tectorum ( Crassulaceae ) in the Rhine Gorge area , Willdenowia, 10.3372/wi.48.48310, 48, 3, (405-414), (2018).
  • Distance, elevation and environment as drivers of diversity and divergence in bumble bees across latitude and altitude, Molecular Ecology, 10.1111/mec.14735, 27, 14, (2926-2942), (2018).
  • Multi‐allelic exact tests for Hardy–Weinberg equilibrium that account for gender, Molecular Ecology Resources, 10.1111/1755-0998.12748, 18, 3, (461-473), (2018).
  • Nonclonal coloniality: Genetically chimeric colonies through fusion of sexually produced polyps in the hydrozoan Ectopleura larynx, Evolution Letters, 10.1002/evl3.68, 2, 4, (442-455), (2018).
  • Genomic divergence in allopatric Northern Cardinals of the North American warm deserts is linked to behavioral differentiation, Ecology and Evolution, 10.1002/ece3.4596, 8, 24, (12456-12478), (2018).
  • Population Genomics: Advancing Understanding of Nature, , 10.1007/13836_2018_60, (2018).
  • Advances in Sequencing and Resequencing in Crop Plants, , 10.1007/10_2017_46, (2018).
  • Populations of Phytophthora rubi Show Little Differentiation and High Rates of Migration Among States in the Western United States , Molecular Plant-Microbe Interactions, 10.1094/MPMI-10-17-0258-R, 31, 6, (614-622), (2018).
  • Stable predictive markers for Phytophthora sojae avirulence genes that impair infection of soybean uncovered by whole genome sequencing of 31 isolates, BMC Biology, 10.1186/s12915-018-0549-9, 16, 1, (2018).
  • Phylogenomic and single nucleotide polymorphism analyses revealed the hybrid origin of Spondias bahiensis (family Anacardiaceae): de novo genome sequencing and comparative genomics, Genetics and Molecular Biology, 10.1590/1678-4685-gmb-2017-0256, (2018).
  • Inferring Variation in Copy Number Using High Throughput Sequencing Data in R, Frontiers in Genetics, 10.3389/fgene.2018.00123, 9, (2018).
  • Diallel genetic analysis for multiple traits in eggplant and assessment of genetic distances for predicting hybrids performance, PLOS ONE, 10.1371/journal.pone.0199943, 13, 6, (e0199943), (2018).
  • Evolution of the U.S. Biological Select Agent Rathayibacter toxicus , mBio, 10.1128/mBio.01280-18, 9, 4, (2018).
  • See more

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.