Current Protocols in Bioinformatics

Current Protocols in Bioinformatics

Online ISBN: 9780471250951

DOI: 10.1002/0471250953

Browse by Table of Contents

  1. Foreword
  2. Preface
  3. Chapter 1 Using Biological Databases
    1. UNIT 1.1 The Importance of Biological Databases in Biological Discovery
    2. UNIT 1.2 Searching Online Mendelian Inheritance in Man (OMIM) for Information on Genetic Loci Involved in Human Disease
    3. You have free access to this content
      UNIT 1.3 Searching NCBI Databases Using Entrez
    4. UNIT 1.4 The UCSC Genome Browser
    5. UNIT 1.5 Using the NCBI Map Viewer to Browse Genomic Sequence Data
    6. UNIT 1.6 Using the DFCI Gene Index Databases for Biological Discovery
    7. UNIT 1.7 Searching the Mouse Genome Informatics (MGI) Resources for Information on Mouse Biology from Genotype to Phenotype
    8. UNIT 1.8 Searching WormBase for Information about Caenorhabditis elegans
    9. UNIT 1.9 Using the Tools and Resources of the RCSB Protein Data Bank
    10. UNIT 1.10 Human Mutation Databases
    11. UNIT 1.11 Using The Arabidopsis Information Resource (TAIR) to Find Information About Arabidopsis Genes
    12. UNIT 1.12 Using the KEGG Database Resource
    13. UNIT 1.13 The Human Gene Mutation Database (HGMD) and Its Exploitation in the Fields of Personalized Genomics and Molecular Evolution
    14. UNIT 1.14 Exploring Genetic, Genomic, and Phenotypic Data at the Rat Genome Database
    15. UNIT 1.15 Using the Ensembl Genome Server to Browse Genomic Sequence Data
    16. UNIT 1.16 Using the iHOP Information Resource to Mine the Biomedical Literature on Genes, Proteins, and Chemical Compounds
    17. UNIT 1.17 Using the MetaCyc Pathway Database and the BioCyc Database Collection
    18. UNIT 1.18 Exploring Zebrafish Genomic, Functional and Phenotypic Data Using ZFIN
    19. UNIT 1.19 Searching NCBI's dbSNP Database
    20. UNIT 1.20 Using the Saccharomyces Genome Database (SGD) for Analysis of Genomic Information
    21. UNIT 1.21 Access Guide to Human Proteinpedia
    22. UNIT 1.22 Using the iPlant Collaborative Discovery Environment
    23. UNIT 1.23 mtDNA Variation and Analysis Using Mitomap and Mitomaster
    24. You have free access to this content
      UNIT 1.24 MalaCards: A Comprehensive Automatically-Mined Database of Human Diseases
    25. UNIT 1.25 Using the MEROPS Database for Proteolytic Enzymes and Their Inhibitors and Substrates
    26. UNIT 1.26 Investigating Protein Structure and Evolution with SCOP2
    27. UNIT 1.27 Searching and Navigating UniProt Databases
    28. UNIT 1.28 Using CATH-Gene3D to Analyze the Sequence, Structure, and Function of Proteins
    29. UNIT 1.29 UniProt Tools
  4. Chapter 2 Recognizing Functional Domains
    1. UNIT 2.1 An Introduction to Recognizing Functional Domains
    2. UNIT 2.2 Using the Blocks Database to Recognize Functional Domains
    3. UNIT 2.3 Multiple Sequence Alignment Using ClustalW and ClustalX
    4. UNIT 2.4 Discovering Novel Sequence Motifs with MEME
    5. UNIT 2.5 Identifying Protein Domains with the Pfam Database
    6. UNIT 2.6 Using TESS to Predict Transcription Factor Binding Sites in DNA Sequence
    7. UNIT 2.7 The InterPro Database and Tools for Protein Domain Analysis
    8. UNIT 2.8 Using the Gibbs Motif Sampler to Find Conserved Domains in DNA and Protein Sequences
    9. UNIT 2.9 Using CorePromoter to Find Human Core Promoters
    10. UNIT 2.10 Using the Structure-Function Linkage Database to Characterize Functional Domains in Enzymes
    11. UNIT 2.11 Using Weeder, Pscan, and PscanChIP for the Discovery of Enriched Transcription Factor Binding Site Motifs in Nucleotide Sequences
    12. UNIT 2.12 Using PhyloCon to Identify Conserved Regulatory Motifs
    13. UNIT 2.13 Using CisGenome to Analyze ChIP-chip and ChIP-seq Data
    14. UNIT 2.14 Using MACS to Identify Peaks from ChIP-Seq Data
    15. UNIT 2.15 DNA Motif Databases and Their Uses
    16. UNIT 2.16 iRegulon and i-cisTarget: Reconstructing Regulatory Networks Using Motif and Track Enrichment
  5. Chapter 3 Finding Similarities and Inferring Homologies
    1. UNIT 3.1 An Introduction to Sequence Similarity (“Homology”) Searching
    2. UNIT 3.2 Finding Homologs to Nucleic Acid or Protein Sequences Using the Framesearch Program
    3. UNIT 3.3 Finding Similar Nucleotide Sequences Using Network BLAST Searches
    4. UNIT 3.4 Finding Homologs in Amino Acid Sequences Using Network BLAST Searches
    5. UNIT 3.5 Selecting the Right Similarity-Scoring Matrix
    6. UNIT 3.6 Constructing and Refining Multiple Sequence Alignments with PileUp, SeqLab, and the GCG Suite
    7. UNIT 3.7 An Overview of Multiple Sequence Alignment
    8. UNIT 3.8 Computing Multiple Sequence/Structure Alignments with the T-Coffee Package
    9. UNIT 3.9 Finding Protein and Nucleotide Similarities with FASTA
    10. UNIT 3.10 Mathematically Complete Nucleotide and Protein Sequence Searching Using Ssearch
    11. UNIT 3.11 Installing, Maintaining, and Using a Local Copy of BLAST for Intranet and Workstation Use
    12. UNIT 3.12 Using EMBL-EBI Services via Web Interface and Programmatically via Web Services
    13. UNIT 3.13 Clustal Omega
  6. Chapter 4 Annotating Genes
    1. UNIT 4.1 An Introduction to Genome Annotation
    2. UNIT 4.2 Using MZEF to Find Internal Coding Exons
    3. UNIT 4.3 Using geneid to Identify Genes
    4. UNIT 4.4 Using GlimmerM to Find Genes in Eukaryotic Genomes
    5. UNIT 4.5 Gene Identification in Prokaryotic Genomes, Phages, Metagenomes, and EST Sequences with GeneMarkS Suite
    6. UNIT 4.6 Eukaryotic Gene Prediction Using GeneMark.hmm-E and GeneMark-ES
    7. UNIT 4.7 Application of FirstEF to Find Promoters and First Exons in the Human Genome
    8. UNIT 4.8 Using N-SCAN or TWINSCAN to Predict Gene Structures in Genomic DNA Sequences
    9. UNIT 4.9 GrailEXP and Genome Analysis Pipeline for Genome Annotation
    10. UNIT 4.10 Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences
    11. UNIT 4.11 Genome Annotation and Curation Using MAKER and MAKER-P
    12. UNIT 4.12 Protein Function Prediction: Problems and Pitfalls
  7. Chapter 5 Modeling Structure from Sequence
    1. UNIT 5.1 An Introduction to Modeling Structure from Sequence
    2. UNIT 5.2 FAMS and FAMSBASE for Protein Structure
    3. UNIT 5.3 Modeling Membrane Proteins Utilizing Information from Silent Amino Acid Substitutions
    4. UNIT 5.4 Representing Structural Information with RasMol
    5. UNIT 5.5 Using Dali for Structural Comparison of Proteins
    6. UNIT 5.6 Comparative Protein Structure Modeling Using MODELLER
    7. UNIT 5.7 Using VMD: An Introductory Tutorial
    8. UNIT 5.8 Protein Structure and Function Prediction Using I-TASSER
  8. Chapter 6 Inferring Evolutionary Relationships
    1. UNIT 6.1 Introduction to Inferring Evolutionary Relationships
    2. UNIT 6.2 Visualizing Phylogenetic Trees Using TreeView
    3. UNIT 6.3 Getting a Tree Fast: Neighbor Joining, FastME, and Distance-Based Methods
    4. UNIT 6.4 Inferring Evolutionary Trees with PAUP*
    5. UNIT 6.5 Using MODELTEST and PAUP* to Select a Model of Nucleotide Substitution
    6. UNIT 6.6 Maximum-Likelihood Analysis Using TREE-PUZZLE
    7. UNIT 6.7 What If I Don't Have a Tree?: Split Decomposition and Related Models
    8. UNIT 6.8 Using PEBBLE for the Evolutionary Analysis of Serially Sampled Molecular Sequences
    9. UNIT 6.9 Phylogenomic Inference of Protein Molecular Function
    10. UNIT 6.10 Using OrthoCluster for the Detection of Synteny Blocks Among Multiple Genomes
    11. UNIT 6.11 Inferring Protein Function from Homology Using the Princeton Protein Orthology Database (P-POD)
    12. You have free access to this content
      UNIT 6.12 Using OrthoMCL to Assign Proteins to OrthoMCL-DB Groups or to Cluster Proteomes Into New Ortholog Groups
    13. UNIT 6.13 Phylogenetic Analysis with the iPlant Discovery Environment
    14. UNIT 6.14 Using RAxML to Infer Phylogenies
  9. Chapter 7 Analyzing Expression Patterns
    1. UNIT 7.1 Analysis of Expression Data: An Overview
    2. UNIT 7.2 The Gene Ontology (GO) Project: Structured Vocabularies for Molecular Biology and Their Application to Genome and Expression Analysis
    3. UNIT 7.3 Analysis of Gene-Expression Data Using J-Express
    4. UNIT 7.4 DRAGON and DRAGON View: Information Annotation and Visualization Tools for Large-Scale Expression Data
    5. UNIT 7.5 Using GenMAPP and MAPPFinder to View Microarray Data on Biological Pathways and Identify Global Trends in the Data
    6. UNIT 7.6 Pathway-Based Analysis of Microarray and RNAseq Data Using Pathway Processor 2.0
    7. UNIT 7.7 An Overview of Spotfire for Gene-Expression Studies
    8. UNIT 7.8 Loading and Preparing Data for Analysis in Spotfire
    9. UNIT 7.9 Analyzing and Visualizing Expression Data with Spotfire
    10. UNIT 7.10 Microarray Data Visualization and Analysis with the Longhorn Array Database (LAD)
    11. UNIT 7.11 Gene Expression Analysis via Multidimensional Scaling
    12. UNIT 7.12 Using GenePattern for Gene Expression Analysis
    13. UNIT 7.13 Data Storage and Analysis in ArrayExpress and Expression Profiler
    14. UNIT 7.14 Analyzing Gene Expression Data from Microarray and Next-Generation DNA Sequencing Transcriptome Profiling Assays Using GeneSifter Analysis Edition
  10. Chapter 8 Analyzing Molecular Interactions
    1. UNIT 8.1 Analyzing Molecular Interactions
    2. UNIT 8.2 Prediction of Protein-Protein Interaction Networks
    3. UNIT 8.3 Evaluation of Electrostatic Interactions
    4. UNIT 8.4 Using DelPhi to Compute Electrostatic Potentials and Assess Their Contribution to Interactions
    5. UNIT 8.5 Searching the MINT Database for Protein Interaction Information
    6. UNIT 8.6 Identifying Functional Sites Based on Prediction of Charged Group Behavior
    7. UNIT 8.7 Using the Reactome Database
    8. UNIT 8.8 Using VisANT to Analyze Networks
    9. UNIT 8.9 Searching, Viewing, and Visualizing Data in the Biomolecular Interaction Network Database (BIND)
    10. UNIT 8.10 Active Site Profiling to Identify Protein Functional Sites in Sequences and Structures Using the Deacon Active Site Profiler (DASP)
    11. UNIT 8.11 Structure-Based pKa Calculations Using Continuum Electrostatics Methods
    12. UNIT 8.12 Flexible Ligand Docking with Glide
    13. UNIT 8.13 Biological Network Exploration with Cytoscape 3
    14. UNIT 8.14 Using AutoDock for Ligand-Receptor Docking
    15. UNIT 8.15 Analyzing Protein-Protein Interactions from Affinity Purification-Mass Spectrometry Data with SAINT
    16. UNIT 8.16 Using ProHits to Store, Annotate, and Analyze Affinity Purification–Mass Spectrometry (AP-MS) Data
    17. UNIT 8.17 Using MEMo to Discover Mutual Exclusivity Modules in Cancer
    18. UNIT 8.18 DXMSMS Match Program for Automated Analysis of LC-MS/MS Data Obtained Using Isotopically Coded CID-Cleavable Cross-Linking Reagents
    19. UNIT 8.19 Scoring Large-Scale Affinity Purification Mass Spectrometry Datasets with MiST
    20. UNIT 8.20 Expression Data Analysis with Reactome
    21. UNIT 8.21 Using pLink to Analyze Cross-Linked Peptides
  11. Chapter 9 Building Biological Databases
    1. UNIT 9.1 Creating Databases for Biological Information: An Introduction
    2. UNIT 9.2 Structured Query Language (SQL) Fundamentals
    3. UNIT 9.3 Modeling Biology Using Relational Databases
    4. UNIT 9.4 Using Relational Databases for Improved Sequence Similarity Searching and Large-Scale Genomic Analyses
    5. UNIT 9.5 Using Apollo to Browse and Edit Genome Annotations
    6. UNIT 9.6 Using Chado to Store Genome Annotation Data
    7. UNIT 9.7 PubSearch and PubFetch: A Simple Management System for Semiautomated Retrieval and Annotation of Biological Information from the Literature
    8. UNIT 9.8 Installing and Configuring CMap
    9. UNIT 9.9 Using the Generic Genome Browser (GBrowse)
    10. UNIT 9.10 Installing a Local Copy of the Reactome Web Site and Knowledgebase
    11. UNIT 9.11 Browsing Multidimensional Molecular Networks with the Generic Network Browser (N-Browse)
    12. UNIT 9.12 Using the Generic Synteny Browser (GBrowse_syn)
    13. UNIT 9.13 Setting Up the JBrowse Genome Browser
    14. UNIT 9.14 Administering GBrowse Sites with WebGBrowse
    15. UNIT 9.15 Cloud Computing with iPlant Atmosphere
  12. Chapter 10 Comparing Genomes
    1. UNIT 10.1 Introduction to Comparing Large Sequence Sets
    2. UNIT 10.2 PipMaker: A World Wide Web Server for Genomic Sequence Alignments
    3. UNIT 10.3 Using MUMmer to Identify Similar Regions in Large Sequence Sets
    4. UNIT 10.4 MultiPipMaker: A Comparative Alignment Server for Multiple DNA Sequences
    5. You have free access to this content
      UNIT 10.5 Using Galaxy to Perform Large-Scale Interactive Data Analyses
    6. UNIT 10.6 Obtaining Comparative Genomic Data with the VISTA Family of Computational Tools
    7. UNIT 10.7 Using QIIME to Analyze 16S rRNA Gene Sequences from Microbial Communities
    8. UNIT 10.8 Using BLAT to Find Sequence Similarity in Closely Related Genomes
    9. UNIT 10.9 The Bluejay Genome Browser
    10. UNIT 10.10 Using the Wash U Epigenome Browser to Examine Genome-Wide Sequencing Data
  13. Chapter 11 Assembling and Mapping Large Sequence Sets
    1. UNIT 11.1 An Introduction to the Informatics of “Next-Generation” Sequencing
    2. UNIT 11.2 Viewing and Editing Assembled Sequences Using Consed
    3. UNIT 11.3 Generating a Genome Assembly with PCAP
    4. UNIT 11.4 Assembling Genomic DNA Sequences with PHRAP
    5. UNIT 11.5 Using the Velvet de novo Assembler for Short-Read Sequencing Technologies
    6. UNIT 11.6 RNA-Seq Read Alignments with PALMapper
    7. UNIT 11.7 Aligning Short Sequencing Reads with Bowtie
    8. UNIT 11.8 Next Generation Sequence Assembly with AMOS
    9. You have free access to this content
      UNIT 11.9 Using Cloud Computing Infrastructure with CloudBioLinux, CloudMan, and Galaxy
    10. UNIT 11.10 From FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline
    11. UNIT 11.11 Using SOAPaligner for Short Reads Alignment
    12. UNIT 11.12 BEDTools: The Swiss-Army Tool for Genome Feature Analysis
    13. UNIT 11.13 Efficient Alignment of Illumina-Like High-Throughput Sequencing Reads with the GEnomic Multi-tool (GEM) Mapper
    14. UNIT 11.14 Mapping RNA-seq Reads with STAR
  14. Chapter 12 Analyzing RNA Sequence and Structure
    1. UNIT 12.1 An Overview of RNA Sequence Analyses: Structure Prediction, ncRNA Gene Identification, and RNAi Design
    2. UNIT 12.2 RNA Secondary Structure Analysis Using the Vienna RNA Package
    3. UNIT 12.3 RNAi: Design and Analysis
    4. UNIT 12.4 Using the RNAstructure Software Package to Predict Conserved RNA Structures
    5. UNIT 12.5 Annotating Non-Coding RNAs with Rfam
    6. UNIT 12.6 RNA Secondary Structure Analysis Using RNAstructure
    7. UNIT 12.7 Identifying Structural Noncoding RNAs Using RNAz
    8. UNIT 12.8 RNA Secondary Structure Analysis Using The RNAshapes Package
    9. UNIT 12.9 miRBase: microRNA Sequences and Annotation
    10. UNIT 12.10 Identification of Novel and Known miRNAs in Deep-Sequencing Data with miRDeep2
    11. UNIT 12.11 Comparative ncRNA Gene and Structure Prediction Using Foldalign and FoldalignM
    12. UNIT 12.12 Using REDItools to Detect RNA Editing Events in NGS Datasets
  15. Chapter 13 Using Proteomics Techniques
    1. UNIT 13.1 Proteomics and the Analysis of Proteomic Data: An Overview of Current Protein-Profiling Technologies
    2. UNIT 13.2 Finding Protein Sequences Using PROWL
    3. UNIT 13.3 Protein Identification Using Sorcerer 2 and SEQUEST
    4. UNIT 13.4 Validation of Tandem Mass Spectrometry Database Search Results Using DTASelect
    5. UNIT 13.5 Installation and Use of LabKey Server for Proteomics
    6. UNIT 13.6 Using ProSight PTM and Related Tools for Targeted Protein Identification and Characterization with High Mass Accuracy Tandem MS Data
    7. UNIT 13.7 Using BiblioSpec for Creating and Searching Tandem MS Peptide Libraries
    8. UNIT 13.8 Using the Proteomics Identifications Database (PRIDE)
    9. UNIT 13.9 Using GFS to Identify Encoding Genomic Loci from Protein Mass Spectral Data
    10. UNIT 13.10 De Novo Interpretation of Tandem Mass Spectra
    11. UNIT 13.11 Extracting Biological Meaning from Large Gene Lists with DAVID
    12. UNIT 13.12 Census for Proteome Quantification
    13. UNIT 13.13 Analyzing Shotgun Proteomic Data with PatternLab for Proteomics
    14. UNIT 13.14 Predicting Peptide Retention Times for Proteomics
    15. UNIT 13.15 Biological Sequence Motif Discovery Using motif-x
    16. UNIT 13.16 Using the scan-x Web Site to Predict Protein Post-Translational Modifications
    17. UNIT 13.17 Identifying Proteomic LC-MS/MS Data Sets with Bumbershoot and IDPicker
    18. UNIT 13.18 Identification of Peptide Features in Precursor Spectra Using Hardklör and Krönik
    19. UNIT 13.19 PatternLab: From Mass Spectra to Label-Free Differential Shotgun Proteomics
    20. UNIT 13.20 Byonic: Advanced Peptide and Protein Identification Software
    21. UNIT 13.21 Proteomics and the Analysis of Proteomic Data: 2013 Overview of Current Protein-Profiling Technologies
    22. UNIT 13.22 STRAP PTM: Software Tool for Rapid Annotation and Differential Comparison of Protein Post-Translational Modifications
    23. UNIT 13.23 PepArML: A Meta-Search Peptide Identification Platform for Tandem Mass Spectra
    24. UNIT 13.24 Employing ProteoWizard to Convert Raw Mass Spectrometry Data
    25. UNIT 13.25 Using PeptideAtlas, SRMAtlas, and PASSEL: Comprehensive Resources for Discovery and Targeted Proteomics
    26. UNIT 13.26 Metaproteomics: Extracting and Mining Proteome Information to Characterize Metabolic Activities in Microbial Communities
    27. UNIT 13.27 Using PepExplorer to Filter and Organize De Novo Peptide Sequencing Results
    28. UNIT 13.28 Using PSEA-Quant for Protein Set Enrichment Analysis of Quantitative Mass Spectrometry-Based Proteomics
  16. Chapter 14 Cheminformatics and Metabolomics
    1. UNIT 14.1 Introduction to Cheminformatics
    2. UNIT 14.2 Using Pharmabase to Perform Pharmacological Analyses of Cell Function
    3. UNIT 14.3 Using MSDchem to Search the PDB Ligand Dictionary
    4. UNIT 14.4 In Silico Drug Exploration and Discovery Using DrugBank
    5. UNIT 14.5 Using ChemBank to Probe Chemical Biology
    6. UNIT 14.6 Using ZINC to Acquire a Virtual Screening Library
    7. UNIT 14.7 PharmGKB: An Integrated Resource of Pharmacogenomic Data and Knowledge
    8. UNIT 14.8 Exploring Human Metabolites Using the Human Metabolome Database
    9. UNIT 14.9 ChEBI: An Open Bioinformatics and Cheminformatics Resource
    10. UNIT 14.10 Metabolomic Data Processing, Analysis, and Interpretation Using MetaboAnalyst
    11. UNIT 14.11 LC-MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine
    12. UNIT 14.12 LipidXplorer: Software for Quantitative Shotgun Lipidomics Compatible with Multiple Mass Spectrometry Platforms
    13. UNIT 14.13 MetaboLights: An Open-Access Database Repository for Metabolomics Data
  17. Chapter 15 Understanding Genome Variation
    1. You have free access to this content
      UNIT 15.2 Some Phenotype Association Tools in Galaxy: Looking for Disease SNPs in a Full Genome
    2. UNIT 15.3 Genotyping in the Cloud with Crossbow
    3. UNIT 15.4 Using VarScan 2 for Germline Variant Calling and Somatic Mutation Detection
    4. UNIT 15.5 Using SomaticSniper to Detect Somatic Single Nucleotide Variants
    5. UNIT 15.6 BreakDancer: Identification of Genomic Structural Variation from Paired-End Read Mapping
    6. You have free access to this content
      UNIT 15.7 cgpPindel: Identifying Somatically Acquired Insertion and Deletion Events from Paired End Sequencing
    7. You have free access to this content
      UNIT 15.8 VAGrENT: Variation Annotation Generator
  18. Appendix 1 User Fundamentals
    1. APPENDIX 1A IUPAC/IUB Single-Letter Codes Within Nucleic Acid and Amino Acid Sequences
    2. APPENDIX 1B Common File Formats
    3. APPENDIX 1C Unix Survival Guide
    4. APPENDIX 1D X Window System Survival Guide
    5. APPENDIX 1E Sequence File Format Conversion with Command-Line Readseq
  19. Appendix 2 Glossary of Bioinformatics Terms
    1. APPENDIX 2 Glossary of Bioinformatics Terms
  20. Appendix 3 Fundamentals of Bioinformatics
    1. APPENDIX 3A An Introduction to Hidden Markov Models