Unit

UNIT 4.10 Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences

  1. Maja Tarailo-Graovac,
  2. Nansheng Chen

Published Online: 1 MAR 2009

DOI: 10.1002/0471250953.bi0410s25

Current Protocols in Bioinformatics

Current Protocols in Bioinformatics

How to Cite

Tarailo-Graovac, M. and Chen, N. 2009. Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences. Current Protocols in Bioinformatics. 25:4.10:4.10.1–4.10.14.

Author Information

  1. Simon Fraser University, Burnaby, British Columbia, Canada

Publication History

  1. Published Online: 1 MAR 2009
  2. Published Print: MAR 2009

Abstract

RepeatMasker is a popular software tool widely used in computational genomics to identify, classify, and mask repetitive elements, including low-complexity sequences and interspersed repeats. RepeatMasker searches for repetitive sequence by aligning the input genome sequence against a library of known repeats, such as Repbase. Here, we describe two Basic Protocols that provide detailed guidelines on how to use RepeatMasker, either via the Web interface or command-line Unix/Linux system, to analyze repetitive elements in genomic sequences. Sequence comparisons in RepeatMasker are usually performed by the alignment program cross_match, which requires significant processing time for larger sequences. An Alternate Protocol describes how to reduce the processing time using an alternative alignment program, such as WU-BLAST. Further, the advantages, limitations, and known bugs of the software are discussed. Finally, guidelines for understanding the results are provided. Curr. Protoc. Bioinform. 25:4.10.1-4.10.14. © 2009 by John Wiley & Sons, Inc.

Keywords:

  • RepeatMasker;
  • genome annotation;
  • repetitive elements;
  • repeat library;
  • cross_match;
  • WU-BLAST;
  • RECON