Repetitive Elements: Bioinformatic Identification, Classification and Analysis
Published Online: 15 FEB 2011
Copyright © 2001 John Wiley & Sons, Ltd. All rights reserved.
How to Cite
Jurka, J., Bao, W., Kojima, K. and Kapitonov, V. V. 2011. Repetitive Elements: Bioinformatic Identification, Classification and Analysis. eLS. .
- Published Online: 15 FEB 2011
Multicopy, or repetitive, deoxyribonucleic acid (DNA) is routinely being detected and analysed by computer-assisted comparison of genomic DNA with reference databases of repeats. The most representative collection of repetitive elements is ‘Repbase Update’ (RU), which currently contains >15 000 unique entries from diverse eukaryotic species. The majority of transposable elements (TEs) in RU are consensus sequences based on multiple alignments of individual repeats. Consensus sequences are approximations of active TEs responsible for generating multiple mutated copies in the genome. The current two major repeat detection and annotation programs, RepeatMasker and CENSOR, both use RU for annotation of repeats in eukaryotic genomes. RU is also increasingly being used as a master reference library to create custom libraries for detection of repeats in newly sequenced genomes. Finally, a combination of different routines can be used to detect repeats not similar to those already present in the reference libraries (de novo approach).
Active transposable elements (TEs) produce families and subfamilies of multiple copies in the genome, called ‘interspersed repetitive elements’ or ‘repeats’.
Consensus sequences derived from aligned families and subfamilies of repeats are excellent approximations of the active TEs from which they were derived.
Consensus sequences are also preferred reference sequences used in screening and annotation of repetitive elements, especially the most divergent ones.
RepeatMasker and CENSOR are basic repeat screening and annotation programs using reference sequence libraries.
In the absence of reference sequences, repetitive DNA can be detected by screening for multiple copies and characteristic structural features (de novo approach).
- transposable elements (TEs);
- simple sequence repeats (SSRs);
- repeat maps;
- computational biology;
- reference databases