• Open Access

Conversion of nucleotides sequences into genomic signals

Authors


*Correspondence to: Paul Dan CRISTEA Spl. Independentei 313, 77206 Bucharest, Romania. Tel.: +40-1-411 44 37, Fax: +40-1-410 44 14. E-mail: pcristea@dsp.pub.ro. http://www.dsp.pub.ro

Abstract

An original tetrahedral representation of the Genetic Code (GC) that better describes its structure, degeneration and evolution trends is defined. The possibility to reduce the dimension of the representation by projecting the GC tetrahedron on an adequately oriented plane is also analyzed, leading to some equivalent complex representations of the GC. On these bases, optimal symbolic-to-digital mappings of the linear, nucleic acid strands into real or complex genomic signals are derived at nucleotide, codon and amino acid levels. By converting the sequences of nucleotides and polypeptides into digital genomic signals, this approach offers the possibility to use a large variety of signal processing methods for their handling and analysis. It is also shown that some essential features of the nucleotide sequences can be better extracted using this representation. Specifically, the paper reports for the first time the existence of a global helicoidal wrapping of the complex representations of the bases along DNA sequences, a large scale trend of genomic signals. New tools for genomic signal analysis, including the use of phase, aggregated phase, unwrapped phase, sequence path, stem representation of components’relative frequencies, as well as analysis of the transitions are introduced at the nucleotide, codon and amino acid levels, and in a multiresolution approach.

Ancillary