UC-Curve: A highly compact 2D graphical representation of protein sequences



A highly compact two-dimensional graphical representation of protein sequences, namely, UC-Curve, is presented by assigning amino acids to the circumference of a unit circle with a cyclic order. UC-Curves can visually reveal general composition features of protein sequences, and roughly exhibit major differences among similar protein sequences. Geometric center vectors of UC-Curves and Euclidean distances are extracted, respectively, to analyze pairwise similarities/dissimilarities between two different families of proteins. Comparative results demonstrate the robustness of the technique and show that UC-Curves could help to inference reasonable phylogenetic relationships with relatively less computational cost. © 2013 Wiley Periodicals, Inc.