Graph construction and random graph generation for modeling protein structures



Researchers often model folded protein structures as graphs with amino acids as the vertices and edges representing contacts between amino acids. The vertices in these graphs are naturally ordered in the amino acid sequence order. There are many different graph construction methods and there is no consensus about what construction to use or what the major issues are with each construction in the literature. We investigate different constructions and examine their effect on various graph measures. We also consider the small-world network model for proteins, discuss its validity under the different constructions, and discuss random protein graph generation. We propose a new graph property for graphs with ordered vertices, the contact distribution, and propose a method of reciprocal attachment to merge neighborhoods for protein graphs. Statistical Analysis and Data Mining Statistical Analysis and Data Mining 2013 DOI: 10.1002/sam.11203