SEARCH

SEARCH BY CITATION

Keywords:

  • comparative genomics;
  • evolution;
  • functional genomics;
  • non-family genes;
  • plant genome

Summary

There are a large number of ‘non-family’ (NF) genes that do not cluster into families with three or more members per genome. While gene families have been extensively studied, a systematic analysis of NF genes has not been reported. We performed comparative studies on NF genes in 14 plant species. Based on the clustering of protein sequences, we identified ~94 000 NF genes across these species that were divided into five evolutionary groups: Viridiplantae wide, angiosperm specific, monocot specific, dicot specific, and those that were species specific. Our analysis revealed that the NF genes resulted largely from less frequent gene duplications and/or a higher rate of gene loss after segmental duplication relative to genes in both low-copy-number families (LF; 3–10 copies per genome) and high-copy-number families (HF; >10 copies). Furthermore, we identified functions enriched in the NF gene set as compared with the HF genes. We found that NF genes were involved in essential biological processes shared by all plant lineages (e.g. photosynthesis and translation), as well as gene regulation and stress responses associated with phylogenetic diversification. In particular, our analysis of an Arabidopsis protein–protein interaction network revealed that hub proteins with the top 10% most connections were over-represented in the NF set relative to the HF set. This research highlights the roles that NF genes may play in evolutionary and functional genomics research.