• PAXBP1;
  • core promoter;
  • CT-repeat;
  • Homo sapiens;
  • primate;
  • evolution

Adaptive evolution may be linked with the genomic distribution and function of short tandem repeats (STRs). Proximity of the core promoter STRs to the +1 transcription start site (TSS), and their mutable nature are characteristics that highlight those STRs as a novel source of interspecies variation. The PAXBP1 gene (alternatively known as GCFC1) core promoter contains the longest STR identified in a Homo sapiens gene core promoter. Indeed, this core promoter is a stretch of four consecutive CT-STRs. In the current study, we used the Ensembl, NCBI, and UCSC databases to analyze the evolutionary trend and functional implication of this CT-STR complex in six major lineages across vertebrates, including primates, non-primate mammals, birds, reptiles, amphibians, and fish. We observed exceptional expansion (≥4-repeats) and conservation of this CT-STR complex across primates, except prosimians, Microcebus murinus and Otolemur garnettii (Fisher exact P < 4.1 × 10−7). H. sapiens has the most complex STR formula, and longest repeats. Macaca mulatta and Callithrix jacchus monkeys have the simplest STR formulas, and shortest repeat numbers. CT ≥4-repeats were not detected in non-primate lineages. Different length alleles across the PAXBP1 core promoter CT-STRs significantly altered gene expression in vitro (P < 0.001, t-test). PAXBP1 has a crucial role in craniofacial development, myogenesis, and spine morphogenesis, properties that have been diverged between primates and non-primates. To our knowledge, this is the first instance of expansion and conservation of a STR complex co-occurring specifically with the primate lineage. Am. J. Primatol. 76:747–756, 2014. © 2014 Wiley Periodicals, Inc.