Fig. S1 The maximum likelihood (ML), neighbor-joining (NJ) and Bayesian trees of SET genes from Arabidopsis and humans.

Fig. S2 Pairwise sequence identities of each subfamily.

Figs S3–S9 The maximum likelihood (ML) trees of each Suvh, Suvr, Ash, Trx, E(z), SMYD and SETD subfamily in sequenced plant genomes.

Fig. S10 The Suvh subtree of the Suv subfamily, together with gene structures of the family members.

Fig. S11 The Bayesian trees of each Suvh, Suvr, Ash, Trx and E(z) subfamily in 19 animal genomes.

Fig. S12 An integrated tree of the SMYD, SETD and other subfamilies employing the SET domain in animal genomes and a PRDM phylogeny using maximum likelihood (ML).

Fig. S13 neighbor-joining (NJ) trees showing the evolutionary relationships in the Suv subfamily from Arabidopsis thaliana, Arabidopsis lyrata, Brassica rapa and Carica papaya.

Fig. S14 A maximum likelihood (ML) tree showing the evolutionary relationships and domain architectures in all subfamilies in plants, animals, fungi and protists.

Fig. S15 The expression of SET genes in Arabidopsis, rice and humans in meiosis.

Fig. S16 The expression of SET duplication genes in Arabidopsis and humans.

Fig. S17 The RNA-Seq expression of SET duplication genes in rice.

Table S1 The number of SET genes in all sequenced genomes

Table S2 A list of SET genes in select genomes

Table S3 Rates of synonymous (dS) and nonsynonymous (dN) substitutions between orthologous SET genes in (a) human and chimpanzee and (b) Arabidopsis thaliana and A.  lyrata

Table S4 The expression of SET genes in Arabidopsis

