Figure S1. The ubiquitin and small ubiquitin-like modifier (SUMO) pathway. Ubiquitin and the small ubiquitin-like modifier (SUMO) are related proteins that undergo similar processing for their activation and conjugation to other proteins such as histones. They are generically depicted as ubiquitin-like proteins (Ubl) in this figure. The precursor Ubl is cleaved into its mature form by a protease (hydrolase). The processed peptide is activated in an ATP-dependent manner through conjugation to the E1-activating enzyme. Following activation the peptide is transferred to the E2-conjugating enzyme that may require an E3 protein for interaction with the target and the transfer of the Ubl to an amino group of a lysine in the target histone. This process can be repeated to poly ubiquitinate, or sumoylate proteins. Removal of the modification occurs through the action of a specific protease (Hydrolase) that differs from the maturation enzyme and the Ubl peptide may be recycled or degraded. Enzymes that may carry out these functions during SUMO modification in the pea aphid are listed in supplemental Table S1.

Figure S2. Pea aphid scaffolds containing multiple histone loci. Accession numbers for the scaffolds are indicated above each image. Arrows indicate the direction of the coding region and numbers to the bottom indicate the position within the scaffold. Each locus is labelled and darker grey arrows with white text indicate probable pseudogenes.

Figure S3. Relationships between HDAC6 and HDAC10. Enzymes encoded by the human (Hs) and pea aphid (Ap) genomes are depicted. Alignments of the 8 HDAC-like domains indicates that the HDAC10 domain b is related to the functional domains in the HDAC6 and HDAC10 proteins. Shaded boxes represent 50% or better protein sequence identity among all 8 domains.

Figure S4. Pea aphid histone methyltransferases. Schematic diagrams of SET domain proteins (left) and a phylogram of the PRMT proteins (right). Domain architectures for the SET domain proteins, based on those from Drosophila melanogaster, are provided for comparison between D. melanogaster and the pea aphid (ACYPI designations). Abbreviations include Ph, plant homeodomain (PHD); FYN/FYC, F/Y rich N/C terminus; MLL2, an N-terminal extension containing 5 PHD fingers and a high mobility group (HMG) box similar to that seen in mixed lineage leukemia 2; A, associated with SET (AWS); Bromo, bromodomain; BAH, bromo adjacent homology domain; Sri2, Set2-Rpb1 interaction motif; PWWP, proline-tryptophan-tryptophan-proline motif; SANT, named for switching-defective protein 3, adaptor 2, nuclear receptor co-repressor, transcription factor IIIB; PRE, pre-SET; RRM, RNA recognition motif; Chr, chromodomain; MBD, methyl CpG binding domain; TPR, tetratricopeptide repeat. Other domains not listed are not abbreviations. Motifs with white regions are atypical types and those with a dashed outline indicate domains present in one species but lacking in the other. An aphid Su(VAR)3-9 was previously cloned but three related genes were present in the draft sequence. Su(VAR)4-20 and pr-Set7 orthologues were not found. The PRMT phylogram is a neighbor-joining tree based on ClustalX alignments of the D. melanogaster DART sequences and the PRMT-related sequences from the pea aphid. Node labels indicate bootstrap support (per cent) for nodes with support greater than 50 per cent. Additional information on the aphid proteins can be found in supplemental Table S1.

Figure S5. An analysis of gene copy number. The relationship between gene length and the number of traces identified using MEGA BLAST and the raw A. pisum draft genome sequence deposited in the NCBI WGS trace archive was determined using 15 microsatellites and two single copy loci. The microsatellites used were Ap-01 to Ap-08, ApH12M, ApH10M, ApH08M, ApH05M, ApH04M, ApG10M, and ApF08M. The single copy loci used were a fragment of aspartate aminotransferase and the S-adenosylmethionine decarboxylase open reading frame. The slope of the regression line served as a basis for comparisons with the slopes of the regression lines derived from similar data using the indicated chromatin-related loci but with each regression line being forced through the origin. The number of different loci identified in the draft genome sequence (indicated in parentheses within the key) is consistent with the individual ratios between the slopes generated by regression of the data for chromatin loci and the slope of the regression line generated with the data from single copy loci (ratios and equations are labeled for each regression line).

Table S1. Chromatin-remodelling protein coding genes identified in the pea aphid draft genome assembly

Table S2. Numbers of chromatin loci for selected orthologous groups in the eugenes database

Table S3. Jumonji Domain Identity Matrix

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

IMB_972_sm_Figure_1.eps3524KSupporting info item
IMB_972_sm_Figure_2.eps4479KSupporting info item
IMB_972_sm_Figure_3.eps2356KSupporting info item
IMB_972_sm_Figure_4.eps1059KSupporting info item
IMB_972_sm_Figure_5.eps723KSupporting info item
IMB_972_sm_Suppl_Tables.xls71KSupporting info item
IMB_972_sm_Suppl_Figures.doc835KSupporting info item
IMB_972_sm_Suppl_Data.pdf653KSupporting info item

Please note: Neither the Editors nor Wiley Blackwell are responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.