Epigenetic analysis of regulatory T cells using multiplex bisulfite sequencing

This work was supported by Wellcome Trust Grant 096388, JDRF Grant 9-2011-253, the National Institute for Health Research Cambridge Biomedical Research Centre (BRC) and Award P01AI039671 (to LSW. and JAT.) from the National Institute of Allergy and Infectious Diseases (NIAID). CW is supported by the Wellcome Trust (089989). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of NIAID or the National Institutes of Health. The Cambridge Institute for Medical Research is in receipt of Wellcome Trust Strategic Award 100140. We gratefully acknowledge the participation of all NIHR Cambridge BioResource volunteers. We thank the Cambridge BioResource staff for their help with volunteer recruitment. We thank members of the Cambridge BioResource SAB and Management Committee for their support of our study and the National Institute for Health Research Cambridge Biomedical Research Centre for funding. We thank Fay Rodger and Ruth Littleboy for running the Illumina MiSeq in the Molecular Genetics Laboratories, Addenbrooke's Hospital, Cambridge. This research was supported by the Cambridge NIHR BRC Cell Phenotyping Hub. In particular, we wish to thank Anna Petrunkina Harrison, Simon McCallum, Christopher Bowman, Natalia Savinykh, Esther Perez and Jelena Markovic Djuric for their advice and support in cell sorting. We also thank Helen Stevens, Pamela Clarke, Gillian Coleman, Sarah Dawson, Jennifer Denesha, Simon Duley, Meeta Maisuria-Armer and Trupti Mistry for acquisition and preparation of samples.


Epigenetic analysis of regulatory T cells using multiplex bisulfite sequencing
A subset of regulatory CD4 + T cells (Tregs) is characterized by the stable constitutive expression of the transcription factor FOXP3. Demethylation at a conserved region within intron 1 of FOXP3, the Tregspecific demethylated region (TSDR), is exclusive to this subset of Tregs: other immune cells that do not express FOXP3 or express the transcription factor transiently, such as activated effector T cells or TGF-β-treated CD4 + T cells, are methylated at the TSDR [1][2][3][4]. Hence, the TSDR provides a specific target to enumerate Tregs within biological samples; developing methods that enable robust Treg-cell quantitation with small numbers of cells from clinical samples remains an important goal. Current methods to measure FOXP3 demethylation include qPCR [5,6] and epigenetic sequencing methylation analysis [7] of Sanger sequencing traces [8], neither of which capture the methylation information at each CpG site on a single piece of DNA, but rather produce an average methylation over the whole region or an average at each site. Although several studies have used a PCR cloning and sequencing method to assess the methylation of each CpG site within the TSDR, this is a laborious method and normally less than 50 clones per sample are analyzed [3,9,10].
We therefore developed a nextgeneration sequencing (NGS) method to assess at single-base resolution the methylation status of the FOXP3 TSDR. The method, which we originally developed to genotype rs1800521 single nucleotide polymorphism in the gene AIRE (Methods detailed as Supporting Information), reports at single base resolution the methylation of each DNA amplicon, provides hundreds to thousands of reads per replicate and can be multiplexed to analyze several DNA regions simultaneously.
Our method uses two-stage PCR: the first is a gene-specific PCR with primers having an adaptor sequence that is targeted by a second round PCR, which adds a unique index sequence enabling up to 960 different samples to be processed per sequencing run. We targeted nine CpG sites within the TSDR (Supporting Information Fig. 1), which have also been used in qPCR assays [6]. Our primers flank the TSDR alleviating the need for methylation-specific primers as well as the demethylated and methylated standards required in qPCR assays. We also developed a bioinformatics processing pipeline that reads the raw output from sequencing and reports the methylation status at each CpG (https:// github.com/XinYang6699/Methpup). To validate our method, we sequenced the TSDR in three CD4 + T-cell subsets, Tregs and naïve and memory effector cells, which were defined by surface molecules CD4, CD25, CD127 [11] and purified by flow sorting from two male and two female * These authors contributed equally to this work.  Table 1). We observed a linear relationship between the proportion of Tregs and the proportion of FOXP3 demethylation in defined mixtures of Tregs and naïve CD4 + T cells (r 2 = 0.99, Supporting Information Fig. 3). Cell inputs below 2000 cells/PCR replicate showed increasing assay variation so unless cells are extremely limiting, we typically utilize 2000-4000 cells per replicate and six replicates per cell sample (Supporting Information Fig. 4).
As expected based on previous publications where the TSDR was shown using multiple technologies to be demethylated in a majority of Tregs [1,3,8], 86% and 62% of the TSDR sequences were demethylated in Treg samples from male donors and 44% and 40% from female donors (Fig. 1, Supporting Information Fig.  2, Supporting Information Table 1). Owing to X inactivation, demethylation of the FOXP3 TSDR occurs on only one copy of the X chromosome in females thereby explaining the ß50% lower demethylation level as compared to males. Although in Tregs the majority of highly demethylated sequencing reads are demethylated at all nine CpG sites considered, a proportion (<10%) of reads have only eight CpG sites demethylated. This incomplete demethylation could be due to biological effects or to incomplete bisulfite conversion. We favor the former possibility since bisulfite conversion efficiency as determined by cytosines that cannot be methylated was greater than 99% (Supporting Information Fig. 5A). Underscoring this biological heterogeneity, in naïve and memory effector cells demethylation of a single CpG among the nine sites was observed at each  Table 1). See Supporting Information Fig. 7 for gating strategy defining CD4 + TCRαβ + T cells. The x axis shows the nine methylation sites analyzed and each row indicates the methylation status of one copy of the sequenced TSDR. Light green represents a C (methylated) and dark green represents a T (demethylated). The y axis is the percentage of sequencing reads for each methylation/demethylation pattern. Six replicates of 3000 cells per replicate (5 ng DNA) were analyzed from a single donor and the median (with range) reads with eight or nine sites demethylated at FOXP3 for the replicates is shown below each graph. (D) A schematic of the FOXP3 gene and the TSDR sequence are shown.
position interrogated and, when combined, comprised ß20% of the reads. Thus the use of NGS to quantify demethylation has the advantage of revealing heterogeneity inherent in the biological process of gene regulation. As expected, and in contrast to Tregs, less than 2% of the TSDR reads from naïve and memory effector cells had a Treg-like TSDR methylation pattern. Although this low level of Treg-specific sequences could be caused by suboptimal cell purification, Tregs with low levels of surface CD25 and a demethylated TSDR have been reported [12] and could account for the low percentage of Treg-specific reads observed in effector T-cell subsets sorted by cell surface markers. The NGS-based demethylation assay offers a great advantage in detecting small subpopulations of cells demethylated at the FOXP3 TSDR as compared to methods that generate an average demethylation status at each residue since each NGS read simultaneously assesses the methylation status of all CpG sites from a single copy of the gene giving confidence in the interpretation of a demethylated read.   Fig. 1 for CTLA4 methylation status. The median (with range) reads with all seven sites demethylated at CTLA4 for the replicates is shown below each graph. Data are representative of four donors (for more information see, Supporting Information Fig. 2 and Supporting Information Table 2).
FOXP3, a procedure that is compatible with NGS-based analysis of bisulfite converted DNA [13].
Because the current version of our NGS sequencing platform utilizes up to 960 unique index sequences per sequencing run of approximately 20 million reads, each index is associated with approximately 20 000 reads. In most cases, 20 000 reads per replicate is in excess of what is required for robust measurements of percent demethylation of any particular region thereby allowing for the multiplexing of gene regions within the same sample. We used multiplexing of up to six gene regions to search for an autosomal TSDR since female samples have 50% less demethylated FOXP3 TSDR reads than male samples. In mouse studies, specific regions in multiple genes in addition to Foxp3 were identified that showed Tregspecific demethylation when compared to conventional CD4 + T cells analyzed ex vivo: Ctla4, Il2ra, Gitr, Eos, and Helios [9]. In comparing the human and mouse sequences in the regions reported, only one strong homology was observed, that in exon 2 of mouse Ctla4 and human CTLA4 (Supporting Information Fig. 6). We developed a sequencing assay for this region in the human gene, multiplexed with the FOXP3 TSDR sequencing assay, and demonstrated that demethylation of exon 2 of human CTLA4 is not Tregspecific. Although >90% of Tregs analyzed ex vivo were demethylated at all seven CpG sites examined, 21-38% of memory CD4 + T cells were demethylated at these same sites (Fig. 2, Supporting Information  Fig. 2, Supporting Information Table 2). In addition, memory effector cells displayed a heterogeneous demethylation pattern with 37-48% of reads having an intermediate methylation pattern (3T, 4T, 5T, or 6T). Thus demethylation of the human CTLA4 exon 2 region reflects more than Treg lineage commitment.
To conclude, we developed a highthroughput method for analyzing methylation at single CpG resolution that uses low DNA input and can be multiplexed. Although we have used the method to measure Treg-specific methylation, it can be easily adapted to verify genome-wide methylation results. Finally, given recent data showing that many disease-associated SNPs are located in enhancer regions [14], the presence of allele-specific methylation detected using our high-throughput method could provide a mechanism for disease-associated gene regulation.