Funding sources: Sequenom, Inc. and Sequenom Center for Molecular Medicine (SCMM). The sponsors participated in the design of the study, data collection, data management, data analysis, decision to submit the report for publication, manuscript preparation, and review.
Conflicts of interest: Mazloom, Džakula, Wang, Oeth, Jensen, Tynan, McCullough, Saldivar, Bombard, and Deciu are employees of Sequenom Center for Molecular Medicine and Sequenom shareholders. Ehrich, van den Boom, Maeder, and McLennan are employees of Sequenom, Inc. and Sequenom shareholders. Palomaki and Canick were members of the Sequenom Clinical Advisory Board between November 2007 and October 2008 and resigned when they received study funding. Palomaki and Canick were Co-Principal Investigators for a Women & Infants Hospital of Rhode Island project fully funded to through a grant from Sequenom, Inc., between October 2008 and February 2012.
Whole-genome sequencing of circulating cell free (ccf) DNA from maternal plasma has enabled noninvasive prenatal testing for common autosomal aneuploidies. The purpose of this study was to extend the detection to include common sex chromosome aneuploidies (SCAs): [47,XXX], [45,X], [47,XXY], and [47,XYY] syndromes.
Massively parallel sequencing was performed on ccf DNA isolated from the plasma of 1564 pregnant women with known fetal karyotype. A classification algorithm for SCA detection was constructed and trained on this cohort. Another study of 411 maternal samples from women with blinded-to-laboratory fetal karyotypes was then performed to determine the accuracy of the classification algorithm.
In the training cohort, the new algorithm had a detection rate (DR) of 100% (95%CI: 82.3%, 100%), a false positive rate (FPR) of 0.1% (95%CI: 0%, 0.3%), and nonreportable rate of 6% (95%CI: 4.9%, 7.4%) for SCA determination. The blinded validation yielded similar results: DR of 96.2% (95%CI: 78.4%, 99.8%), FPR of 0.3% (95%CI: 0%, 1.8%), and nonreportable rate of 5% (95%CI: 3.2%, 7.7%) for SCA determination
Since it was first reported by Lo et al that fetal nucleic acids were present in maternal plasma in the form of circulating cell free (ccf) DNA fragments, ccf DNA has shown promise as a novel analyte for the development of a noninvasive approach to prenatal genetic testing. This promise, following the publication of a number of major research and clinical studies that have described high precision methods to detect fetal genetic disorders using massively parallel sequencing (MPS), has now been realized.[2-8] Initial studies reporting the accurate detection of trisomy 21 were soon extended to encompass additional autosomal variations including trisomy 18, trisomy 13, and subchromosomal copy number variations, and finally even the complete reconstruction of a fetal genome by sequencing maternal plasma derived ccf DNA.[9-13]
In addition to autosomal variants, a number of clinical disorders have been linked to the copy number of sex chromosomes. Among the most common sex chromosome aneuploidy (SCA) conditions are Turner syndrome [45,X], Trisomy X [47,XXX], Klinefelter syndrome [47,XXY], and [47,XYY] syndrome (sometimes referred to as Jacobs syndrome). Although individually each of these are relatively rare, cumulatively SCAs occur in approximately 0.3% of all live births. Indeed, the population prevalence of all SCAs surpasses the birth prevalence of autosomal chromosomal abnormalities (e.g. trisomies 21, 18, or 13). This reflects the fact that SCAs are rarely lethal and their phenotypic features are less severe than autosomal chromosomal abnormalities. According to the World Health Organization (WHO), SCAs account for nearly one half of all chromosomal abnormalities in humans. WHO also reports that one out of every 400 phenotypically normal humans (0.25%) has some form of SCA. This study only focuses on SCAs that involve whole sex chromosomes (i.e. not deletions, isochromosomes, etc.).
Recent reports indicate that SCAs can be detected using ccf DNA and MPS, albeit at an accuracy lower than for autosomal trisomies, and they were also detected in proof-of-concept studies.[15, 16] Similar to the analysis methods used to determine autosomal aneuploidies through whole genome sequencing,[2, 4-6, 8] prenatal detection of SCAs is based on quantification of chromosomal dosages. If the measured deviation originated from the fetus, it is proportional to the fraction of the fetal DNA in the maternal plasma. The noninvasive detection of SCAs presents a number of additional challenges when compared with the detection of autosomal aneuploidies. Among these are sequencing bias associated with genomic guanine and cytosine (GC) composition and the sequence similarity between chromosomes X and Y, leading to mapping challenges. Moreover, two chromosomes need to be simultaneously assessed amid a background of presumably normal maternal sex chromosomes while the sex of the fetus remains unknown. In addition, the homology between chromosome Y and other chromosomes reduces the signal-to-noise ratio. Third, the small size of the Y chromosome can also result in large variations in its measured representations. Finally, the unknown presence of maternal and/or fetal mosaicism can hinder optimal quantification of chromosomal representations and can impede SCA classification.[17, 18]
Development of appropriate data sets and algorithms can overcome these challenges and can enable the accurate noninvasive detection of SCA. As little as 4% fetal DNA is sufficient for accurate detection of fetal autosomal trisomy.[2, 4-6] This study reports the establishment of a training set for SCA detection as well as the testing of the newly developed assay and algorithm on a blinded validation set. Overall, we demonstrate in this study that SCAs can be detected noninvasively with high sensitivity and a low false positive rate.
Study design and sample collection
The training cohort was composed of frozen plasma sample aliquots from 1564 pregnant women collected as part of an independently developed and coordinated previous study.[4-6] These samples were selected from a residual bank of aliquots collected for a prior nested case-control study of pregnant women at high risk for fetal aneuploidy.[4-6] Samples involved were collected between 10.5 and 20 weeks gestation, prior to invasive amniocentesis or chorionic villus sampling. Karyotype results, including sex chromosomal abnormalities, were obtained for all samples. This cohort was employed as part of the laboratory development process of an improved assay for detection of autosomal aneuploidy. The training set included 9 [45,X] samples, 5 [47,XXX] samples, 8 [47,XXY] samples, and 1 [47,XYY] sample. A separate blinded clinical validation cohort consisted of samples from 411 pregnancies, collected within a similar gestational period, from pregnant women at high risk for fetal aneuploidy, also prior to invasive sampling. The selection criteria excluded samples identified as coming from patients with multiple gestations, mosaic for sex chromosomes, or having no documented karyotype report available (n = 9). The demographic information for these samples is presented in Supporting Table 1.
These 411 samples included 21 [45,X] samples, 1 [47,XXX] sample, 5 [47,XXY] samples, and 3 [47,XYY] samples. All samples used for the validation cohort had at least two 4 mL plasma aliquots available per patient.
All samples were obtained from subjects 18 years of age or older who provided Institutional Review Board (IRB) (or equivalent) approved informed consent. Samples for the training cohort were collected, as previously described, as part of an international collaboration (ClinicalTrials.gov NCT00877292). Samples for the validation cohort were collected under the following IRB approved clinical studies: Western IRB no. 20091396, Western IRB no. 20080757, Compass IRB no. 00351.
Blood collection and plasma fractionation
Up to 50 mL of whole blood was collected from patients into EDTA-K2 spray-dried 10 mL Vacutainers (EDTA tubes; Becton Dickinson, Franklin Lakes, NJ, USA). Whole blood samples were refrigerated or stored on wet ice and were processed to plasma within 6 h of the blood draw. The maternal whole blood in EDTA tubes was centrifuged (Eppendorf 5810R plus swing out rotor) at 4 °C at 2500 g for 10 min, and the plasma was collected. The EDTA plasma was centrifuged a second time (Eppendorf 5810R plus fixed angle rotor) at 4 °C at 15 500g for 10 min. After the second spin, the EDTA plasma was removed from the pellet that formed at the bottom of the tube and distributed into 4 mL barcoded plasma aliquots and immediately stored frozen at ≤−70 °C until DNA extraction.
Fetal quantifier assay
The quantity of ccf DNA in each sample was assessed by the Fetal Quantifier Assay as previously described.[3-6, 19, 20]
Circulating cell free (ccf) plasma DNA isolation and purification
The ccf DNA was isolated from up to 4 mL plasma using the QIAamp Circulating Nucleic Acid Kit (QIAGEN Inc., Valencia, CA, USA) as previously described.[2, 20] A minimum of 3.5 mL initial plasma volume was required for final classification of fetal SCA. The ccf DNA was eluted in a final volume of 55 μL.
Sequencing library preparation
The ccf DNA libraries were prepared in 96-well plate format from 40 μL of ccf DNA per donor following the Illumina TruSeq library preparation protocol (Illumina, Inc., San Diego, CA, USA) with AMPure XP magnetic bead clean-up (Beckman Coulter, Inc., Brea, CA, USA) on a Caliper Zephyr liquid handler (PerkinElmer Inc., Santa Clara, CA, USA). TruSeq indexes 1 through 12 were incorporated into libraries. No size fractionation of ccf DNA libraries was required because of the characteristic fragmentation of ccf DNA. Libraries were quantified on the Caliper LabChip GX (PerkinElmer Inc., Santa Clara, CA, USA) and normalized to the same concentration.[2-4]
Multiplexing, clustering, and sequencing
The ccf DNA libraries were pooled row-wise at a 12-plex level, clustered to Illumina HiSeq 2000 v3 flow cells, and the ccf DNA insert sequenced for 36 cycles on a HiSeq 2000. Index sequences were identified with seven cycles of sequencing.
Prior to sequencing, each sample library was assessed for DNA content. The results were translated to a concentration measure. Only those samples with the DNA concentrations greater than 7.5 nM/L were accepted in the final analysis. Samples with fetal DNA fractions less than the detection limit of 4% were rejected. Furthermore, because the contribution of the fetal DNA in the maternal plasma is expected to be less than 50%, the samples with the reported fetal fraction exceeding 50% were deemed invalid and were also excluded. In order to assure the quality of the sequencing step, a set of post-sequencing quality control (QC) metrics were imposed. The QC criteria included (1) a minimum number of 9 million autosomal aligned reads per sample, (2) a lower cut-off for the aligned reads partitioned into 50 kBp regions as filtered for the regions with repeated DNA sequences, subjected to GC content correction, and divided by the total raw counts, as well as (3) the observed curvature of the counts-versus-GC content estimated in the context of the 50 kBp regions.
Following sequencing, adapters were removed from the qualified reads. The reads were then demultiplexed according to their barcodes and aligned to human reference genome build 37 (hg.19) using the Bowtie 2 short read aligner. Only perfect matches within the seed regions were allowed for the final analysis.
To remove systematic biases from raw measurements, we extended a previously developed normalization procedure to sex chromosomes. All chromosomes were partitioned into contiguous, nonoverlapping 50 kBp genomic regions and parameterized. The normalization parameters for chromosome X were derived from a subset of 480 euploid samples corresponding to known female fetuses. Filtering of genomic regions yielded the final subsection comprising 76.7% of chromosome X. This subsection was employed to quantify the amount of chromosome X present in the sample. A similar procedure was used for chromosome Y using a separate training set of 23 pooled adult male samples. A subset of regions representing 2.2% of chromosome Y was found to be specific to males. Those regions were then used to quantify the representation of chromosome Y. The normalization method used for region selection also enables detection of subchromosomal abnormalities and does not depend on any study-specific optimized normalized chromosomal ratio as done in other studies.
The developed SCA algorithm is sex-specific. The fetal sex is predicted according to the algorithm described in Mazloom et al. Next, SCA is assessed separately for male and female pregnancies. X chromosome aneuploidies ([45,X] and [47,XXX]) are considered for putative female fetuses, whereas Y chromosome aneuploidies ([47,XXY] and [47,XYY]) are evaluated for putative male fetuses. For both sexes, chromosome representations are evaluated as the ratios of normalized chromosome X and Y read counts in the genomic regions described previously, versus the total autosomal read counts.
In brief, samples identified as representing female fetuses were labeled as [XX] if they fell within a range compatible with the [46,XX] samples from the training set. Under-representation of chromosome X led to a labeling of [45,X], whereas over-representation of chromosome X led to a labeling of [47,XXX]. To avoid maternal interference with the chromosome X fetal sex aneuploidy calls, we enforced lower and upper boundaries on chromosome X representations for the prediction of [45,X] and [47,XXX]. Such lower and upper thresholds were determined by calculating the maximum theoretical chromosomal representation of a sample with 70% fetal fraction. Determination of SCA for the putative female samples for which the chromosome X representation fell within borderline ranges was not performed. In z-score space, these ranges correspond to [−3.5;−2.5] and [2.5;3.5].
Samples identified as representing male fetuses were labeled as [XY] provided they followed a pattern of chromosome X and Y distribution compatible with the [46,XY] samples in the training set. Over-representation of chromosome X in putative male samples, if comparable with the distribution of chromosome X for [46,XX] samples, led to a labeling of [47,XXY]. Over-representation of chromosome Y in putative male samples led to a labeling of [47,XYY]. Determination of SCA was not performed for putative male samples for which the fetal contribution was insufficient.
The nonreportable category included samples affected either by analytical failures (fetal fraction <4% or >50%, library concentration < 7.5 nM/L, or QC requirements not satisfied) or by the region in which a sex aneuploidy assessment cannot be performed.
Sex chromosome aneuploidy detection in the training set
The performance of the classification algorithm is summarized in Table 1. Data for the training set are shown in Figure 1A and B. In this set, there were 740 samples with a karyotype result that indicated female sex and data that allowed for assessment of fetal sex aneuploidy. Of these, 732 were correctly classified (720 XX, 8 X, 4 XXX) whereas 8 reportedly euploid samples were not classified as XX (3 were identified as X, 1 as XXX, and 4 as XY). Furthermore, there were 729 samples with a karyotype result that indicated male sex. Of these, 725 were correctly classified (718 XY, 6 XXY, and 1 XYY), whereas 4 euploid male samples were annotated as euploid female samples. This amounts to an overall sensitivity for the detection of SCA of 100% (95% confidence interval 82.3%–100%) and a false positive rate of 0.1% (95% confidence interval 0%–0.3%). The nonreportable rate relating to SCA determination was 6% (95% confidence interval 4.9%–7.4%).
Table 1. Sex chromosome results for the training set, the validation set and the combined datasets. Gray filled boxes indicate those results where the karyotype and test result agree. Columns correspond to karyotype results whereas rows correspond to the test outcome
Sex chromosome aneuploidy detection in the validation set
Data for the validation set are shown in Figures 1C and D. One hundred ninety one samples with a karyotype result that indicated female sex had data that allowed for assessment of fetal sex aneuploidy. Of these, a total of 185 were correctly classified (167 XX, 17 X, 1 XXX). There was 1 false positive and 1 false negative for [45,X], and 4 XX samples were predicted to be XY. Of the 199 samples with a karyotype result indicating male sex, 198 were correctly classified (191 XY, 5 XXY, 2 XYY) and 1 was annotated as a female sample. Thus, the overall sensitivity for the detection of SCA was 96.2% (95% confidence interval 78.4%–99.8%) and the false positive rate was 0.3% (95% confidence interval 0%–1.8%). The nonreportable rate relating to SCA determination was 5% (95% confidence interval 3.2%–7.7%).
The results from the training set are comparable with the ones from the validation set. This is not surprising, given the fact that the training set was mostly used in order to learn the regions on chromosomes X and Y best suited for SCA discrimination and for estimating data densities for euploid samples alone. The nonreportable rate is higher among samples pertaining to male fetuses. This is due to the fact that fewer genomic regions are included from chromosome Y, which leads to a weaker signal-to-noise ratio.
Discrepant fetal sex results were related to transcription errors which occurred during the unblinding process. This error accounts for 4 out of the 5 discrepant results for euploid samples. The data presented in Table 1 does not correct for the abovementioned discrepancies.
Our laboratory test reporting is currently confined to the most common autosomal aneuploidies (trisomies 21, 18, and 13). The test methodology has been designed from inception as a whole genome analysis. The rationale has been that, when validated, as in the present study, this would allow the future addition of other clinically relevant fetal chromosomal abnormalities without fundamental changes to the assay process. As sequencing technologies decrease in cost, this will likely allow for transition to a noninvasive fetal analysis of the information content currently achieved by karyotyping, or better yet comparative genome hybridization techniques, but without the need for an invasive procedure. A logical next step towards this goal is the inclusion of information regarding other fetal chromosomal complements such as sex chromosome abnormalities.
Sex chromosome aneuploidies, such as Turner and Klinefelter syndromes, are less common than the autosomal trisomies, with roughly a 1 in 3000 incidence for Turner syndrome for live-born infants and 1 in 500–1000 for Klinefelter syndrome. The prevalence of monosomy X is much higher in the first trimester of pregnancy but often results in subsequent in-utero lethality. The clinical features of the common SCAs vary from abnormalities in stature, to more severe cardiac defects, though the phenotypic abnormalities are rarely life-threatening. However, the incidence of mosaicism in these disorders may complicate noninvasive prenatal testing (NIPT), particularly with respect to gonadal dysgenesis in the adult female population (patients in whom the phenotypic abnormalities tend toward menstrual dysfunction, ovulatory defects, and premature ovarian failure).
Until recently, ultrasound was the only noninvasive method that may suggest these conditions prenatally, and invasive testing was required for further elucidation. Although Turner syndrome may exhibit a finding, such as a cystic hygroma, sonographic abnormalities in other SCAs may be quite subtle, even for the most skilled technologist, or nonexistent all together.
The algorithm presented here for SCAs led to a combined sensitivity of 96.2% (95% confidence interval 78.4%–99.8%) and false positive rate of 0.3% (95% confidence interval 0%–1.8%) with a nonreportable rate of 5% (95% confidence interval 3.2%–7.7%). NIPT, with demonstrated robust performance in detecting the more common autosomal trisomies, can clearly also aid as an adjunctive test to ultrasound in detecting SCAs, as this study has demonstrated.
An important ongoing challenge is to address the counseling-related issues that accompany the diagnosis of a SCA by NIPT. As noted previously, the phenotypic presentation of SCAs may vary considerably. In particular, women of reproductive age who appear phenotypically normal, but manifest menstrual irregularities such as oligomenorrhea may, in fact, be mosaic for their sex chromosomes, for example, [45,X]/[46,XX]/[47,XXX]. Pregnancy can occur in such patients, and the diagnosis of such a chromosomal finding by NIPT could represent not only a fetal SCA but also low-level maternal mosaicism – particularly in the presence of normal targeted fetal ultrasound imaging. For a full discrimination of fetal versus maternal mosaicism, sequencing of the buffy coat (where maternal cells are found) can be considered. However, the number of positive cases suggesting fetal SCA is expected to be very small and should be put into the context of the current primary indication and demand in the laboratory for NIPT.
Some recent publications describing noninvasive prenatal testing state a detection rate of 100% of trisomy 21. In clinical practice, a test with perfect performance is impossible to achieve. Recent reports indicate that false negatives do occur even when such testing has claimed 100% detection. Multiple reasons can account for such errors: insufficient fetal DNA, confined placental mosaicism, maternal metastatic disease, or chain of custody events.
At present, American Congress of Obstetricians and Gynecologists guidelines advocate confirmatory diagnostic testing for those cases in which there is an over-representation of chromosome 21, 18, or 13 material suggesting fetal Down syndrome, Edwards syndrome, or Patau syndrome. The data from this cohort of SCAs demonstrates robust testing performance similar to what has been seen with the autosomal aneuploidies. The same approach of confirming positive cases with invasive testing could be considered when NIPT suggests the rare presence of Turner, Klinefelter, or XYY syndromes.
We thank the members of the SCMM Bioinformatics group for developing and supporting the infrastructure required for the SCA algorithm, in particular to Sung Kim for his contributions to the region selection method, Tim Lu, Zhanyang Zhu, Xiaojun Guan, Eyad Almasri, and Jennifer Geis for their contributions in testing the algorithm and implementing an efficient computational pipeline, and Christine Chin for database programming.
We thank personnel at the 27 clinical sites for the Women & Infants Study for enrollment of women at high risk of aneuploidy as well as for sample and fetal karyotype collection. These sites and Principal Investigators are listed in an earlier study.
We thank Jaclyn Olsen and Sara Moellering of Sequenom for their oversight of the clinical studies, and Clare Gibbons at North York General Hospital for her diligence in collecting and entering the clinical data for more than half of the samples analyzed. We also thank Sonia Minassian for acting as the independent third party within the unblinding process of the validation set.
WHAT'S ALREADY KNOWN ABOUT THIS TOPIC?
Circulating cell-free fetal DNA present in maternal plasma enables noninvasive prenatal testing.
Using massively parallel sequencing (MPS), accurate detection of trisomies of chromosomes 21, 18, and 18 has been demonstrated and is now offered as a laboratory developed test.
Detection of monosomy X through MPS is also possible. Fewer data are available for other sex chromosome aneuploidies.
WHAT DOES THIS STUDY ADD?
By using a comprehensive bioinformatic model, we demonstrate that accurate detection of the most common sex chromosome aneuploidies (SCA) through whole-genome massively parallel sequencing can be achieved.
The SCA detection algorithm will complement the already existing methods for detection of autosomal trisomy.