Benefits and limitations of whole genome versus targeted approaches for noninvasive prenatal testing for fetal aneuploidies

Authors


  • Funding sources: None
  • Conflicts of interest: None declared

ABSTRACT

The goal to noninvasively detect fetal aneuploidies using circulating cell-free fetal DNA in the maternal plasma seems to be achieved by the use of massively parallel sequencing (MPS). To date, different MPS approaches exist, all aiming to deliver reliable results in a cost effective manner. The most widely used approach is the whole genome MPS method, in which sequencing is performed on maternal plasma to determine the presence of a fetal trisomy. To reduce costs targeted approaches, only analyzing loci from the chromosome(s) of interest has been developed. This review summarizes the different MPS approaches, their benefits and limitations and discusses the implications for future noninvasive prenatal testing. © 2013 John Wiley & Sons, Ltd.

INTRODUCTION

Assessing the genetic condition of the fetus using fetal DNA obtained via noninvasive procedures [noninvasive prenatal testing (NIPT)] has been a dream for many years. In 1997, the group of Dennis Lo for the first time reported the presence of circulating cell-free fetal DNA (ccffDNA) in the maternal plasma.[1] This ccffDNA was shown to be rapidly cleared from the maternal circulation after delivery,[2] and even though it only represents a minor fraction of the total circulating free DNA in maternal plasma, its proportion is still ~10–20% (varying between women and depending on gestational age), whereas the remainder is of maternal origin.[3, 4]

Since its discovery, many characteristics of ccffDNA have been identified and much progress has been made in the development of noninvasive prenatal tests using this material. However, for fetal aneuploidy detection, noninvasive testing is rather complex, as the chromosome or region of interest is also carried by the mother, and there is only a quantitative difference between the maternal and fetal contribution. The first promising report on noninvasive aneuploidy detection was from Lo et al. in 2007,[5] when using fetal mRNA in the maternal plasma. Subsequently, in 2008, two research groups independently opened up a new era with the successful use of massively parallel sequencing (MPS) for NIPT.[6, 7]

MASSIVELY PARALLEL SEQUENCING FOR NONINVASIVE ANEUPLOIDY DETECTION

Because the proportion of fetal DNA in the maternal plasma is only ~10% in the first trimester,[4] for noninvasive aneuploidy detection using ccffDNA in maternal plasma, it is essential to have a method that can accurately determine a very small increase in the total amount of a specific chromosome and/or to have a method to enrich for fetal DNA. With MPS, millions to billions of short DNA molecules can be analyzed simultaneously in a single sequencing run,[8, 9] revealing both the identity and quantity of DNA fragments. Granted it was possible to detect small changes in copy number variations reliably with MPS, it was a pivotal step to examine whether this technique could be used to detect fetal aneuploidies using (nonenriched) ccffDNA. Indeed, the initial proof-of-concept studies of Fan et al. and Chiu et al.[6, 7] described the successful application of MPS for this purpose. The DNA in maternal plasma was sequenced, and after mapping the obtained reads to the human reference genome, the (relative) numbers of fragments per chromosome were counted. If the fetus carries a trisomy, (statistically significant) more fragments of the trisomic chromosome are expected to be present when compared with a diploid fetus. Indeed, the initial studies of Fan et al. and Chiu et al. showed that the MPS method allowed for correct identification of fetal trisomies 21, 18, and 13, without false negative or positive results. As a consequence of these promising results, this method has been widely validated as the base for aneuploidy detection.

However, MPS is not selective in the chromosomal origin of the sequenced DNA fragments, and in a normal disomic individual, chromosome 21, for example, represents less than 1.5% of the sequenced fragments. It is therefore necessary to sequence many million DNA fragments, originating from the complete (nonenriched) genome, to ensure the analysis of sufficient chromosome 21 fragments to statistically detect significant differences between normal and trisomic fetuses. Theoretically, with only one chromosome of interest, this does not seem to be time-efficient and cost-efficient. Therefore, targeted alternatives to whole genome MPS have been developed. In the succeeding text, first, the different next-generation sequencing (NGS) platforms and their specifications in general terms, and subsequently, both whole genome and targeted (MPS) approaches for NIPT, including benefits and limitations, will be discussed in more detail.

NGS PLATFORMS

Today, different NGS platforms are commercially available, each with its own costs, benefits, and limitations, especially when used for NIPT. The costs of the sequencing platforms have recently been reviewed[10] and will not be discussed further here as these are often dependent on local negotiations. Among the available platforms, the use of four have been reported for NIPT, namely the Illumina Genome Analyzer/Illumina HiSeq 2000 Genome Analyzer,[6, 7, 11-14] the Applied Biosystems (ABI) SOLiD Analyzer,[15, 16] the Helicos Heliscope,[17, 18] and the IonTorrent (Life Technologies) platform.[19] These platforms are all short-read platforms, which are, especially for NIPT, very well suited, as ccffDNA shows a fragmentation pattern of around 146 bp.[20] There is evidence that fetal DNA is shorter than maternal DNA, and this difference in length has been used to specifically enrich for fetal DNA.[13, 21, 22] Given the fact that DNA in maternal plasma is highly fragmented, one can envision that this is the reason why no papers have appeared on the use of longer-sequence-read platforms, such as the Roche/454 (454 Life Sciences Branfort USA) and PacBio RS (Pacific Biosciences CA USA) for NIPT, and these will therefore not be discussed here.

The most widely used platforms are the high-throughput HiSeqTM (Illumina, Inc) and SOLiDTM [Life TechnologiesTM/Applied BiosystemsR (ABI)] platforms. These differ from each other in that they use polymerase chain reaction (PCR)-based sequencing-by-synthesis or sequencing-by-ligation.

HiseqTM (Illumina, Inc.)

For the Illumina platforms, DNA fragments are ligated to adapters and attached to reaction chambers located on a flow cell. The attached DNA fragments are extended, amplified by bridge amplification, and sequenced through sequencing-by-synthesis. Using this technology, the Illumina HiSeqTM 2000 can produce up to three billion single-end reads per run, which, for most applications, makes this platform very suitable for multiplexing of samples and thus high throughput runs. The run time for the HiSeqTM 2000 varies from 5 to 14 days, depending on read length. With the arrival of the HiSeqTM 2500, one can even choose between a rapid run and a high output run, making it possible to produce even faster results. With the rapid run, 300 million single-end reads (~10 Gb) can be produced within 7 h.

SolidTM [Life TechnologiesTM/Applied BiosystemsR (ABI)]

For the library construction of the ABI SOLiDTM system, DNA fragments are ligated to adapters, attached to beads, and subsequently clonally amplified by emulsion PCR.[8, 23] With sequencing-by-ligation, the SOLiD4 can produce up to 0.7 billion single-end reads per run, and with the 5500xl SOLiDTM, the amount of data that can be generated has increased to 1.4 billion single-end reads per run. Runtime on the SOliDTM instrument varies between 5 and 10 days, depending on the experimental set-up. The 5500xl SOLiDTM machine uses two fully configurable flowchips, each with six independent lanes, allowing the user to use between one and six lanes, whereas only paying reagent costs for the lanes used rather than the whole flow cell. Although this platform produces slightly less data per run than the Illumina HiSeqTM platform, it is also flexible and can be used for high throughput runs, with the recently marketed upgrade 5500xl W platform offering even more possibilities.

HeliScopeR single molecule sequencer (HelicosTM Biosciences)

A weakness of the aforementioned platforms is that through the PCR-based sample preparation experimental bias, especially for GC-rich regions, can be introduced by the amplification step. Using single-molecule-sequencing on, for instance, the HeliScopeR Single Molecule Sequencer, this bias can be eliminated. For NIPT, a small study reported that when applying this single-molecule-sequencing platform, indeed, a better distinction between fetal disomy and trisomy 21 could be achieved than with PCR-based platforms.[17] However, because for NIPT in routine clinical practice, high throughput analysis, achieved via multiplexing, is inevitable; this single-molecule-sequencing platform is not very well suited as multiplexing of samples is cumbersome.

Others

In addition to the high throughput sequencers mentioned previously, several benchtop sequencers are available on the market as well that might be suitable for NIPT, including the MiSeqTM (Illumina, Inc.) and Ion TorrentTM PGM (Life Technologies). Illumina's MiSeqTM is based on the existing sequencing-by-synthesis chemistry, whereas the Ion TorrentTM PGM uses the semiconductor sequencing technology. Both benchtop sequencers have relatively low set-up costs and reduced running times compared with most of the high throughput sequencers. However, an important limitation of the current benchtop machines is the amount of data that can be generated. The MiSeqTM Personal Sequencer can produce for a 2 × 100 bp run up to 3.4 Gb in 16.5 h, and the Ion TorrentTM PGM (Ion 318 chip) 200 bp reads up to 1 Gb (~5 million reads) in 4.4 h. It is debatable whether these machines have the capacity to deliver sufficient data for reliable noninvasive aneuploidy detection, particularly if samples are multiplexed, thereby limiting possibilities for high throughput. The forthcoming Ion ProtonTM benchtop sequencer may resolve these issues as it can produce up to 10 Gb of data. However, the performance remains to be tested.

WHOLE GENOME VERSUS TARGETED APPROACHES FOR ANEUPLOIDY NIPT

Whole genome approach

To date, a number of large-scale clinical validation studies have shown that whole genome MPS of ccffDNA in maternal plasma for fetal aneuploidy detection, particularly for Down syndrome, can be performed with high (nearly diagnostic) accuracy (Table 1). In terms of costs and efficiency, the number of samples that can be sequenced simultaneously, and thus the minimal number of reads, is an important issue. Clearly, insufficient numbers of sequenced reads will reduce the reliability, as the statistical significance of any difference in number of reads will drop. Too many reads, however, will increase the costs significantly, as fewer samples can be multiplexed per run. There is still an ongoing debate on how many DNA fragments should be sequenced and quantified for a reliable result (Table 1). In 2011, Chiu et al.[11] showed that when the mean number of mappable sequenced reads per sample was 2.3 million, 100% of the fetal Down syndrome cases could be correctly diagnosed, whereas when the mean number was 0.3 million, the detection rate of fetal trisomy 21 was only 79.1% (Table 1). Sparks et al.[24] stated that when using whole genome MPS, ~6.3 million uniquely mapped reads are required to ensure sufficient chromosome 21 count.

Table 1. Overview of large-scale validation studies for NIPT for Down syndrome
 No. of samplesT21 samplesNGS platformWhole genome (WG)/Targeted (T) approachNumber of mapped reads per sampleSensitivity (%)Specificity (%)
  1. NIPT, noninvasive prenatal testing; NGS, next-generation sequencing; n.s., not specified.
Ehrich 2011[12]449 (4-plex)39Illumina GAIIxWG3–5 million10099.7
Chiu 2011[11]2-plex n = 314 8-plex n = 75386Illumina GAIIxWG2.3 million (2-plex) 0.3 million (8-plex)100 (2-plex) 79.1 (8-plex)97.9 (2-plex) 98.9 (8-plex)
Palomaki 2011[14]4664 (4-plex)212Illumina High Seq 2000WGn.s.98.699.8
Sparks 2012[26]29839Illumina High Seq 2000T204 000/410 000/620 000100100
Sparks 2012[24]16335Illumina High Seq 2000T1 million  
Ashoor 2012[27]39750n.s.Tn.s.100100
Norton 2012[28]322881n.s.Tn.s.10099.7
Bianchi 2012[31]2882 (6-plex)89Illumina High Seq 2000WGn.s.100100

Assuming a minimal of 5 to 10 million mappable reads per sample, with the Illumina HiSeqTM 2000, theoretically, 12 samples can be multiplexed per lane, thus when using all 16 lanes of the dual-flow cell system, 192 samples can be sequenced per run in 4 to 5 days (depending on read length). With the ABI 5500xl SOLiDTM, theoretically, ~70 samples can be pooled on one flowchip, and thus, with one machine being equipped with two flow chips, a total of ~140 samples can be sequenced in one run, taking 8 to 10 days. As mentioned earlier, less expensive benchtop sequencing platforms, such as the MiSeqTM, Ion TorrentTM PGM, or Ion ProtonTM System, might be applicable too if sufficient reads can be obtained. The application of the Ion TorrentTM PGM for NIPT, as reported upon recently by Yuan et al.,[19] resulted in a mean of 3.5 million reads per sample, which is on the lower limit of the minimal number of reads necessary for a reliable NIPT result, as assumed by others. The applicability of the Ion TorrentTM PGM therefore needs to be validated in larger studies. Furthermore, the relatively low number of achievable reads reduces the possibilities for multiplexing and therefore high throughput sequencing, resulting in increased costs per sample. The advantage of lower set-up costs for such platforms therefore needs to be weighed against these disadvantages.

To address important issues such as the minimal number of reads required for reliable results and quality controls of the test, guidelines are desired.

Targeted approaches

When using nontargeted approaches, sequencing of the complete genome is carried out, even though there is (are) only one (or several) chromosomes(s) of interest. This means that much of the information that is obtained remains unused. In an attempt to reduce sequencing costs, several groups therefore have studied the application of targeted approaches.

In 2011, Liao et al.[25] used the SureSelect Target Enrichment System (Agilent Technologies) to enrich for exons on chromosome X. Indeed, for the regions targeted by enrichment, the mean sequence coverage was 213-fold higher than that of the nonenriched samples. This increased the coverage of fetal-specific alleles within the targeted region from 3.5% to 95.9%.[25]

Sparks et al.,[24, 26] Ashoor et al.,[27] and Norton et al.[28] showed that by selective amplification of specific regions on chromosome 21 and 18, and subsequent MPS analysis, a method referred to as digital analysis of selected regions (DANSR), the amount of sequenced reads required to reliably detect fetal trisomy 18 or 21 is significantly less than that required for whole genome MPS approaches. They designed DANSR assays against 576 nonpolymorphic loci on each chr18 and chr21, and in addition, to simultaneously determine the fetal fraction in the sample, assays were designed against single nucleotide polymorphism (SNP) containing loci on chromosomes 1 to 12. Data were analyzed in combination with a novel algorithm, fetal-fraction optimized risk of trisomy evaluation, to determine trisomy risk for each pregnant woman.[24] Using the combination of DANSR and fetal-fraction optimized risk of trisomy evaluation, fetal trisomy 21 and 18 cases could be identified with high accuracy with ~1 million raw reads per sample, which means a five to tenfold decrease in number of reads as compared with the whole genome approach, accompanied by a decrease in sequencing costs as well as in data storage. Furthermore, as internal control, the fetal fraction in the sample could be determined simultaneously.

In 2012, Liao et al.[29] reported on targeted MPS for the detection of fetal trisomy 21 by allelic ratio analysis. During library preparation for MPS analysis, they included an enrichment step to enrich for 2906 SNP loci on chromosomes 7, 13, 18, and 21. They subsequently analyzed plasma DNA libraries with and without target enrichment by MPS and compared the data with the maternal and fetal SNPs known from SNP array analysis. They found that the number of mappable paired-end reads in nonenriched and enriched samples was comparable, but the mean sequencing depth of the enriched samples was 225-fold higher than the nonenriched samples (0.12 times versus 27 times). Unfortunately, using their analysis algorithm and the enrichment approach, they were only able to correctly identify trisomy 21 when the extra chromosome was paternal in origin. In maternally derived trisomy 21, it was less accurate.

They concluded that either adding more informative SNP loci or increasing the sequencing depth would solve this problem, the latter, however, not being very realistic as this would mean the number of reads would need to be increased to more than the number necessary in the whole genome approach.

In an attempt to further refine this method, Zimmermann et al.[30] described a modified version of this approach. They enriched for 11 000 SNP loci and used a method they called parental support for analyzing the MPS data. For the 145 samples that met the quality criteria, they obtained an average of 8.85 million mappable reads per sample, of which 6.47 million mapped to the targeted SNPs (~1.3 million per chromosome). The average depth of read was 344, and the median depth per SNP was 255. Using this method, they were able to correctly identify fetal disomy or trisomy for chromosomes 21, 18, and 13 and also correctly identified fetal 45,X and 47,XXY. An advantage of this method is that by comparing the relative amounts of alleles at a set of loci, problems with chromosome-to-chromosome amplification variation are eliminated. Reduced sensitivity and specificity rates for the detection of trisomy 13 and sex chromosomal abnormalities, as described in previous studies,[6, 15, 31] can thus be solved. The authors therefore state that the parental support method increases clinical coverage of viable chromosomal abnormalities by approximately twofold. In the future, these targeted approaches might also be used to target specific regions for the detection of submicroscopic imbalances (microdeletions/microduplications).

Although for targeted methods, such as DANSR, the number of reads obtainable using benchtop platforms will be sufficient, the possibilities for high throughput are still rather limited. This means that for medium to high throughput of samples, the capital costs of the machines, as well as the sample and library preparation costs, will probably be the same for the targeted and whole genome approach. Nevertheless, targeted sequencing can be a good alternative for the still rather expensive whole genome approach, as more samples can be sequenced simultaneously (theoretically greater than tenfold for the DANSR method). One should, however, realize that when using a targeted approach, only the region(s) of interest can be studied.

Methods that do not require MPS at all are expected to be cost efficient too. Two of such approaches that have been described are digital PCR[32] and methylated DNA immunoprecipitation in combination with real-time quantitative PCR.[33] However, because these approaches have not been used and validated widely, their performance, high-throughput possibilities, and costs remain to be determined in more detail, and these techniques will therefore not be discussed here.

NIPT IN CLINICAL PRACTICE

On the basis of the evidence of the clinical performance of NIPT tests, as summarized in Table 1, currently, NIPT is commercially offered as a service for so-called ‘high risk’ pregnant women in parts of the USA, Asia, Middle East, and Europe. The laboratory test is mostly performed by companies in the USA (Verinata Health Inc., Sequenom Inc., Ariosa Diagnostics, Inc., NateraTM) and China (BGI, Berry Genomics Co.), with the majority offering the test via healthcare providers and companies (e.g. LifeCodexx AG, GENNET) outside these countries. All companies offer MPS-based tests using Illumina equipment for the detection of fetal trisomy 13, 18, and/or 21 and/or sex chromosomal aberrations, some of them whole genome (Verinata Health, Sequenom, BGI, Berry Genomics) and others targeted-based (Ariosa Diagnostics, Natera), with prices ranging from ~$500 to $1700. Differences in prices are obvious, but at this stage, it is hard to draw any conclusions regarding cost-effectiveness of whole genome compared with the targeted approach. Ariosa Diagnostics, applying the targeted DANSR approach, charges $795 in the USA, whereas companies such as Verinata Health and Sequenom, both using whole genome approaches, charge $1200 and $1700 in the USA, respectively. This suggests that at this stage, higher costs are still associated with the whole genome approach. One should, however, realize that these are not cost prices but commercial prices. Furthermore, prices differ from country to country and some prices include counseling, whereas others only include the test. It is likely that costs will fall in the near future and only then a valid comparison can be made. Furthermore, even though not discussed here, one should bear in mind that issues of patents may also influence prices. Moreover, the recent acquisition of Verinata Health by Illumina might be a potential point of concern for NIPT providers, as this might affect pricing or availability of the Illumina machines for NIPT, but so far, this is only speculative. Noteworthy is that some insurance companies (in the USA) already decided to cover (part of) the costs, resulting in differential pricing: Sequenom, for example, charges $1700 to uninsured women and $235 to insured women. Future analysis on clinical utility and economic costs, as already described by Chitty et al.,[34] and Song et al.,[35] need to determine at what price NIPT would be cost efficient in the different healthcare systems throughout the world. It is expected that NIPT will be cost efficient in the long term because methods keep improving, costs of sequencing are dropping, and this test can improve health outcomes.

SUMMARY

At the moment, whole genome MPS approaches for the detection of fetal aneuploidies using ccffDNA in maternal plasma are more extensively validated than targeted approaches with regard to sensitivity and specificity. Cost wise, whole genome approaches are, if used to study only one specific chromosome, more expensive per sample than targeted approaches, as fewer samples can be multiplexed. However, one should bear in mind that, with the exception of multiplexing, sample handling and library preparation are comparable for both approaches, limiting cost reduction in case of the targeted approach. The ongoing and rapid reduction in the costs associated with sequencing and downstream data handling should be taken into account too. Moreover, as technological advances are being made rapidly, it is questionable whether, in the long term, MPS NIPT will only be applied for the detection of full-blown aneuploidies of specific chromosomes. Indeed, several groups already reported the detection of submicroscopic chromosomal aberrations,[31, 36-38] and in 2010, the complete fetal genome was assembled from maternal plasma.[20] Recently, three studies followed up on these findings and performed whole genome, exome, and targeted MPS for the detection of paternally inherited or de novo arisen fetal mutations using ccffDNA from maternal plasma.[39-41] Whatever approach will be used in the future, both benefits and limitations will go along with associated challenges for diagnostic services and healthcare providers.

WHAT'S ALREADY KNOWN ABOUT THIS TOPIC?

  • Since the discovery of cell-free fetal DNA in maternal plasma, large progress has been made in the development of noninvasive prenatal tests.
  • The first applications in noninvasive prenatal diagnosis were single polymerase chain reaction-based.
  • Since 2008, a new era in the development of noninvasive aneuploidy testing was opened by the first successful application of massively parallel sequencing for this purpose.

WHAT DOES THIS STUDY ADD?

  • For fetal aneuploidy testing, whole genome massively parallel sequencing is still rather expensive and to reduce costs, targeted sequencing approaches are being developed.
  • This review highlights benefits and limitations of both whole genome and targeted approaches for noninvasive prenatal testing for fetal aneuploidy detection for now and the near future as a shift in the most cost effective approach is anticipated in the near future.

Ancillary