Non‐invasive fetal genotyping for maternal alleles with droplet digital PCR: A comparative study of analytical approaches

To develop a flexible droplet digital PCR (ddPCR) workflow to perform non‐invasive prenatal diagnosis via relative mutation dosage (RMD) for maternal pathogenic variants with a range of inheritance patterns, and to compare the accuracy of multiple analytical approaches.


| INTRODUCTION
During pregnancy, cell free fetal DNA (cffDNA) is shed from the placenta and can be detected in maternal plasma alongside cell free DNA (cfDNA) from maternal tissues. 1 This has allowed safer access to fetal genetic material and the development of prenatal testing using only a maternal blood sample, including screening for common aneuploidies, 2 determination of fetal sex 3 and fetal RHD status, 4 and non-invasive prenatal diagnosis (NIPD) of single gene disorders. 5 Unlike detection of de novo or paternally inherited variants, which are not present in the maternal genome, determining the fetal inheritance of maternal variants is technically challenging, as the variant of interest is also present in the maternal cfDNA. This has complicated the development of NIPD for scenarios in which the mother is affected with a dominant condition or is a carrier for an Xlinked or recessive condition. Detecting fetal inheritance of maternal alleles can be achieved using relative haplotype dosage analysis (RHDO), in which next generation sequencing (NGS) is used to determine the inheritance of parental haplotypes in the cfDNA. 6 However, the requirement of DNA from the biological father and a familial proband currently limits its availability.
Notably, RHDO is not available for sickle cell disease (SCD), despite this being a life-limiting autosomal recessive (AR) condition, with over 300,000 patients born globally every year 7 and one of the most frequent requests for prenatal diagnosis in the United Kingdom. Approximately 70% of SCD patients are homozygous for a single missense variant in the HBB gene (NM_000518.5:c.20A>T), referred to as the haemoglobin S (HbS) allele. 8 Each year, over 300 invasive prenatal tests are performed for SCD in the United Kingdom alone, and a sample from the biological father is not received in up to 40% of cases when requested 9 ; this would prevent access to RHDO even if it was available.
Pregnant women who are heterozygous for rare disease variants are also not served by RHDO, as it is too expensive to permit validation for rare genetic conditions in a publicly funded healthcare system. Moreover, due to the requirement of a high number of informative single nucleotide polymorphisms (SNPs) to differentiate haplotypes at the gene locus, RHDO is currently not offered to consanguineous couples. 10 There is therefore an unmet clinical need for a cost-effective and flexible non-invasive method for detecting maternally inherited variants using only a maternal blood sample.
Patients welcome the prospect of such tests but stress that accuracy is of paramount importance for any new method to be an acceptable alternative to invasive testing. 11 Fetal inheritance of maternal alleles can also be detected using relative mutation dosage (RMD), which measures imbalances in the abundance of variant and normal sequences in the cfDNA. 12 Detecting changes in allelic dosage due to the fetal genotype is complicated by the low concentration of total cfDNA, which is present between 600 and 2000 genome equivalents per millilitre of plasma (GE/ml) in early pregnancy. 13,14 Of this total cfDNA, the fetal DNA may comprise less than 10%, and any allelic dosage change will be relative to this fetal fraction. For a cfDNA test to be a clinical alternative to invasive testing, it should be available at an equivalent time-point to chorionic villous sampling (CVS), which is 11-14 weeks gestation, when the fetal fraction and absolute concentration of cfDNA are both low. 14 Digital PCR (dPCR) has previously been applied to RMD approaches, including cohorts of haemophilia, 15,16 monogenic diabetes 17 and inherited deafness. 18 However, Barrett et al. 19 reported in a large cohort study for SCD that 7 out of 59 fetal genotype predictions using sequential probability ratio testing (SPRT) were incorrect, including one false positive and four false negatives.
Recently, Sawakwongpra et al. 20 reported a misclassification rate of roughly 20% (5/24) when SPRT was applied to non-invasive fetal genotyping of beta-thalassemia. The high rate of these incorrect predictions and uncertainty about their aetiology have so far hindered the clinical implementation of RMD in cfDNA testing. Many publications have applied SPRT to RMD, 12,15,16,19,[21][22][23] whilst z-score methods 24,25 and Bayesian approaches, 18,26 including Markov chain Monte-Carlo (MCMC) analysis, 17 have also been reported. However, cohort sizes are often limited, and no large-scale comparison of analytical approaches has been performed.
In this study, we developed a flexible droplet digital PCR (ddPCR) workflow to perform NIPD for maternal pathogenic variants with Xlinked, autosomal dominant (AD) and autosomal recessive (AR) inheritance, including the common HBB c.20A>T variant, and a case from a consanguineous family. We then applied three analytical methods, SPRT, Bayesian and z-score analyses, and compared their accuracy in predicting fetal genotypes. part of the RAPID project (NIHR RP-PG-0707-10107; Research Ethics Committee reference: 14/SC/1020). 13 All samples were pseudo-anonymised and consent for research obtained. For this study, 127 samples were identified with the following criteria: samples from singleton pregnancies from women who were heterozygous for pathogenic variants in Mendelian disease genes and with a fetal genotype determined by invasive prenatal or postnatal testing. Three cases were subsequently excluded due to evidence of the haemoglobin C (HbC) allele, sample contamination and the presence of a twin pregnancy which was not noted on the referral. Thus 124 samples were subsequently analyzed: 88 for SCD (HBB c.20A>T) and 36 for a range of variants in different disease genes ( Figure 1).

| Sample collection
Maternal plasma samples were processed as previously described 13 and stored at −80°C. cfDNA was extracted using a QiaSymphony instrument using the QIAsymphony DSP Circulating Nucleic Acid Kit (Qiagen). Genomic DNA (gDNA) for maternal and paternal controls was extracted from stored blood pellets using a QuickGene-610L (Kurabo) kit. Further details are included in Supplementary Methods.  ddPCR was performed on a Bio-Rad QX200 system with an automated droplet generator (Bio-Rad). Each assay was optimised on gDNA from heterozygous parental controls using an annealing temperature gradient. Each cfDNA sample was then tested with the ddPCR assay for the relevant pathogenic variant, and results were analyzed using Quantasoft (v1.7.4). The fetal fraction was determined via ddPCR for the ZFY locus 29 or a paternally inherited SNP. 21,30 Identification of informative SNPs was performed using NGS and ddPCR and is described in more detail in the Supplementary Methods. Parental gDNA was tested alongside the cfDNA at equivalent concentrations to provide a comparison dataset of samples known to be truly heterozygous.

| Limit of detection experiment
A limit of detection experiment was designed to test the sensitivity of the HBB c.20A>T ddPCR assay. gDNA from patients confirmed by Sanger sequencing to be homozygous for either the HbS (HbSS) or HbA (HbAA) alleles was fragmented to an average size of 150 bp on a Covaris E220 Ultrasonicator, and the DNA fragment profile was assessed using an Agilent 2200 TapeStation. HbAA and HbSS gDNA was spiked into a HbAS gDNA sample at increments from 2% to 12%, at different total concentrations of DNA. These mixtures simulated the composition of cfDNA from a pregnant woman bearing affected and unaffected fetuses with varying fetal fractions and cfDNA concentrations. This experiment was performed only for the SCD assay due to sample availability.

| Analysis
SPRT was performed as previously described for autosomal 31 and X-linked inheritance patterns, 15 using a likelihood ratio threshold of 8. 32 The Bayesian analysis was performed for AD variants as described by Caswell et al., 17 with additional models for X-linked and AR inheritance, and a threshold of 0.95 17 for fetal genotype classification. Finally, we performed a z-score analysis, which was modified from Chiu et al. 2 Heterozygous parental gDNA results were used as a control dataset, with known equal concentrations of variant and reference alleles. The z-score was then calculated as the number of standard deviations by which the cfDNA sample variant fraction differed from the mean of the heterozygous gDNA controls. Applying the same thresholds as Chiu et al., 2 z-scores greater than 3 or less than −3 were used to predict homozygous and hemizygous fetal genotypes, whilst z-scores between 2 and −2 were predicted to have heterozygous fetal genotypes.

| Assay optimisation
The limit of detection study using sonicated gDNA showed that the SCD assay could distinguish 4% spike-ins of both HbSS and HbAA at DNA inputs ranging from 3000 to 12000 molecules (Supplementary Figure 1). We then looked at the variant fraction of the heterozygous gDNA controls with known equal concentrations of variant and reference alleles. These showed substantial variation (43.8%-55.4%), particularly when fewer than 2000 haploid genome equivalents (GE) were measured, due to the sampling error associated with low DNA inputs. The variation was such that when the SPRT and Bayesian analyses were applied to the control dataset, several replicates exceeded the classification thresholds (Supplementary Figure 2).
Based on the results of this gDNA testing and the limit of detection experiment for the SCD assay, we set additional quality filters across both cohorts: samples with fewer than 2000 GE at the SHAW ET AL.
-479 variant site or with a fetal fraction below 4% were classified as inconclusive for all three analytical methods. The 4% fetal fraction threshold is consistent with that routinely used for non-invasive prenatal screening for aneuploidies 33,34 and in our laboratory for complex cfDNA assays for monogenic conditions. This was therefore applied across all the assays, not just the SCD assay, rather than testing bespoke assays due to lack of sample availability. A total of 25 samples, 13 in the SCD cohort and 12 in the bespoke cohort, did not meet these quality criteria and were therefore classified as inconclusive across all three analytical methods.

| Sickle cell disease cohort
The HBB c.20A>T ddPCR assay was successfully optimised and additional testing assessed the impact of the common HbC allele (NM_000518.5:c.19G>A) on probe binding, which generated a distinct low fluorescence droplet cluster (Supplementary Figure 3).
When compared to the results of invasive testing, the SPRT, Bayesian and z-score analyses generated 97%, 98% and 99% correct fetal genotype predictions for reportable cases, with 22%, 25% and 23% inconclusive results, respectively (Table 1). These inconclusive rates include 13 cases, 15% of the SCD cohort, which did not meet the quality criteria due to low fetal fraction or low DNA input. There were two incorrect fetal genotype predictions generated by SPRT; a homozygous normal genotype (HbAA) predicted for a confirmed heterozygous carrier fetus (HbAS) and a heterozygous carrier genotype predicted for a homozygous affected fetus (HbSS). The Bayesian and z-score analyses generated only one of these incorrect fetal genotypes. This sample (cfDNA-122), collected at 12 + 5 weeks gestation, had a variant fraction of 50% and a fetal fraction of 6% and was incorrectly predicted to have a heterozygous fetus by all three analysis methods, whilst the CVS reported a fetus affected with SCD.

| Bespoke cohort
For the bespoke design cohort, all three analysis methods again performed similarly well, with 92% correct fetal genotype predictions for reportable cases for SPRT, 91% for the Bayesian analysis and 94% for the z-score analysis ( Table 2). Among the correct predictions made by all three methods were those for the common CFTR c.1521_1523del variant and two common beta-thalassemia variants: HBB c.126_129del and HBB c.93-21G>A. Six samples were tested F I G U R E 1 The testing workflow of the sample cohort. The numbers in each box indicate the number of samples tested. Three samples included for sickle cell disease testing were subsequently excluded due to the detection of contamination, the presence of the haemoglobin C allele and due to a twin pregnancy.

480
-SHAW ET AL. from couples in which both parents carried the same rare variant in a recessive gene, and correct fetal genotypes were predicted by all three methods for five of these ( Table 3). The overall rates of inconclusive results, including those which did not meet the quality thresholds, were slightly higher for this cohort; 33% for SPRT, 39% for the Bayesian analysis and 50% for z-score analysis. Twelve cases, 33% of the total cohort, were classified as inconclusive across all three methods due to low fetal fraction or low DNA input. Of note, seven X-linked samples had fewer than 2000 molecules detected by the variant fraction assay (cfDNA-19, cfDNA-20, cfDNA-22, cfDNA-25, cfDNA-26, cfDNA-27 and cfDNA-30) and therefore did not pass the quality filter. These samples were extracted from small plasma volumes and tested early in the study, prior to the optimisation of the cfDNA extraction method (Supporting Information S1).
The SPRT and Bayesian analyses generated the same two incorrect fetal genotype predictions for two X-linked recessive variants in the ABCD1 and IDS genes; one false negative result (cfDNA-17) and one false positive result (cfDNA-29) ( Table 4). The latter was also incorrectly classified by z-score analysis. This false positive result, called by all three analytical methods, was from a cfDNA sample taken at 13 + 6 weeks gestation from a woman who was heterozygous for the IDS c.182_189del variant, which causes mucopolysaccharidosis type 2. The ZFX/ZFY ddPCR assay measured a fetal fraction of 10.3%, and the variant fraction was 54.6%. The SPRT, Bayesian and z-score analyses all predicted that the fetus was hemizygous for the variant and was therefore affected. However, Sanger sequencing on CVS DNA at the time of sampling did not detect the pathogenic variant, and the fetus was born unaffected with mucopolysaccharidosis type 2.
Overall, across both the SCD and bespoke cohorts, SPRT, Bayesian and z-score analyses correctly classified 96%, 97% and 98% of reportable cases, with 25%, 29% and 30% inconclusive results, respectively (Table 5). Again, these inconclusive results include the 25 cases, 20% of the total cohort, which did not meet the quality thresholds. SPRT generated the highest number of incorrect results, with four in total across both cohorts, while z-score generated only two. A full detailed results table can be found in Supplementary Table 2.

| X-chromosome inactivation
We hypothesised that X-chromosome inactivation (XCI) could have  All three analytical methods had high rates of inconclusive results, the lowest being SPRT at 25% and the highest being zscore analysis at 30%. Twenty-five samples in total, 20% of the cases, did not meet the quality criteria due to low DNA input or low fetal fraction and these were classified as inconclusive across all three analytical methods. By removing these cases and looking only at those which passed the quality criteria, the inconclusive rates are reduced to 6%, 11% and 13% for SPRT, Bayesian and zscore analyses, respectively. Since this study used archived samples, we were unable to request repeat samples at later gestations for those cases with inconclusive ddPCR results due to low fetal fraction or low numbers of molecules. Although taking a second sample later in gestation does not always resolve issues of low fetal fraction, it is expected that in a clinical scenario with access to repeat sampling in cases which fail the quality criteria, all three analysis methods may have a lower inconclusive rate. Of note, there are current clinical NIPD services being offered in the NHS reported to have high inconclusive rates during validation, which were subsequently resolved following repeat testing with later samples. 10,36 The SPRT was developed by Wald 37 and has been applied to dosage-based cfDNA testing of sequence variants with ddPCR 12,15,16,[19][20][21] and NGS, 22,23,38 as well as for chromosomal aneuploidy. 31 The SPRT has also been successfully applied in RHDO, where its usage is justified by the sequential nature of SNPs along the locus of interest. 6 However, as ddPCR data for a single loci is not acquired sequentially, the application of SPRT in ddPCR has been criticised 39 and incorrect fetal genotype predictions have been reported. 12,19,20,23,38 In this study, applying the SPRT to ddPCR results from heterozygous parental gDNA still generated fetal genotype T A B L E 4 Incorrect results.

Sample number Variant Fetal fraction (%) Variant fraction (%)
Molecules  Figure 2), indicating that assay validation was not a contributing factor to the incorrect results. However, cfDNA has many unique features, including a distinct fragment size profile, jaggedness, non-random end-motifs and topology, 41 which may impact ddPCR assays and cannot be replicated using gDNA as a surrogate reference material.
Unfortunately, we did not have access to blood samples taken prior to pregnancy for the women in our cohort, which would have been the most appropriate control material. By applying a cfDNA reference dataset instead of gDNA controls, the incorrect results with zscore analysis may have been prevented, which suggests an avenue for future development. In an ideal scenario, each assay would have been optimised on maternal cfDNA from a pre-pregnancy control plasma sample to assess for any allelic imbalances. A further limitation is that we used archived samples, some of which had low fetal fraction, and therefore we could not access repeat samples later in pregnancy when fetal fraction may be higher.
Recent reports have applied NGS to RMD, with a focus on SCD, with modifications to allow for single-molecule counting. 22,23,42 These approaches allow sequence-level inspection of the cfDNA and could theoretically be applied to the detection of any pathogenic variant. However, ddPCR is a cheaper and more flexible technique, which allows the rapid development of assays for both rare and common variants, giving it a significant advantage over NGS techniques for clinical application.

| CONCLUSION
In summary, we report a large cohort of cfDNA analysis for maternally inherited variants, including the common HBB c.20A>T variant. In our cohort of 124 cfDNA samples, z-score analysis was the most accurate method, generating 98% correct fetal genotype predictions out of the total number of predictions made, but with a high inconclusive rate of 31%. However, based on the evidence from previous NIPD tests, which have been translated into clinical service, this inconclusive rate would be expected to reduce with repeat sampling. For the two incorrect fetal genotype predictions made by the z-score analysis, the cause remains unknown. It may be that with further refinements to the analysis method it will become possible to reduce this rate of incorrect results. High degrees of accuracy are required prior to clinical implementation of cfDNA tests, and clear information on test performance must be given to patients and physicians.