A rapid, multiplex digital PCR assay to detect gene variants and fusions in non‐small cell lung cancer

Digital PCR (dPCR) is emerging as an ideal platform for the detection and tracking of genomic variants in cancer due to its high sensitivity and simple workflow. The growing number of clinically actionable cancer biomarkers creates a need for fast, accessible methods that allow for dense information content and high accuracy. Here, we describe a proof‐of‐concept amplitude modulation‐based multiplex dPCR assay capable of detecting 12 single‐nucleotide and insertion/deletion (indel) variants in EGFR, KRAS, BRAF, and ERBB2, 14 gene fusions in ALK, RET, ROS1, and NTRK1, and MET exon 14 skipping present in non‐small cell lung cancer (NSCLC). We also demonstrate the use of multi‐spectral target‐signal encoding to improve the specificity of variant detection by reducing background noise by up to an order of magnitude. The assay reported an overall 100% positive percent agreement (PPA) and 98.5% negative percent agreement (NPA) compared with a sequencing‐based assay in a cohort of 62 human formalin‐fixed paraffin‐embedded (FFPE) samples. In addition, the dPCR assay rescued actionable information in 10 samples that failed to sequence, highlighting the utility of a multiplexed dPCR assay as a potential reflex solution for challenging NSCLC samples.


Introduction
Lung cancer is the leading cause of cancer death in the United States, with a projected 350 deaths per day in 2022 [1].Fortunately, there are a growing number of advancements in screening and treatment response monitoring, as well as targeted therapies and immunotherapies, that have improved clinical management for patients with advanced non-small cell lung cancer (NSCLC) [1].For example, there are now over a dozen different precision medicines targeting driver genes and network pathways [2].Despite these improvements in treatment options for NSCLC patients, there remain significant challenges with current molecular test options that critically limit treating patients with the right drugs.Constraints including test Abbreviations ARMS, allele-refractory mutation system; dPCR, digital polymerase chain reaction; FFPE, formalin-fixed paraffin-embedded; HDPCR™, high definition polymerase chain reaction; Indel, insertion/deletion; IVT, in vitro transcription; LNA, locked nucleic acid; NAT, normal adjacent tissue; NGS, next-generation sequencing; NPA, negative percent agreement; NSCLC, non-small cell lung cancer; PPA, positive percent agreement; QNS, quantity not sufficient; RPKM, reads per kilobase of transcript per million reads mapped; SNV, single-nucleotide variant; VAF, variant allele frequency.accessibility, sample availability, and the lack of consistent payor reimbursement for diagnostic tests have prevented widespread utilization of precision medicines [3].Challenges, such as insufficient or poor quality samples, and slow turnaround time [4], have further hindered broad adoption.For example, in a 2022 multisource database investigation, nearly 50% of patients were unable to benefit from precision medicines due to factors linked with obtaining biomarker results; 18% received inaccurate results due to test limitations or errors; and 4% started on a less precise treatment due to prolonged test turnaround time [5].Therefore, there is an outstanding need for rapid, comprehensive, reliable, and low-cost methods that can identify patients as eligible for precision treatment and clinical trials.
Massively parallel, or next-generation sequencing (NGS), is the leading approach to profile both primary tumor samples and peripheral cell-free nucleic acids for clinically actionable biomarkers.A major advantage of this method is that sequence information of entire genes and regions of the genome is generated, which enables comprehensive detection of variants present.However, there are also key challenges with sequencing-based approaches, including test failures due to insufficient specimen volume, nucleic acid isolation yields, failed library preparation [6], complex and time-consuming laboratory workflows and bioinformatics analysis [7], and high instrumentation and reagent cost [8,9].These factors have limited both the successful processing of clinical samples and the types of institutions performing these assays.dPCR is an emerging alternative to NGS for cancer biomarker testing due to its simple workflow, low sample input requirements, high sensitivity, fast turnaround time, and low cost [10][11][12].However, the clinical utility of conventional dPCR remains limited due to its inherent multiplexing limitation to assessing all actionable biomarkers in a single assay with a limited amount of sample.To overcome this, several methods have been proposed to increase digital PCR information content through amplification curve analysis [12], melt curve analysis [12,13], and amplitude modulation [14,15].However, none of these methods have yet been developed into a comprehensive assay that generates a complete set of actionable information because of complexities in workflows.
Here, we describe a proof-of-concept TaqMan ®based amplitude modulation-based dPCR panel [HDPCR, see Ref. 15] for multiplexed detection of relevant variants seen in NSCLC, including 12 singlenucleotide or insertion/deletion DNA variants, 14 RNA fusion variants, and MET exon 14 skipping (Table S1).All DNA variants and RNA fusion variants detected by this panel were selected based on NCCN guideline recommendations and the association of targeted therapies for advanced or metastatic NSCLC [16].The predicted prevalence in Table S1 is based on the frequency of unique Sample Identifiers for each variant in the Catalog Of Somatic Mutations in Cancer (COSMIC) database [17].This study aimed to present a proof-of-concept assay that targets relevant loci, not a finalized assay with broad inclusivity.The amplitude modulation scheme relies on standard, low-cost, TaqMan probe hydrolysis that is concentration limited to deterministically program unique fluorescent signatures for each analyte.Given that modern PCR instruments incorporate photodetectors with a wide dynamic range, multiple targets each with a corresponding unique fluorescent intensity can be multiplexed within one channel.The panel also leverages multi-spectral signal encoding for some analytes to create a form of error detection code [18] that improves the specificity of analyte detection beyond standard TaqMan dPCR by lowering the effective background noise.Together, the dPCR panel enables a 3-h turnaround time of results from isolated nucleic acids to a complete variant analysis.

Human biological samples
De-identified, remnant, formalin-fixed paraffinembedded (FFPE)-isolated, human biological genomic DNA from 25 NSCLC patients was provided by Dartmouth Hitchcock Medical Center (Lebanon, NH, USA).The use of pre-existing archived and deidentified samples is considered Non-human Subjects Research at Dartmouth.A second set of de-identified, remnant FFPE samples > 15 years old were sourced from Discovery Life Sciences (includes samples through acquisitions of Conversant Biologics and East West Biopharma, Huntsville, AL, USA), and the ethical, moral, and legal aspects of the studies were evaluated and approved by either the Western Institutional Review Board or a local institutional review board (Table 1, samples collected August 1994-September 2006).A third set of de-identified, remnant FFPE samples < 3 years old were sourced from Cureline (Brisbane, CA, USA), and the ethical, moral, and legal aspects of the studies were evaluated and approved by the Institutional Review Board of University of Hong Kong/Hospital Authority Hong Kong West Cluster (Table 1, samples collected February 2020-June 2020).All study methodologies conformed to the standards set by the Declaration of Helsinki, and all samples enrolled in this study were collected with the understanding and written consent of each subject and had no pathological selection criteria.The institution names at which the Discovery and Cureline samples were collected were confidential.Formalin-fixed paraffin-embedded samples were split into these three groups based on 'time in block' age (Table 1).Discovery Life Sciences and Dartmouth Hitchcock Medical Center isolated the nucleic acids (DNA and/or RNA) using validated in-house methods and performed initial quality control (QC) (quantification, sizing, and RNA quality assessment).The QC data, patient demographics, and clinical metadata for all samples are provided in Table S2.Normal adjacent tissue (NAT) FFPE curls (Discovery) were combined in sets of three curls per tube and extracted with the AllPrep ® DNA/ RNA FFPE Extraction Kit (PN 80234; Qiagen, Germantown, MD, USA).Isolated nucleic acids were quantified by Qubit4™ (Qubit dsDNA HS kit; Thermo Fisher Scientific, Waltham, MA, USA).

Synthetic RNA via in vitro transcription
The MEGAscript™ T7 Transcription Kit (PN AM1330; Life Technologies, Carlsbad, CA, USA) was used according to the manufacturer's protocol.First, custom DNA gBlocks with T7 promoter sequences (Integrated DNA Technologies, Inc., Coralville, IA, USA) were created for each fusion variant (Table S3).The transcription reaction was set up with the following volumes using the MEGAscript T7

Amplitude modulation dPCR assay construction
The primer-probe systems adopted one of three configurations: an allele-refractory mutation system (ARMS) with or without blocking oligonucleotides, a variant-sensitive probe, or an exon-specific design to identify exon-exon RNA fusion junctions (Fig. 1).
To begin, we synthesized and screened multiple primer-probe systems in singleplex using synthetic templates designed to represent a variant of interest.
For the DNA-specific ARMS and variant-sensitive probe systems [19], the strandedness of the system (targeting Watson or Crick), the thermodynamics of the penultimate base pair mismatch, and the orientation with respect to nearby variant sites were considered during the design phase.Once systems were identified that worked well in singleplex and in pairwise duplex (Table S4), the same principles of amplitude modulation in dPCR that have previously been demonstrated on qPCR [15] were applied.This approach allows multiple targets to be detected in the same color channel by tuning the reaction chemistry and probe concentrations, then applying Poisson statistics to interpret the observed dPCR data.Primer and probe concentrations were empirically optimized under multiple different concentrations and thermal cycling conditions to achieve terminal fluorescent amplitude values that allowed for fluorescent intensity separation of all variants (Table S5).Due to the close genomic coordinate proximity of some of the DNA variants, the DNA targets were split into two separate wells to minimize cross-target amplification.For the RNA-specific fusion targets, a separate reaction included a reversetranscription PCR step to generate cDNA.We also sought to incorporate knowledge of the prevalence and co-occurrence of certain biomarkers into the assay design.For example, to reduce the risk of calling errors that may be elevated in co-positive samples (e.g., EGFR L858R and KRAS G12C), prevalent variants were encoded in different color channels.Complete sets of multiplex primer-probe systems were prioritized based on four criteria: responsiveness to amplitude modulation, reaction efficiency (e.g., minimal dPCR 'rain'), cross-reactivity due to proximity of targets, and specificity to discriminate between the variant and the wild-type sequences.The issue of 'rain' refers to partitions with fluorescence amplitude that falls between the expected positive partition amplitude and the negative partition amplitude.For amplitude modulation PCR, the 'rain' creates an additional issue where partitions belonging to a higher amplitude level (e.g., level 2 or 2i) are misclassified as a lower level (level 1 or 1i), thereby creating false positives in the lower-level windows and false negatives in the higher-level window.For some targets, locked nucleic acid (LNA) probebased detection schemes [20,21] had less interaction with wild-type DNA and produced less 'rain'.Other primer and probe systems that had noticeably higher reaction efficiency (e.g., minimal 'rain') were assigned to higher intensity levels.
The nature of dPCR reduces the impact of nonspecific amplification events, as false-positive signals are contained to a few partitions.However, it can still result in appreciable noise levels in the absence of target (Fig. 2A,B).This led us to implement a multi-spectral encoding strategy for some targets to further improve performance (Fig. 1A,B).Multi-spectral encoding relies on including two probes to the same target, each with a different fluorescent signature.This creates two independent probe hydrolysis events, thereby enhancing the signal above the noise created due to nonspecific single-probe hydrolysis.For example, the EGFR T790M system generated positive counts in the presence of wildtype genomic DNA (Fig. 2A,B) and a similar number of counts in the presence of low copy number EGFR T790M variant (Fig. 2D,E).However, when EGFR T790M is encoded in channel 5 as well as channel 1, the T790M-positive counts are easily distinguished from the noise (Fig. 2C,F).In another example, the channel 1 probe for KRAS G12C performed better than the channel 3 probe with extracted genomic DNA and by combining the two, a more distinct population of positive partitions are generated (Fig. 3).The layout of the first multiplex DNA assay (Fig. 4), second DNA assay (Fig. S1), and RNA assay (Fig. S1) illustrates how amplitude modulation and multi-spectral encoding work together to resolve multiple variants in one well.Figure S2 contains an illustrative example of a DNAnegative control.Refer to Table S5 for the primerprobe formulation to achieve this assay layout.

Amplitude modulation digital PCR reaction setup and cycling (DNA)
DNA PCR reactions were set up using the following volumes: 2.4 μL 5× dPCR QuantStudio™ Absolute Q™ Master Mix (PN A52490; Thermo Fisher Scientific), 2.9 μL oligonucleotide primer-probe mix (Table S5), and 6.7 μL of isolated genomic DNA.Contrived samples and natural specimen FFPE were tested at 4.18 ngÁμL À1 .Each dPCR reaction mix was vortexed three times for 5-s pulses, spun down in a microfuge, and 9 μL of the dPCR reaction mix was added to each well of a QuantStudio Absolute Q MAP16 Plate Kit (PN A52865; Thermo Fisher Scientific).Next, 12 μL of QuantStudio Absolute Q Isolation Buffer (PN A52730; Thermo Fisher Scientific) was added to each well on top of each reaction mix.The final quantity of genomic DNA that makes it into the system, as part the 9 μL input, is 21 ng for contrived and FFPE samples.The wells were sealed with QuantStudio Absolute Q strip caps (PN 332101; Thermo Fisher Scientific).All testing was conducted on one of two QuantStudio Absolute Q Digital PCR Systems (Thermo Fisher Scientific).Thermal cycling was performed as follows: (a) Preheating at 96 °C for 10 min and (b) 35 cycles consisting of denaturing (96 °C, 15 s), followed by annealing/extension (58 °C for 30 s).Terminal fluorescence intensity data were collected in all four available color channels.Along with the reaction mixes, every plate included a positive control (gBlocks of synthetic targets in each color channel) and a negative control (consisting of only human genomic DNA background).Positive control primers for EGFR Exon 2 (DNA) were included in each well, respectively.Primer and probe sequences are described in Table S4 (Integrated DNA Technologies, Inc. and Thermo Fisher Scientific).

Amplitude modulation digital PCR reaction setup and cycling (RNA)
RNA dPCR reactions were set up using the following volumes: 2.4 μL 5× dPCR QuantStudio Absolute Q Master Mix (PN A52490; Thermo Fisher Scientific), 2.4 μL 5× primer-probe mix (Table S5), 0.   consisting of denaturing (95 °C for 10 s), followed by annealing/extension (58 °C for 1 min).Terminal fluorescence intensity data were collected for all four available color channels.Along with the reaction mixes, every plate included a positive control (gBlocks of synthetic targets in each color channel) and a negative control (consisting of only isolated FFPE total RNA background).Positive control primers for ACTB (RNA) were included in each well, respectively.Primer and probe sequences are described in Table S4 (Integrated DNA Technologies, Inc. and Thermo Fisher Scientific).

Contrived DNA and RNA sample assembly
Contrived FFPE samples were created by combining synthetic DNA gBlocks (average size = 400 nt, containing either reference sequence or variant of interest, from Integrated DNA Technologies, Inc.) with 21 ng of extracted healthy (negative) human FFPE DNA at six different variant fractions ranging from 60 to 2300 copies (1-40% variant allele frequency (VAF)).The contrived FFPE RNA samples were created by combining the fusion IVT RNAs with the negative extracted FFPE RNA at a range of copy numbers: 5000, 7500, 10 000, 11 250 while the negative extracted FFPE RNA remained constant at 5000 copies (Table S6).

Variant calling from amplitude-modulated digital PCR data
Once the oligo sequences and concentrations were set for each assay, a run was conducted with each genomic target present in singleplex in two replicate wells.For each target, positive partitions were identified using an amplitude cutoff, which was established by testing each target in the assay individually, and the mean and covariance of positive partition amplitudes were calculated across all four channels.The mean and covariance of partition amplitudes for all possible  S1).hgDNA refers to in-well positive control amplicon for EGFR Exon 2. (B) Superimposed fluorescence scatterplots for synthetic targets profiled individually at 5000 copies for EGFR E746_A750del (COSM6223), EGFR Exon 20 H773dup (COSM12377), EGFR L858R (COSM6224), EGFR T790M (COSM6240), EGFR G719S (COSM6252), and BRAF V600E (COSM476).A similar spectral layout for well #2 is shown in Fig. S1.Negative controls are shown in Fig. S2.
target combinations were predicted by assuming amplitudes would add linearly.This set of analyses generated 'expected' target amplitudes, which were used to classify partitions across all other experiments.These singleplex runs were also used to characterize the crosstalk levels of each dPCR instrument, and this crosstalk was subtracted out in all multiplex runs.
Each sample plate run contained at least one negative control well, which only had the internal EGFR Exon 2 control target present, and at least one positive control well, which had multiple synthetic targets present that would generate signal in each channel.These controls were used to perform three plate-wide corrections.First, the negative control well was used to determine the mean amplitude of partitions positive for the internal control; if this was different from the expected location, then the expectation for that target was scaled for the rest of the plate.Similarly, the positive control well was used to determine the mean amplitude of partitions, which were positive in each individual channel.If a given channel differed from its expected level, the ratio between observed and expected mean was used to scale the expected amplitude for all targets in that channel.Finally, the negative control well was re-analyzed to determine how many partitions were positive for targets other than the internal control target.These levels were used to determine an expected level of spurious amplification which occurs in the absence of target material.This set of corrections was performed on a plate-by-plate basis to correct for any differences from run to run.
After these plate-wide corrections, noncontrol wells were analyzed to determine target counts.Partition classification was performed using the Mahalanobis distance metric: for a partition with the 4-dimensional amplitude vector x !, its Mahalanobis distance to a target with expected mean amplitude u ! and covariance . This is effectively the same as classic Euclidian distance but scaled by the covariance of the expected target amplitude; this corrects for the fact that some targets generate point clouds with inherently wider spread than others.Each partition is assigned to the target or target combination to which it has the lowest Mahalanobis distance.Analyzing all partitions in this manner results in a count of positive partitions for each target, which is converted into a target concentration using Poisson statistics where copy Target ¼ Àln 1À counts Target counts Valid Partitions : The expected level of spurious amplification was then subtracted to yield a final concentration for each target.
For the contrived and human biological sample experiments, DNA samples with EGFR Exon 2 copy numbers below 1000 copies per reaction were empirically determined to be quantity not sufficient (QNS) and excluded from the performance calculations.Similarly, RNA samples with ACTB copy numbers below 1000 copies per reaction were determined to be QNS.To quantitatively determine which samples exhibited abnormal results, wells were labeled invalid and excluded from the analysis whether they had a coefficient of variation in the reference channel across all partitions of > 15% (Fig. S3).Additionally, if a well had > 100 partitions with signals < 6000 relative fluorescent units in the reference channel, it was determined to be invalid and excluded from the analysis.These exclusions led to an observed per-well failure rate of $ 4.95% (33/666 total reactions) on both instruments.One of the main failure modes was images with dark patches in the QC array (Fig. S3), which could be due to optical or flow issues in the instrument.The performance of the chemistry and algorithm was determined on the contrived DNA samples down to 1% VAF and the contrived RNA samples down to 5000 total fusion copies (Table 2 and Table S6).Receiver operating characteristic (ROC) analysis was performed on the complete DNA and RNA contrived data sets to identify the optimal threshold for each target to separate positive and negative contrived samples.The ROC analysis used the ratio of the target to the in-well positive control (EGFR Exon 2 and ACTB for the DNA and RNA assays, respectively) as the predictor.The calculations were performed using the R software package pROC [22,23].These optimized thresholds were used to calculate the performance of the clinical sample data sets (Table S7).Note, in the dPCR assay the number of variant nucleic acid molecules at each target loci are measured, while no wild-type molecules are measured on a per-loci based.Instead, VAF estimates are generated by dividing the total variant molecule copy number by the EGFR Exon 2 internal control copy number.This may lead to discrepancies with sequencing due to nonuniform representation and/or DNA amplification between EGFR Exon 2 and the other regions assessed in the panel.

Parallel comparator testing
DNA and RNA isolated from the Discovery and Cureline FFPE clinical samples were parallel processed through Discovery Life Sciences' QiaSeq MultiModal panel (64 DNA genes and 6 primary genes for RNA fusions, recommended input mass of 200 ng DNA and 200 ng RNA with at least DV20%).Data were processed through Qiagen's CLC Workbench bioinformatics workflow to generate variant call files and reports.DNA isolated from the Dartmouth Hitchcock samples was processed using Ion AmpliSeq™ Cancer Hotspot Panel v2 and TruSight Tumor 170.Data processing was performed using the Torrent Suite and the Tru-Sight Tumor 170 v1.0 Local App, respectively.Sequencing summary statistics are provided in Table S8.Sequencing VAF estimates take into account the number of reads supporting the wild-type sequence at the position of each variant.Samples were considered indeterminate and excluded from the clinical concordance analysis if the sample did not generate at least 20 reads for a particular target region.
To improve comparator confidence in 31 RNA samples with low read counts for ALK and/or RET transcripts (< 100 RPKM), we sought to run an additional fusion comparator that was commercially available using digital droplet PCR.RNA from the FFPE clinical samples was processed through Bio-Rad mRNA ddPCR fusion assays for RET (ID dHsaEXD81378442; Bio-Rad, Hercules, CA, USA), ROS1 (ID dHsaEXD73338942; Bio-Rad), and ALK (ID dHsaEXD86850342; Bio-Rad).First, the clinical FFPE RNA samples were converted to cDNA using the Bio-Rad cDNA Synthesis Kit according to the manufacturer's instructions (PN 1725037).cDNA was then quantified by Qubit4 (Thermo Fisher Scientific).Three ddPCR reactions were set up for each of the mRNA fusion assays with the following volumes: 10 μL 2× ddPCR Supermix for Probes (no dUTP) (PN 186-3023; Bio-Rad), 1 μL 20× mRNA Fusions Assay, 1 μL 20× GUSB Reference Assay (ID dHsaCPE5050189; Bio-Rad), 6 μL cDNA, and 4 μL nuclease-free water (PN 10977015; Invitrogen, Waltham, MA, USA).Each ddPCR reaction mix was added to a 96-well PCR plate (PN 12001925; Bio-Rad), sealed with a PX1 PCR Plate Sealer (Bio-Rad), then vortexed three times for 10 s pulses, and spun down in a microfuge.The ddPCR fusion reactions were first run on the Bio-Rad Automated Droplet Generator (Bio-Rad) according to the manufacturer's instructions.Once the droplets were generated, a new reaction plate was generated and sealed with a PX1 PCR Plate Sealer.This new reaction plate was transferred to a C1000 Touch Thermal Cycler (Bio-Rad) and run at the following conditions: (a) preheating at 95 °C for 10 min, (b) 40 cycles of denaturing (94 °C for 30 s), followed by annealing/extension (55 °C for 1 min), and (c) enzyme deactivation at 98 °C for 10 min.Lastly, the reaction plate was run on the QX200 Droplet Reader (Bio-Rad) according to the manufacturer's instructions.The results were analyzed by GUSB counts to determine valid/invalid samples.The Bio-Rad fusion assays were then benchmarked to our fusion assay by running each of them with a titration of IVT products (0, 10, 50, 100, and 500 copies) in a 1-ng background of RNA cell line reference (PN 4307281; Applied Biosystems, Waltham, MA, USA). 3. Results

Contrived sample and commercial reference performance
After removing invalid samples (n = 40 DNA, n = 7 RNA), a total of 293 FFPE DNA and 314 FFPE RNA contrived reactions, each containing one or more variants at a range of variant allele frequencies (Table S6), were characterized on the multiplexed assay.These samples were constructed with no a priori knowledge on the assay performance, as we sought to understand calling accuracy at both high and low VAFs using a custom algorithm designed to automatically classify each digital partition (see Section 2).With the parameters optimized for the contrived sample set, the algorithm calling gave results in agreement with the contrived sample composition: for the contrived FFPE DNA and RNA targets, a 94% positive percent agreement (PPA)/99% negative percent agreement (NPA) and 100% PPA/97.9%NPA, respectively (Table 2).The assay generated a total of 20/1578 (1.3%) falsenegative calls and 9/1578 (0.6%) false-positive calls on the contrived DNA samples, which may be partly driven by chemistry and partly by instrument noise.For example, the majority of these false-negative DNA calls were associated with the EGFR G719X target.This primer/probe system was one of the noisiest, likely because it targeted three variants with three different variant-specific primers at the same codon and required a blocker to suppress the wild-type signal.The EGFR G719X assay was not multi-spectrally encoded, which could have significantly reduced the nonspecific calls and allowed for higher amplitudes to increase sensitivity.
We further sought to assess the analytical accuracy of the RNA assay using an external reference standard (SeraCare).Here, three of the fusion reportables (ALK, ROS1, and MET Exon 14 skipping) were tested with the multiplex dPCR assay and found to generate copy estimates in strong agreement with the Certificate of Analysis concentration (Table S9).An additional comparison of the multiplex dPCR assay against three commercially available singleplex fusion assays for ALK, RET, and ROS1 also demonstrated similar levels of performance, with a sensitivity to detect 100 or fewer IVT RNA molecules (Fig. S4).

Human biological sample performance
Consistent with prior reports on the impact of FFPE storage time on DNA fragment length [24], 17/45 FFPE samples that were > 15 years old did not yield DNA of sufficient quality to generate libraries for sequencing or dPCR analysis.All of the 40 FFPE samples that were < 3 years old yielded sufficient DNA and RNA for sequencing (> 200 ng of DNA and RNA).However, prioritizing material for sequencing left three samples with insufficient material for subsequent dPCR testing.After filtering for samples with both passing dPCR calls and sufficient NGS read data at each target position, the assay achieved a 100% PPA and 98.5% NPA on the human biological FFPE DNA samples (n = 38), and a 100% NPA on the FFPE RNA samples (n = 31) (Table 3 and Table S10).
Of the 28 DNA and 16 RNA samples > 15 years old that generated sequencing data, we observed highly variable sequencing coverage across the variant loci interrogated by the dPCR assay (Table S8).This appears to have contributed to five samples with clinical annotation of EGFR Exon 19 del+ (based on prior sequencing or PCR assays on sister blocks) where the dPCR assay detected EGFR E746_A750del (COSM6223), and NGS re-sequencing failed to detect a variant due to lack of coverage in Exon 19.Similarly, four samples were detected to be positive for KRAS G12C by dPCR, and three had associated clinical annotation of KRAS+, but they failed to generate sequencing data due to insufficient quantity of nucleic acid for library preparation (Table S10).One sample was detected to be positive for EGFR H773dup but gave zero aligned reads in EGFR Exon 20.For the 40 DNA samples < 3 years old, one sample (DH-EGFR-048) was called dPCR positive for EGFR G719X that was not detected by NGS.Here, the comparator sequencing assay was validated for detection down to 5% G719X variant frequency, while the amplitude modulation dPCR assay measured it at 2.0% VAF, suggesting it may have been missed by sequencing (Table S10a).Unfortunately, discordant resolution could not be performed on these samples as additional nucleic acid could not be obtained.Taken together, these results highlight the potential value of a dPCR assay that is compatible with lower input mass and yet still has high sensitivity to generate actionable information from degraded or low-yielding samples.

Multi-spectral encoding improves TaqMan assay specificity
Based on the performance of single-probe TaqMan systems, we implemented multi-spectral encoding for EGFR L858R, EGFR T790M, ERBB2 Y772_A775dup, and KRAS G12C (Fig. 4, Figs S1 and S2).In the absence of multi-spectral encoding, the average single-channel background noise for these four targets was 108 positive partitions, as measured by running wildtype genomic DNA (Fig. 2).With multi-spectral encoding, however, the average background noise for these four targets was reduced to an average of 3 positive partitions.Multi-spectral encoding thus allowed for the accurate counting of these targets down to as few as 9 molecules (Fig. 2).

Discussion
There are a growing number of targets and associated molecular testing methodologies to interrogate NSCLC molecular tumor profiles, ranging from single gene qPCR tests [25], easy-to-use cartridge-based systems [26], to comprehensive genomic profiling assays [27].
Here, we describe a first-of-its-kind, proof-of-concept assay that combines the speed and simplicity of a PCR test with the breadth of actionable coverage and sensitivity of a multi-gene sequencing-based test.One of the key challenges with developing a highly multiplexed oncology-focused PCR assay is being able to separately and specifically report variants that are in very close physical proximity (e.g., EGFR L858R and EGFR L861Q, only separated by two codons).Primer and probe systems for one variant can inadvertently interact with the primer and probe systems for the other, leading to false-positive signal generation.Here, we mitigated these interactions by either separating out proximal variants into separate wells, or by leveraging target-specific probes and a common, wild-type amplicon that spans multiple targets.
Additionally, we incorporated multi-spectral signal encoding to suppress wild-type amplification noise that becomes increasingly more challenging in high multiplex PCR mixtures.
For a subset of the > 15-year-old DNA FFPE cases, there was insufficient nucleic acid available to proceed with library preparation and sequencing (N = 17/45, 38%, Table 1), or there was insufficient amplicon coverage across all actionable genomic positions to enable confident calls for all reportables (N = 10/45, 22%).Amplicon coverage is a known issue for targeted sequencing panels and can be driven by a combination of isolation methods, hybridization capture probe locations, DNA fragment lengths, DNA input amount, and sequencing alignment workflows [28].Here, the issue was particularly acute, given the age of a large fraction of the samples.The multiplex dPCR assay, less constrained by DNA quality and input mass requirements for sequencing, was able to generate a valid result for 22 DNA samples that had insufficient DNA for sequencing or had coverage gaps (Table S10b).This highlights two important potential use cases for a multiplex dPCR panel.First, for samples that yield insufficient DNA and/or RNA for sequencing, a low-cost dPCR solution provides a feasible alternative to repeat biopsy to gain actionable information.Second, for situations where sequencing has a multiday turnaround time, a fast and comprehensive dPCR test may enable faster decision-making by the physician while waiting for more comprehensive test results [29].This is particularly true for metastatic NSCLC patients, where there can be a real sense of urgency.To support this hypothesis, future work should explore the dPCR assay performance on additional FFPE sample types, including needle core biopsies and fine needle aspirates, where input mass is particularly challenging.While some sequencing-based assays detect fusions through DNA measurements by attempting to identify specific breakpoints within introns, this can be computationally challenging and highly dependent on sequencing coverage [30].For this reason, we selected a sequencing comparator that leverages RNA-Seq, which, like our assay, makes calls by detecting the presence of fusion exon-exon junctions.However, despite having a $ 50 ng total RNA input, we noticed that three of the RNA gene targets (MET, NTRK1, and ACTB) had low wild-type expression levels (< 100 RPKM) across all samples tested, which suggests some combination of preanalytic and/or biological factors can create greater challenges for RNA-based fusion variant detection.The low read count held true for both the older (> 15 years) and younger (< 3 years) FFPE samples (Table S2).To investigate whether the low counts were specific to sequencing, we evaluated the human biological samples with a second fusion comparator: three commercially available ddPCR singleplex fusion assays for ALK, RET, and ROS1 (Bio-Rad).We first verified the performance of the ddPCR Bio-Rad assays by titrating the previously generated IVT products, and then proceeded with re-testing the human biological RNA samples.Of the N = 60 RNA samples tested across the three ddPCR assays (1 ng total RNA for each assay), N = 75/180 (42%) assays failed on the Bio-Rad ddPCR assay due to low reference gene GUSB counts.In contrast, the amplitude modulation dPCR assay had only 6/60 (10%) assay failures due to reference gene copies with the same approximate (1.5-3 ng) of total RNA input.This highlights the importance of selecting suitable reference controls given preanalytic and biological factors, as well as assay input mass.
In summary, the performance of the dPCR assay was evaluated using a mix of contrived and human biological NSCLC samples to assess performance.The contrived samples allowed testing across all variants and reportables at a range of VAFs, and enabled algorithm development and optimization.The assay also successfully detected many of the common DNA variants in NSCLC human biological samples, including variants present in samples that were not sufficient for NGS.While this assay nor the comparator assays did not detect any rare DNA variants or any RNA fusionpositive samples, this is not surprising given the sample size and the low prevalence of rare variant and fusions (1-4% of NSCLC patients) [31][32][33][34].To further establish the potential of amplitude modulation dPCR in NSCLC testing, additional work is needed to (a) expand the inclusivity of the assay for insertion, deletion, and fusion variants, (b) better understand the relationship between sample input, quality and performance, and (c) test the methods on a larger sample set containing representative rare variants and fusionpositive samples.

Conclusions
Amplitude modulation and multi-spectral encoding enable laboratories to increase the amount of information and decrease noise in dPCR reactions.Here, we illustrate how a 27-variant tumor profiling assay can be constructed for actionable biomarkers with a performance commensurate to NGS, with the benefit of compatibility with lower input mass samples.These chemical and computational approaches may help enable low-cost, fast turnaround, accessible assays in the future.

Fig. 1 .
Fig.1.Three TaqMan primer/probe configurations are leveraged in the multiplex dPCR assay.(A) One or two identical sequence probes, each with a different fluorophore/quencher pair (red and blue), hybridize specifically to the variant sequence and not to the wild-type sequence.Probes are flanked by wild-type locus-specific primers.(B) Allele-refractory mutation system (ARMS) primers specific for the single nucleotide variant (SNV) or insertion/deletion (indel) of interest undergo 3 0 extension if there is a perfect sequence match.One or two identical sequence probes complementary to wild-type sequence can be labeled with different fluorophore/ quencher pairs.(C) RNA-based fusion assays designed against cDNA sequences whereby one primer targets one gene exon and a second primer and probe target the exon of the fusion partner gene.
6 μL reverse transcriptase (PN M0368S; New England Biolabs, Ipswich, MA, USA), 5 μL RNA sample (1-3 ng total RNA), and 1.6 μL 1× TE Buffer (pH 8.0, Low EDTA (Tris-EDTA; 10 mM Tris base, 0.1 mM EDTA)) (PN 786-150; G-Biosciences, St. Louis, MO, USA).Each dPCR reaction mix was then vortexed three times for 5-s pulses and spun down in a microfuge, and 9 μL of the dPCR reaction mix was added to each well of a QuantStudio Absolute Q MAP16 Plate (PN A52865; Thermo Fisher Scientific).Next, 12 μL of QuantStudio Absolute Q Isolation Buffer (PN A52730; Thermo Fisher Scientific) was added to each well on top of the reaction mix.The wells were sealed with QuantStudio Absolute Q strip caps (PN 332101; Thermo Fisher Scientific).All testing was conducted on QuantStudio Absolute Q Digital PCR Systems (Thermo Fisher Scientific).Thermal cycling was performed as follows: (a) reverse transcription at 50 °C for 15 min, (b) preheating at 95 °C for 10 min, and (c) 40 cycles

Fig. 2 .
Fig. 2. Multi-spectral encoding isolates background nonspecific wild-type amplification inherent to nucleic acid hybridization-reliant chemistry.Panels (A-C) show 1D and 2D plots in two channels for probe-based detection for COSM6240 (EGFR T790M).The primers and probes produce some nonspecific amplification with background wild-type DNA (N = 6090 haploid genome copies).(D, E) A contrived sample containing 0.25% COSM6240 synthetic copies in a background of wild-type DNA generates a true-positive signal in channel 1 that is indistinguishable from nonspecific amplification.(F) The same sample as in (D) and (E) leveraging multi-spectral encoding to isolate truepositive partitions from nonspecific amplification.The table on the right shows false-positive counts arising within the call windows of each of four targets from four negative control samples.

Fig. 3 .
Fig. 3. Multi-spectral encoding compensates for variable channel 3 and channel 1 probe performance.(A, B) A channel 3 or 1 probe targeting COSM516 (KRAS G12C) in the presence of synthetic target and human genomic DNA (top) or synthetic target alone (bottom).(C) A mixture of channel 1 and channel 3-labeled COSM516 probes leads to a shift in the positive distribution away from the negative population in both the X and Y directions, reducing false-positive partitions and consolidating true positives.

Table 1 .
Human biological FFPE samples were sourced through multiple studies over varying timescales.Metadata and QC performance are described through sequencing and dPCR workflows.Approved by the Institutional Review Board of University of Hong Kong/Hospital Authority Hong Kong West Cluster. d

Table 2 .
Contrived human biological sample performance.Algorithm performance on all contrived samples at ≥ 1% VAF.Algorithm parameters were optimized on this same sample set as described in Section 2.

Table 3 .
Nucleic acids from human biological NSCLC samples were isolated and underwent QC as described in Section 2. Results are shown for samples passing both NGS and dPCR QC and count criteria.