Cost‐effectiveness of precision diagnostic testing for precision medicine approaches against non‐small‐cell lung cancer: A systematic review

Precision diagnostic testing (PDT) employs appropriate biomarkers to identify cancer patients that may optimally respond to precision medicine (PM) approaches, such as treatments with targeted agents and immuno‐oncology drugs. To date, there are no published systematic appraisals evaluating the cost‐effectiveness of PDT in non‐small‐cell lung cancer (NSCLC). To address this gap, we conducted Preferred Reporting Items for Systematic Reviews and Meta‐Analyses searches for the years 2009–2019. Consolidated Health Economic Evaluation Reporting Standards were employed to screen, assess and extract data. Employing base costs, life years gained or quality‐adjusted life years, as well as willingness‐to‐pay (WTP) threshold for each country, net monetary benefit was calculated to determine cost‐effectiveness of each intervention. Thirty‐seven studies (50%) were included for analysis; a further 37 (50%) were excluded, having failed population‐, intervention‐, comparator‐, outcomes‐ and study‐design criteria. Within the 37 studies included, we defined 64 scenarios. Eleven scenarios compared PDT‐guided PM with non‐guided therapy [epidermal growth factor receptor (EGFR), n = 5; programmed death‐ligand 1 (PD‐L1), n = 6]. Twenty‐eight scenarios compared PDT‐guided PM with chemotherapy alone (anaplastic lymphoma kinase, n = 3; EGFR, n = 17; PD‐L1, n = 8). Twenty‐five scenarios compared PDT‐guided PM with chemotherapy alone, while varying the PDT approach. Thirty‐four scenarios (53%) were cost‐effective, 28 (44%) were not cost‐effective, and two were marginal, dependent on their country’s WTP threshold. When PDT‐guided therapy was compared with a therapy‐for‐all patients approach, all scenarios (100%) proved cost‐effective. Seven of 37 studies had been structured appropriately to assess PDT‐PM cost‐effectiveness. Within these seven studies, all evaluated scenarios were cost‐effective. However, 81% of studies had been poorly designed. Our systematic analysis implies that more robust health economic evaluation could help identify additional approaches towards PDT cost‐effectiveness, underpinning value‐based care and enhanced outcomes for patients with NSCLC.


Introduction
The World Health Organization (WHO) lists lung cancer as the most common cancer and leading cause of cancer death (1.76 million globally) [1]. In the USA, lung cancer was projected to cause 140 730 cancer deaths in 2020 (almost a quarter of total cancer deaths), with projected lung cancer deaths in the EU at 182 600 in 2020 [2,3]. Relative survival of lung cancer is poor, at 39% and 13% for 1-and 5-year survival, respectively. Non-small-cell lung cancer (NSCLC) accounts for 85% of lung cancer cases [4,5].
Since 2009, several classes of drugs for NSCLC have been approved for use, all with accompanying precision diagnostic tests (PDT). These include tyrosine kinase inhibitors (TKI) against epidermal growth factor receptor (EGFR; gefitinib, erlotinib, afatinib, dacomitinib and osimertinib), TKI against anaplastic lymphoma kinase (ALK; crizotinib, ceritinib, alectinib, brigatinib, and lorlatinib), v-raf murine sarcoma viral oncogene homolog B1 (BRAF; dabrafenib/trametinib), c-ros oncogene 1 (ROS1; crizotinib or entrectinib), mesenchymal epithelial transition factor (MET; capmatinib or tepotinib), rearranged during transfection proto-oncogene (RET; selpercatinib), neurotrophic tropomyosin receptor kinase (NTRK; entrectinib or larotrectinib) and immuno-oncology (IO) drugs (pembrolizumab, nivolumab, atezolizumab and durvalumab). The American Society of Clinical Oncology (ASCO) and Ontario Health guidelines state that the 60% of stage IV NSCLC patients with actionable mutations (EGFR, ALK, BRAF, ROS1, MET, RET and NTRK) should be offered the corresponding precision medicine (PM) that targets these abnormalities, and the remaining 40% without driver mutations should be offered immunotherapy, dependent on programmed death-ligand 1 (PD-L1) tumour proportion score test results [6,7]. However, the costs of these new agents are proving unsustainable, in both unregulated markets and in socialised healthcare systems [8,9]. However, this challenge has to be set against improved patient outcomes observed using these new targeted agents [10]. Quantifying the impact requires a value assessment of a PM intervention to both the patient and the payer [11,12]. This requires some form of health economic evaluation, as part of a health technology assessment process.
To understand the health economic evaluation landscape of PDT-guided PM, we undertook a systematic review of the evidence available for valuebased policymaking in this domain. Our hypothesis is that PDT, while a fraction of the cost of their associated PM, provide substantial value in terms of health benefits.

Methods
The review is registered with PROSPERO (registration number: CRD42020171234) as per Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines [13]. The methodology followed was similar to a previous paper by the authors [14].

Search strategy
Utilising the PICOS framework (population, intervention, comparator, outcome, study design), we formulated the research question: 'What is the costeffectiveness of precision diagnostic testing (PDT) for guiding therapy in non-small-cell lung cancer?' PICOS was employed to develop a search limited to studies that performed economic evaluation of patients diagnosed with NSCLC who were subsequently stratified for treatment selection based on a PDT result. The search was conducted for studies reported between 1 January 2009 and 31 December 2019. We searched MEDLINE, Embase, Cochrane Library, SCOPUS, Web of Science, NHS Economic Evaluation Database (EED) and Econlit. Meeting presentations were also searched for the same period in the ASCO and International Society for Pharmacoeconomics and Outcomes Research (ISPOR) websites (see Table 1).

Study selection
Articles were screened for eligibility based on criteria listed in Table S1. Titles and abstracts of all articles were reviewed for eligibility and only accepted if these criteria were met. Four reviewers (RH, DF, DS and ML) independently evaluated the full text of potentially eligible articles to determine whether to include or exclude. A lack of consensus over eligibility was resolved between the four reviewers. If doubts remained about study suitability (e.g. abstracts lacking peer review), they were excluded. The integrity of each study was assessed according to a checklist developed by the ISPOR Consolidated Health Economic Evaluations Reporting Standards Task Force Report [15]. This underpinned development of a quality rating for each study, thus allowing rigorous evaluation of the robustness of the data generated. Quality assessment was performed by one reviewer, checked by a second reviewer and any disagreement resolved by third/fourth reviewers. Quality ratings were assigned in five categories (Table S2).
The Study Selection Workflow is outlined in Fig. S1. Our initial database search and other electronic searches (ASCO, ISPOR) identified 18 723 records. MEDLINE and Embase results were further searched based on health economic filters, as there is a paucity of these represented in the identified records. A total of 18 614 records were excluded and the remaining 110 records imported into reference management software, where duplicate records (n = 36) were removed. A total of 74 articles were screened for eligibility. After full text examination, 10 articles were either reviews or systematic reviews, which were retained for reference, whereas seven articles did not mention the terms lifeyears gained (LYG), quality-adjusted life-years (QALY), or incremental cost-effectiveness ratio (ICER). Seven other articles did not include cost-effectiveness analysis (CEA), cost benefit analysis or cost utility analysis. On further examination, 12 were abstracts without enough detailed information, and one was an intervention without deployment of a PDT. In total, 37 eligible studies remained which involved economic evaluation of PDT for guiding therapeutic intervention in NSCLC.

Data extraction
We extracted empirical and methodological data and imported these data into Microsoft EXCEL. Extracted features included: author, year, country of study, NSCLC stage/advanced/not described, therapy, biomarker utilised, LYG, QALY, the current ICER (cost per LYG) and/or ICER (cost per QALY), willingness-to-pay (WTP) cost-effectiveness threshold (CET) and net monetary benefit (NMB) statistic (calculated based on LYG or QALY, costs and WTP). We also extracted author, PM cost, PDT cost (and calculated the PDT : PM cost ratio), perspective (healthcare payer, health insurance or hospital), modelling approach, time horizon (duration of therapy), discounting applied, one-way sensitivity analysis (OWSA), probabilistic sensitivity analysis (PSA) and the trial upon which the economic evaluation was based. While most studies only listed one scenario of PDT intervention compared with standard of care (or another PDT), some studies listed as many as 10 different scenarios, where the scenario involved variation in the PDT, therapy or country. If there were insufficient data (e.g. abstract reports from conferences), we emailed the original authors for further details.

Data synthesis
Data capture and quality analysis for each study of cost-effectiveness were represented in the data extraction and as a narrative summary. Modelling techniques used in the different studies were compared and their robustness analysed.

Sub-analysis
Net monetary benefit was calculated in each instance where a PM-guided by a PDT was compared with the same PM drug administered to all patients without PDT guidance.

Mathematical formulae employed
In cases where more than one therapy and test combination were modelled, the reported ICER might not be compared to the base case, e.g. best supportive care (BSC) or LYG and QALY reported, but no corresponding ICER calculated. In these instances, we calculated the ICER based on reported costings and QALY for the PDT using the following formula: For the studies that compared PDT-guided therapy with unselected PDT therapy for all-comers and with chemotherapy, we conducted a sub-analysis using the NMB (a summary statistic that represents an intervention's value in monetary terms) with the formula: The WTP CET employed for each scenario corresponded to that reported in the study; if more than one or a range of WTP CET were described, we conservatively chose the lowest. Additionally, if no WTP CET was disclosed, then the WTP CET from the same country in another captured study was employed, or 1× gross domestic product (GDP) per capita of that country was used.

Results
The 37 studies were reported from Asia, Australia, Europe and North America, all of which were at least upper middle-income countries. Publications spanned the period 2009-2019. Where a negative ICER was reported or calculated, it was always due to negative costs, not negative LYG or QALY. The reader should refer to the NMB statistic before drawing conclusions, as a negative value here will always determine that the intervention was not cost-effective.

Health outcomes for each precision medicine/precision diagnostic testing combination
3.1.1. TKI treatment guided by EGFR status versus TKI treatment for all patients Four of five ICER were negative (Table 2a), generated from increased QALY and less costly therapy; one of the ICER was positive [16][17][18][19][20]. Negative ICER can be equated with an intervention that can either be costeffective or not, which may lead to confusion. The simpler Net Monetary Benefit was calculated for each scenario; here a negative NMB indicates a non-costeffective strategy. All four erlotinib studies produced positive NMB. The one gefitinib study evaluated revealed an NMB equal to an increase of Singapore dollars (S$)5800 per patient in value when the EGFR test is employed to guide gefitinib therapy, as opposed to an unselected approach ( Table 2).

Immunotherapy treatment guided by PD-L1 positivity versus immunotherapy treatment for all patients
Incremental cost-effectiveness ratios generated by the PD-L1 testing strategy were described as dominated (i.e. yielded worse health outcomes and were more costly) when compared with the no testing strategy when pembrolizumab treatment was considered. However, the corresponding incremental QALY and costs reported increased health benefits and were less expensive, respectively [21]. Nivolumab therapy accompanied by PD-L1 testing generated ICER well below Swiss francs (CHF)100,000 WTP CET (Table 2b) [22].
Sub-analysis of the nivolumab study indicated an NMB of CHF86 per patient at PD-L1 ≥ 1% and an NMB of CHF2,779 per patient at PD-L1 ≥ 10% where the PD-L1 test guides nivolumab therapy when compared with an unselected approach (Fig. 1A). Analysis of the pembrolizumab study revealed an NMB of US$32,604 per patient at PD-L1 ≥ 1% and a NMB of US$56,889 per patient at PD-L1 ≥ 50% in the USA (Fig. 1B). For China, the results indicated an NMB of US$27,039 per patient at PD-L1 ≥ 1% and an NMB of US$52,120 per patient at PD-L1 ≥ 50% (Fig. 1C).
In summary, PM treatments (erlotinib, gefitinib or immunotherapy) guided by a precision diagnostic test (EGFR or PD-L1) increased the clinical and monetary value of PM for both the patient and healthcare payer when compared with an unselected treatment approach for 'all' patients (Table 3).

TKI treatment guided by EGFR or ALK status versus chemotherapy treatment
Of the 14 studies (20 scenarios) evaluated, 13 scenarios generated ICER which breached their respective WTP CET (Table 2c); this is reflected in their corresponding negative NMB values (Table 4).
3.1.4. Immunotherapy treatment guided by PD-L1 positivity versus chemotherapy treatment Table 5 shows six scenarios which generated ICER below their WTP CET (Table 5), which correlated with positive NMB values; the remaining two scenarios had ICER which breached their WTP CET.

Treatment guided by genetic status using different testing scenarios
Twelve scenarios breached their WTP CET, two scenarios were dominated (clinically inferior and more expensive than the standard of care), and the remaining 11 scenarios were within their CET (Table 2e). The NMB indicated 13 scenarios that were not cost-effective, 10 scenarios that were cost-effective, and two scenarios which were marginal at zero (Table 6).

Analyses for each PM-PDT cost evaluated
The annual cost of each study's PM was identified and this was employed as a denominator to assess the fraction of test cost relative to precision therapy (Tables  S3a-S3e  status had the greatest effect on the ICER, and the PSA determined that PD-L1 testing increased costeffectiveness of the therapy in China, Switzerland and USA (Table S3b) [21,22].

TKI treatment guided by EGFR or ALK status versus chemotherapy treatment
Afatinib, erlotinib and gefitinib treatment guided by EGFR status Reviewing the OWSA results, it was evident that both increasing mutation prevalence of EGFR and the health status of the patient had the greatest impact on the ICER. The PSA determined in four of seven studies that in China (with patient access programmes), Germany and Japan, EGFR-guided therapy was costeffective, whereas the Mexican, Thai and US studies were not (Table S3c) [23][24][25][26][27][28][29].
Osimertinib treatment guided by EGFR-T790M status Overall, the OWSA results showed that the patient's health status and the cost of osimertinib had the greatest effect on the ICER, whereas the PSA was inconclusive, with China and the UK studies cost-effective, but Canadian, Chinese and USA studies not cost-effective (Table S3c) [30][31][32][33][34].
Alectinib, ceritinib and crizotinib treatment guided by ALK status The OWSA demonstrated that cost of therapy and patient's health status influenced the ICER the most. Where a PSA was conducted in the Chinese study, it was likely that ceritinib was cost-effective, whereas alectinib was not (Table S3c) [35,36].

Immunotherapy treatment guided by PD-L1 positivity versus chemotherapy treatment
The OWSA performed showed that OS had a major impact on the ICER in four of five studies; in the four cases where PSA was conducted, the Swiss study was likely to be cost-effective, Hong Kong and USA studies were inconclusive, and the USA study was not cost-effective (Table S3d) [37][38][39][40][41].

Treatment guided by genetic status using different testing scenarios
Although the OWSA in Canada, China and USA revealed that several ICER values were most sensitive  to OS, PFS and drug costs, this was not true for all studies, with certain ICER values in Australia and France more impacted by high-risk patients, inpatient care or costs alone. Most of the PSA performed suggested that these were not cost-effective strategies, although the 14-gene assay and ALK testing were costeffective (Table S3e).

Discussion
To our knowledge, there is no systematic review of economic evaluations of NSCLC which has focused specifically on PDT-guided PM. In our analysis, we identified 64 CEA scenarios, evaluated within 37 studies, which satisfy our criteria, to determine 'What is the cost-effectiveness of PDT for guiding therapy in non-small-cell lung cancer?' Thirty-four (53%) of these scenarios were deemed cost-effective. However, only 11 of the 64 scenarios followed the correct analysis format to assess whether a PDT adds value to a PM approach. That is, only these 11 scenarios compared PM and PDT with PM administered to the patient cohort without prior use of the test to select patients. Of these, seven scenarios (63.6%) agreed with our hypothesis of PDT-guided treatment conferring measurable increased benefit. Four scenarios presented conflicting results of data from Wan et al.; [21] we believe that the authors may have mislabelled these studies as dominated (clinically inferior and more expensive) rather than dominant (less costly and better health outcomes), which corresponds to the incremental costs and QALY indicated in their results. The data that we have presented for these seven positive studies, and our conclusions, are supported by the authors of a recent systematic review of economic evaluation which only focussed on IO drugs. This study found that in NSCLC, molecular testing to help guide IO interventions provides more clinical benefit than the pharmaceutical agent alone [42]. Overall, the LYG or QALY gained for EGFRdirected therapy were greater in the Asian studies than in North American or European populations, which is to be expected, as the prevalence of EGFR mutations is greater in Asia. In 59%(24 of 41 cases), the EGFRguided therapy failed cost-effectiveness criteria regardless of test type; this is also true for ALK testing in 38% (five of 13) of cases, and in 14% (two of 14) of PD-L1 testing (all IHC-based testing).
A number of the testing scenarios involving nextgeneration sequencing (NGS) have difficultly capturing more than one actionable mutation with standard CEA, as current Markov or state transition models aggregate patient data into distinct health groups [e.g.  progression-free survival (PFS), progressive disease (PD) and death], neglecting heterogeneity amongst the patient cohort, and PSM is incapable of returning to PFS from a PD state, where in some cases there is a distinct possibility of a 'cure'. Dynamic simulation modelling such as discrete event simulation (DES) has recently been suggested as a model which can track individual patient pathways, incorporating results, testing and consequential therapies [43].
Our analyses strongly suggest that health economic evaluation should be performed routinely from the start of and alongside clinical trials. This is particularly true for precision oncology, where therapeutic costs are high and improved patient outcomes achieved through application of a relatively inexpensive PDT would be beneficial, both from a clinical and a health economic viewpoint. Previously, we have demonstrated a paucity of CEA studies for PM-guided care in colorectal cancer; that same dearth of application of CEA is evident for NSCLC, with only seven of 37 studies (18.9%) adequately designed to analyse the cost-effectiveness and value of PDT [14].

Strengths and weaknesses
The principal strength of this systematic review is that we employ the NMB summary statistic rather than the ICER to assess cost effectiveness. NMB incorporates both costs and QALY at a WTP threshold particular to that country, allowing cost-effectiveness to be easily captured, thus generating more robust data. Secondly, we demonstrate that while PM is a driver of costs, PDT are a driver of value. PDT, at a fraction of the cost of a precision therapy, add value beyond the therapy, by selecting patients who will accrue greater health benefits and reducing costs by excluding patients who will not benefit from a particular PM approach.
Weaknesses of the data presented in this systematic review are that the majority of studies published are inappropriately structured to best assess effective PDT deployment in PM, which may reflect the lack of involvement of health economists and diagnostic stakeholders in setting the PM agenda.
Secondly, the generalisability of the results of this study is difficult to ascertain, as WTP CET vary not only between but also within countries where such studies are performed. WHO proposes that it is reasonable to spend income to achieve a QALY that is equivalent to the GDP per capita of a country, a recommendation followed in the UK by the National Institute for Health and Clinical Excellence with its £30,000 WTP CET, but this is adapted for end-of-life disease such as metastatic NSCLC at £50,000, and modified again for small patient subgroups, a hallmark of PM, with values of £75,000 upwards [44][45][46]. Such high WTP CET could have a significant impact on the costs of a country's healthcare system, if only the value of an intervention is considered. It would be advisable also to conduct a Budget Impact Analysis which would more robustly assess the intervention's affordability [47]. Thirdly, all these CEA are based on randomised controlled trials (RCT) which involve highly selected patient populations. For PM RCT, the small patient populations and more complex clinical pathways may increase uncertainty in CEA modelling results. Adding CEA of real-world data as an adjunct to RCT data would improve confidence in a treatment's effectiveness [48].
Fourthly, the CEA do not capture capital costs (testing equipment), personnel and their training, and reporting tool costs.
Fifthly, patient waiting times between test and therapy and the impact of first and potentially further surgical biopsies are not reported, two important aspects of PDT deployment. The turnaround times from sequential single gene testing to NGS is important in advanced NSCLC, where appropriate speed of test turnaround may be crucial to a patient' survival (and likely QALY impact). These are not modelled in the studies described. Liquid biopsies also add to the speed of tumour profiling, with the additional bonus of sampling being relatively painless to the patient [49].

Conclusion
Over half of the scenarios analysed presented ICER below the WTP CET, suggesting a potential publication bias which can only be addressed by increased diligence and transparency in the health economics/precision oncology evaluation. Only seven of 37 CEA studies performed to assess the benefit of PM approaches in NSCLC care were appropriately designed to assess the value of combining PDT with PM, highlighting the need for greater emphasis on precise health economic analysis to inform value-based patient care. Despite this, employing molecular tests to guide NSCLC therapy appears to be cost-effective in the majority of cases. Thus, cost-effective deployment of PDT can add substantial value to the PM approach well in excess of the cost of the test itself and should inform a more robust approach for future PM delivery for NSCLC patients.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. PRISMA flow diagram Table S1. Screening criteria and study design for systematic review. Table S2. CHEERS criteria and quality rating. Table S3a. Methodological characteristics and quality rating of TKI treatment guided by EGFR status versus TKI treatment for all patients. Table S3b. Methodological characteristics and quality rating of immunotherapy guided by PD-L1 positivity versus immunotherapy for all patients. Table S3c. Methodological characteristics and quality rating of TKI treatment guided by EGFR or ALK status versus chemotherapy. Table S3d. Methodological characteristics and quality rating of assessment of immunotherapy guided by PD-L1 positivity versus chemotherapy. Table S3e. Methodological characteristics and quality rating of treatment guided by genetic status using different testing scenarios.