Missing data in trial‐based cost‐effectiveness analysis: An incomplete journey

SUMMARY Cost‐effectiveness analyses (CEA) conducted alongside randomised trials provide key evidence for informing healthcare decision making, but missing data pose substantive challenges. Recently, there have been a number of developments in methods and guidelines addressing missing data in trials. However, it is unclear whether these developments have permeated CEA practice. This paper critically reviews the extent of and methods used to address missing data in recently published trial‐based CEA. Issues of the Health Technology Assessment journal from 2013 to 2015 were searched. Fifty‐two eligible studies were identified. Missing data were very common; the median proportion of trial participants with complete cost‐effectiveness data was 63% (interquartile range: 47%–81%). The most common approach for the primary analysis was to restrict analysis to those with complete data (43%), followed by multiple imputation (30%). Half of the studies conducted some sort of sensitivity analyses, but only 2 (4%) considered possible departures from the missing‐at‐random assumption. Further improvements are needed to address missing data in cost‐effectiveness analyses conducted alongside randomised trials. These should focus on limiting the extent of missing data, choosing an appropriate method for the primary analysis that is valid under contextually plausible assumptions, and conducting sensitivity analyses to departures from the missing‐at‐random assumption.


| INTRODUCTION
Cost-effectiveness analyses (CEA) conducted alongside randomised controlled trials are an important source of information for health commissioners and decision makers. However, clinical trials rarely succeed in collecting all the intended information (Bell, Fiero, Horton, & Hsu, 2014), and inappropriate handling of the resulting missing data can lead to misleading inferences (Little et al., 2012). This issue is particularly pronounced in CEA because these usually rely on collecting rich, longitudinal information from participants, such as their use of healthcare services (e.g., Client Service Receipt Inventory; Beecham & Knapp, 2001) and their health-related quality of life (e.g., EQ-5D-3L; Brooks, 1996).
Several guidelines have been published in recent years on the issue of missing data in clinical trials (National Research Council, 2010; Committee for Medicinal Products for Human Use (CHMP), 2011;Burzykowski et al., 2010;Carpenter & Kenward, 2007) and for CEA in particular (Briggs, Clark, Wolstenholme, & Clarke, 2003;Burton, Billingham, & Bryan, 2007;Faria, Gomes, Epstein, & White, 2014;Manca & Palmer, 2005;Marshall, Billingham, & Bryan, 2009). Key recommendations include: • taking practical steps to limit the number of missing observations; • avoiding methods whose validity rests on contextually implausible assumptions, and using methods that incorporate all available information under reasonable assumptions; and • assessing the sensitivity of the results to departures from these assumptions.
In particular, following Rubin's taxonomy of missing data mechanisms (Little & Rubin, 2002), methods valid under a missing-at-random (MAR) assumption (i.e., when, given the observed data, missingness does not depend on the unseen values) appear more plausible than the more restrictive assumption of missing completely at random, where missingness is assumed to be entirely independent of the variables of interest. Because we cannot exclude the possibility that the missingness may depend on unobserved values (missing not at random [MNAR]), an assessment of the robustness of the conclusions to alternative missing data assumptions should also be undertaken.
Noble and colleagues (Noble, Hollingworth, & Tilling, 2012) have previously reviewed how missing resource use data were addressed in trial-based CEA. They found that practice fell markedly short of recommendations in several aspects. In particular, that reporting was usually poor and that complete-case analysis was the most common approach. However, missing data research is a rapidly evolving area, and several of the key guidelines were published after that review. We therefore aimed to review how missing cost-effectiveness data were addressed in recent trial-based CEA.
We reviewed studies published in the National Institute for Health Research Health Technology Assessment (HTA) journal, as it provides an ideal source for assessing whether recommendations have permeated CEA practice. These reports give substantially more information than a typical medical journal article, allowing authors the space to clearly describe the issues raised by missing data in their study and the methods they used to address these. Our primary objectives were to determine the extent of missing data, how these were addressed in the analysis, and whether sensitivity analyses to different missing data assumptions were performed. We also provide a critical review of our findings and recommendations to improve practice.

| METHODS
The PubMed database was used to identify all trial-based CEA published in HTA between the January 1, 2013, and December 31, 2015. We combined search terms such as "randomised," "trial," "cost," or "economic" to capture relevant articles (see Appendix A.1 for details of the search strategy). The full reports of these articles were downloaded then screened for eligibility by excluding all studies that were pilot or feasibility studies; reported costs and effects separately (e.g., cost-consequence analysis); or did not report a within-trial CEA.
For each included study, we extracted key information about the study and the analysis to answer our primary research questions. A detailed definition of each indicator extracted is provided in Appendix B. In a second stage, we drew on published guidelines and our experience to derive a list of recommendations to address missing data, and then rereviewed the studies to assess to which extent they followed these recommendations (see Appendix B for further details).
Data analysis was conducted with Stata version 15 (StataCorp, 2017). The data from this review are available on request (Leurent, Gomes, & Carpenter, 2017).

| Included studies
Sixty-five articles were identified in our search (Figure 1), and 52 eligible studies were included in the review (listed in Appendix A.2). The median time frame for the CEA was over 12 months, and the majority of trials (71%, n = 37) conducted a follow-up with repeated assessments over time (median of 2; Table 1). The most common effectiveness measure was the quality-adjusted life year (81%, n = 42). Other outcomes included score on clinical measures, or dichotomous outcomes such as "smoking status".

| Extent of missing data
Missing data was an issue in almost all studies, with only five studies (10%) having less than 5% of participants with missing data. The median proportion of complete cases was 63% (interquartile range, 47-81%; Figure 2). Missing data arose mostly from patient-reported (e.g., resource use and quality of life) questionnaires. The extent of missing data was generally similar for cost and effectiveness data, but 10 (19%) studies had more missing data in the latter (Table 1). The proportion of complete cases reduced, as the number of follow-up assessments increased (Spearman's rank correlation coefficient ρ = −0.59, p value < .001) and as the study duration increased (ρ = −0.29, p = .04).

| Approach to missing data
In the remaining assessments, we excluded the five studies with over 95% of complete cases. Three main approaches to missing data were used: complete-case analysis (CCA;Faria et al., 2014), reported in 66% of studies (n = 31), multiple imputation (MI; Rubin, 1987; 49%, n = 23), and ad hoc hybrid methods (17%, n = 8). For the primary analysis, CCA was the most commonly used method (43%, n = 20), followed by MI (30%, n = 14; Table 2). MI was more common when the proportion of missing data was high and when there were multiple follow-up assessments (see Table 3).

| Sensitivity analyses
Over half of the studies (53%, n = 25) did not conduct any sensitivity analysis around missing data, with 21% (n = 10) reporting CCA results alone and 11% (n = 5) MI results under MAR alone (Table 4). The remaining studies (n = 22, 47%) assessed the sensitivity of their primary analysis results to other approaches for the missing data. This was usually performing either MI under MAR, or CCA, when the other approach was used in the primary analysis. Other sensitivity analyses included using last observation carried forward or regression imputation.
Only two studies (4%) conducted sensitivity analyses, assuming data could be MNAR. In both studies, values imputed under a standard MI were modified to incorporate possible departures from the MAR assumption for both   the cost and effectiveness data using a simplified pattern-mixture model approach (Faria et al., 2014;Leurent et al., 2018). The studies then discussed the plausibility of these departures from MAR and their implications for the costeffectiveness inferences. Table 5 reports the number of studies that reported evidence of following the recommendations from Figure 3 (see Section 4). Most studies reported being aware of the risk of missing data, for example, by taking active steps to reduce them (n = 35, 74%). In addition, almost two-thirds of the studies (n = 29, 62%) reported the breakdown of missing data by arm, time point, and endpoint. Only about one-third of the studies have clearly reported the reasons for the missing For the five studies with less than 5% of incomplete cases, four used CCA and one an ad hoc hybrid method for their primary analysis. One of the five studies conducted a sensitivity analysis to missing data. c Excluding 12 studies where this was unclear (n = 35). Note. % = row percentages; CCA = complete-case analysis; MAR = assuming data missing at random; MI = multiple imputation; MNAR = assuming data missing not at random. Total may be more than 100% as some studies conducted more than one sensitivity analysis. a Other methods used for sensitivity analysis include last observation carried forward (n = 1), regression imputation (n = 1), adjusting for baseline predictors of missingness (n = 1), imputing by average of observed values for that patient (n = 1), and an ad hoc hybrid method using multiple and mean imputation (n = 1). data (n = 16, 34%) and the approach used for handling the missing data and its underlying assumptions (n = 17, 36%). Only one study (2%) appropriately discussed the implications of missing data in their cost-effectiveness conclusions.

| Summary of findings
Missing data remain ubiquitous in trial-based CEA. The median proportion of participants with complete cost-effectiveness data was only 63%. This reflects the typical challenges faced by CEA of randomised controlled trials, which often rely on patient questionnaires to collect key resource use and health outcome data. Despite best efforts to ensure  completeness, a significant proportion of nonresponse is likely. This is consistent with other reviews, which also found no reduction of the extent of missing data in trials over time (Bell et al., 2014). CCA remains the most commonly used approach for handling missing data in trial-based CEA, in contrast to recommendations. This approach makes the restrictive assumption that, given the variables in the analysis model, the distributions of the outcome data are the same, whether or not those outcome data are observed. This approach is also problematic because it can result in a loss in precision, as it discards participants who have partially complete data postrandomisation and who can provide important information to the analysis. Other unsatisfactory approaches based on unrealistic assumptions, such as last observation carried forward and single imputation, are also occasionally used.
MI (Rubin, 1987) assuming MAR has been widely recommended for CEA (Briggs et al., 2003;Burton et al., 2007;Faria et al., 2014;Marshall et al., 2009), allowing for baseline variables and postrandomisation data not in the primary analysis to be used for the imputation. It seems to be now more commonly used, with around half of the studies using MI for at least one of their analyses (up to 74% in 2015). Around one-third of the studies used MI for their primary CEA, which is higher than seen in primary clinical outcome analyses (8%; Bell et al., 2014).
On the other hand, sensitivity analyses to missing data remain clearly insufficient. Only two studies (4%) conducted comprehensive sensitivity analyses and assessed whether the study's conclusions were sensitive to departures from the MAR assumption (i.e., possible MNAR mechanisms). Half of the studies did not conduct any sensitivity analysis regarding the missing data. The remaining studies performed some sort of sensitivity analyses, but usually consisting of simple variations from the primary analysis, such as reporting CCA results in addition to MI. This may be more for completeness than proper missing data sensitivity analyses. For example, if MI is used for the primary analysis (having assumed that MAR is the realistic primary missing data assumption), a sensitivity analysis that involves CCA will make stronger missing data assumptions.

| Strengths and limitations
Our review follows naturally from the review of Noble et al. (2012) and gives an update of the state of play after the publication of several key guidelines. Our review, however, differs in scope and methods and cannot be directly compared with the results of Noble et al. One of the key strengths of this review is that HTA comprehensive reports allowed us to obtain a more complete picture of the missing data and the methods used to tackle it. HTA monographs are published alongside more succinct peer-reviewed papers in specialist medical journals, and they are often seen as the "gold-standard" for trial-based CEA in the UK. It seems therefore reasonable to assume that these are representative of typical practice in CEA. This review is, to our knowledge, the first to look at completeness of both cost and effectiveness data. A limitation is the use of a single-indicator "proportion of complete cases" to capture the extent of the missing data issue. This is however a clearly defined indicator and allows comparison with other reviews. The "recommendations indicators" also focused on the information reported in the study, not necessarily what might have been done in practice.

| Recommendations
A list of recommendations to address missing data in trial-based CEA is presented in Figure 3. Trial-based CEA are prone to missing data, and it is important that analysts take active steps at the design and data-collection stages to limit their extent (Bernhard et al., 2006;Brueton et al., 2013;National Research Council, 2010). Resource use questionnaires should be designed in a user-friendly way, and their completion encouraged during follow-up visits, possibly supported by a researcher (Mercieca-Bebber et al., 2016;National Research Council, 2010). Alternative sources should also be considered to minimise missing information, for example, administrative data or electronic health records (Franklin & Thorn, 2018;Noble et al., 2012).
For any study with missing data, clear reporting of the issue is required. Ideally, the study should report details of the pattern of missing data (Faria et al., 2014), possibly as an appendix. At a minimum, CEA studies should report for each analysis the number of participants included by trial arm, as recommended in the Consolidated Standards of Reporting Trials guidelines (Noble et al., 2012;Schulz et al., 2010).
Although CCA may be justifiable in some circumstances, the choice of CCA for the primary analysis approach appears difficult to justify in the presence of repeated measurements, because the loss of power (by discarding all patients with any missing values) across the different time points tends to be large. Other approaches valid under more plausible MAR assumptions and making use of all the observed data, such as MI (Rubin, 1987); likelihood-based repeated measures models (Faria et al., 2014;Verbeke, Fieuws, Molenberghs, & Davidian, 2014); or Bayesian models (Ades et al., 2006), should be considered. In particular, MI has been increasingly used in CEA, and further guidance to support an appropriate use in this context is warranted.
An area with clear room for improvement is the conduct of sensitivity analyses. This review found that many studies used CCA for the primary analysis and MI as a sensitivity analysis, or vice-versa, and concluded that the results were robust to missing data. This is misleading because both of these methods rely on the assumption that the missingness is independent of the unobserved data. Although the MAR assumption provides a sensible starting point, it is not possible to determine the true missing-data mechanism from the observed data. Studies should therefore assess whether their conclusions are sensitive to possible departures from that assumption (National Research Council, 2010; Committee for Medicinal Products for Human Use (CHMP), 2011;Faria et al., 2014). Several approaches have been suggested to conduct analyses under MNAR assumptions. Selection models express how the probability of being missing is related to the value itself. Pattern-mixture models, on the other hand, capture how missing data could differ from the observed (Molenberghs et al., 2014;Ratitch, O'Kelly, & Tosiello, 2013). Pattern-mixture models appear attractive because they frame the departure from MAR in a way that can be more readily understood by clinical experts and decision makers and can be used with standard analysis methods such as MI (Carpenter & Kenward, 2012;Ratitch et al., 2013). MNAR modelling can be challenging, but accessible approaches have also been proposed (Faria et al., 2014;Leurent et al., 2018). Further developments are still needed to use these methods in the CEA context and to provide the analytical tools and practical guidance to implement them in practice.

| CONCLUSION
Missing data can be an important source of bias and uncertainty, and it is imperative that this issue is appropriately recognised and addressed to help ensure that CEA studies provide sound evidence for healthcare decision making. Over the last decade, there have been some welcome improvements in handling missing data in trial-based CEA. In particular, more attention has been devoted to assessing the reasons for the missing data and adopting methods (e.g., MI) that can incorporate those in the analysis. However, there is substantial room for improvement. Firstly, more efforts are needed to reduce missing data. Secondly, the extent and patterns of missing data should be more clearly reported. Thirdly, the primary analysis should consider methods that make contextually plausible assumptions rather than resort automatically to CCA. Lastly, sensitivity analyses to assess the robustness of the study's results to potential MNAR mechanisms should be conducted.

CONFLICT OF INTEREST
The authors have no conflict of interest.

A.1 | PubMed search criteria and results
Search Wiles, N., Thomas, L., Abel, A., Barnes, M., Carroll, F., Ridgway, N., … Lewis, G. (2014). Clinical effectiveness and cost-effectiveness of cognitive behavioural therapy as an adjunct to pharmacotherapy for treatment-resistant depression in primary care: The CoBalT randomised controlled trial. Health Technology Assessment, 18 (31) Because these aspects could have been mentioned in multiple parts in the monograph, we used a systematic approach, looking for keywords and checking the most relevant paragraphs in the full report.

B.2.2 | Answers
"Yes": The recommendation was clearly mentioned, and the criteria therefore met. "No": The recommendation was not clearly mentioned or found. The recommendation may still have been followed but not reported (or at least not found with the above strategy).
"Unclear": There was some suggestions the criteria may have been met but not enough information to be sure. Comment on why missing data (e.g., "because patients were too ill"). Or explore baseline factors associated with missingness No mention of reasons for MD in the CE section.
Have to be specific to the CE missing data, or clearly mentioning something like "reasons for MD are discussed in clinical analysis section …" D3. Describe methods used, and underlying missing data assumptions Clearly state the method used to address missing data, AND the underlying assumption.
No report of missing data assumption or method used Draw overall conclusion in light of the different results and the plausibility of the respective assumptions Conduct sensitivity analyses, and interpret results appropriately.
Did MNAR SA and appropriate conclusion.
-Did not conduct sensitivity analyses -Conducted sensitivity analyses, but no comment/conclusion -Did MI and CC and only say "results did not change/robust to missing data"