SEARCH

SEARCH BY CITATION

Keywords:

  • missing data;
  • LOCF;
  • MMRM;
  • multiple imputation

Abstract

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES

The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop-outs, explaining the role and limitations of the ‘last observation carried forward’ method and describing the CHMP's cautionary stance on the use of mixed models.

In preparation for the release of the updated guidance document, statisticians in the Pharmaceutical Industry held a one-day expert group meeting in September 2008. Topics that were debated included minimizing the extent of missing data and understanding the missing data mechanism, defining the principles for handling missing data and understanding the assumptions underlying different analysis methods.

A clear message from the meeting was that at present, biostatisticians tend only to react to missing data. Limited pro-active planning is undertaken when designing clinical trials. Missing data mechanisms for a trial need to be considered during the planning phase and the impact on the objectives assessed. Another area for improvement is in the understanding of the pattern of missing data observed during a trial and thus the missing data mechanism via the plotting of data; for example, use of Kaplan–Meier curves looking at time to withdrawal. Copyright © 2009 John Wiley & Sons, Ltd.


1. BACKGROUND

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES

The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001 1. In September 2007 the CHMP issued a recommendation to review the document 2, with particular emphasis on the following:

  • 1.
    Summarizing and critically appraising the pattern of drop-outs.
  • 2.
    Use of sensitivity analysis or the justification for its absence.
  • 3.
    Explaining the role and limitations of the ‘last observation carried forward’ (LOCF) method.
  • 4.
    Describing the CHMP's cautionary stance on the use of mixed models.

In preparation for the release of the updated guidance document, PSI (Statisticians in the Pharmaceutical Industry), a professional association of statisticians in the Pharmaceutical Industry, held a one-day expert group meeting in September 2008. A list of the meeting attendees and affiliations is given in Appendix. Topics that were debated included the following:

  • 1.
    Minimizing the extent of missing data and understanding the missing data mechanism.
  • 2.
    Defining the principles for handling missing data.
  • 3.
    Understanding the assumptions underlying different analysis methods.

The statistical techniques developed for handling missing data usually assume that the missing data mechanism can be one of the following:

  • 1.
    Missing completely at random (MCAR).
  • 2.
    Missing at random (MAR).
  • 3.
    Missing not at random (MNAR).

Definitions for each of these terms are provided in Table I.

Table I. MCAR, MAR and MNAR definitions.
Missing completely at random (MCAR)The missing value mechanism is unrelated to the observed or unobserved responses, or to other measurements such as baseline values and treatment group. In particular, the probability that an observation is missed does not depend on how big or small it would have been if observed or on the size of the previous or subsequent observations on the same or any subject. Under MCAR any method of analysis that would have been valid for the complete data, such as ANCOVA, remains valid for the observed data
Missing at random (MAR)The missing value mechanism may be dependent on observed measurements, including responses, but given these measurements, there is no remaining dependence on unobserved responses. The concept of Missing at Random (MAR) is most simply explained in the context of patient dropout in a longitudinal study. Suppose that two patients share the same treatment and covariates, and exactly the same response measurements up to the point at which one drops out and the other remains. Then the missing data from the subject who drops out are MAR if they have the same statistical behaviour as the observations from the subject who remains. Under MAR a valid analysis can be constructed that does not require knowledge of the specific form of the missing value mechanism
Missing not at random (MNAR)Even after accounting for observed measurements, there remains dependence between the missing value mechanism and the unobserved responses. Under MNAR a valid analysis does require knowledge of the specific form of the missing value mechanism, but in practice we will almost never know this mechanism

The remainder of this paper summarizes the questions raised, resulting discussions and consensuses reached. It should be noted that some of the issues were not discussed at the meeting. However, the meeting acted as the catalyst. Consequently, the issues debated after the September 2008 meeting, and the subsequent outcome have also been included in this paper. After a brief review of the issues associated with each topic, the major questions raised are listed, immediately followed by a summary of the discussion and agreements. The context of the discussion is largely that of longitudinal clinical trials with dropouts or withdrawals. However, many of the points raised are applicable to other situations.

2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES

The CHMP Points to Consider document on Missing Data states in Section 3: ‘In the design and conduct of a clinical trial all efforts should be directed towards minimizing the amount of missing data likely to occur’. The expert group discussed what proactive steps could be undertaken by trialists to minimize the amount of missing data in clinical trials, and how best to understand the pattern of missing data observed during a trial and thus the missing data mechanism.

In the context of this paper, the term “pattern” is used to cover a multitude of different issues relating to missing data including, for example,

  • 1.
    Timing of discontinuations.
  • 2.
    Differential timing of discontinuation by treatment group.
  • 3.
    Reasons for discontinuation.
  • 4.
    Differential reasons for discontinuation by treatment group, by time, or by treatment and time.
  • 5.
    How baseline and/or post-baseline characteristics of those who discontinue differ from those who complete a trial.

Trialists often investigate the observed patterns of missing data to provide information relating to the missing data mechanism.

Q1. What practical steps can be taken to avoid the presence of missing data in (a) short-term and (b) long-term clinical trials?

Following the ICH guideline on Statistical Principles for Clinical Trials, ICH E9 3, the expert group acknowledged that missing values represent a potential source of bias and that every effort should be undertaken to plan the study so that the amount of missing data is minimized. However, there was consensus that there will almost always be some missing data. It was agreed that the principles for minimizing the amount of missing data do not depend on the length of the trial. Suggestions to minimize the amount of missing data included the following:

  • 1.
    Design the study and write the protocol so that key data are clearly identified.
  • 2.
    The protocol should proactively plan for missing data; for example, unambiguously state the objectives of the study, the patient population of interest and how missing data may impact any inferences to be made. To illustrate the issues a nephrology trial was considered where haemoglobin data are collected weekly for 24 weeks. In such trials it is expected that about 30% of patients will withdraw from follow up. Reasons for withdrawal include death, kidney transplantation, adverse events, loss to follow up, etc. In most cases simply extending the trial or increasing sample size will not adequately address missing data. Protocols and statistical analysis plans rarely discuss the expected patterns of missing data, or consider the impact of the potential patterns on the overall scientific validity of the trial. Statisticians should proactively plan for various missing data mechanisms when determining the sample size, using existing knowledge of the disease and compound under investigation, and the likely impact on the overall inferences to be drawn.
  • 3.
    Consider a two-step withdrawal process for patients: withdrawal of consent for treatment and withdrawal of consent from observation. Once a patient has withdrawn consent for treatment, only assessments needed to address key efficacy and safety questions of interest should be undertaken. In addition, the protocol should clearly state how the collection of the follow-up data would help address the key scientific questions of interest. It was acknowledged that in some disease areas (e.g. pain control, diabetes) it might be a challenge to explain the value of continuing to observe patients whilst the patients are not being treated with the trial study medication. In other disease areas (e.g. Oncology) such practices are already standard. Broadly speaking, at present it seems that the collection of follow-up data is often undertaken for drugs that modify disease progression, but not for symptomatic treatments. It was also recognized that switching of treatments can be an issue with continuing to monitor patients after withdrawal of treatment. Also switching treatments can result in confounding of treatment effects, which may be difficult to interpret or of limited value for short acting treatments or subjective responses. Nonetheless if data are collected after withdrawal with the aim of improving compliance, it was suggested that the amount of such data could be reduced; for example, only collect data relating to the primary endpoint and adverse events.
  • 4.
    In any clinical trial there will always be ‘necessary’ and ‘unnecessary’ discontinuations. For ethical reasons, a trial must always be designed to permit ‘necessary’ discontinuations such as allowing a patient to discontinue due to lack of efficacy or an adverse event. These outcomes in themselves are often useful when assessing a treatment's effectiveness and safety. However, ‘unnecessary’ discontinuations such as lost to follow up, which do not clearly map to adverse events or lack of efficacy can be reduced by tighter clinical protocols. That is, tighter control of the patient population through stricter inclusion or exclusion criteria; for example, patients should be selected who are more likely to complete the study. The disadvantage of this approach is that it can reduce the generalizability of the trial findings. It is worth noting, though, that the occurrence of missing data anyway influences the generalizability of the results obtained from the observed data. In summary, although a tall order, one important way to minimize missing data is to select a patient population that minimizes discontinuations for ‘unnecessary’ reasons; that is, those that are not causally related to the pharmacodymanic effects of the drug, while still enrolling a representative population.

Another suggestion that was made after the meeting was to reduce the amount of data being collected in individual trials and simplify case report forms (CRFs). If only key relevant data are collected, then the chance of data being captured reliably will increase, hence reducing the amount of missing data.

Q2. What methods do you think should be routinely employed to understand the nature of missing data?

To understand the nature of missing data it is important that the relevant information is collected. In a large number of clinical trials sponsored by the PSI, standard withdrawal or discontinuation CRFs are employed. These have prescribed standard lists for reasons for withdrawal such as adverse event, lack of efficacy, lost to follow up, etc. The group felt that often statisticians do not give enough thought to the customization of these CRFs for the disease under consideration or the study objectives; for example, how often are disease- or study-specific reasons included? To illustrate the point consider ‘Lost to Follow Up’ in oncology trials. What does this actually mean? Should study-specific reasons be provided to better understand what happens to these patients? The understanding of patient withdrawal patterns and associated missing data mechanism starts with the collection of relevant information.

The expert group also felt that during the planning phase of a clinical trial it is important to identify potential predictors of missing data, both to facilitate the collection of relevant data and for potential inclusion in the analysis. For example, consider an asthma clinical trial. In such trials FEV1 is often used as the primary endpoint. It is widely recognized that ‘asthma exacerbations’ may also be an important endpoint. In fact, when such events occur a patient may visit their health care professional, who in turn may advise the patient to withdraw from the trial. Subsequently when designing asthma trials it may be important to collect data on ‘asthma exacerbations’. It is, however, important to make the distinction between potential predictors of missing data and scenarios in which drop-out rates simply differ between the treatment groups. The latter case is an example of MAR. Given that treatment is always included in the model, the statistical analysis will therefore not be biased.

The importance of eliminating practices that artificially increase the number of discontinuations when designing clinical trials was also discussed. For example, should patients be discontinued for protocol violations or for lack of compliance? The answer to the question really depends on the precise question of interest. However, an alternative to discontinuing such patients is to permit them to stay in the study but flag the data as non-compliant. This is akin to having follow-up data that can be used in some analyses to test some hypotheses and excluded from other analyses. A specific example of artificial increase in missing data would be the use of electronic diaries for daily pain. If patients do not enter their data by the end of the day should the device not permit them to enter the data? Alternatively should entry be allowed but the data flagged as out of the desired time window?

Another area where the expert group felt improvements could be made was for trialists to start thinking earlier in the process about the mechanisms that cause missing data. As outlined in ICH E9, drug development spans many years and comprises an ordered program of clinical trials each with their own specific objectives. Little effort is made to understand missing data in the earlier phases of drug development. Sponsors tend to start considering the impact of missing data during late Phase II and Phase III, the pivotal clinical trials, when such issues can affect the approval of the final package by the regulatory authorities. Missing data mechanisms need to be considered when making go/no-go decisions at the end of Phase I and early Phase II. In addition the impact of missing data on later phase study design should be considered. It is important to note that the use of Phase I data can be problematic. Phase I studies often involve healthy volunteers, which may not provide useful information about what to expect in patients. However, in some therapeutic areas, such as oncology, when Phase I studies are focused on the population of interest, an insight into the missing data mechanism may be gained.

Q3. What are the relative merits of the following exploratory analyses?

  • Plotting raw data and inspection of the data?

  • Analysis by pattern of missing data (drop-out cohort)?

  • Logistic regression of drop-out on earlier data?

There was consensus that graphical display is one of the most important tools available to statisticians when trying to understand the causes of missing data. Although analytical methods exist for exploring missing data, a large amount of information can be ascertained by simply plotting the data: for example,

  • 1.
    Kaplan–Meier plots to look at time to withdrawal both overall and for specific reasons.
  • 2.
    Plots of treatment means against time for cohorts of subjects with similar follow-up times.
  • 3.
    Plots of treatment means against time for those who drop out at each visit together with the corresponding means against time for those who continue.

The latter two suggestions should be plotted on the same time scale for ease of comparison. The key to success is thinking through the question of interest and intelligently plotting the data.

It is often useful to complement such graphs with logistic regression to explore predictors of dropouts. This is especially useful in identifying key predictors from a set of candidate predictors. Such regressions can rule out MCAR in favour of a MAR mechanism. However, no definitive statements about the exact missing data mechanism can be made. Even if, from the observed data, this mechanism appears MCAR, the data may yet be MNAR. In other words the direct cause of the dropouts may always be unobserved. This is discussed in more detail in Carpenter and Kenward 4

Q4. How would the approach differ if the missing values were in safety data as opposed to efficacy data?

The expert group agreed that the principles for minimizing and understanding missing data should not change for safety data, but the challenges may be very different. For example, in Phase III there are often a small number of specific adverse events of interest that are compound-specific. During the design phase careful consideration needs to be given as to how information will be collected about such events, and the impact of missing data on the inferences to be drawn. It was agreed that there is a need for more than simple summary tables of adverse event incidence rates in clinical study reports. Increased use of graphical displays and more in-depth analyses are required. Any interpretation should be linked to the risk management plan 5.

Although the expert group agreed that the principles for minimizing and understanding missing data should not change for efficacy and safety data, it is interesting to note that principles for handling the missing data in the analysis from a regulatory perspective do seem to differ. Regulators often seek conservative efficacy analyses but seldom see the value in conservative methods that under estimate the treatment effects for safety data.

3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES

As discussed in the CHMP Points to Consider document on Missing Data, if missing values are handled by simply excluding any patients with missing outcomes from the analysis a large number of issues can arise, which may affect the interpretation of the trial results. The following section summarizes the discussions at the expert group meeting relating to the principles that should be applied when handling missing data.

Q5. Regulators have stated on numerous occasions that missing data from patients who drop out are different from other types of missing data. What are the principles for handling different types of missing data?

The expert group agreed that the key issue when handling any missing data is understanding the mechanism causing the missing data. It is essential that the proposed method of analysis, and associated handling of missing data, regardless of whether the patient discontinued or not, must be directly linked to and properly reflect the original objectives of the study, including any assumptions made when designing the trial. Specifically for patients who withdraw, the group felt that the critical question is what information needs to be collected for patients who discontinue, as such patients will occur in every trial. How missing data is handled is an integral part of the description of the primary comparison. The cost of running additional trials to investigate the effect of missing data far outweighs the cost of collecting the appropriate information in the first instance.

Q6. What are the principles for sensitivity analysis in the light of missing data?

The expert group agreed that two important principles exist when considering sensitivity analyses: transparency and relevance of the assumptions. It is important to clearly describe the original assumptions when designing the study so that all stakeholders can assess their relevance. The assumptions underlying any sensitivity analyses should be divergent from the original assumptions. It was agreed that, in contrast, a series of ‘wrong’ analyses does not properly constitute a sensitivity analysis.

Q7. Regulators seem to be favouring a requirement for sponsor companies to monitor patients after withdrawal. How should post-withdrawal data be handled in the statistical analysis?

Some literature has been published relating to the use of data collected after withdrawal 4, 6. Although too complex a topic to focus upon in detail here, since the issue of collecting data after withdrawal currently seems to be a critical one from the regulatory perspective, it was briefly discussed by the expert group. The group agreed that the issue reinforces the need to clearly define the objectives of the study. In defining the objectives clearly and precisely it will become apparent whether collecting data from patients who withdraw is necessary to address the question of concern. If such data are not collected then it may be necessary to describe how any resultant selection biases will be addressed. It was noted that the mechanism for withdrawal might differ between on-treatment and off-treatment periods. This in turn may lead to further technical challenges when incorporating data from patients after withdrawal into the analysis.

4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES

In recent years a large amount of literature has been published on the merits of the different approaches for handling missing data 7–9. This final session of the meeting focused on clarifying the assumptions behind the different methods and how they might relate to the objectives of the trial, specifically for a longitudinal clinical trial with dropouts or withdrawals.

Q8. What are the underlying assumptions of the (a) LOCF, (b) mixed model for repeated measures (MMRM) and (c) multiple imputation (MI) methods for handling missing data in a longitudinal clinical trial with dropouts or withdrawals?

LOCF is a single-imputation method. It makes an implicit assumption that the patients would sustain the same response seen at an early study visit for the entire duration of the trial. The assumption is untestable and potentially unrealistic. Even the strong MCAR assumption does not suffice to guarantee that an LOCF analysis is valid; in fact Kenward and Molenberghs 10 have shown that the assumptions under which LOCF is valid does not fit naturally into the MCAR, MAR and MNAR framework. Further, the uncertainty of imputation is not taken into account, and so, as discussed by Mallinckrodt et al.7, the method can result in systematic underestimation of the standard errors.

MMRM and MI analyses make the assumption that data are MAR. In a MMRM analysis information from the observed data is used via the within-patient correlation structure to provide information about the unobserved data, but the missing data are not explicitly imputed. A MMRM analysis uses all the available data to provide information about the unobserved data 7. It estimates the treatment effects assuming the withdrawn patients have the same statistical behaviour as those who continued. That is, MMRM assumes that the data observed until the point of discontinuation is a valid predictor of the unobserved data. In MI, the imputation step is separate from the modelling step, and so there is an additional flexibility to explore different assumptions about the nature of the missing data. If this flexibility is not used then it may in some circumstances essentially give the same results as MMRM, and so offers no advantages over that method. Further details on each of the above methods are provided in Mallinckrodt et al. 7.

It was acknowledged that if the underlying mechanisms that cause missing data are non-informative the resulting impact on the statistical analysis is far easier to handle, compared with informative missingness. The data being analysed, however, cannot provide evidence to distinguish between these two situations.

Q9. When might the assumptions for each of the methods be considered valid?

Table II outlines when it might be appropriate or inappropriate to use LOCF, MMRM or MI techniques in a longitudinal clinical trial with dropouts or withdrawals. In such studies it is important to recognize that MMRM and MI in their most basic form, both assume the multivariate normal distribution when providing information about the missing data. Invalid inferences can be drawn when the assumption is not met. There are, however, generalizations and modifications of these approaches, which while based on the same basic principles, are valid under other distributional assumptions.

Table II. Summary of When LOCF, MMRM and MI should be considered for a longitudinal clinical trial with a continuous response variable.
Statistical methodSituations where technique can be consideredSituations where technique should not be considered
  • *

    *If the baseline response is not crossed with time then an increase (rather than a decrease) in the variability of the treatment comparison data can be observed. The reason for this is that the correlation between the baseline score and the outcome variable nearly always decreases with time; that is, the serial correlation decays. If baseline is fitted as a main effect then the estimated regression coefficient is averaged across all visits, and is larger than the correct baseline regression coefficient for the final time. This means that the analysis over-corrects for the endpoint of interest, which is typically a comparison at the final visit. So even with no missing data one can get an over-corrected estimate of treatment difference.

LOCFStable disease following first post-treatment observationDiseases with marked improvement or deterioration over time (e.g. Alzheimers)
 Short-term trialsRelapsing or remitting diseases (e.g. generalized anxiety disorder)
  Disease involving transient treatment effects
  When MMRM is used since it nullifies the repeated-measures aspects of the technique
MMRMTrials where the objective is to make inferences about treatment effects if patients stayed on treatment, but where no post-withdrawal data has been collectedWhen withdrawn patients do not mimic patients who continue in the study given same background history
  Trials where the objective is to make inferences about treatment effects if patients stayed on treatment but where off-treatment data is included in the analysis
 An unstructured covariance matrix should always be employed. Time should always be fitted as a class variable. The baseline response should nearly always be crossed with time*If multivariate normal assumption does not hold for providing information about the missing data
MIIncreased flexibility required. In MI the imputation part is separated from modelling part. Extra variables and complexity can be incorporated such as treatment withdrawals, outcomes etc. In particular post-randomization variables predictive of dropouts can be incorporatedWhen Monte Carlo simulation not appropriate If multivariate normal assumption does not hold for providing information about the missing data
 Different imputation schemes are needed for different treatment groups 

One of the main issues when determining how to handle missing data is that the true missing data mechanism will always be unknown and not testable from the data. It is only possible to suggest the data are not consistent with the MCAR assumption. No amount of clever modelling can overcome this issue. If the mechanism for missingness is informative then it may not be possible to fully evaluate the impact of the treatment of missing data in the analysis and this must be carefully considered in the interpretation of the data. Subsequently, the key issues are what questions are being answered from the analysis for the trial, and under what assumptions does the proposed analysis answer the questions. Doubts about aspects of the assumptions can be addressed through appropriate sensitivity analyses. That is, sensitivity analyses can be used to explore the influence of the effect of missing data when doubts exist regarding the missing data mechanism. The use of sensitivity analyses needs to be approached carefully. Sensitivity analyses focused on specific assumptions are useful when determining the robustness of the inferences from a clinical trial.

5. CONCLUSION

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES

The Points to Consider Document on Missing Data was adopted by the CHMP in December 2001. Since the issuance of the guidance document there has been increased debate within the statistical community about the merits of the different approaches used to handle missing data such as LOCF, MMRM and MI. Subsequently in September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop-outs, explaining the role and limitations of LOCF and describing the CHMP's cautionary stance on the use of mixed models. It was clear from the one-day PSI sponsored expert group meeting that the 2001 guideline places a great deal of emphasis on the merits of the different statistical methods available for handling missing data, and not enough on the principles that should be considered when designing trials.

The expert group also concluded that currently biostatisticians tend to react to missing data. Comprehensive, proactive planning is rarely undertaken when designing trials. It is imperative that the precise objectives of the trial are documented and the potential impact of missing data thoroughly considered during the planning phase. Missing data mechanisms for a trial need to be considered. Sensitivity analyses investigating the robustness of the inferences to the different assumptions made should also be considered. Another identified area for improvement is in the understanding of the pattern of missing data observed during a trial and subsequently the missing data mechanism via the plotting of data; for example, use of Kaplan–Meier curves of time to withdrawal. Finally it was concluded that the handling of missing data is a difficult area. If the mechanism for the missing data is non-informative then the issue can be addressed by using relatively straightforward statistical techniques. However, if the mechanism for the missing data is informative then the issues are complex, and appropriate sensitivity analysis is called for.

REFERENCES

  1. Top of page
  2. Abstract
  3. 1. BACKGROUND
  4. 2. MINIMIZING MISSING DATA AND UNDERSTANDING THE MISSING DATA MECHANISM
  5. 3. DEFINING THE PRINCIPLES FOR HANDLING MISSING DATA
  6. 4. UNDERSTANDING THE UNDERLYING ASSUMPTIONS OF THE DIFFERENT ANALYSIS METHODS
  7. 5. CONCLUSION
  8. Acknowledgements
  9. APPENDIX A: PSI DISCUSSION GROUP MEMBERSHIP AND AFFILIATION
  10. REFERENCES