SEARCH

SEARCH BY CITATION

Summary

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

There are several methods in which one can assess the relationship between an intervention and an outcome. Randomized controlled trials (RCTs) are considered as the gold standard for evaluating interventions. However, for many questions of clinical importance, RCTs would be impractical or unethical. Clinicians must rely on observational studies for the best available evidence when RCTs are unavailable. This article provides an overview of observational research designs to facilitate the understanding and appraising of their validity and applicability in clinical practice. Major methodological issues of observational studies including selection bias and confounding are also discussed. In addition, strategies to minimize these problems in the design and analytical phases of a study are highlighted. Knowledge of the strengths, weaknesses and recent methodological advances in observational studies can assist clinicians to make informed decisions about whether a particular observational study would provide useful information to enhance patient care.


Review Criteria

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

Published literature was reviewed to identify the different observational study designs used to assess the relationship between an intervention and an outcome, the major challenges and the strategies to reduce these issues.

Message for the Clinic

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

Observational studies play a significant role in healthcare, particularly when evidence from randomized controlled trials is unavailable. Clinicians treating patients need to recognize their important role in medical decision making. It is crucial to carefully assess studies for possible methodological flaws and make judgments about the validity and applicability of study results to enhance patient care.

Introduction

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

There are several ways in which one can study the relationship between an intervention, exposure or risk factor and an outcome. Randomized controlled trials (RCTs) are seen as the gold standard for assessing the efficacy and safety of an intervention. One of the most important features of this study design is randomization, which ensures that the groups formed are similar, except for chance difference, in all aspects. This method enhances the internal validity by minimizing biases and confounding. The internal validity of a study is the extent to which the observed difference in outcomes between the study groups can be attributed to the intervention rather than other factors. However, RCTs are resource intensive and focus on short-term effects of an intervention among a small population. In addition, there is limited generalizability because of their strict inclusion and exclusion criteria that usually under-represent vulnerable patient groups and only provide results of average patients. Further, RCTs are unavailable to answer many questions of clinical importance and clinicians must therefore rely on observational studies for the best available evidence.

Observational studies play a significant role in healthcare, including the study of the use and effects of medicines in large populations known as ‘pharmacoepidemiology’, a discipline that uses similar methods to traditional epidemiological investigations but focuses on the area of clinical pharmacology (1). Observational studies are useful methods for studying various problems, particularly where an RCT might be unethical or not feasible (2). The main difference between an RCT (experimental design) and an observational study (non-experimental design) is the absence of random allocation of the intervention by the investigator. High-quality observational studies can generate credible evidence of intervention effects. They are more clinically relevant than RCTs (i.e., better generalizability or applicability) and include the patient populations of interest to clinicians. In addition, they are more suitable to detect rare or latent effects of interventions (2). There is an expanding body of literature using observational designs, partly because observational studies are less resource intensive than RCTs as they often use electronic healthcare data that have already been collected, which have become more available in the last decade. Thus, a large amount of evidence is accumulated. For example, observational studies highlighted the increased risk of cardiovascular events with rofecoxib (3,4).

Clear reporting is an obligation of researchers to enable readers to understand the essential aspects of the investigation and to critically assess its strengths and weaknesses. The Consolidated Standards of Reporting Trials (CONSORT) statement provides guidance for reporting RCTs and has already been endorsed by prominent medical journals (http://www.consort-statement.org/). Recently, an international collaboration of epidemiologists, methodologists, statisticians, researchers and journal editors has produced the STROBE statement (STrengthening the Reporting of OBservational studies in Epidemiology; http://www.strobe-statement.org/). The STROBE statement (5) provides a set of reporting recommendations for the common observational designs (cohort, case–control and cross-sectional studies) and is increasingly endorsed by biomedical journals. The CONSORT and the STROBE statements serve as checklists for researchers. Their purpose is to enhance clear and transparent presentation of research rather than prescribing how research should be designed or conducted. Further, while clarity of reporting is a prerequisite to critical appraisal, these recommendations are not an instrument for evaluating the quality of research and clear reporting does not necessarily guarantee the reliability of findings (5).

The aim of this article is to provide an overview of the methods used in observational studies to facilitate the understanding and appraising of their validity and applicability in clinical practice. Different types of study design can answer different types of questions. Major methodological issues of observational studies including selection bias and confounding are also discussed. In addition, main approaches to reduce these problems in the design and analytical phases of a study are highlighted. Newer methods (namely, within-subject designs and the use of propensity scores and instrumental variables to control confounding) are also described. With the use of the STROBE statement, clinicians with a good understanding of study designs and methods for controlling biases and confounding can better recognize weaknesses and possible flaws, including inadequate reporting, in observational studies.

Observational study designs

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

This section discusses the range of observational study designs (Box 1). Their characteristics, advantages and methodological challenges are highlighted.

Table Box 1.   Observational study designs
Cohort studies follow one group that is exposed to an intervention of interest and another group that is non-exposed to determine the occurrence of the outcome (the relative risk). Cohort studies can examine multiple outcomes of a single exposure.
Case–control studies compare the proportion of cases with a specific exposure to the proportion of controls with the same exposure (the odds ratio). Case–control studies can examine multiple factors that may be associated with the presence or absence of the outcome.
Within-subject methods (case-only designs):
 The self-controlled case-series method assesses the association between a transient exposure and an outcome by estimating the relative incidence of specified events in a defined time period after the exposure.
 Case-crossover design estimates the odds of an outcome by comparing the probability of exposure between the exposure and control periods.
 Case-time-control design is case-cross over design with the addition of a traditional control group.
Cross-sectional studies are used to determine prevalence, that is, the number of cases in a population at a certain time and to examine the association between an exposure and an outcome.
Ecological studies focus on the comparison of groups. They can be used to identify associations by comparing aggregate data on risk factors and disease prevalence from different population groups.
Case reports provide anecdotal evidence by describing single cases. Description often includes the manifestations, clinical course, prognosis, how clinicians diagnosed and treated the condition and the clinical outcome.

Cohort studies

Cohort design can be prospective or retrospective and has a number of applications, including the study of incidence, causes and prognosis (6,7). In a retrospective cohort study both the exposure and the outcome of interest have already occurred. A cohort study typically follows a group of individuals in which some have had or have an exposure of interest over time to determine the occurrence of the outcome. The exposure could be some behavioural factor or exposure to a drug or a medical intervention. Usually a control group of individuals who have not been exposed to the same intervention is also followed. The probability of developing the outcome in one group is compared with that in another group; this is called the relative risk. Cohort studies measure exposure and outcome in temporal sequence thereby avoiding the debate as to which comes first, thus this design can demonstrate causal relationships. An advantage of the cohort design compared with the case–control approach is that the investigator can examine a wide range of possible outcomes in one cohort study.

Cohort design is inefficient for studying the incidence of a latent or rare outcome (e.g., cancer) because individuals would need to be followed for many years at a substantial cost. The major challenges of this design include: (i) selection bias that occurs when there are systematic differences between the study groups in factors related to the outcome, (ii) the inability to control for all extraneous factors (confounders) that might be associated with the outcome and might differ between the study groups and (iii) bias by differential loss to follow-up because of migration, death or drop-outs (7). Selection bias and confounding are discussed later in this article.

Case–control studies

Case–control studies are usually retrospective. An intervention group would include individuals who have the outcome of interest (i.e., cases) and they are matched with a control group who do not have the outcome (i.e., controls or non-cases). Same information on prior exposure is collected from both groups (8). The proportion of cases with a specific exposure is compared with the proportion of controls with the same exposure (the odds ratio) and therefore determines the relative importance of the exposure with respect to the presence or absence of the outcome.

As some of the subjects have been deliberately chosen because they have the outcome, case–control studies are more cost-efficient than cohort studies—that is, a smaller sample size is sufficient to generate adequate information because of a higher percentage of cases per study. In addition, a large number of variables can be examined at one time while the outcome being studied is limited (i.e., presence or absence of the outcome). Further, case–control studies are commonly used for initial, inexpensive evaluation of risk factors and are particularly useful when there is a long period between an exposure and the outcome, or when the outcome is rare. The major problems with case–control design are confounding, selection bias and recall bias because people with the outcome are more likely to remember certain antecedents, or exaggerate or minimize what they consider to be risk factors.

Nested case–control studies

A nested case–control study comprises subjects sampled from a cohort study. The case–control study is thus ‘nested’ inside the cohort study (9). When the outcome of interest is rare, it is convenient and cost-efficient to construct a case–control study within the cohort once a sufficient number of cases have accumulated. If records of a specific exposure of the cases and a subset of non-cases could be retrieved from medical records for example, one can examine the association between the outcome and a factor not planned in advance. Analysis methods appropriate for case–control studies are applicable to nested case–control studies with computation of an odds ratio. The advantages of a nested case–control design over standard cohort design include: better control for confounding (e.g., age, disease duration), better quantification of time-dependent exposures (i.e., account for unexposed person-times) and less complex analysis because confounding is controlled for through matching and avoids sophisticated statistical techniques such as propensity scores (9).

Within-subject methods (case-only designs)

Cohort and case–control studies are useful for examining cumulative effects of chronic exposures. However, to minimize a major challenge of these methods—confounding by indication—within-subject methods that use self-controls to address the potential bias caused by unmeasured confounders have been proposed. These include self-controlled case-series method, case-crossover design and case-time-control design.

The self-controlled case series method is derived from the cohort (fixed exposure, random event) rather than case–control (fixed event, random exposure) logic (10). This method was originally published by Farrington et al. (11) to investigate the association between vaccination and acute potential adverse events and has also been used to examine effects of chronic exposures such as antidepressants (12). Using data on cases only, it is an alternative to cohort or case–control methods for assessing the association between a transient exposure and an outcome by estimating the relative incidence of specified events in a defined period after the exposure. Time within the observation period is classified as at risk or as control time in relation to the exposure. The key advantages are that it controls for individual-level confounders (measured and unmeasured) and allows for changes in the risk of exposure with time (13). It therefore provides valid inference about the incidence of events in risk periods relative to the control period and is suitable for studying recurrent outcomes.

Case-crossover studies are also less susceptible to confounding by indication because the exposure history of each patient is used as his/her own control thus eliminates between-person confounding (14). They are useful for examining effects of transient exposures (e.g., use of benzodiazepine) on acute events (e.g., car accidents) and the time relationship of immediate effects to the exposure. It estimates the odds of an outcome by comparing the probability of exposure between the exposure and control periods. However, the underlying probability of exposure must be constant so that the exposure and control periods are comparable. Therefore, changes in prescribing over time or within-person confounding, including transient indication or changes in disease severity, may be problematic because they can influence the probability of exposure, that is, the case-crossover design may have time trend bias (1). Differential recall bias can still be an issue even though exposure data for the case and control periods are provided by the same person. Case-time-control design is an elaboration of the case-crossover design. This design uses data from a traditional control group to estimate and adjust for time trend bias, control-time selection bias and recall bias (1).

Cross-sectional studies

Cross-sectional studies are primarily used to determine prevalence, that is, the number of cases in a population at a certain time. Prevalence is important to clinicians because it influences the chance of a particular diagnosis and the predictive value of an investigation. This method is also used to examine the association between an exposure and an outcome (infer causation). The subjects are assessed at one point in time to determine whether they were exposed to an intervention or risk factor and whether they have the outcome. A difference between cross-sectional studies and cohort and case–control designs is that some of the subjects will not have been exposed nor have the outcome of interest. The major advantage of cross-sectional studies is that they are generally quick to conduct and inexpensive because there is no follow-up. However, this method cannot differentiate between cause and effect or the sequence of events and is inefficient when the outcome is rare.

Ecological studies

Ecological or correlational studies focus on the comparison of groups rather than individuals and are typically based on aggregate secondary data. The unit of analysis in an ecological study is an aggregate of individuals and variables are often aggregate measures collected on this group. One can use ecological studies to identify associations by comparing aggregate data on risk factors and disease prevalence from different population groups. Because all data are aggregated at the group level, relationships between exposure and outcome at the individual level cannot be empirically determined but are inferred from the group level. An error of reasoning (‘ecological fallacy’) occurs when conclusions are drawn about individuals on the basis of group-level data, as relationships between variables observed for groups may not necessarily hold for individuals (15). Ecological studies provide relatively cheap and efficient source for generating or testing the plausibility of hypotheses for further investigation by case–control, cohort, or experimental studies to test whether the observations made on populations as a whole can be confirmed in individuals. Despite these practical advantages, there are major methodological problems that limit causal inference, including ecological and cross-level bias, problems of confounder control, within-group misclassification, temporal ambiguity, collinearity and migration across groups (15). Therefore, ecological studies should only be conducted when individual-level data are unavailable.

Case reports

Case reports provide anecdotal evidence. A case report is a description of a single case, typically describing the manifestations, clinical course and prognosis of that case. In addition, how clinicians diagnosed and treated the condition and the clinical outcome are often described. Because of the wide range of natural biological variability in these aspects and the lack of a control group, a single case report or a number of case reports with common elements (e.g., similar clinical features and suspected common exposures) provide weak empirical evidence. However, they often raise questions of new health hazards, new clinical syndromes or unusual association between a disease and some exposure. These can generate hypotheses that need to be tested using stronger study designs.

Challenges of observational studies

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

It is critical to minimize the effects of bias and confounding to provide credible results. Bias and confounding are major threats to internal validity of a study and should always be considered as alternative explanations when interpreting the relationship between an intervention and the outcome. This section discusses two major challenges of observational studies—selection bias and confounding—and approaches that can be taken to minimize these problems. Key points are summarized in Box 2.

Table Box 2.   Major challenges of observational studies
  1. *Definitions by the CONSORT statement from Rochon et al. (7).

Selection bias*: a systematic error in creating intervention groups, causing them to differ with respect to prognosis. The groups differ in measured or unmeasured baseline characteristics because of the way in which participants were selected for the study or assigned to their study groups.
Confounding*: a situation in which the estimated intervention effect is biased because of some difference between the comparison groups apart from the planned interventions such as baseline characteristics, prognostic factors, or concomitant interventions. For a factor to be a confounder, it must differ between the comparison groups and predict the outcome of interest.
Strategies to reduce confounding
Design phase
 Restriction: inclusion to the study is restricted to a certain category of a confounder (e.g., male).
 Matching of controls to cases to enhance equal representation of subjects with certain confounders among study groups.
Analytical phase
 Stratification: the sample is divided into subgroups or strata on the basis of characteristics that are potentially confounding the analysis (e.g., age).
 Statistical adjustments
  Regression: estimates the association of each independent variable with the dependent variable (the outcome) after  adjusting for the effects of other variables.
  Propensity score: a score that is the conditional probability of exposure to an intervention given a set of observed variables  that may influence the likelihood of exposure.
  Instrumental variable: a pseudo-randomization method that divides patients according to levels of a covariate that is  associated with the exposure but not associated with the outcome.

Selection bias

Selection bias is a systematic error due to design and execution errors in sampling, selection or allocation methods. Factors that determined whether an individual received an intervention could result in the intervention and comparison groups differing in particular factors that affect the outcome, either because people were preferentially selected to receive the intervention or because of choices that they made (7). To minimize, assess and deal with selection bias, a recommended approach involves the selection of appropriate comparison groups, the identification and assessment of the comparability of potential confounders between the groups and the use of appropriate statistical techniques in the analysis (7). Selection bias cannot be completely excluded in case–control studies because non-participation may have differed between cases and controls. Confounding by indication (also known as channelling bias) is a form of selection bias, which occurs when treatments are preferentially prescribed to groups of patients based on their underlying risk profile (16). Confounding by indication is one of the most important, frequent problems encountered in non-experimental studies of medication effects due to the natural presence of incomparability of prognosis between subjects receiving the drug and those who do not. Thus, selection of exposure is confounded with patient factors (clinical, non-clinical, or both), which are also related to the outcome.

Confounding

Selection bias can result in confounding. A factor can confound an association only if it differs between the intervention and comparison groups. For a variable to confound an association, it must be associated with both the intervention and outcome and its relation to the outcome should be independent of its association with the intervention. Confounding occurs when the differences in baseline characteristics between the study groups result in differences in the outcome between the groups apart from those related to the intervention under investigation (17). Crude, unadjusted results of non-experimental studies may lead to invalid inference regarding the effects of the intervention. Confounding can cause over- or under-estimation of the true relationship and may even change the direction of the observed effect.

Strategies to reduce confounding

Because observational studies often use data that were originally collected for other purposes, not all the relevant information may have been available for analysis. Thus, there are unknown potential confounders. Methods to improve the comparability of the intervention and control groups in observational study design include: (i) restriction – inclusion to the study is restricted to a certain category of a confounder (e.g., male). However, strict inclusion criteria can limit generalizability of results to other segments of the population; and (ii) matching of controls to cases (frequency matching or one-to-one matching) to enhance equal representation of subjects with certain confounders among study groups. The effect of the variable used for restriction or matching cannot be evaluated and is a disadvantage of these approaches.

In the data analysis phase, methods to control for confounding include: (i) stratification – the sample is divided into subgroups or strata on the basis of characteristics that are potentially confounding the analysis (e.g., age). The effects of the intervention are measured within each subgroup (18). Disadvantages of stratification include reduced power of the study to detect effects because the number of participants in each stratum is smaller than the total study population and subgroups may not be balanced with respect to other characteristics and (ii) statistical adjustments for dissimilarities in characteristics between the study groups. Regression, propensity score and instrumental variable are three main methods for statistical adjustments that can be applied to improve validity.

Regression (e.g., linear, logistic and proportional hazard regression) is the most common method for reducing confounding in observational studies (18). Regression analyses estimate the association of each independent variable (i.e., measures of baseline characteristics and the intervention) with the dependent variable (the outcome) after adjusting for the effects of all the other variables. It is important to compare adjusted and unadjusted estimates of the effect. If these estimates differ greatly, it suggests that the differences in baseline characteristics were a source of confounding and have had a substantial effect on the outcome.

Propensity score, proposed by Rubin and Rosenbaum (19), is a statistical method increasingly used by researchers to control for confounding by indication, particularly when there are a large number of variables. This score is the conditional probability of exposure to an intervention given a set of observed variables that may influence the likelihood of exposure (e.g., drug treatment). The propensity score can be derived from a multivariable logistic regression analysis including variables that are statistically significantly associated with the exposure. A higher score indicates a higher probability of receiving the exposure. Classification of subjects on levels of this single variable or including this variable as a single covariate in a multivariable regression model tends to balance all of the observed variables, but not the unobserved (i.e., there is residual confounding).

In recent years, the instrumental variable method, a technique originates from the field of econometrics, has been used more commonly in pharmacoepidemiological studies to overcome the potential lack of balance on unobserved prognostic factors (e.g., health behaviour) (20). In brief, this pseudo-randomization method divides patients according to levels of a covariate that is associated with the exposure but not associated with the outcome. For example, Brookhart et al. (21) used the prescribing physician’s preference to cyclooxygenase-2 inhibitors or non-selective, non-steroidal anti-inflammatory drugs as an instrumental variable to compare the risk of gastrointestinal complications associated with the use of these medicines. The instrumental variable method may lead to equal distribution of characteristics in both exposed and non-exposed people and thus reduce potential confounding.

In summary, observational studies are widely published and there are a number of methods available. These studies offer useful methods to examine various questions, particularly relating to drug effects in real clinical practice. The promise of observational studies as a valuable source of evidence in medical decision making must be balanced against concerns about the validity of that evidence. The hallmark of good research is the rigor with which it is conducted and careful analysis and interpretation. Preventing or limiting bias and confounding can be achieved in the design and data analytical phases of observational studies. When properly conducted, they are no more misleading than RCTs and produce critical, trustworthy evidence. It is crucial that clinicians critically assess studies for possible methodological flaws and make judgments about the validity and applicability of study results in the interests of optimal patient care.

Acknowledgement

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References

Dr Lu is supported by an Australian National Health and Medical Research Council Public Health Training Fellowship.

References

  1. Top of page
  2. Summary
  3. Review Criteria
  4. Message for the Clinic
  5. Introduction
  6. Observational study designs
  7. Challenges of observational studies
  8. Acknowledgement
  9. References