Background: Current approaches to medical science generally have not resulted in rapid, robust integration into feasible, sustainable real world healthcare programs and policies. Implementation science risks falling short of expectations if it aligns with historical norms. Fundamentally different scientific approaches are needed to accelerate such integration.
Methods: We propose that the key goal of implementation science should be to study the development, spread and sustainability of broadly applicable and practical programs, treatments, guidelines, and policies that are contextually relevant and robust across diverse settings, delivery staff, and subgroups. We recommend key conceptual and methodological characteristics needed to accomplish these goals.
Results: The methods to produce such advances should be rapid, rigorous, transparent, and contextually relevant. We recommend approaches that incorporate a systems perspective, investigate generalizability, are transparent, and employ practical measures and participatory approaches.
Conclusions: To produce different outcomes, we need to think and act differently. Implications of such an implementation science approach include fundamental changes that should be relevant to Clinical Translational Science Award investigators, comparative effectiveness researchers, those interested in pragmatic trials, grant funders, and community partners. Clin Trans Sci 2012; Volume #: 1–8
Multiple reviews and experts concur that, with few exceptions, scientific evidence has generally not translated rapidly or consistently into policy and practice.1–5 Science as usual has not worked—we posit that this has been due, at least in part, to characteristics of the types of approaches, research methods, and the types of questions asked. To outline our position, there has been a mismatch between the predominant view of “good science” and the complex health and healthcare problems demanding solutions. The field of implementation science has gathered great momentum in the last few years, expanding from a set of observations on facilitators and barriers to the use of evidence-based interventions to a cohort of investigations studying theoretically driven approaches to improve adoption, implementation, and occasionally sustainability.
Most researchers still generally adhere to a traditional linear model of research, from basic science to treatment development to efficacy and occasional effectiveness trials and then to implementation.2,3 Indeed, implementation scientists follow this same heuristic, step by step, “pipeline” approach from discovery to safety/feasibility to efficacy to effectiveness to dissemination model based on pharmaceutical research.6,7 This linear model, which postulates that efficacy studies must always precede effectiveness or implementation research, has not served us well, as the characteristics that predict success in efficacy trials are often different and sometimes inversely related to those associated with success in “later stages.”2,3,8 We posit that the implementation science field has been held back in part due to the types of questions we ask, the tools we employ, and the constraint of researchers to fit into a traditional paradigm when the world of dissemination and implementation is at its nature, complex, dynamic, and uncontrollable The problems we study are often at odds with the view of “good” or “robust” science.
Many, including us, have criticized traditional approaches, but have not clearly articulated a compelling alternative framework.8–10 Here we outline the key characteristics of a science that would be relevant to the complexity of healthcare, busy clinicians faced with competing demands, decision makers faced with forced choices among imperfect alternatives, complex patients, and thorny multidimensional problems. The primary purposes of this paper are to: (1) outline the type of science needed to make faster headway in addressing key health and healthcare issues; (2) discuss the specific characteristics of such approaches and precisely how they would differ from current dominant paradigms; (3) provide concrete examples of how a hypothetical study would be designed using these new priorities compared to a typical “state of the art” clinical trial as well as real-world examples of such approaches; and (4) explore implications of such a change for research funding, review, and priorities.
What is Needed?
There are four key issues that if addressed will go a long way toward making health research results more usable, and much more timely. These issues, summarized in Table 1, include philosophy of science perspective and worldview, fundamental goals of research, methods utilized, and flexibility of health research. We argue that fundamentally different approaches are needed from those that dominate health research today if we are to make more rapid and relevant advances. As Einstein said, “The significant problems we face cannot be solved at the same level of thinking we were at when we created them.”
Table 1. Implications of robust implementation science approaches using rigorous, rapid, and relevant scientific methods.
Context is critical
Research should focus on and describe context
Most problems, and interventions are multilevel and complex
Focus on systems characteristics
More emphasis needed on interrelationships among system elements and systems rules
Robust, Practical Goals
Representatives and reach
Focus on reaching broader segments of population and those most in need
Study generalization (or lack of such) across settings, subgroups, staff, and conditions
Pragmatic and practical
Producing answers to specific questions relevant to stakeholders
Scalability and sustainability
From outset, greater focus on scale-up potential and likelihood of sustainability
Research Methods to Enhance Relevance
Identify and address plausible threats to validity in context of question. Greater focus on replication
Approaches that produce faster answers
Best solutions usually evolve over time, as a result of informed hypotheses and mini-tests with feedback
Integration of methods; triangulation
For greater understanding, integrated Quantitative and Qualitative methods are often required
Relevance to stakeholders should be top priority
Encourage and support diverse approaches with the above characteristics (all models are wrong)
Respect for diverse approaches; humility
Different perspectives, goals, methods and approaches are needed. Continuing the same existing approaches will produce the same unsatisfactory results
The scientific worldview perspective issue can be summarized as that of an interrelated systems perspective versus a mechanistic, determinism approach to science. This issue overlaps with and leads to the scientific methodologies employed that are discussed below, but fundamentally concerns studying issues in context.11,12
Figure 1 summarizes visually a contextual approach and contains several points. First is that interventions are delivered in and surrounded by a multilevel context, and this context is important for understanding and in many cases determining outcomes. Studying and evaluating characteristics, changes and influences at multiple levels of this context—such as the policy setting, organization, history, and community involved are equally important as individual participant characteristics, but are often ignored or not reported.13,14 Second, intervention research does not typically test principles or theories directly: rather what is tested is a package that operationalizes principles embedded in “wrappings” such as modality, language, level of interactivity, and pragmatic decisions about frequency, duration, and intensity of contacts.15,16 Third, the individual intervention components are less important than the relationships or “fit,” alignment and compatibility among a research team, a specific intervention, the target audience, and the multilevel setting in which it is implemented. This view also implies that there may not be any one invariant intervention that is “best” across diverse settings, intervention staff, populations, or time.
The goal of much existing health research has been to assess results, assumed to generalize across a wide range of conditions. Such approaches have been developed most completely in the drug efficacy randomized trial.2,17 Most explanatory trials17,18 test average effects within conditions, with an assumption that effects that apply universally, often in a dose–response linear relationship. This research is often conducted on atypical participants, for example patients not having any other comorbid conditions, and is often not seen as relevant or feasible by practitioners.2,10 The types of implementation research19 needed to increase the odds that results will apply to specific contexts and be relevant to key stakeholders, such as payers, clinicians, policy makers, and patients/families represent a more pragmatic or realist approach to research.20,21 As shown in Table 2, such research addresses real-world questions, using real-world comparisons, and seeks to explicitly evaluate generalizability across subgroups and settings rather than assume it.23
Table 2. Key methodological characteristics of practical implementation science research that will translate.
To answer questions from patients, practitioners, and policy makers to inform real-world decision making.
B. Evaluates participation and representativeness†
To determine breadth of applicability. Assesses participation rate and representativeness of participants, settings, staff, and subgroups.
C. Comparison condition(s) are real alternatives
To address practical questions in context of currently available (and usually less expensive) alternatives.
D. Collects costs and economic data
To provide information on resources needed to adopt and replicate in different settings.
E. Assesses multiple outcomes, often using mixed methods
To provide results that recognize the different priorities of multiple audiences—e.g., behavior change, quality of life/functioning, healthcare use, impact on health disparities, unintended consequences.
F. Uses flexible research design to fit question
To consider and addresses key threats to internal and external validity.
G. Transparent reporting*,†,‡
To include information on implementation and modifications; numerators and denominators of settings, staff, patients invited, participating, and completing treatment.
Some have contrasted this as a difference between rigor in efficacy-type explanatory research versus relevance in the practical, pragmatic approaches summarized in Table 2.24–26 We reject this dichotomy, as does the CONSORT work group on Pragmatic trials (http://www.consort-statement.org/extensions/designs/pragmatic-trials). Rather, we think that factors of external validity and replication are key, underemphasized principles of rigorous science.25 It is possible to have both rigor and relevance (as demonstrated in published examples below), and issues of relevance (e.g., the extent to which a program or policy produces high reach, is widely adopted, and can be successfully delivered over time) are essential to integration of research to practice, and have been insufficiently studied.9,27 Much could be improved through transparency in reporting data on recruitment, implementation, and outcomes across the multiple levels of patient, staff, and setting.8 One way to efficiently and transparently report such issues would be to use an “expanded CONSORT reporting flow diagram” (Figure 2) that expands the scope of reporting to concisely present information about recruitment and sustainability within the settings where studies are conducted. By reporting on such contextual issues, research would be much more likely to address important scientific questions about robustness of effects and generalizability, and make possible reviews of the settings and conditions under which findings apply.
Integration of rigor and relevance discussed above translates into distinctive methodological approaches. The dominant “reductionist” research method is a carefully controlled, heavily scrutinized top-down investigation that follows a predetermined, set protocol.28 Such methodology has been useful in establishing several “evidence-based” medications and procedures.2 It has not, however, been very successful in advancing the results of these studies into practice or policy2,29
In addition to the evidence often not being practice-based,30 we think that a large part of the reason for the lack of adoption results from concerns about the applicability of results from methods that assume a top-down, linear translation approach7 which is implied in a “stages of research approach” (e.g., basic research to efficacy to effectiveness). Such approaches assume that the optimal solutions are those discovered in the efficacy laboratory, which then just need to be implemented with fidelity by practitioners. We suggest considering a contextual, complexity theory perspective that involves applied stakeholders as equal partners who have different but important types of knowledge, experience and practical expertise, and that assumes optimal solutions will evolve over time, as evidence continues to accrue.29,31
Table 3, which summarizes key characteristics of both traditional efficacy research methods and a context-based, rapid learning, evolving implementation science approach,32,33 illustrates how a conventional drug trial versus an implementation science approach would differ on a multitude of decisions about research methods. A fundamental difference, reflected throughout the table, is that the former focuses on maximizing a single primary outcome (usually effect size) whereas the latter uses a more multifaceted set of methods that emphasizes mixed methods, adaptive learning during the trial, and multiple outcomes to address concerns of diverse stakeholders.
Table 3. Efficacy trial versus pragmatic implementation science trial investigating use of pharmacist calls for promoting adherence with “statin” medications.
Highly motivated site(s) within high performing systems having excellent EMR resources
Randomly selected site(s) from multiple, diverse delivery systems
Which clinicians to approach?
Highly motivated clinicians within those sites
All clinicians within those sites
Which patients to enroll?
Highly motivated patients with minimal comorbidity
All patients newly prescribed a statin, regardless of comorbid physical or psychosocial problems
What level of comfort with cell phones to select for?
Comfortable using wide range of cell-phone features
Include those without cell phones (need to provide one), and those with wide range of comfort with cell-phone features
How frequently to send text messages and monitor patients?
Frequently; isolated from workflow in clinic; close, highly individualized intensive monitoring
Less frequently, but consistent with workflow patterns in clinic
Clinical pharmacist involvement and training?
Single individual, highly experienced, trained in motivational interviewing
Multiple clinical pharmacists with standard training in patient counseling
What kind of advice protocol to provide?
Highly scripted, standardized
Unscripted or general guidelines and suggestions for adapting
How to monitor implementation of advice protocol?
Careful assessment of fidelity to protocol, and intensified intervention if not optimal
Qualitative assessment of advice actually delivered by pharmacists
How to monitor patient medication adherence?
Active, continuous assessment with electronic medication monitors
Surveillance by patient self-report and/or prescription refill records
How to monitor impact on lipids?
Lipid levels drawn at prespecified intervals during additional visits for that purpose
Lipid levels drawn in the course of routine practice visits
Which patient subgroups to monitor for differences in effectiveness?
Few subgroups assessed (due to exclusions in recruitment), homogeneous patients not on other medications
Multiple prespecified subgroups (particularly for subgroups that might be excluded in an efficacy trial, e.g. individuals with multimorbidity or limited cell phone comfort), low health literacy/numeracy participants
Duration of follow-up?
Short-term (e.g., 3–6 months), allowing identification of individuals who soon stop treatment
Long-term (12–24 months), allowing identification of individuals who later restart treatment
Continuation of intervention?
To end of grant funding
Long-term incorporation into clinic operations
Does this intervention have any effects, positive or negative, on clinic operations?
Not relevant to assess
Critical to assess through staff interviews, observations and qualitative assessments
Assessed from perspective of adopting organization and patient, includes cost-effectiveness indices
Cost of intervention
Across the various decision points, a traditional efficacy approach assumes that the optimal conditions have been determined ahead of time by the researchers and that the implementers just need to follow instructions in a standardized fashion across settings, populations, and other factors. This contrasts with the more evolving and adaptive, flexible approach that values information about variability in recruitment, adoption, implementation, and outcomes across settings, staff, patients and time, and lessons learned during these processes. A final key difference between the traditional trial approach and the pragmatic implementation trial is the emphasis on efficiency and cost issues shown in both Tables 2 and 3. The efficacy trial may regard efficiency and cost as irrelevant or secondary to effect size. The implementation science approach considers these issues as primary factors influencing adoption, implementation, and maintenance of interventions.
Degree of Flexibility
As illustrated in each of the tables and the figures, there is a fourth fundamental difference between the traditional health research paradigm that is often rewarded as the “best science” by study sections, publication reviewers, FDA panels, Cochrane reviews, and guideline developers and a more context-based IS approach. This is the degree of flexibility in which research methods are chosen and the rigidity of the scientific approach, which often casts issues of applicability, feasibility, and appropriateness of methods as irrelevant to scientific rigor. Tenure and promotion committees and professional societies have reinforced inflexibility in methods assuming research that follows rigid prespecified rules, removes investigators from the data, and minimizes context is always optimal.
Admittedly, this approach, which has become ensconced into what some have termed “cookbook RCTs,”2 has identified some useful interventions. We do, however, take issue with the assumption of the unquestioned superiority of this mechanistic approach and its prototype—the randomized controlled trial over all alternatives for all questions—as is done in rating schemes such as CONSORT criteria and Cochrane criteria that heavily emphasize internal validity at the expense of external validity.25,34
This currently dominant, reductionist type of health research, as illustrated in Table 3, has not produced sufficient advances in primary care, public health, science policy, reductions in health disparities, or local results that the public and funders have hoped.3 Instead, we propose that a much more flexible spirit of investigation, that includes local knowledge and expertise, carefully assesses and plans for local conditions and preferences, that “follows the data”35 and is adaptive, iterative, and evolving26 is much more likely to transfer—and to do so much more rapidly—to today's complex health and healthcare issues.
We conclude, along with others who have considered the issue,36 that no one methodological approach is inherently superior, but that, along with stakeholders, researchers need to consider the question(s) first, and design the research methods to fit the question(s) rather than vice versa (Figure 1). What we should study is convergence across methods, delivery conditions, settings, intervention staff; and we should study boundaries to findings. Such an approach implies a respect and need for diverse approaches to questions. No one research design, method, or theory reveals the “truth”—rather all models and all methods are “wrong”,37 as they are all approximations and simplifications of reality that each have strengths and limitations. Some solutions, methods, and models are more useful than others,37, 38 but this should be determined by the specific question, context and type of answer needed—not a priori.
Table 3 summarized key differences between a completely pragmatic trial and a prototypic efficacy study. In practice, few studies are completely pragmatic or 100% explanatory.19 In Table 4, we describe five recent investigations selected to illustrate how different implementation science topics can be addressed using methods that are both rigorous and relevant. Kraschnewski et al.39 paid careful attention to the recruitment and participation of county health departments in their research on practical weight loss intervention programs. They systematically approached 81 county health departments in North Carolina meeting prespecified criteria and then transparently analyzed the percent and characteristics of those participating versus not. Such information on adoption at the setting level is seldom reported. Mosen et al.40 studied multilevel systems issues involved in delivering a population-based colorectal cancer-screening program. In a large pragmatic RCT, they demonstrated that a low-cost automated telephone reminder intervention based on their formative work and an implementation science model was highly effective in increasing screening rates among HMO members. The results demonstrated clear cost-effectiveness (estimated $40 per additional screening completed) and the program was adopted and continued for the HMO population following study completion.
Table 4. Example implementation studies by rigorous and relevant topics of interest.
Enhancing and documenting the reach of different diabetes self-management programs using RE-AIM model (Chronic illness management—HMO)
Compared DVD take-home self-management program to in-person self-management using hybrid preference design to evaluate reach and effectiveness. Found DVD produced four-fold increase in reach with no loss in effectiveness.
System-wide application and evaluation of population-based colorectal cancer screening program (feasibility—cost and efficiency—going to scale and sustainability; health system—screening)
Large HMO-wide screening program to promote colorectal cancer screening among those due for screening using automated phone calls based on formative work. RCT demonstrated clear cost effectiveness (40 per additional screen) and program continued by HMO.
Pragmatic trial using web-based and community health workers weight loss program for obese, hypertensive community health center patients (health disparities, partnership research, pragmatic studies, flexibility).
Practical RCT evaluated multimodality, multilevel intervention designed for low literacy, complex patients. Found intervention significantly improved weight loss and blood pressure over 2-year period among very low income, predominantly African American sample.
Quasi-experimental trial of a state-level natural experiment to implement collaborative care for depression within primary care clinics.
Quasi-experimental design examined impact of the implementation of an evidence-based depression care management intervention across 82 primary care clinics within Minnesota. Study is in its final year of funding.
Glasgow et al.41 compared the reach and effectiveness of two modalities for delivering diabetes self-management programs. Their hybrid preference-RCT design involved randomizing half of all patients in a diabetes registry who met eligibility criteria to a traditional RCT to evaluate in-person self-management training versus a DVD home self-management program. The other half of potential diabetes participants were offered their choice of the DVD or in person class, as would be done in practice. This design revealed that the DVD attracted four times the number of participants as did the in-person class (reach and relevance)—which is not possible to evaluate using the standard RCT. Also, the DVD did not sacrifice effectiveness compared to the in-person self-management training (rigor evaluated by the RCT, the evaluation of which would have been confounded if using only the self-selection portion of the preference design). Bennett et al.42 reported on a collaboratively developed practical intervention using partnership research methods. Their weight loss intervention was developed for high-risk community health center patients who were obese and also hypertensive. Their pragmatic RCT design19,23 demonstrated that a primarily technology based intervention (web and automated phone calls), supplemented by calls and occasional meetings with a community health worker, significantly improved both weight loss and systolic blood pressure over the 2 year implementation period.
Finally, the DIAMOND initiative developed an innovative rigorous and relevant statewide quality improvement effort for depression treatment within primary care settings.43 The study used a quasi-experimental design, involving a staggered rollout of the initiative, with sites exposed to the intervention at different time points while continuously recording outcomes across the multiple intervention waves. This allowed for comparisons between preintervention and intervention sites, while also affording a sufficiently long observational period to evaluate changes from baseline on depression outcomes. This design, which focuses on replication of effects, maximizes rigor within a natural implementation experiment, and is powered to enable the investigators to detect impact of the initiative on both individual and clinic-level outcomes.
We conclude that it will not be possible to more rapidly or consistently integrate science into practice without major changes in scientific perspective (toward contextual and systems perspectives), research goals (focus on pragmatic approaches and robustness across conditions), research methods (rapid, adaptive, and convergent methods), and flexibility of research. Such an approach would also imply the broadening of the types of research that are considered “gold standard,” based on the specific context of the investigation, and would change the types of research rewarded by review, publication, promotion and tenure, and scientific awards.2
Advances in the development of methods that better combine the twin paragons of rigor and relevance should emerge from the principles inherent in implementation science as discussed above. In summary: (1) there is no single research design or method to identify “truth”; (2) complexity is the rule, not the exception; efficacy research and reductionism has generally failed to create solutions that are found credible, feasible, and relevant by those policy makers and practitioners who must cope with the problems research is supposed to address; (3) variation in practice, across settings and populations is necessary; fidelity to a prespecified protocol as a sole marker of implementation is limited; and (4) attention to the issues and approaches outlined in this paper—especially if applied beginning with the earliest stages of a research investigation and in partnership with key stakeholders—could help us out of the hole we have dug of “evidence-based interventions” and guidelines that sit on the shelf gathering dust.
Although this paper specifically focuses on the needs of a rigor and relevance methodology armamentarium for implementation science, these same principles may well be applied to the broader field of intervention and prevention science. Indeed, implementation science as it is currently conceived6,20 rests on the value of the treatment, preventive, system and policy interventions that are created and tested, and the limitations of knowledge gained through development are amplified in the challenges of real-world integration. Without continued reflection on (and perhaps alteration) of the methods and measures used in biomedical research, the impact of science will remain far short of its potential. CTSA grantees, among others, are in a position to lead such a change in perspective and methods, and to evaluate if such changes do in fact result in more rapid, relevant solutions. Among other areas prime for application, comparative effectiveness research (http://www.pcori.org), and community engagement groups should consider and debate the issues and recommendations in this paper.