Title. Reading, writing and systematic review.
Aim. This paper offers a discussion of the reading and writing practices that define systematic review.
Background. Although increasingly popular, systematic review has engendered a critique of the claims made for it as a more objective method for summing up research findings than other kinds of reviews.
Discussion. An alternative understanding of systematic review is as a highly subjective, albeit disciplined, engagement between resisting readers and resistant texts. Reviewers of research exemplify the resisting reader when they exclude reports on grounds of relevance, quality, or methodological difference. Research reports exemplify resistant texts as they do not simply yield their findings, but rather must be made docile to review. These acts of resistance make systematic review possible, but challenge claims of its greater capacity to control bias.
Conclusion. An understanding of the reading and writing practices that define systematic review still holds truth and objectivity as regulative ideals, but is aware of the reading and writing practices that both enable and challenge those ideals.
Even the most casual review of the health and social sciences literature over the last decade will show the growing interest in systematic reviews of research. Journal articles and books regularly appear promoting the need for, instructing readers on how to conduct, reporting the results of, and even reviewing such reviews. Conceived as a cornerstone of evidence-based practice, the systematic review is appealing because of its promise to permit valid (albeit provisional) conclusions to be drawn about clinical problems from the ever increasing number of research findings addressing those problems. Whether the problem is medication non-adherence, the management of chronic illness, or accounting for health and social disparities, systematic review holds out, and often fulfils, the promise of arriving at working research conclusions and workable practice solutions.
Yet like most trends in method, systematic review has engendered a critique focused on claims made for it as a more objective method for summing up research findings than other kinds of reviews. As proposed in this paper, the objectivity claimed for systematic review is challenged by an alternative understanding of it as a highly subjective, albeit disciplined, engagement between reviewers – conceived as resisting readers – and research reports, conceived as resistant texts.
On the system in, and objectivity of, systematic review
What makes a review systematic (as opposed to unsystematic) is the use of an explicit and auditable protocol for review. As typically described in instructional literature on systematic review (e.g. Higgins & Green 2006) and in the volumes of published reports of systematic reviews (e.g. see the ‘Review’ section of Journal of Advanced Nursing), this protocol demarcates the problem for which the review is undertaken; the one or more research purposes or questions addressing that problem; the reports of research that will be reviewed; the mechanisms that will be used to search for, select, and retrieve these reports; and, the techniques that will be used to appraise these reports and to analyse and synthesize the findings in them. Reviewers organize their reports of systematic reviews to conform to this sequence of stages.
If what makes a review systematic is adherence to a protocol, what makes a review unsystematic is simply that it does not adhere to a protocol. The fact that a review is unsystematic, however, does not make it a less worthy review than one that is systematic. For example, reviews for research require only that the literature selected be relevant to the case being made for a proposed or completed study, that no relevant report be excluded, and that the literature reviewed be accurately represented in making that case. There is no mandate to be systematic, that is, to move through the stages prescribed for a systematic review of research, in reviews for research (Maxwell 2006). Other types of reviews of research, such as state-of-the-science/state-of-the-art reviews, also do not require adherence to an explicit review protocol for the selection and treatment of literature.
Yet, the term systematic review is used to convey something more than the use and communication of a prescribed system to conduct reviews of research. Indeed, the term is misused in either/or comparisons aligning unsystematic or impressionistic reviews with so-called narrative or qualitative reviews. Such misalignments fail to distinguish: (a) between systematic (i.e. protocol-driven) and unsystematic (i.e. non-protocol driven) reviews; (b) among types of studies reviewed; and, (c) among methods of analysis and synthesis used. Reviews of qualitative, quantitative and/or mixed-methods studies (comparison b) may be systematic or unsystematic (comparison a) and systematic or unsystematic reviews (comparison a) of qualitative, quantitative, or mixed-methods studies (comparison b) may consist of one or more qualitative and/or quantitative analysis and/or synthesis methods, such as metasummary, metasynthesis and meta-analysis (comparison c). The deployment of the terms qualitative or narrative– to signal unsystematic reviews – or their deployment with the term systematic, to designate reviews in which quantitative meta-analyses could not be conducted (e.g. Duedahl et al. 2006), reproduces the erroneous ideas that: (a) systematic reviews are more worthy than unsystematic reviews; (b) qualitative or narrative reviews are always non-methodical and defaulted to only when quantitative reviews are not possible; and that (c) taking P-values as reported in quantitative studies is a type of qualitative review in the same plane as the use of one or more qualitative methods or techniques (e.g. taxonomic analysis, meta-ethnography) to analyse and synthesize findings (Bushman & Wells 2001).
Systematic reviews (especially when conceived as involving the use of quantitative methods to synthesize quantitative findings) continue to be promoted for their greater objectivity than unsystematic reviews. Procedural objectivity (i.e. adherence to an auditable protocol) is presumed to optimize the validity of review outcomes, or to yield a closer approximation to ‘reality’ via the control or minimization of bias. Procedural objectivity, however, does not remove the subjectivity of the process, nor does it even guarantee the transparency or replicability of review outcomes claimed to distinguish systematic from unsystematic review (MacLure 2005). The only thing transparent and reproducible is adherence to a prescribed protocol for conducting reviews. Although systematic reviews are by definition methodical in that they mandate adherence to an orderly and communicable system for conducting them, no one method, nor one execution of any one of these methods, is used to conduct any one of the stages prescribed for them.
The activities constituting each stage of the systematic review process and its outcomes vary with reviews and reviewers. For example, the vagaries of searchers, search engines and information databases virtually ensure that no two searches, even if conducted during the same period and under identical search criteria, will yield the same reports (Sandelowski & Barroso 2007). Systematic reviews ostensibly addressing the same research question will not include the same reports nor necessarily come to the same conclusions (Ezzo et al. 2001, Campbell et al. 2003, Linde & Willich 2003). Owing to the lack of consensus on what constitutes quality, the controversy surrounding the proper use of quality criteria in systematic reviews, and the sheer volume and diversity of checklists and guides available to appraise quality (e.g. West et al. 2002, Ogilvie et al. 2005, Pawson 2006), quality appraisal ends up being a largely idiosyncratic affair that operates against claims for systematic review of transparency, replicability and control of bias.
In short, like any other literature review, systematic reviews reflect the perspectives, preferences and propensities of reviewers in the very way that they conceive problems, pose research questions, select the reports of studies that will be reviewed, treat these reports, and compare and combine the findings in them. Systematic reviews are procedurally objective in that the steps taken are communicable and, therefore, repeatable as steps, but the objectivity of review outcomes ultimately resides in a disciplined subjectivity. The outcomes of systematic reviews are as ‘situated, partial and perspectival’ (Lather 1999, p. 3) as any other human activity.
Resisting readers and resistant texts
To understand the partiality of systematic review requires recognizing it as an engagement between reviewers – conceived as resisting readers – and research reports, conceived as resistant texts. As shown in ethnographic studies, and in critiques and reflexive accounts, of the systematic review process (Traynor 1999, Mykhalovskiy 2003, MacLure 2005, Moreira 2007, Sandelowski et al. 2007b), what is typically hidden in claims of the greater objectivity of systematic review are the reading and writing practices that define the process. Research reports are generally viewed as indexes of the studies conducted. The findings in these reports are generally conceived as indexes of the experiences or events researchers studied, and the results of systematic review, as indexes of these findings. Research reports are treated in systematic reviews as sources of extractable and ultimately synthesizable data that are seen to represent the experiences and events under study. Yet, reports, the findings in them, and the results of systematic reviews are also texts produced in the varied reading and writing practices constituting inquiry. The systematic review enterprise is ‘teeming with texts’ (Mykhalovskiy 2003, p. 332) that are read, re-read, re-written or never read at all.
Reviewers of research arguably exemplify the category of reader referred to as the ‘resisting reader’ (Fetterley 1978, Traynor 1999) as they are obliged to maintain a critical stance toward the research reports available for review in a domain of inquiry. Reviews of literature have been described as gatekeeping, policing and, ultimately, political enterprises (Lather 1999, MacLure 2005) whereby reviewers decide what reports are relevant to a review and, if deemed relevant, worthy to include in that review. As they encounter the volume of reports typically generated by a multi-channel search of multiple databases, reviewers continually adjust research questions, search terms, and selection criteria in order to claim comprehensiveness within the search and selection parameters they themselves created. Systematic reviews are labour-intensive, making it critical that the numbers of reports not exceed the resources available to review them. Reviewers become ‘reluctant readers’ (MacLure 2005, p. 399) when they legitimate not reading certain reports. Reviewers, thereby, actively shape what comes to be seen as the body of research in a field while simultaneously preserving the system in systematic review, that is, methodically accounting for their decisions to read or not to read the reports retrieved (Sandelowski et al. 2007b). Before they even arrive at the stage of a systematic review where findings are reduced via synthesis, reviewers will have already reduced – via their reluctance to read – the volume of findings to be synthesized. Reviewers reduce the ‘information anxiety’ (Harrison 1996, p. 224) that initially generated the urgent call for systematic reviews in the health and social sciences, ironically by reducing the number of findings requiring analytic reduction.
Reviewers may be seen to resist also by virtue of their use of checklists, standards and appraisal guides to determine which of the reports of studies retrieved as relevant to the purpose of a review will be excluded or, if included, will be treated in sensitivity and other post-hoc analyses as high- vs. low-quality studies. Research reports are written in a prescribed style intended to persuade readers that studies were conducted according to prescribed rules (Sandelowski 2003). Quality criteria enable reviewers legitimately to resist any claim to credibility made in reports they judge to be unwarranted.
As a consequence of reader reluctance to read at all and resistance to what they do read, systematic reviews often end up as rather ‘empty…reviews’ (Lang et al. 2007), that is, consisting of only a fraction of the reports retrieved. Although it makes systematic review possible, reader resistance undermines claims to minimizing selection bias. This built-in selection bias is disguised as relevance and quality appraisal. The necessarily judgmental character of the process is masked by rhetorical devices (Sandelowski 2003) that lend the process its veneer of objectivity. Most notable among these are the enumerated tables and graphs tracking the numbers of hits per databases searched and the attrition in numbers of reports included per reason for exclusion. What is showcased here is a procedural objectivity – the auditing of process – not the impartiality of the process or its outcome.
Resisting to include reports
Acts of reader resistance typically operate to exclude reports, but they may also result in the inclusion of reports at risk for exclusion on the grounds of poor quality. For example, a descriptive study presented methodologically as a grounded theory study may be resisted, that is, excluded as a poor grounded theory study, or it may be re-read as a descriptive study and included. Both of these options exemplify resisting readers, but whereas the first reader resists a methodological claim by taking it at its word, the second reader resists by reading against, or re-writing, the claim to bring an account of method into alignment with its practice (Sandelowski & Barroso 2007).
Resisting by appeal to the qualitative/quantitative divide
Appeals to idealized distinctions between qualitative and quantitative research may also be seen as acts of reader resistance. The move to incorporate qualitative research findings into evidence-based practice has generated a more inclusive understanding of evidence, but it has done so primarily by reproducing accounts of qualitative and quantitative research as representing two contrasting modes of inquiry and by assuming that descriptions of method reflect the practice of method. Although the ‘quantitative/qualitative iconography’ (Law 2004, p. 4) is integral to the discourse of research, qualitative and quantitative research consist of too much within-group diversity and between-group similarity to sustain such a binary conception. The terms are variously used to describe highly disparate entities, including paradigms, methodologies, data, sampling, data collection and analysis and interpretive and representational techniques.
In the case of systematic review, the qualitative/quantitative iconography is used to justify excluding reports of studies solely on the basis of methodology. Not so long ago, qualitative research was simply excluded a priori from systematic reviews (i.e. not read at all) largely because it was perceived as yielding weak evidence. With the advent of a spate of publications promoting the strength and value of evidence produced from qualitative research (e.g. see a review of these publications in Sandelowski 2004) and the development of methods to synthesize qualitative research findings (e.g. Paterson et al. 2001, Campbell et al. 2003, Sandelowski & Barroso 2007), reports of quantitative studies are now also subject to a priori exclusion from systematic reviews.
One rationale typically offered for the a priori exclusion of quantitative studies is they address aspects of target phenomena different from qualitative studies (Barbour & Barbour 2003). Reviewers here resist reading quantitative reports on the grounds that reports of studies measuring participant attitudes toward, beliefs about, and responses to events have no thematic overlap with studies interpreting how participants construct and live these events. A systematic review can, therefore, be seen legitimately to exclude quantitative studies outright, even though they address the same domain of inquiry (e.g. medication adherence, end-of-life caregiving) as qualitative studies.
A second rationale typically offered for the a priori exclusion of quantitative studies is that only qualitative studies can access participants’ experiences from their own perspectives (e.g. Downe et al. 2006). Reviewers here resist reading on the basis of idealized depictions of methods that are often not achieved in practice. Because methods become what they are in the hands of users, no one method by itself can be said to be more privileged in its capacity to achieve certain goals (e.g. to elicit more authentic experiences) than other methods without regard to how it is practised. For example, qualitative studies with findings in the form of surveys of data may offer descriptions at the same depth and fidelity of understanding as quantitative research (Sandelowski et al. 2007a).
A third rationale offered is that qualitative and quantitative research findings are too different to be managed in the same review. Here aggregative synthesis is aligned with quantitative research, interpretive synthesis is aligned with qualitative research, and the former mode of synthesis is deemed inappropriate for the latter (e.g. Noblit & Hare 1988). When synthesis is conceived as an aggregation of findings, research results deemed to replicate each other are literally summed up, as in meta-analysis, vote counting and metasummary (Voils et al. 2008). Repetitive findings are pooled, or assimilated, into each other. In contrast, when research synthesis is conceived as a configuration or ‘mosaic’ (Hammersley 2001, p. 548) of findings, research results deemed not to replicate, but rather to complement, challenge, or otherwise relate to, each other are configured into a coherent whole, for example, a theoretical model, meta-narrative, or line of argument. Findings are meshed, as opposed to merged (Noblit & Hare 1988, Campbell et al. 2003, Greenhalgh et al. 2005, Voils et al. 2008).
Yet all modes of synthesis, whether by aggregation or configuration, are inescapably interpretive as they consist of reviewers’ treatments and re-renderings of the research findings in reports (which themselves consist of researchers’ interpretive renderings of data obtained from research participants). Moreover, although synthesis by aggregation is based on a ‘quantitative’ logic whereby findings deemed to replicate each other are summed up and the findings with the largest magnitude are considered to have more evidence for their existence, that logic may be entirely appropriate for qualitative findings. This is especially the case for qualitative findings in the form of surveys of data that are similar in form to surveys of data in quantitative reports (Sandelowski et al. 2007a). Similarly, although synthesis by configuration (which is what those advocating ‘interpretive’ synthesis mean) is based on a ‘qualitative’ logic whereby findings deemed to relate to each other in ways other than replication are assembled into coherence via theoretical models, conceptual maps, meta-narratives and the like, that logic may be entirely appropriate for quantitative research findings. This is especially the case for quantitative findings that are so disparate that they resist aggregation. Quantitative methods of synthesis require that at least two relationships produced by techniques meeting statistical assumptions and deemed to measure the same variables in the same way be present to produce a synthesis (because quantitative synthesis implies at least two numbers to sum up). Furthermore, a set of aggregated findings may, by virtue of having been aggregated, permit their configuration into, for example, a theoretical model or conceptual map.
In short, the distinction drawn here is not between idealized depictions of qualitative and quantitative research findings, or between aggregation for quantitative findings vs. interpretation for qualitative findings, but rather between findings that lend themselves to, enable, or may only permit aggregation or configuration. When conceived as the difference between aggregation and configuration, it is harder to resist reading studies simply because they are ‘either’ qualitative or quantitative. That is, it is easier to contemplate a set of N = 1 quantitative findings as configurable with each other and sets of repetitive qualitative findings as assimilable with each other.
Appeals to the qualitative/quantitative divide are also used to justify excluding qualitative studies in a domain of inquiry from systematic reviews of exclusively qualitative studies in that domain. The rationale offered here is that because sampling in qualitative research is purposeful, any other sampling frame would violate the imperatives of qualitative research (Barbour & Barbour 2003). Purposeful sampling would further reduce the number of reports for review.
All of these appeals to the qualitative/quantitative divide arguably represent acts of reader resistance whereby reviewers defend, albeit systematically and transparently, not reading reports of studies on the grounds of idealized accounts of methodological differences. Yet, these appeals are undermined whenever descriptions of methods are not in accord with the practice of methods, as evident in the reports of studies. Indeed, readers of reports of systematic reviews containing these appeals may themselves become resistant readers by refusing to accept these appeals as valid.
Acts of resistance are not confined to reviewers conducting systematic reviews; research reports may themselves be conceived as resistant actors. Moreover, reviewers may see their own resistance to texts, not as stemming from themselves but rather from the texts. For example, the very emphasis in qualitative research on the complexities and contradictions of individual experience is seen to make qualitative research findings as resistant to synthesis as are poems (Sandelowski & Barroso 2007).
To conceive research reports as resistant texts requires understanding research reports as after-the-fact reconstructions of studies styled to confer order on what is in actuality a rather disorderly, messy undertaking, namely empirical research (Bazerman 1988, Law 2004). By virtue of this stylized order, research reports are intended to persuade readers that valid science was conducted (Traynor 1999, Sandelowski 2003). Yet this stylized order is not amenable as given to the work of reviewing. Information is typically not presented in the form required for analysis and synthesis, or, owing to publication page limits and other constraints, information reviewers deem necessary may not be presented at all. Reviewers often have to contact authors to obtain this information. The work of reviewing, therefore, entails reconstructing these texts to make them pliable to the review process. Review outcomes are produced from textual reconstructions of researchers’/writers’ textual reconstructions of the words and deeds of research participants, which were themselves constructions of whatever experience or event was under investigation.
Data are never simply extracted from reports, but rather reviewers decide what will constitute the data for extraction. Reports are ‘deconstructed’ in order to ‘reconstruct’ the information in them in a standardized format (Harden et al. 2004, p. 796). Extracting information from reports of research is hardly the uncomplicated affair it appears to be in reports of systematic reviews (Sandelowski & Barroso 2007). Indeed, a recurring explanation for the lack of uptake of research findings into practice is the difficulty reading research reports (e.g. Retsas 2000).
Moreover, data are never simply ‘extracted intact’ (MacLure 2005, p. 394) and then used as given. Instead, they are transformed, transposed, converted, tabulated, graphed or otherwise manipulated, modified and reconfigured to enable comparison and combination. They are ‘disentangled’ from the reports in which they are ‘embedded’ and then ‘entangled’ with the ‘machinery’ of systematic review (Moreira 2007). These data must, paradoxically, be ‘distorted into clarity’ (Law 2004, p. 2), to make them usable for synthesis. To facilitate comparison and combination, qualitative data may be quantitized and quantitative data may be qualitized. Different metaphors and concepts are translated into each other. Effect sizes are calculated from different statistical expressions of results (Voils et al. 2008). The results of these operations are further re-assembled, enumerated, narrated, tabulated, funnel- and forest-plotted, or otherwise inscribed as the results of systematic review (Latour & Woolgar 1986).
In short, reports do not simply yield their findings, rather reviewers make them yield. Reviewers make ‘resistant’ texts ‘docile’ to review (Moreira 2007, p. 8).
Toward more mindful systematic reviews
In this unsystematic review of systematic review, I have summarized an alternative understanding of systematic review as a highly disciplined yet still inherently subjective interaction between resisting readers and resistant texts. This understanding contrasts with the view of systematic review as an objective method for summing up research findings, which are themselves conceived as indexes of the events and experiences studied. Judging by the bourgeoning literature finding fault with published reports of systematic reviews (e.g. Gebel et al. 2007), these reports themselves constitute resistant texts and the readers of these reports, yet another group of resisting readers.
I am not proposing that the typical view of systematic review is inferior to, or that it be replaced with, the textual view featured here. Indeed, systematic review (and empirical research in general) would have no purpose in the practice disciplines if the possibilities of drawing practice conclusions from knowledge held to be provisionally ‘true’ were not possible. Instead, I am proposing a more mindful, or reflexive, understanding of the reading and writing practices that define systematic review. Such an attitude resists accounts of systematic review that reproduce unwarranted and divisive methodological claims. Such an attitude still holds truth and objectivity as regulative ideals, but is aware of the reading and writing practices that both enable and challenge those ideals.
This paper was made possible by a grant from the National Institute of Nursing Research, National Institutes of Health (5R01NR004907, June 3, 2005–March 31, 2010) to develop methods for ‘Integrating qualitative and quantitative research findings’.