SEARCH

SEARCH BY CITATION

Introduction

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

At an early point in the development of evidence-based health care, its advocates came to realize that the way evidence, particularly quantitative data, is presented has a major bearing on how it is interpreted and implemented. Thus, Fahey and colleagues sent 182 health-care executives and non-executives details of a hypothetical mammography programme and a cardiac rehabilitation programme.1 They presented the same results of research evidence in four different ways. Decisions on which programmes to fund were significantly influenced by the way in which data were presented. This and similar research has helped to stimulate enthusiasm for the ‘number needed to treat’ as an intuitive and clinically meaningful measurement of treatment benefit.2

Does this original experience have implications within evidence-based library and information practice? Would librarians and information officers make different decisions about everyday practices if presented with data in a way that is different than that to which they are accustomed?

Desperately seeking evidence?

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

Systematic reviews increasingly feature on the agenda of the health librarian.3 Large numbers of information specialists are employed as members of multi-disciplinary research teams to support systematic review activities. At the same time, health librarians in general are expected to advise on whether a published systematic review has taken reasonable steps to identify all relevant evidence. Of course, ‘reasonable’ is a subjective concept that is difficult to quantify. Certainly, early held preconceptions suggested that such searches should be exhaustive, if not in the strict sense of interrogating multiple data sources, then in the resultant physical and mental state of the hard-working information specialist!

More recent developments within health technology assessment have led to recognition that it is not always possible (or indeed desirable) to expend considerable resources in the pursuit of diminishing returns from the evidence.4 Time and funding for systematic searching is usually finite. In many cases, ‘good enough’ is regarded as an acceptable substitute for the ideal. Here again, ‘good enough’ is both subjective and elusive.

Current controversies

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

Not everyone subscribes to the notion of ‘good enough’. The well-publicised incident at John Hopkins University is used by some to emphasize the importance of exhaustive searching.5 Just as some clinicians refuse to acknowledge the concept of ‘medical futility’ (i.e. a point at which therapy should not be performed because available data show that it will not improve the patient's medical condition),6 so some librarians will not recognize ‘bibliographic futility’ (i.e. a point at which further literature searching should not be performed because available data show that it will not affect the overall result of retrieval).

There is also a need for agreement on what is a worthwhile outcome. Librarians tend to focus on ‘numbers of items retrieved’, with the implicit assumption that, if citations identified are topically relevant, then they are worth finding. More important for the review team, is the concept of ‘appropriateness’, that is that the retrieved references are not simply on the right topic but that they are also eligible for inclusion in the review. Of course, such a viewpoint is not only fundamentally pragmatic, but can only be evaluated in retrospect—‘was this item of evidence included in the final review?’ An even less forgiving verdict would be ‘did this item of evidence make a difference to the overall finding of the review?’ To use such a measure resonates within the medical field where a research finding must not only be ‘statistically significant’ but it should also be ‘clinically significant’, capable of making a difference in practice.7 So it is not entirely unreasonable to argue that, if a reference will not make a difference, then it is not ‘worth’ retrieving. Certainly, this feels a sounder starting position than to rely on theoretical concepts such as sensitivity and specificity or recall and precision.

Finally, we should recognize that we are dealing with considerable uncertainty—we never know how large the total population of studies on a particular topic actually is. As a consequence, we never know how close we are to this total nor can we tell how many studies are missing when we decide to ‘call off the search’. Nevertheless, this applies equally to any disease where clinicians must rely on best available estimates of incidence and prevalence. Such similarities are readily apparent, as methods such as the ‘capture–recapture technique’ have been used to estimate the size of unknown diseased populations and of unknown populations of potentially relevant studies.8

Scenario

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

To investigate these and associated issues we need a realistic scenario and a small and usable data set.

You are planning a systematic review on a health management topic with limited resources. Your review team questions the usefulness of different search methods that are typically included in a review protocol. ‘Do we have to use them all and are they all equally valuable?’, they ask. You decide to examine the evidence. You retrieve a recent article which offers an opportunity to make comparisons across different methods9 (Table 1).

Table 1.  Yield of different search methods (adapted from Greenhalgh and Peacock9)
Search methodTotal (n = 495)
  • Figures in brackets are percentages. Numbers add up to 105% because some items were located by more than one method.

Protocol driven:150 (30)
 Electronic database search126 (25)
 Hand search (32 journals)24 (5)
‘Snowballing’:252 (51)
 Reference tracking218 (44)
 Citation tracking4 (7)
Personal knowledge:119 (24)
 Sources known to research team85 (17)
 Social networks of research team29 (6)
 Serendipitous5 (1)
Raw total, including double counting521 (105)
Total in final report495 (100)

We firstly see how figures in Table 1 relate to our earlier discussion about different types of outcome. In the above study, the authors initially scanned over 6000 electronic abstracts and book titles (number of items retrieved). They then selected 1004 full-text books or papers (appropriateness) for further analysis. After appraising all these for quality and relevance, they then cited 495 (items that make a difference).

Calculating the results

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

If we work from the approximate figure of 6000 citations in the Greenhalgh and Peacock study,9 then we have a denominator against which we can calculate a more meaningful metric. How many citations do we need to retrieve by one method of retrieval, compared with the other methods to yield one additional reference to be used in the review? Intuitively, the formula is 1 divided by the proportion of relevant references found by our chosen method minus the proportion of references found by all other methods. Note that, for this example, we assume that references are equally distributed per 1000 citations regardless of method used (i.e. 25% of relevant references are found within 25% of the total citations). This is obviously an over-simplification as greater numbers of references are typically retrieved through electronic database searches than through other methods.

So, starting with the electronic database search, we calculate 1/(126/1374) − (395/4105) = 1/0.0917 − 0.0962 = 1/0.0045. You therefore have to retrieve approximately 221 citations (or 222 if limited to four decimal places, as above) by electronic database searches, compared with other methods, to find one additional reference for use in the review. The number needed to retrieve (NNTR) is thus 221 in this example (Table 2).

Table 2.  Numbers needed to retrieve (NNTR) for different search methods (computed from Greenhalgh and Peacock9)
Search methodRelevantNot relevantOther methods
RelevantNot relevantNNTR
Protocol driven:15016503713829167
 Electronic database search12613743954105221
 Hand search (32 journals) 242764975203117
‘Snowballing’:25228082692671 91
 Reference tracking21824223033057110
 Citation tracking 343864875093133
Personal knowledge:11913214024158151
 Sources known to research team 859354364544198
 Social networks of research team 293314925148125
 Serendipitous       5555165424236
Total5215479   

Having computed this new metric, we can begin to explore the implications of the data for citations, time and money.

Citations

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

The first point to observe is that, regardless of retrieval methods used, we need to retrieve very large numbers of citations to identify one extra relevant reference. Although we have used an average number of citations across methods—the reality is that some methods will have much larger NNTRs while others will have much smaller NNTRs—we are still looking to retrieve between 100 and 250 references for every extra reference found. At what point do we decide that such effort is no longer worthwhile?

We also have an index measure that allows us to compare the effectiveness of different search methods—this contains both the notion of ‘yield’ and of the background number of references retrieved. An extension of this method, beyond the scope of this column, is to perform head-to-head comparisons of different retrieval methods. This would be more meaningful than comparing one method against all other methods. Such metrics could then be compiled for different topic areas and different review circumstances.

Time

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

A second point is that the clock is ticking, not just when we are successfully retrieving references, but also when we are proving unsuccessful. Greenhalgh and Peacock make a major contribution by attempting to record time spent on different retrieval activities.9 So, for example, they found that

‘electronic searching, including developing and refining search strategies and adapting these to different databases, took about 2 weeks of specialist librarian time and yielded only about a quarter of the sources—an average of one useful paper every 40 min of searching’.9

By comparison they also found that

‘it took a month to hand search a total of 271 journal years, from which we extracted only 24 papers that made the final report—an average of one paper per 9 h of searching’.9

While these figures were computed separately from our NNTR, they offer the intriguing possibility that, in future, reviewers might also calculate ‘time needed to retrieve’. This would allow more accurate calculation of time required for specific types of reviews as previously attempted for meta-analyses.10

Money

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

Finally, as the adage goes, ‘time is money’. Typically, systematic literature searching is calculated as a set amount of staff time to which a salary cost is attributed.11 Attaching time and retrieval success metrics to individual search methods may facilitate more accurate calculation of project requirements and help in making difficult decisions about when to start, and indeed when to stop, different strategies. We may even be able to move to pricing strategies that incorporate a ‘cost needed to retrieve’ metric.

Conclusion

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References

This speculative investigation has several limitations. First, as already mentioned, the denominators for each method are based on an average figure across the review as a whole. If the review had reported how many of the 6000 references were identified through each method, we would have a more accurate picture of differential yields for each. Next, each method is being compared with all other methods combined, whereas a more useful comparator is head-to-head between competing methods. Furthermore, availability of suitable data dictates that the chosen review is in an area of ‘complex evidence’, making difficult direct comparisons with health technology assessments or more conventional Cochrane systematic reviews. Indeed, the data relates to a single review making it impossible to generalize even to other reviews of ‘complex evidence’. Finally, the calculations provided above offer only one possible way to calculate a useful metric which combines both yield and background prevalence. It is conceivable that a more meaningful measure remains to be discovered.

Notwithstanding such limitations, this novel way of expressing search results is potentially more intuitive and should help to advance discussions about when a particular search method is ‘worth’ initiating or ‘worth’ pursuing. Initially, this will apply for systematic reviews, but ultimately similar measures may help health librarians in decisions relating to use of, or even purchase of, different subject databases. Turning such research into practical data for day-to-day decision making is, after all, what evidence-based information retrieval is all about!12

References

  1. Top of page
  2. Introduction
  3. Desperately seeking evidence?
  4. Current controversies
  5. Scenario
  6. Calculating the results
  7. Citations
  8. Time
  9. Money
  10. Conclusion
  11. References
  • 1
    Fahey, T., Griffiths, S. & Peters, T. J. Evidence-based purchasing: understanding results of clinical trials and systematic reviews. British Medical Journal 1995, 311, 10569.
  • 2
    Cook, R. J. & Sackett, D. L. The number needed to treat: a clinically useful measure of treatment effect. British Medical Journal 1995, 310, 4524.
  • 3
    Harris, M. R. The librarian's roles in the systematic review process: a case study. Journal of the Medical Library Association 2005, 93, 817.
  • 4
    Royle, P. & Milne, R. Literature searching for randomized controlled trials used in Cochrane reviews: rapid versus exhaustive searches. International Journal of Technology Assessment in Health Care 2003, 19, 591603.
  • 5
    Robinson, J. G. & Gehle, J. L. Medical research and the Institutional Review Board: The librarian's role in human subject testing. Reference Services Review 2005, 33, 204.
  • 6
    Bernat, J. L. Medical futility: definition, determination, and disputes in critical care. Neurocritical Care 2005, 2, 198205.
  • 7
    Turk, D. C. Statistical significance and clinical significance are not synonyms! Clinical Journal of Pain 2000, 16, 1857.
  • 8
    Spoor, P., Airey, M., Bennett, C., Greensill, J. & Williams, R. Use of the capture–recapture technique to evaluate the completeness of systematic literature searches. British Medical Journal 1996, 313, 3423.
  • 9
    Greenhalgh, T. & Peacock, R. Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources. British Medical Journal 2005, 331, 10645.
  • 10
    Allen, I. E. & Olkin, I. Estimating time to conduct a meta-analysis from number of citations retrieved. Journal of the American Medical Association 1999, 282, 6345.
  • 11
    Wade, C. A., Turner, H. M., Rothstein, H. R. & Lavenberg, J. G. Information retrieval and the role of the information specialist in producing high-quality systematic reviews in the social, behavioural and education sciences. Evidence and Policy: A Journal of Research, Debate and Practice 2006, 2, 89108.
  • 12
    Booth, A. Evidence-based perspectives on information access and retrieval. In: Booth, A. & Brice, A. (eds). Evidence-Based Practice for Information Professionals: a Handbook. London: Facet Publishing, 2004: 23146.