SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. 1. INTRODUCTION
  4. 2. METHODOLOGY
  5. 3. RESULTS
  6. 4. CONCLUSIONS
  7. References

The study investigates whether legal expertise matters to find more relevant documents during the discovery phase of a request for information.


1. INTRODUCTION

  1. Top of page
  2. Abstract
  3. 1. INTRODUCTION
  4. 2. METHODOLOGY
  5. 3. RESULTS
  6. 4. CONCLUSIONS
  7. References

The TREC 2007 Legal Track Interactive Task Challenge [9] involved five hypothetical legal “complaints” based on some facet of tobacco litigation. Each complaint included a request to produce relevant documents. These document production requests were broadly worded to force the opposing party to provide a maximum number of responsive documents during discovery. The resources for document production were two databases containing the tobacco litigation documents released under the terms of the Master Settlement Agreement (MSA) between the Attorneys General of several states and seven U.S. tobacco organizations. These two databases, the Legacy Tobacco Documents Library (LTDL) and Tobacco Documents Online (TDO), contain around 7,000,000 documents. The majority of these documents are not legal publications like cases, statutes, or regulations; the databases include scientific studies, corporate correspondence, periodical articles, news stories, and a mix of litigation documents.

Finding relevant documents in large databases is easier said than done. Studies have shown that researchers tend to overestimate the effectiveness of online retrieval. Blair and Maron in their 1985 study on retrieval effectiveness, showed that attorneys who were confident they had located at least 75% of the relevant documents actually had a success rate of about 20%. [5]. Their research findings had a major impact in information retrieval evaluation, especially of operational systems. In a sequel article Blair [2] reflected on the impact of their study. Dabney [6], Bing [1] and Schweighofer [8] provide in-depth reviews of the problems of full text searching for legal information and provide suggestions for solutions to the problems.

In the past twenty years, the functionality of full-text document-retrieval systems has improved but more evaluation of information retrieval effectiveness is needed. Attorneys and their support staff must recognize that effective information retrieval in today's complex litigation requires a variety of tools and approaches, including a combination of automated searches, sampling of large databases, and a team-based review of these results.

2. METHODOLOGY

  1. Top of page
  2. Abstract
  3. 1. INTRODUCTION
  4. 2. METHODOLOGY
  5. 3. RESULTS
  6. 4. CONCLUSIONS
  7. References

2.1 Task

As part of a class exercise, six groups of MLIS students at the University of Washington Information School were asked to search for relevant documents addressing three topics designed for the Legal Track Interactive Task Challenge. The topics searched were:

  • Topic 7: All documents discussing, referencing, or relating to company guidelines, strategies, or internal approval for placement of tobacco products in G-rated movies;
  • Topic 45: All documents that refer or relate to pigeon deaths during the course of animal studies;
  • Topic 51: All documents referencing or regarding lawsuits involving claims related to memory loss.

2.2 Search Engines

The two search engines used in the evaluation are the Legacy Tobacco Documents Library (http://legacy.library.ucsf.edu/) (LTDL) and the Tobacco Documents Online (TDO) (http://tobaccodocuments.org).

2.3 Participants

The sixteen study participants are all graduate students studying towards a Masters degree in Library & Information Science. Four of the six groups had three members; two groups had two members. The group membership was based on level of general search experience as well as familiarity with legal documents or litigation research. To avoid learning effects the topics were searched in different order by the groups (see Table 1). Searchers were instructed to focus on overall recall.

Searchers were trained using the last topic from the list of the legal TREC interactive topics (Topic 23: “All documents referencing lobbying efforts by tobacco companies against legislation (state or federal) aimed at eliminating tobacco advertising on billboards, where the document makes specific reference to the First Amendment.”) The six groups used both systems (LTDL and TDO) to familiarize themselves with the search capabilities and other features available. When appropriate, searchers created accounts in these systems to permit access to features available only to registered users.

2.4 Study Design

After the training searches were completed, the groups were directed to begin searching on the test topics in a specified order. The order of topics to be searched by groups was:

Table 1. Search Order of Topics
GroupTopics
gpl123
gp2231
gp3312
gp4132
gp5213
gp6321

The groups were asked to organize their search effort in the following way:

  • (1)
    For each of the two systems (LTDL, TDO):
  • (2)
    Each searcher should familiarize themselves with a topic at a time;
  • (3)
    Each searcher should develop search statements for the topic and for each system
  • (4)
    Each searcher should conduct searches in each system and record the search statements, the number of records retrieved, and the first 20-30 records.
  • (5)
    Each group will meet to discuss each member's search strategies, compare what was retrieved, and decide on how to refine the search so as to conduct a final search.
  • (6)
    Then, each group will perform a final search using one or more search statements and selecting up to 100 relevant documents in the final document pool. At this stage groups can search together as a team using any arrangement that suits them.
  • (7)
    Submit the final “search statement(s)” used to retrieve the docs;
  • (8)
    Submit up to 100 relevant documents per topic.
  • (9)
    Searching will be performed using the two systems (LTDL, TDO).
  • (10)
    The team submissions were by document identifiers (Bates numbers).

3. RESULTS

  1. Top of page
  2. Abstract
  3. 1. INTRODUCTION
  4. 2. METHODOLOGY
  5. 3. RESULTS
  6. 4. CONCLUSIONS
  7. References

Groups UW1 and UW2 were consisted of law librarianship students and all had a JD degree as well as professional experience as lawyers, or legal search professionals. Groups UW3-6 were all first year MLIS students without much legal search experience.

The results as scored based on the legal track method are presented in Figure 1. From these results we can see that UW6, a group of first year MLIS students without any legal training, outperformed the other groups.

thumbnail image

Figure 1. TREC-based relevance results per group.

Download figure to PowerPoint

4. CONCLUSIONS

  1. Top of page
  2. Abstract
  3. 1. INTRODUCTION
  4. 2. METHODOLOGY
  5. 3. RESULTS
  6. 4. CONCLUSIONS
  7. References

The selected results presented here show that a group of searchers without domain expertise outperformed all groups. Looking at the results that group received first place while second and third is taken by the groups with the domain experts. Could it be that this is an outlier? Further analysis will shed light on the issue.

References

  1. Top of page
  2. Abstract
  3. 1. INTRODUCTION
  4. 2. METHODOLOGY
  5. 3. RESULTS
  6. 4. CONCLUSIONS
  7. References