Case Study: Developing the PLOS thesaurus

Authors


Abstract

Editor's Summary

A thesaurus should quietly streamline website operations behind the scenes but can become an obstacle if it is inadequate or falls out of date, the situation the Public Library of Science (PLOS) realized in 2011. The organization sought help from Access Innovations, Inc., to bring its thesaurus up to speed, comply with the Z39.19 standard and ensure support of machine categorization of content. An analysis revealed over- and under-used terms and provided recommendations and specific guidance. With this knowledge and various use cases, PLOS determined a new customized thesaurus was needed. Access Innovations expanded and organized the original PLOS vocabulary of 3,132 terms to produce a thesaurus of more than 10,000 terms covering 10 science topics in seven levels. PLOS monitored weekly progress, and subject matter experts reviewed the draft for terminology and organization. Thesaurus terms will be applied to content as subject metadata using Data Harmony's MAIstro, a rule-based categorization system permitting editorial oversight and incremental improvements. After testing and tuning rules for accuracy, the system will be used to automatically index all PLOS content.

Ancillary