Expediting medical literature coding with query-building


  • Effective January 23, 2011, this work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.


Manual sorting of published journal articles into several pre-defined subsets for the purpose of qualitative analysis is common practice in social science research. Unfortunately, this can be a time-consuming process which requires the attention of a subject specialist, and relies on various measures of inter-rater reliability to ensure that the results are valid and reproducible to serve as a basis for further study. We describe a system we have implemented, steelir, to help determine features common to one set of PubMed® articles in order to distinguish them from another. The system provides users with word-level unigram and bigram features from the article title and abstract, as well as MeSH® indexing terms, and suggests robust sample queries to find similar articles. We apply the system to the task of distinguishing original research articles on functional magnetic resonance imaging (fMRI) of sensorimotor function from fMRI studies of higher cognitive functions.