Query expansion using UMLS Tools for health information retrieval



Four new automatic query expansion strategies based on UMLS Metathesaurus are proposed to improve the effectiveness of health information retrieval: String index with Concept expansion (SC), String index with Term expansion (ST), Word index with Concept expansion (WC), and Word index with Term expansion (WT). Results from a comparison evaluation study using Medline plus dataset indicated that 1) the Mean Average Precisions (MAPs) with term-level expansion are higher than those with concept level by 5.6% for 30 queries and 10.9% for short queries; 2) the MAPs based on the string index strategy are better than those based on the word index by 15.5% for 30 queries and 9.6% for short queries; and 3) the String index with Term expansion (ST) has the highest MAPs for both 30 queries and short queries. These results will help us better understand the effectiveness of different automatic query expansion strategies using UMLS Metathesaurus and further inform the design of future Healthcare IR system.