• natural language processing;
  • semantic analysis;
  • terminology

Specialized medical ontologies and terminologies, such as SNOMED CT and the Unified Medical Language System (UMLS), have been successfully leveraged in medical information systems to provide a standard web-accessible medium for interoperability, access, and reuse. However, these clinically oriented terminologies and ontologies cannot provide sufficient support when integrated into consumer-oriented applications, because these applications must “understand” both technical and lay vocabulary. The latter is not part of these specialized terminologies and ontologies. In this article, we propose a two-step approach for building consumer health terminologies from text: 1) automatic extraction of definitions from consumer-oriented articles and web documents, which reflects language in use, rather than relying solely on dictionaries, and 2) learning to map definitions expressed in natural language to terminological knowledge by inducing a syntactic-semantic grammar rather than using hand-written patterns or grammars. We present quantitative and qualitative evaluations of our two-step approach, which show that our framework could be used to induce consumer health terminologies from text.