SEARCH

SEARCH BY CITATION

Abstract

Regarding that information in broad-coverage knowledge bases, such as thesauri, is usually incomplete, merging information from different sources is an alternative to amplify coverage. We propose a method for the enrichment of a thesaurus with information acquired automatically from dictionaries. First, synonymy pairs are extracted. Then, these pairs are assigned to the most similar candidate synsets. Finally, the remaining pairs are the target of clustering to identify new synsets. After selecting the adequate experimentation settings, this method was applied to enrich a Portuguese thesaurus with synonyms extracted from three dictionaries, which resulted in TRIP, a larger and broader thesaurus with new words and concepts. The steps towards the creation of this new thesaurus and its evaluation are described here.