Concept-matching IR systems versus word-matching information retrieval systems: Considering fuzzy interrelations for indexing Web pages



This article presents a semantic-based Web retrieval system that is capable of retrieving the Web pages that are conceptually related to the implicit concepts of the query. The concept of “concept” is managed from a fuzzy point of view by means of semantic areas. In this context, the proposed system improves most search engines that are based on matching words. The key of the system is to use a new version of the Fuzzy Interrelations and Synonymy-Based Concept Representation Model (FIS-CRM) to extract and represent the concepts contained in both the Web pages and the user query. This model, which was integrated into other tools such as the Fuzzy Interrelations and Synonymy based Searcher (FISS) metasearcher and the fz-mail system, considers the fuzzy synonymy and the fuzzy generality interrelations as a means of representing word interrelations (stored in a fuzzy synonymy dictionary and ontologies). The new version of the model, which is based on the study of the cooccurrences of synonyms, integrates a soft method for disambiguating word senses. This method also considers the context of the word to be disambiguated and the thematic ontologies and sets of synonyms stored in the dictionary.