Minimizing cultural differences using ontology-based multilingual and multicultural information retrieval system

Authors


Introduction

Most information retrieval systems are dependent on users entering terms and initiating a search; however, these terms may or may not match those in the system. In addition, individual differences make information searching ineffective and inefficient. Each individual has a unique conceptualization and uses different terms when he or she searches for information. Everyone uses different words (or, at least different symbols); therefore, everyone searches and interprets information differently. Kelly (1955) asserts that everyone constructs his or her own concepts, which he calls Personal Construct Theory. While cross-language information retrieval (CLIR) is mainly focused on linguistic aspects, this study attempts to show how user performance differs when the system provides both linguistic and cultural aspects of information.

Cultural Difference and Information Retrieval

Cultural difference is one of the major factors that make individuals different. Everyone has different concepts that are created by variations in culture (Taylor, 2004). Park's (2004) study demonstrates that cultural difference affects learners' information searching, analysis, and use, particularly in the technology-based learning environment. His study shows that East Asians' information retrieval is based on relationships, such as associative relationships, while European Westerns' information retrieval is based on categories such as classification. Minimizing cultural differences in information retrieval is a vital issue because diverse cultures are brought close together through technologies. One of the most common cultural differences is a language difference. While many research works completed by various researchers have concretely established that the usage of CLIR may minimize language differences, substantive empirical work remains to be done in identifying how individual users perform a multicultural information search using an ontology-based information retrieval system. The purpose of this study is to investigate how users perform when both multilingual and multicultural resources are provided. The poster will present our promising work that seeks to improve access to a movie database by transforming the information seeking behaviors of users into an information retrieval system that organizes multilingual and multicultural information based on ontological relationships. The Ontology-based Multilingual and Multicultural Information Retrieval (OMIR) system is based on semantic relationships among the information resources. By gathering input from the biggest movie database, Internet Movie Database (IMDb), about the terms and concepts used to search movies, a map of the relationships among resources relating the body of information in the movie field will be generated. Maximizing knowledge of how non-native English speakers look for information allows us to structure and organize movie resources within the OMIR system in a manner that reflects the ways working users currently conceptualize their resources. Organizing the movie resources within an OMIR system that capitalizes on user information behavior is predicted to facilitate the location of wanted information plus the discovery of new and related areas of information that may not have been possible using traditional keyword-based information retrieval systems. Upon development of the multilingual and multicultural information retrieval system, the influence of the incorporation of ontology in the success of a search for movie resources from the system will be tested. Indices of success will include: the number of resources retrieved; the relevancy of the resources retrieved; and the time spent searching.

Methodology

Participants

Forty undergraduate students will participate in one 60-minute usability study. Following a 5 to 10 minute introduction to the OMIR system, participants will be asked to search for movie resources, and the searches will be recorded using screen capture software.

Development of the OMIR

The ontology for movies was created through analysis of domains, extractions of key concepts from keywords, and subject headings from the IMDb. The terms are organized into: top, broad, narrow, related, preferred, and non-preferred terms. Three relationships: (1) equivalence, (2) hierarchical, and (3) associative, are used to collocate the resources (ANSI/NISO Z39.19-2005, 2005). Associative relationships are implemented by adding both associative relationships between terms belonging to the same hierarchy and associative relationships between terms belonging to different hierarchies. XML Topic Maps (XTM) is employed to implement the ontology. Omnigator, an openly accessible Topic Map browser, will be used to navigate the OMIR system (Ontopia, 2005). The screenshot of the OMIR system interface is illustrated below (Figure 1). The first box from Figure 1 shows type of information, the second box from Figure 1 shows multilingual information, the third box from Figure 1 shows semantic relationships among resources, the fourth box from Figure 1 shows culture-related information, and the fifth box from Figure 1 shows general information about the movie.

Figure 1.

Screen Shot of Ontology-based Multilingual and Multicultural IR System

Significance of the Project

This study investigates non-native English speakers' conceptualization of the semantic relationships among cultural resources into the structure of an online information retrieval system. Development of such a system is significant as users' quest for relevant and timely information is becoming increasingly complicated by the ever-growing amount and complexity of resources. Building meaningful and rich semantic relationships among movie resources holds the promise of allowing researchers to find cultural information effectively and efficiently.

Ancillary