WorldWideScience.org is a noteworthy example of a federated search application that promotes diversity of ideas at an international level by aggregating high quality science information from government and government-sanctioned sources from 55 countries contributing 49 databases [now 52 databases] and portals and representing approximately 73% of the world's population . Medical Librarian Hope Leman, in a review of WorldWideScience , illustrates beautifully the value of an international science portal for casting a wide research net and for finding important information in unexpected places:
“Now here is an example of why this project can lead to improvements in the quality of life for the ill worldwide. One of the results I got for ALS was from the journal Internal Medicine published by the Japanese Society of Internal Medicine. Now, I am quite interested in Japanese views of ALS, given that they have a much higher rate of full ventilation of patients than is true in the US. That is an interesting phenomenon in itself, suggesting that Japanese caregivers and clinicians have a greater willingness to care for patients under these often demanding conditions. And the article I found, “Salivary Chromogranin A: Useful and Quantitative Biochemical Marker of Affective State in Patients with Amyotrophic Lateral Sclerosis,” might sound arcane. But it actually had the very moving conclusion that it is imperative that ways of measuring mood be found for ALS patients, many of whom lose the ability to speak and some of whom become locked in. “Useful biochemical markers of the affective state in advanced patients have not yet been developed.” What a wonderful world we live in where search engines like WorldWideScience render findable scholarship produced in societies not one's own that sets you to thinking about issues that had not before entered your ken.”
WorldWideScience.org is an excellent example of a federated search application that employs an innovative approach, hierarchical federation, to efficiently search numerous content sources. Hierarchical federation allows for the combining of multiple federated search engines, each of which performs a portion of the searching, aggregating, and relevance ranking of content.
Figure I illustrates the hierarchical approach. WorldWideScience.org is a federated search portal that searches 52 sources, one of which, Science.gov, is itself a federated search portal. Science.gov searches 40 sources, one of which is a federated search portal, the E-print Network. From a single search page on WorldWideScience.org, a user can search 140 sources.
Deep Web Technologies is expanding its science research portal, ScienceResearch.com, to federate 500 sources using the hierarchical approach by mid-2009. In order to make large-scale federated search viable as a paradigm that is implemented by other vendors and organizations, Deep Web Technologies is actively researching the following elements of a complete solution:
- 1.Distributed computing to spread the computation and network loads, i.e. to load balance. In particular, aggregation of search results from different sources and their relevance ranking lends itself to distributed computing.
- 2.A mechanism for providing failover to redundant hardware components.
- 3.Automated source selection.
- 4.A streamlined approach for creating, testing, monitoring, and updating thousands of sources.
- 5.A mechanism to query and select a subset of sources from a federated search engine. This is required to eliminate searching duplicate sources across multiple engines.
- 6.Development of standards for development of hierarchical federations so that federated search engines from different vendors can inter-operate.