The main objective of this collaborative project is to develop enabling techniques for a large-scale metasearch engine that aims at covering a much larger portion of the Web and at the same time retrieving more up-to-date and more useful documents than existing search engines and metasearch engines. A metasearch engine is a system that provides unified access to multiple existing search engines. Upon receiving a query, the metasearch engine determines the appropriate search engines to invoke, the documents to retrieve from each invoked search engine and finally the set of documents to be shown to the user. The main problems to be studied in this project include (1) how to automatically discover useful search engines on the Web; (2) how to automatically and accurately categorize search engines into a concept hierarchy and how to use user profiles to map user queries to appropriate concept(s) in the hierarchy; (3) how to automatically incorporate search engines into a metasearch engine; (4) how to perform accurate database selection for longer queries; and (5) how to merge results returned from multiple search engines. This has been added to Deep Web Research Subject Tracer™ Information Blog.
posted by Marcus Zillman |
4:25 AM