<$BlogRSDUrl$> Marcus P. Zillman, M.S., A.M.H.A. Author/Speaker/Consultant
Marcus P. Zillman, M.S., A.M.H.A. Author/Speaker/Consultant
Internet Happenings, Events and Sources


Sunday, June 26, 2005  

Open Source Web Information Retrieval (OSWIR05)
http://www.emse.fr/OSWIR05/

The World Wide Web has grown to be a primary source of information for millions of people. Due to the size of the Web, search engines have become the major access point for this information. However, "commercial" search engines use hidden algorithms that put the integrity of their results in doubt, so there is a need for some open source Web search engines. On the other hand, the Information Retrieval (IR) research community has a long history of developing ideas, models and techniques for finding results in data sources, but finding one's way through all of them is not an easy task. Moreover their applicability to the Web search domain is uncertain. The goal of the workshop is to survey the fundamentals of the IR domain and to determine the techniques, tools, or models that are applicable to Web search. Presentations should include either strong arguments or report results supported by large-scale experiments that demonstrate the applicability of the technique to the Web domain as well as its advantage over similar techniques. Relevant topics include, but are not restricted to:

. Information Retrieval Models and Matching Function Models
- vector space, probabilistic, Boolean models and their extensions
- passage retrieval
- normalization
. Utilities for IR
- relevance feedback
- clustering
- indexing entities (N-grams, words, stemming, stop word removal, compound
nouns, named entities, concepts, etc.)
- statistical regression
- query expansion (e.g. with thesaurus)
- natural language processing (syntactical analysis, etc.)
- disambiguation
. Web (and hypertext) particulars
- links
- anchors
- HTML and/or XML structure
- document identification (URL)
- duplicates
- hidden documents
- dynamic documents
- site
. Evaluation of models
. User Interface
- Query language
- Results presentation

This has been added to Bot Research Subject Tracerâ„¢ Information Blog.

posted by Marcus Zillman | 4:00 AM
archives
subject tracers™