XML is increasingly used to markup content in information repositories. Retrieval from XML document collections is now an area of active research. XML content retrieval poses a two-fold problem. One, finding effective techniques to retrieve the most useful XML elements in response to a user query. Two, devising an appropriate evaluation methodology to measure the effectivity of such techniques. This study examines both. Pivoted length normalization in VSM is revisited on benchmark XML collections. Performance improved. It is furthered by exploring query reformulation with an attempt to understand user's intent: what she wants and what she does not. On the evaluation, the sensitivity & robustness of various evaluation metrics along with reliability and reusability of the assessment pool used at INEX Ad-Hoc track since 2007 are studied in-depth. Large-scale experiments demonstrate that the INEX collections remain usable when evaluating non-participating systems. Finally, a low-costpooling method based on query-specific variable pool-depth is proposed that proved effective in evaluation of both XML and document retrieval. Part of this work won the Best JASIST Paper Award 2011.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.