Multiple Sources of Evidence for XML Retrieval

Börkur Sigurbjörnsson, Jaap Kamps, and Maarten de Rijke.

Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2004). Pages: 554-555. 2004. [acm]

Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid query language: the text-rich nature of the documents suggest a content-oriented (IR) approach, while the mark-up allows users to add structural constraints to their IR queries. We will show how evidence for relevancy from different sources helps to answer such hybrid queries. We evaluate our methods using the INEX 2003 test set, and show that structural hints in hybrid queries help to improve retrieval effectiveness.

@inproceedings{10.1145/1008992.1009117,
author = {Sigurbj\"{o}rnsson, B\"{o}rkur and Kamps, Jaap and de Rijke, Maarten},
title = {Multiple sources of evidence for XML retrieval},
year = {2004},
isbn = {1581138814},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/1008992.1009117},
doi = {10.1145/1008992.1009117},
booktitle = {Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages = {554–555},
numpages = {2},
keywords = {content and structure, XPath, XML retrieval},
location = {Sheffield, United Kingdom},
series = {SIGIR '04}
}