Several methods of ranking retrieval systems with partial relevance judgment

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Some measures such as mean average precision and recall level precision are considered as good system-oriented measures, because they concern both precision and recall that are two important aspects for effectiveness evaluation of information retrieval systems. However, such good system-oriented measures suffer from some shortcomings when partial relevance judgment is used. In this paper, we discuss how to rank retrieval systems in the condition of partial relevance judgment, which is common in major retrieval evaluation events such as TREC conferences and NTCIR workshops. Four system-oriented measures, which are mean average precision, recall level precision, normalized discount cumulative gain, and normalized average precision over all documents, are discussed. Our investigation shows that averaging values over a set of queries may not be the most reliable approach to rank a group of retrieval systems. Some alternatives such as Bar da count. Condorcet voting, and the zero-one normalization method, are investigated. Experimental results are also presented for the evaluation of these methods.
LanguageEnglish
Title of host publicationUnknown Host Publication
Pages13-18
Number of pages6
DOIs
Publication statusPublished - 2007
Event2nd International Conference on Digital Information Management, 2007 (ICDIM '07) - Lyon, France
Duration: 1 Jan 2007 → …

Conference

Conference2nd International Conference on Digital Information Management, 2007 (ICDIM '07)
Period1/01/07 → …

Fingerprint

Information retrieval systems

Cite this

@inproceedings{d60e4c6ecad74165aace4b700b799f13,
title = "Several methods of ranking retrieval systems with partial relevance judgment",
abstract = "Some measures such as mean average precision and recall level precision are considered as good system-oriented measures, because they concern both precision and recall that are two important aspects for effectiveness evaluation of information retrieval systems. However, such good system-oriented measures suffer from some shortcomings when partial relevance judgment is used. In this paper, we discuss how to rank retrieval systems in the condition of partial relevance judgment, which is common in major retrieval evaluation events such as TREC conferences and NTCIR workshops. Four system-oriented measures, which are mean average precision, recall level precision, normalized discount cumulative gain, and normalized average precision over all documents, are discussed. Our investigation shows that averaging values over a set of queries may not be the most reliable approach to rank a group of retrieval systems. Some alternatives such as Bar da count. Condorcet voting, and the zero-one normalization method, are investigated. Experimental results are also presented for the evaluation of these methods.",
author = "Shengli Wu and Sally McClean",
year = "2007",
doi = "10.1109/ICDIM.2007.4444193",
language = "English",
isbn = "978-1-4244-1475-8",
pages = "13--18",
booktitle = "Unknown Host Publication",

}

Wu, S & McClean, S 2007, Several methods of ranking retrieval systems with partial relevance judgment. in Unknown Host Publication. pp. 13-18, 2nd International Conference on Digital Information Management, 2007 (ICDIM '07), 1/01/07. https://doi.org/10.1109/ICDIM.2007.4444193

Several methods of ranking retrieval systems with partial relevance judgment. / Wu, Shengli; McClean, Sally.

Unknown Host Publication. 2007. p. 13-18.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Several methods of ranking retrieval systems with partial relevance judgment

AU - Wu, Shengli

AU - McClean, Sally

PY - 2007

Y1 - 2007

N2 - Some measures such as mean average precision and recall level precision are considered as good system-oriented measures, because they concern both precision and recall that are two important aspects for effectiveness evaluation of information retrieval systems. However, such good system-oriented measures suffer from some shortcomings when partial relevance judgment is used. In this paper, we discuss how to rank retrieval systems in the condition of partial relevance judgment, which is common in major retrieval evaluation events such as TREC conferences and NTCIR workshops. Four system-oriented measures, which are mean average precision, recall level precision, normalized discount cumulative gain, and normalized average precision over all documents, are discussed. Our investigation shows that averaging values over a set of queries may not be the most reliable approach to rank a group of retrieval systems. Some alternatives such as Bar da count. Condorcet voting, and the zero-one normalization method, are investigated. Experimental results are also presented for the evaluation of these methods.

AB - Some measures such as mean average precision and recall level precision are considered as good system-oriented measures, because they concern both precision and recall that are two important aspects for effectiveness evaluation of information retrieval systems. However, such good system-oriented measures suffer from some shortcomings when partial relevance judgment is used. In this paper, we discuss how to rank retrieval systems in the condition of partial relevance judgment, which is common in major retrieval evaluation events such as TREC conferences and NTCIR workshops. Four system-oriented measures, which are mean average precision, recall level precision, normalized discount cumulative gain, and normalized average precision over all documents, are discussed. Our investigation shows that averaging values over a set of queries may not be the most reliable approach to rank a group of retrieval systems. Some alternatives such as Bar da count. Condorcet voting, and the zero-one normalization method, are investigated. Experimental results are also presented for the evaluation of these methods.

U2 - 10.1109/ICDIM.2007.4444193

DO - 10.1109/ICDIM.2007.4444193

M3 - Conference contribution

SN - 978-1-4244-1475-8

SP - 13

EP - 18

BT - Unknown Host Publication

ER -