Using the Euclidean Distance for Retrieval Evaluation

Shengli Wu, Yaxin Bi, xiaoqin zeng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In information retrieval systems and digital libraries, retrieval result evaluation is a very important aspect. Up to now, almost all commonly used metrics such as average precision and recall level precision are ranking based metrics. In this work, we investigate if it is a good option to use a score based method, the Euclidean distance, for retrieval evaluation. Two variations of it are discussed: one uses the linear model to estimate the relation between rank and relevance in resultant lists, and the other uses a more sophisticated cubic regression model for this. Our experiments with two groups of submitted results to TREC demonstrate that the introduced new metrics have strong correlation with ranking based metrics when we consider the average of all 50 queries. On the other hand, our experiments also show that one of the variations (the linear model) has better overall quality than all those ranking based metrics involved. Another surprising finding is that a commonly used metric, average precision, may not be as good as previously thought.
LanguageEnglish
Title of host publicationUnknown Host Publication
Number of pages14
Publication statusPublished - 2011
EventUsing the Euclidean Distance for Retrieval Evaluation -
Duration: 1 Jan 2011 → …

Conference

ConferenceUsing the Euclidean Distance for Retrieval Evaluation
Period1/01/11 → …

Fingerprint

Information retrieval systems
Digital libraries
Experiments

Cite this

Wu, S., Bi, Y., & zeng, X. (2011). Using the Euclidean Distance for Retrieval Evaluation. In Unknown Host Publication
Wu, Shengli ; Bi, Yaxin ; zeng, xiaoqin. / Using the Euclidean Distance for Retrieval Evaluation. Unknown Host Publication. 2011.
@inproceedings{9ed6248436524267a2198e5c8fb6b6c4,
title = "Using the Euclidean Distance for Retrieval Evaluation",
abstract = "In information retrieval systems and digital libraries, retrieval result evaluation is a very important aspect. Up to now, almost all commonly used metrics such as average precision and recall level precision are ranking based metrics. In this work, we investigate if it is a good option to use a score based method, the Euclidean distance, for retrieval evaluation. Two variations of it are discussed: one uses the linear model to estimate the relation between rank and relevance in resultant lists, and the other uses a more sophisticated cubic regression model for this. Our experiments with two groups of submitted results to TREC demonstrate that the introduced new metrics have strong correlation with ranking based metrics when we consider the average of all 50 queries. On the other hand, our experiments also show that one of the variations (the linear model) has better overall quality than all those ranking based metrics involved. Another surprising finding is that a commonly used metric, average precision, may not be as good as previously thought.",
author = "Shengli Wu and Yaxin Bi and xiaoqin zeng",
year = "2011",
language = "English",
booktitle = "Unknown Host Publication",

}

Wu, S, Bi, Y & zeng, X 2011, Using the Euclidean Distance for Retrieval Evaluation. in Unknown Host Publication. Using the Euclidean Distance for Retrieval Evaluation, 1/01/11.

Using the Euclidean Distance for Retrieval Evaluation. / Wu, Shengli; Bi, Yaxin; zeng, xiaoqin.

Unknown Host Publication. 2011.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Using the Euclidean Distance for Retrieval Evaluation

AU - Wu, Shengli

AU - Bi, Yaxin

AU - zeng, xiaoqin

PY - 2011

Y1 - 2011

N2 - In information retrieval systems and digital libraries, retrieval result evaluation is a very important aspect. Up to now, almost all commonly used metrics such as average precision and recall level precision are ranking based metrics. In this work, we investigate if it is a good option to use a score based method, the Euclidean distance, for retrieval evaluation. Two variations of it are discussed: one uses the linear model to estimate the relation between rank and relevance in resultant lists, and the other uses a more sophisticated cubic regression model for this. Our experiments with two groups of submitted results to TREC demonstrate that the introduced new metrics have strong correlation with ranking based metrics when we consider the average of all 50 queries. On the other hand, our experiments also show that one of the variations (the linear model) has better overall quality than all those ranking based metrics involved. Another surprising finding is that a commonly used metric, average precision, may not be as good as previously thought.

AB - In information retrieval systems and digital libraries, retrieval result evaluation is a very important aspect. Up to now, almost all commonly used metrics such as average precision and recall level precision are ranking based metrics. In this work, we investigate if it is a good option to use a score based method, the Euclidean distance, for retrieval evaluation. Two variations of it are discussed: one uses the linear model to estimate the relation between rank and relevance in resultant lists, and the other uses a more sophisticated cubic regression model for this. Our experiments with two groups of submitted results to TREC demonstrate that the introduced new metrics have strong correlation with ranking based metrics when we consider the average of all 50 queries. On the other hand, our experiments also show that one of the variations (the linear model) has better overall quality than all those ranking based metrics involved. Another surprising finding is that a commonly used metric, average precision, may not be as good as previously thought.

M3 - Conference contribution

BT - Unknown Host Publication

ER -

Wu S, Bi Y, zeng X. Using the Euclidean Distance for Retrieval Evaluation. In Unknown Host Publication. 2011