The Linear Combination Data Fusion Method in Information Retrieval

Shengli Wu, Yaxin Bi, xiaoqin zeng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.
LanguageEnglish
Title of host publicationUnknown Host Publication
Number of pages14
Publication statusPublished - 2011
EventDEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II -
Duration: 1 Jan 2011 → …

Conference

ConferenceDEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Period1/01/11 → …

Fingerprint

Data fusion
Information retrieval
Linear regression
Experiments

Keywords

  • Data Fusion
  • Information Retrieval
  • TREC

Cite this

@inproceedings{e1b03c6613f14c03930aad08fe8736a3,
title = "The Linear Combination Data Fusion Method in Information Retrieval",
abstract = "In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.",
keywords = "Data Fusion, Information Retrieval, TREC",
author = "Shengli Wu and Yaxin Bi and xiaoqin zeng",
year = "2011",
language = "English",
booktitle = "Unknown Host Publication",

}

Wu, S, Bi, Y & zeng, X 2011, The Linear Combination Data Fusion Method in Information Retrieval. in Unknown Host Publication. DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II, 1/01/11.

The Linear Combination Data Fusion Method in Information Retrieval. / Wu, Shengli; Bi, Yaxin; zeng, xiaoqin.

Unknown Host Publication. 2011.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - The Linear Combination Data Fusion Method in Information Retrieval

AU - Wu, Shengli

AU - Bi, Yaxin

AU - zeng, xiaoqin

PY - 2011

Y1 - 2011

N2 - In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.

AB - In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.

KW - Data Fusion

KW - Information Retrieval

KW - TREC

M3 - Conference contribution

BT - Unknown Host Publication

ER -