Fusion-based Methods for Result Diversification in Web Search

Shingli Wu, Chunlan Huang, Liang Li, Fabio Crestani

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Search result diversification of text documents is especially necessarywhen a user issues a faceted or ambiguous query to the search engine.A variety of approaches have been proposed to deal with this issue in recent years.In this article, we propose a group of fusion-based result diversification methodswith the aim to improve performance that considers both relevance and diversity.They are linear combinations of scores that are obtained from different componentsearch systems. The weight of each search system isdetermined by considering three factors: performance, dissimilarity, and complementarity.There are two major contributions. Firstly, we find that all the three factors of performance and complementarity and dissimilarity are useful for effective weighting of linear combination.Secondly, we present the logarithmic function-based model for converting ranking information into scores.Experiments are carried out with four groups of results submitted to theTREC web diversity task. Experimental results show that some of the fusion methods that use the aforementioned techniques perform more effectively than the state-of-the-art fusion methods for result diversification.
LanguageEnglish
Pages16-26
JournalInformation Fusion
Volume45
Issue number1
Early online date8 Jan 2018
DOIs
Publication statusE-pub ahead of print - 8 Jan 2018

Fingerprint

Fusion reactions
Search engines
Experiments

Keywords

  • Data fusion
  • Web search
  • Result diversification
  • Linear combination
  • Weight assignment
  • Linear score normalization

Cite this

Wu, Shingli ; Huang, Chunlan ; Li, Liang ; Crestani, Fabio. / Fusion-based Methods for Result Diversification in Web Search. In: Information Fusion. 2018 ; Vol. 45, No. 1. pp. 16-26.
@article{ce9af89e143f49b08aa66abea568d6d2,
title = "Fusion-based Methods for Result Diversification in Web Search",
abstract = "Search result diversification of text documents is especially necessarywhen a user issues a faceted or ambiguous query to the search engine.A variety of approaches have been proposed to deal with this issue in recent years.In this article, we propose a group of fusion-based result diversification methodswith the aim to improve performance that considers both relevance and diversity.They are linear combinations of scores that are obtained from different componentsearch systems. The weight of each search system isdetermined by considering three factors: performance, dissimilarity, and complementarity.There are two major contributions. Firstly, we find that all the three factors of performance and complementarity and dissimilarity are useful for effective weighting of linear combination.Secondly, we present the logarithmic function-based model for converting ranking information into scores.Experiments are carried out with four groups of results submitted to theTREC web diversity task. Experimental results show that some of the fusion methods that use the aforementioned techniques perform more effectively than the state-of-the-art fusion methods for result diversification.",
keywords = "Data fusion, Web search, Result diversification, Linear combination, Weight assignment, Linear score normalization",
author = "Shingli Wu and Chunlan Huang and Liang Li and Fabio Crestani",
year = "2018",
month = "1",
day = "8",
doi = "10.1016/j.inffus.2018.01.006",
language = "English",
volume = "45",
pages = "16--26",
journal = "Information Fusion",
issn = "1566-2535",
publisher = "Elsevier",
number = "1",

}

Fusion-based Methods for Result Diversification in Web Search. / Wu, Shingli; Huang, Chunlan; Li, Liang; Crestani, Fabio.

In: Information Fusion, Vol. 45, No. 1, 08.01.2018, p. 16-26.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Fusion-based Methods for Result Diversification in Web Search

AU - Wu, Shingli

AU - Huang, Chunlan

AU - Li, Liang

AU - Crestani, Fabio

PY - 2018/1/8

Y1 - 2018/1/8

N2 - Search result diversification of text documents is especially necessarywhen a user issues a faceted or ambiguous query to the search engine.A variety of approaches have been proposed to deal with this issue in recent years.In this article, we propose a group of fusion-based result diversification methodswith the aim to improve performance that considers both relevance and diversity.They are linear combinations of scores that are obtained from different componentsearch systems. The weight of each search system isdetermined by considering three factors: performance, dissimilarity, and complementarity.There are two major contributions. Firstly, we find that all the three factors of performance and complementarity and dissimilarity are useful for effective weighting of linear combination.Secondly, we present the logarithmic function-based model for converting ranking information into scores.Experiments are carried out with four groups of results submitted to theTREC web diversity task. Experimental results show that some of the fusion methods that use the aforementioned techniques perform more effectively than the state-of-the-art fusion methods for result diversification.

AB - Search result diversification of text documents is especially necessarywhen a user issues a faceted or ambiguous query to the search engine.A variety of approaches have been proposed to deal with this issue in recent years.In this article, we propose a group of fusion-based result diversification methodswith the aim to improve performance that considers both relevance and diversity.They are linear combinations of scores that are obtained from different componentsearch systems. The weight of each search system isdetermined by considering three factors: performance, dissimilarity, and complementarity.There are two major contributions. Firstly, we find that all the three factors of performance and complementarity and dissimilarity are useful for effective weighting of linear combination.Secondly, we present the logarithmic function-based model for converting ranking information into scores.Experiments are carried out with four groups of results submitted to theTREC web diversity task. Experimental results show that some of the fusion methods that use the aforementioned techniques perform more effectively than the state-of-the-art fusion methods for result diversification.

KW - Data fusion

KW - Web search

KW - Result diversification

KW - Linear combination

KW - Weight assignment

KW - Linear score normalization

U2 - 10.1016/j.inffus.2018.01.006

DO - 10.1016/j.inffus.2018.01.006

M3 - Article

VL - 45

SP - 16

EP - 26

JO - Information Fusion

T2 - Information Fusion

JF - Information Fusion

SN - 1566-2535

IS - 1

ER -