A Novel Neighborhood Based Document Smoothing Model for Information Retrieval

Pawan Goyal, Laxmidhar Behera, TM McGinnity

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

In this paper, a novel neighborhood based document smoothing model for information retrieval has been proposed. Lexical association between terms is used to provide a context sensitive indexing weight to the document terms, i.e. the term weights are redistributed based on the lexical association with the context words. A generalized retrieval framework has been presented and it has been shown that the vector space model (VSM), divergence from randomness (DFR), Okapi Best Matching 25 (BM25) and the language model (LM) based retrieval frameworks are special cases of this generalized framework. Being proposed in the generalized retrieval framework, the neighborhood based document smoothing model is applicable to all the indexing models that use the term-document frequency scheme. The proposed smoothing model is as efficient as the baseline retrieval frameworks at runtime. Experiments over the TREC datasets show that the neighborhood based document smoothing model consistently improves the retrieval performance of VSM, DFR, BM25 and LM and the improvements are statistically significant.
Original languageEnglish
Pages (from-to)391-425
JournalInformation Retrieval
Volume16
Issue number3
DOIs
Publication statusPublished (in print/issue) - 1 Jun 2013

Fingerprint

Dive into the research topics of 'A Novel Neighborhood Based Document Smoothing Model for Information Retrieval'. Together they form a unique fingerprint.

Cite this