A Novel Neighborhood Based Document Smoothing Model for Information Retrieval

Pawan Goyal, Laxmidhar Behera, TM McGinnity

    Research output: Contribution to journalArticle

    6 Citations (Scopus)

    Abstract

    In this paper, a novel neighborhood based document smoothing model for information retrieval has been proposed. Lexical association between terms is used to provide a context sensitive indexing weight to the document terms, i.e. the term weights are redistributed based on the lexical association with the context words. A generalized retrieval framework has been presented and it has been shown that the vector space model (VSM), divergence from randomness (DFR), Okapi Best Matching 25 (BM25) and the language model (LM) based retrieval frameworks are special cases of this generalized framework. Being proposed in the generalized retrieval framework, the neighborhood based document smoothing model is applicable to all the indexing models that use the term-document frequency scheme. The proposed smoothing model is as efficient as the baseline retrieval frameworks at runtime. Experiments over the TREC datasets show that the neighborhood based document smoothing model consistently improves the retrieval performance of VSM, DFR, BM25 and LM and the improvements are statistically significant.
    Original languageEnglish
    Pages (from-to)391-425
    JournalInformation Retrieval
    Volume16
    Issue number3
    DOIs
    Publication statusPublished - 1 Jun 2013

      Fingerprint

    Cite this