Abstract
Latent semantic indexing (LSI) is a popular technique used in information retrieval (IR) applications. This paper presents a novel evaluation strategy based on the use of image processing tools. The authors evaluate the use of the discrete cosine transform (DCT) and Cohen Daubechies Feauveau 9/7 (CDF9/7) wavelet transform as a preprocessing step for the singular value decomposition (SVD) step of the LSI system. In addition, the effect of different threshold types on the search results is examined. The results show that accuracy can be increased by applying both transforms as a preprocessing step, with better performance for the hard-threshold function. The choice of the best threshold value is a key factor in the transform process. This paper also describes the most effective structure for the database to facilitate efficient searching in the LSI system. (c) 2009 Elsevier B.V. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 2406-2417 |
Journal | Neurocomputing |
Volume | 72 |
Issue number | 10-12, |
DOIs | |
Publication status | Published (in print/issue) - Jun 2009 |
Keywords
- Latent semantic indexing
- Information retrieval
- Discrete cosine transform
- Singular value decomposition
- Cohen Daubechies Feauveau 9/7
- Hard thresholding
- Soft thresholding