Abstract
Data fusion synthesizes results from diverse sources, but the performance impact remains mysterious. This research reveals the inner workings of fusion through machine prophecy. Constructing a random forest model using TREC dataset benchmarks, we accurately predicted the performance of two fusion
algorithms. The model achieved near perfect R2 scores above 0.9 by exploiting meaningful statistical features. Compared to linear regression, the tree-based
ensemble provides superior insight. The importance of newly identified drivers,
like P@1000 metrics, is quantified. With this prescient view, researchers can
refine fusion techniques to offer better search. By uncovering the secrets of fusion
success, machine learning guides the path to retrieval excellence.
algorithms. The model achieved near perfect R2 scores above 0.9 by exploiting meaningful statistical features. Compared to linear regression, the tree-based
ensemble provides superior insight. The importance of newly identified drivers,
like P@1000 metrics, is quantified. With this prescient view, researchers can
refine fusion techniques to offer better search. By uncovering the secrets of fusion
success, machine learning guides the path to retrieval excellence.
Original language | English |
---|---|
Pages | 192-200 |
Number of pages | 9 |
Publication status | Published (in print/issue) - 1 Dec 2023 |
Event | 25th International Conference on Information Integration and Web Intelligence - Denpasar, Bali, Indonesia Duration: 4 Dec 2023 → 6 Dec 2023 https://www.iiwas.org/conferences/iiwas2023/ |
Conference
Conference | 25th International Conference on Information Integration and Web Intelligence |
---|---|
Abbreviated title | iiWAS 2023 |
Country/Territory | Indonesia |
City | Bali |
Period | 4/12/23 → 6/12/23 |
Internet address |
Bibliographical note
Publisher Copyright:© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- data fusion
- performance prediction
- random forest
- information retrieval