Data Fusion Performance Prophecy: A Random Forest Revelation

Zhongmin Zhang, Shengli Wu

Research output: Contribution to conferencePaperpeer-review


Data fusion synthesizes results from diverse sources, but the performance impact remains mysterious. This research reveals the inner workings of fusion through machine prophecy. Constructing a random forest model using TREC dataset benchmarks, we accurately predicted the performance of two fusion
algorithms. The model achieved near perfect R2 scores above 0.9 by exploiting meaningful statistical features. Compared to linear regression, the tree-based
ensemble provides superior insight. The importance of newly identified drivers,
like P@1000 metrics, is quantified. With this prescient view, researchers can
refine fusion techniques to offer better search. By uncovering the secrets of fusion
success, machine learning guides the path to retrieval excellence.
Original languageEnglish
Number of pages9
Publication statusPublished (in print/issue) - 1 Dec 2023
Event25th International Conference on Information Integration and Web Intelligence - Denpasar, Bali, Indonesia
Duration: 4 Dec 20236 Dec 2023


Conference25th International Conference on Information Integration and Web Intelligence
Abbreviated titleiiWAS 2023
Internet address

Bibliographical note

Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.


  • data fusion
  • performance prediction
  • random forest
  • information retrieval


Dive into the research topics of 'Data Fusion Performance Prophecy: A Random Forest Revelation'. Together they form a unique fingerprint.

Cite this