Filter and Wrapper Stacking Ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data

Sugam Budhraja, Maryam Doborjeh, Balkaran Singh, Samuel Tan, Zohreh Doborjeh, Edmund Lai, Alexander Merkin, Jimmy Lee, Wilson Goh, Nikola Kasabov

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
12 Downloads (Pure)

Abstract

Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics.
Original languageEnglish
Article numberbbad382
Pages (from-to)1-17
Number of pages17
JournalBriefings in Bioinformatics
Volume24
Issue number6
Early online date26 Oct 2023
DOIs
Publication statusPublished (in print/issue) - Nov 2023

Bibliographical note

Funding Information:
This research is supported by the MBIE Catalyst: Strategic–New Zealand-Singapore Data Science Research Program and the National Research Foundation, Singapore, under its Industry Alignment Fund–Pre-positioning (IAF-PP) Funding Initiative. The LYRIKS study was supported by the National Research Foundation Singapore under the National Medical Research Council Translational and Clinical Research Flagship Program (NMRC/TCR/003/2008). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore.

Publisher Copyright:
© The Author(s) 2023. Published by Oxford University Press.

Keywords

  • feature selection
  • biomarker discovery
  • ensemble learning
  • high-dimensional data
  • genomics
  • proteomics

Fingerprint

Dive into the research topics of 'Filter and Wrapper Stacking Ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data'. Together they form a unique fingerprint.

Cite this