Abstract
"Metagenomics" is the study of genomic sequences obtained directly from environmental microbial communities with the aim to linking their structures with functional roles. The field has been aided in the unprecedented advancement through high-throughput omics data sequencing. The outcome of sequencing are biologically rich data sets. Metagenomic data consisting of microbial spe-cies which outnumber microbial samples, lead to the "curse of dimensionality". Hence the focus in metagenomics studies has moved towards developing efficient computational models using Machine Learning (ML), reducing the computational cost. In this paper, we comprehensively assessed various ML approaches to classifying high-dimensional human microbiota effectively into their functional phenotypes. We propose the application of embedded feature selection methods, namely, Extreme Gradient Boost-ing and Penalized Logistic Regression to determine important species. The resultant feature set enhanced the performance of one of the most popular state-of-the-art methods, Random Forest (RF) over metagenomic studies. Experimental results indicate that the proposed method achieved best results in terms of accuracy, area under Receiver Operating Characteristic curve (ROC-AUC) and major improvement in processing time. It outperformed other feature selection methods of filters or wrappers over RF and classifiers such as Support Vector Machine (SVM), Extreme Learning Machine (ELM), and k -Nearest Neighbors ( k -NN).
Original language | English |
---|---|
Pages (from-to) | 751-763 |
Number of pages | 14 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 16 |
Issue number | 3 |
Early online date | 23 Jul 2018 |
DOIs | |
Publication status | Published (in print/issue) - 2018 |
Keywords
- Metagenomics
- Microbiota
- Embedded Feature Selection
- OperationalTaxonomicUnits(OTUs)
- Classification
Fingerprint
Dive into the research topics of 'A Comprehensive Study on Predicting Functional Role of Metagenomes Using Machine Learning Methods'. Together they form a unique fingerprint.Profiles
-
Fiona Browne
- School of Computing - Lecturer
- Faculty Of Computing, Eng. & Built Env. - Lecturer
- Computer Science and Informatics Research
Person: Academic
-
Haiying Wang
- School of Computing - Reader
- Computer Science and Informatics Research
- Faculty Of Computing, Eng. & Built Env. - Reader
Person: Academic