PAAM-ML: A novel Phylogeny and Abundance aware Machine Learning Modelling Approach for Microbiome Classification

Jyotsna Talreja Wassan, Haiying / HY Wang, Browne Fiona, Huiru Zheng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Recent advances in high-throughput sequencing technologies have accelerated microbiome studies by profiling 16S rRNA genes present in microbial species. Identifying, analyzing, and targeting such microbial composition is important to provide an enriched analysis of microbial samples. In this paper, we propose a novel phylogeny and abundance aware machine learning modelling approach (PAAM-ML) for classifying microbial samples into their respective functional phenotypes. The approach integrates abundance count of microbial species as well as relationships between them, which are encoded in their phylogenetic tree of life. It incorporates the underlying structural tree information into the abundance of microbial species (features) to create a phylogeny and abundance aware matrix structure (PAAM). The matrix is then used as input for machine learning (ML) models for microbiome classification. We compared the classification performance of PAAM-ML with state-of-the-art approaches using Phylogenetic Isometric Log-Ratio Transform (PhILR) and MetaPhyl using three use cases. PAAM-ML significantly improved the performance. It outperformed PhILR with ptextbf<0.01 in Human Microbiome across 4 body sites. We also performed a comprehensive analysis of the proposed approach by applying feature engineering. Our experimental results indicate significant classification performance, for example, the highest accuracy of 0.977 and Mathews Correlation Coefficient of 0.961 was achieved when applying Random Forest and feature engineering over the PAAM associated with Human Microbiome.
LanguageEnglish
Title of host publication2018 IEEE International Conference on Bioinformatics and Biomedicine
Pages44-49
Number of pages6
ISBN (Electronic)978-1-5386-5488-0, 978-1-5386-5487-3
DOIs
Publication statusPublished - 3 Dec 2018
Event2018 IEEE International Conference on
Bioinformatics and Biomedicine
- Madrid, Spain
Duration: 3 Dec 20186 Dec 2018
http://orienta.ugr.es/bibm2018/

Conference

Conference2018 IEEE International Conference on
Bioinformatics and Biomedicine
Abbreviated titleBIBM2018
CountrySpain
CityMadrid
Period3/12/186/12/18
Internet address

Fingerprint

Learning systems
Mathematical transformations
Genes
Throughput
Phylogeny
Chemical analysis

Keywords

  • Classification
  • machine Learning
  • metagenomics
  • operational Taxonomic Units (OTUs)
  • phylogeny

Cite this

Wassan, Jyotsna Talreja ; Wang, Haiying / HY ; Fiona, Browne ; Zheng, Huiru. / PAAM-ML: A novel Phylogeny and Abundance aware Machine Learning Modelling Approach for Microbiome Classification. 2018 IEEE International Conference on Bioinformatics and Biomedicine . 2018. pp. 44-49
@inproceedings{62a68d9fc8b54f10a59bba8f2d63e6a7,
title = "PAAM-ML: A novel Phylogeny and Abundance aware Machine Learning Modelling Approach for Microbiome Classification",
abstract = "Recent advances in high-throughput sequencing technologies have accelerated microbiome studies by profiling 16S rRNA genes present in microbial species. Identifying, analyzing, and targeting such microbial composition is important to provide an enriched analysis of microbial samples. In this paper, we propose a novel phylogeny and abundance aware machine learning modelling approach (PAAM-ML) for classifying microbial samples into their respective functional phenotypes. The approach integrates abundance count of microbial species as well as relationships between them, which are encoded in their phylogenetic tree of life. It incorporates the underlying structural tree information into the abundance of microbial species (features) to create a phylogeny and abundance aware matrix structure (PAAM). The matrix is then used as input for machine learning (ML) models for microbiome classification. We compared the classification performance of PAAM-ML with state-of-the-art approaches using Phylogenetic Isometric Log-Ratio Transform (PhILR) and MetaPhyl using three use cases. PAAM-ML significantly improved the performance. It outperformed PhILR with ptextbf<0.01 in Human Microbiome across 4 body sites. We also performed a comprehensive analysis of the proposed approach by applying feature engineering. Our experimental results indicate significant classification performance, for example, the highest accuracy of 0.977 and Mathews Correlation Coefficient of 0.961 was achieved when applying Random Forest and feature engineering over the PAAM associated with Human Microbiome.",
keywords = "Classification, machine Learning, metagenomics, operational Taxonomic Units (OTUs), phylogeny",
author = "Wassan, {Jyotsna Talreja} and Wang, {Haiying / HY} and Browne Fiona and Huiru Zheng",
note = "Unable to find ISSN",
year = "2018",
month = "12",
day = "3",
doi = "10.1109/BIBM.2018.8621382",
language = "English",
isbn = "978-1-5386-5489-7",
pages = "44--49",
booktitle = "2018 IEEE International Conference on Bioinformatics and Biomedicine",

}

Wassan, JT, Wang, HHY, Fiona, B & Zheng, H 2018, PAAM-ML: A novel Phylogeny and Abundance aware Machine Learning Modelling Approach for Microbiome Classification. in 2018 IEEE International Conference on Bioinformatics and Biomedicine . pp. 44-49, 2018 IEEE International Conference on
Bioinformatics and Biomedicine, Madrid, Spain, 3/12/18. https://doi.org/10.1109/BIBM.2018.8621382

PAAM-ML: A novel Phylogeny and Abundance aware Machine Learning Modelling Approach for Microbiome Classification. / Wassan, Jyotsna Talreja; Wang, Haiying / HY; Fiona, Browne; Zheng, Huiru.

2018 IEEE International Conference on Bioinformatics and Biomedicine . 2018. p. 44-49.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - PAAM-ML: A novel Phylogeny and Abundance aware Machine Learning Modelling Approach for Microbiome Classification

AU - Wassan, Jyotsna Talreja

AU - Wang, Haiying / HY

AU - Fiona, Browne

AU - Zheng, Huiru

N1 - Unable to find ISSN

PY - 2018/12/3

Y1 - 2018/12/3

N2 - Recent advances in high-throughput sequencing technologies have accelerated microbiome studies by profiling 16S rRNA genes present in microbial species. Identifying, analyzing, and targeting such microbial composition is important to provide an enriched analysis of microbial samples. In this paper, we propose a novel phylogeny and abundance aware machine learning modelling approach (PAAM-ML) for classifying microbial samples into their respective functional phenotypes. The approach integrates abundance count of microbial species as well as relationships between them, which are encoded in their phylogenetic tree of life. It incorporates the underlying structural tree information into the abundance of microbial species (features) to create a phylogeny and abundance aware matrix structure (PAAM). The matrix is then used as input for machine learning (ML) models for microbiome classification. We compared the classification performance of PAAM-ML with state-of-the-art approaches using Phylogenetic Isometric Log-Ratio Transform (PhILR) and MetaPhyl using three use cases. PAAM-ML significantly improved the performance. It outperformed PhILR with ptextbf<0.01 in Human Microbiome across 4 body sites. We also performed a comprehensive analysis of the proposed approach by applying feature engineering. Our experimental results indicate significant classification performance, for example, the highest accuracy of 0.977 and Mathews Correlation Coefficient of 0.961 was achieved when applying Random Forest and feature engineering over the PAAM associated with Human Microbiome.

AB - Recent advances in high-throughput sequencing technologies have accelerated microbiome studies by profiling 16S rRNA genes present in microbial species. Identifying, analyzing, and targeting such microbial composition is important to provide an enriched analysis of microbial samples. In this paper, we propose a novel phylogeny and abundance aware machine learning modelling approach (PAAM-ML) for classifying microbial samples into their respective functional phenotypes. The approach integrates abundance count of microbial species as well as relationships between them, which are encoded in their phylogenetic tree of life. It incorporates the underlying structural tree information into the abundance of microbial species (features) to create a phylogeny and abundance aware matrix structure (PAAM). The matrix is then used as input for machine learning (ML) models for microbiome classification. We compared the classification performance of PAAM-ML with state-of-the-art approaches using Phylogenetic Isometric Log-Ratio Transform (PhILR) and MetaPhyl using three use cases. PAAM-ML significantly improved the performance. It outperformed PhILR with ptextbf<0.01 in Human Microbiome across 4 body sites. We also performed a comprehensive analysis of the proposed approach by applying feature engineering. Our experimental results indicate significant classification performance, for example, the highest accuracy of 0.977 and Mathews Correlation Coefficient of 0.961 was achieved when applying Random Forest and feature engineering over the PAAM associated with Human Microbiome.

KW - Classification

KW - machine Learning

KW - metagenomics

KW - operational Taxonomic Units (OTUs)

KW - phylogeny

U2 - 10.1109/BIBM.2018.8621382

DO - 10.1109/BIBM.2018.8621382

M3 - Conference contribution

SN - 978-1-5386-5489-7

SP - 44

EP - 49

BT - 2018 IEEE International Conference on Bioinformatics and Biomedicine

ER -