A Metagenomic Hybrid Classifier for Paediatric Inflammatory Bowel Disease

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Abstract—Inflammatory bowel disease (IBD) is a group of inflammatory diseases of the human colon and small intestine. IBD symptoms are non-specific; diagnosis can be delayed becausean invasive colonoscopy is required for confirmation. Delayed diagnosis is linked to poor growth in children. Imbalances in the human intestinal microbiome - the community of microorganismsthat reside in the human gut - are thought to contribute to the development of IBD. Work done to date in classifying host health statuses from patterns in human micro biomes with supervised learning algorithms has focused on modelling what is present in the gut (i.e. a bacterial census) with the random forest algorithm. Metagenomic shotgun sequencing is required to understand what is occurring in the gut (i.e. gene functions) and is often cost prohibitive for hundreds of samples. However, gene functions can be predicted with the Phylogenetic Investigation ofCommunities by Reconstruction of Unobserved States (PiCRUSt) software package, which could represent a valuable source of new features. In this paper we investigate feature relevanceacross the feature set with the Boruta algorithm. We find that the majority of relevant features are from the predicted metagenome. Support vector machines (SVM) and multilayer perceptrons (MLP) are rarely used with macrobiotic datasets but offer several theoretical advantages. To determine if the new features and alternative algorithms are appropriate, we experiment with a range of machine learning and computational intelligence algorithms. With the best performing algorithms we also implement a conditional multiple classifier system that can identify IBD presence, IBD subtype, and IBD activity from a non-invasive stool sample.
LanguageEnglish
Title of host publicationUnknown Host Publication
Pages1083-1089
Number of pages7
DOIs
Publication statusE-pub ahead of print - 3 Nov 2016
EventWorld Congress on Computational Intelligence 2016 - Vancouver
Duration: 3 Nov 2016 → …

Conference

ConferenceWorld Congress on Computational Intelligence 2016
Period3/11/16 → …

Fingerprint

Pediatrics
Classifiers
Genes
Supervised learning
Multilayer neural networks
Software packages
Learning algorithms
Artificial intelligence
Support vector machines
Learning systems
Health
Costs

Keywords

  • classification
  • metagenomes

Cite this

@inproceedings{1187e72a4e1048a29ae5b162031d5bbe,
title = "A Metagenomic Hybrid Classifier for Paediatric Inflammatory Bowel Disease",
abstract = "Abstract—Inflammatory bowel disease (IBD) is a group of inflammatory diseases of the human colon and small intestine. IBD symptoms are non-specific; diagnosis can be delayed becausean invasive colonoscopy is required for confirmation. Delayed diagnosis is linked to poor growth in children. Imbalances in the human intestinal microbiome - the community of microorganismsthat reside in the human gut - are thought to contribute to the development of IBD. Work done to date in classifying host health statuses from patterns in human micro biomes with supervised learning algorithms has focused on modelling what is present in the gut (i.e. a bacterial census) with the random forest algorithm. Metagenomic shotgun sequencing is required to understand what is occurring in the gut (i.e. gene functions) and is often cost prohibitive for hundreds of samples. However, gene functions can be predicted with the Phylogenetic Investigation ofCommunities by Reconstruction of Unobserved States (PiCRUSt) software package, which could represent a valuable source of new features. In this paper we investigate feature relevanceacross the feature set with the Boruta algorithm. We find that the majority of relevant features are from the predicted metagenome. Support vector machines (SVM) and multilayer perceptrons (MLP) are rarely used with macrobiotic datasets but offer several theoretical advantages. To determine if the new features and alternative algorithms are appropriate, we experiment with a range of machine learning and computational intelligence algorithms. With the best performing algorithms we also implement a conditional multiple classifier system that can identify IBD presence, IBD subtype, and IBD activity from a non-invasive stool sample.",
keywords = "classification, metagenomes",
author = "Benjamin Wingfield and SA Coleman and T.Martin McGinnity and AJ Bjourson",
year = "2016",
month = "11",
day = "3",
doi = "10.1109/IJCNN.2016.7727318",
language = "English",
isbn = "978-1-5090-0620-5",
pages = "1083--1089",
booktitle = "Unknown Host Publication",

}

A Metagenomic Hybrid Classifier for Paediatric Inflammatory Bowel Disease. / Wingfield, Benjamin; Coleman, SA; McGinnity, T.Martin; Bjourson, AJ.

Unknown Host Publication. 2016. p. 1083-1089.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A Metagenomic Hybrid Classifier for Paediatric Inflammatory Bowel Disease

AU - Wingfield, Benjamin

AU - Coleman, SA

AU - McGinnity, T.Martin

AU - Bjourson, AJ

PY - 2016/11/3

Y1 - 2016/11/3

N2 - Abstract—Inflammatory bowel disease (IBD) is a group of inflammatory diseases of the human colon and small intestine. IBD symptoms are non-specific; diagnosis can be delayed becausean invasive colonoscopy is required for confirmation. Delayed diagnosis is linked to poor growth in children. Imbalances in the human intestinal microbiome - the community of microorganismsthat reside in the human gut - are thought to contribute to the development of IBD. Work done to date in classifying host health statuses from patterns in human micro biomes with supervised learning algorithms has focused on modelling what is present in the gut (i.e. a bacterial census) with the random forest algorithm. Metagenomic shotgun sequencing is required to understand what is occurring in the gut (i.e. gene functions) and is often cost prohibitive for hundreds of samples. However, gene functions can be predicted with the Phylogenetic Investigation ofCommunities by Reconstruction of Unobserved States (PiCRUSt) software package, which could represent a valuable source of new features. In this paper we investigate feature relevanceacross the feature set with the Boruta algorithm. We find that the majority of relevant features are from the predicted metagenome. Support vector machines (SVM) and multilayer perceptrons (MLP) are rarely used with macrobiotic datasets but offer several theoretical advantages. To determine if the new features and alternative algorithms are appropriate, we experiment with a range of machine learning and computational intelligence algorithms. With the best performing algorithms we also implement a conditional multiple classifier system that can identify IBD presence, IBD subtype, and IBD activity from a non-invasive stool sample.

AB - Abstract—Inflammatory bowel disease (IBD) is a group of inflammatory diseases of the human colon and small intestine. IBD symptoms are non-specific; diagnosis can be delayed becausean invasive colonoscopy is required for confirmation. Delayed diagnosis is linked to poor growth in children. Imbalances in the human intestinal microbiome - the community of microorganismsthat reside in the human gut - are thought to contribute to the development of IBD. Work done to date in classifying host health statuses from patterns in human micro biomes with supervised learning algorithms has focused on modelling what is present in the gut (i.e. a bacterial census) with the random forest algorithm. Metagenomic shotgun sequencing is required to understand what is occurring in the gut (i.e. gene functions) and is often cost prohibitive for hundreds of samples. However, gene functions can be predicted with the Phylogenetic Investigation ofCommunities by Reconstruction of Unobserved States (PiCRUSt) software package, which could represent a valuable source of new features. In this paper we investigate feature relevanceacross the feature set with the Boruta algorithm. We find that the majority of relevant features are from the predicted metagenome. Support vector machines (SVM) and multilayer perceptrons (MLP) are rarely used with macrobiotic datasets but offer several theoretical advantages. To determine if the new features and alternative algorithms are appropriate, we experiment with a range of machine learning and computational intelligence algorithms. With the best performing algorithms we also implement a conditional multiple classifier system that can identify IBD presence, IBD subtype, and IBD activity from a non-invasive stool sample.

KW - classification

KW - metagenomes

U2 - 10.1109/IJCNN.2016.7727318

DO - 10.1109/IJCNN.2016.7727318

M3 - Conference contribution

SN - 978-1-5090-0620-5

SP - 1083

EP - 1089

BT - Unknown Host Publication

ER -