Stratification of Type-2 Diabetes comorbidities using genotypic array and Machine Learning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Background: The treatment of comorbidities remains costly and represents a major priority in Evidence Based Medicine (EBM). Determining genetically the molecular-subclasses of proinflammatory comorbid conditions is important to stratify patients that may more effectively respond to specific treatment interventions. The objective of this study is to develop a Machine Learning (ML) based classifier to stratify patients with Type-2-Diabetes and different comorbidities.

Methods: A preliminary dataset of samples from 254 people with Type-2 Diabetes recruited at NICSM were genotyped with an Affymetrix UKBioBank Axiom Array. SNP results for 80 patient samples of class DCM1 (i.e. Type-2 Diabetes associated with comorbidities of circulatory system) and 90 patient samples of class DCM2 (i.e. Type-2-Diabetes associated with comorbidities of digestive system) were filtered through feature selection using ANOVA, Chi-square and Fast Correlation Based Filter. The top10 SNPs along with information from Electronic Care Records (ECR), were selected for building 5 ML binary classifiers, using Support Vector Machine, Random Forest, Artificial Neural Network, Decision Tree and Naive Bayes algorithms, and their performances were tested with a 10-fold cross validation.

Results: Of the 5 classifiers, the Naive Bayes algorithm outperformed all others with an Area under the Curve score of 0.681, overall Classification Accuracy of 65.68% and Mathews Correlation Coefficient of 0.316.

Conclusion: Further improvement in the performance of our ML classifier is currently in progress. With the inclusion of further data from ECR, as well as data from public repositories, we hope to build a better classifier.
LanguageEnglish
Title of host publication21st Meeting of the Irish Society of Human Genetics
Pages70
Volume88(1)
Publication statusPublished - 22 Jan 2019
Event21st Meeting of the Irish Society of Human Genetics - Dublin, Ireland
Duration: 21 Sep 201821 Sep 2018
Conference number: 21

Conference

Conference21st Meeting of the Irish Society of Human Genetics
CountryIreland
CityDublin
Period21/09/1821/09/18

Fingerprint

Type 2 Diabetes Mellitus
Comorbidity
Single Nucleotide Polymorphism
Digestive System
Decision Trees
Evidence-Based Medicine
Cardiovascular System
Area Under Curve
Analysis of Variance
Machine Learning
Therapeutics

Keywords

  • Machine Learning
  • Microarrays
  • Personalised Medicine
  • Diabetes
  • Comorbidities

Cite this

@inproceedings{ba3da9cba9d64e639804ef32f1baa67c,
title = "Stratification of Type-2 Diabetes comorbidities using genotypic array and Machine Learning",
abstract = "Background: The treatment of comorbidities remains costly and represents a major priority in Evidence Based Medicine (EBM). Determining genetically the molecular-subclasses of proinflammatory comorbid conditions is important to stratify patients that may more effectively respond to specific treatment interventions. The objective of this study is to develop a Machine Learning (ML) based classifier to stratify patients with Type-2-Diabetes and different comorbidities.Methods: A preliminary dataset of samples from 254 people with Type-2 Diabetes recruited at NICSM were genotyped with an Affymetrix UKBioBank Axiom Array. SNP results for 80 patient samples of class DCM1 (i.e. Type-2 Diabetes associated with comorbidities of circulatory system) and 90 patient samples of class DCM2 (i.e. Type-2-Diabetes associated with comorbidities of digestive system) were filtered through feature selection using ANOVA, Chi-square and Fast Correlation Based Filter. The top10 SNPs along with information from Electronic Care Records (ECR), were selected for building 5 ML binary classifiers, using Support Vector Machine, Random Forest, Artificial Neural Network, Decision Tree and Naive Bayes algorithms, and their performances were tested with a 10-fold cross validation.Results: Of the 5 classifiers, the Naive Bayes algorithm outperformed all others with an Area under the Curve score of 0.681, overall Classification Accuracy of 65.68{\%} and Mathews Correlation Coefficient of 0.316.Conclusion: Further improvement in the performance of our ML classifier is currently in progress. With the inclusion of further data from ECR, as well as data from public repositories, we hope to build a better classifier.",
keywords = "Machine Learning, Microarrays, Personalised Medicine, Diabetes, Comorbidities",
author = "Angelina Villikudathil and {Mc Guigan}, Declan and Andrew English and Catriona Kelly and Paula McClean and AJ Bjourson and Priyank Shukla",
year = "2019",
month = "1",
day = "22",
language = "English",
volume = "88(1)",
pages = "70",
booktitle = "21st Meeting of the Irish Society of Human Genetics",

}

Villikudathil, A, Mc Guigan, D, English, A, Kelly, C, McClean, P, Bjourson, AJ & Shukla, P 2019, Stratification of Type-2 Diabetes comorbidities using genotypic array and Machine Learning. in 21st Meeting of the Irish Society of Human Genetics. vol. 88(1), pp. 70, 21st Meeting of the Irish Society of Human Genetics, Dublin, Ireland, 21/09/18.

Stratification of Type-2 Diabetes comorbidities using genotypic array and Machine Learning. / Villikudathil, Angelina; Mc Guigan, Declan; English, Andrew; Kelly, Catriona; McClean, Paula; Bjourson, AJ; Shukla, Priyank.

21st Meeting of the Irish Society of Human Genetics. Vol. 88(1) 2019. p. 70.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Stratification of Type-2 Diabetes comorbidities using genotypic array and Machine Learning

AU - Villikudathil, Angelina

AU - Mc Guigan, Declan

AU - English, Andrew

AU - Kelly, Catriona

AU - McClean, Paula

AU - Bjourson, AJ

AU - Shukla, Priyank

PY - 2019/1/22

Y1 - 2019/1/22

N2 - Background: The treatment of comorbidities remains costly and represents a major priority in Evidence Based Medicine (EBM). Determining genetically the molecular-subclasses of proinflammatory comorbid conditions is important to stratify patients that may more effectively respond to specific treatment interventions. The objective of this study is to develop a Machine Learning (ML) based classifier to stratify patients with Type-2-Diabetes and different comorbidities.Methods: A preliminary dataset of samples from 254 people with Type-2 Diabetes recruited at NICSM were genotyped with an Affymetrix UKBioBank Axiom Array. SNP results for 80 patient samples of class DCM1 (i.e. Type-2 Diabetes associated with comorbidities of circulatory system) and 90 patient samples of class DCM2 (i.e. Type-2-Diabetes associated with comorbidities of digestive system) were filtered through feature selection using ANOVA, Chi-square and Fast Correlation Based Filter. The top10 SNPs along with information from Electronic Care Records (ECR), were selected for building 5 ML binary classifiers, using Support Vector Machine, Random Forest, Artificial Neural Network, Decision Tree and Naive Bayes algorithms, and their performances were tested with a 10-fold cross validation.Results: Of the 5 classifiers, the Naive Bayes algorithm outperformed all others with an Area under the Curve score of 0.681, overall Classification Accuracy of 65.68% and Mathews Correlation Coefficient of 0.316.Conclusion: Further improvement in the performance of our ML classifier is currently in progress. With the inclusion of further data from ECR, as well as data from public repositories, we hope to build a better classifier.

AB - Background: The treatment of comorbidities remains costly and represents a major priority in Evidence Based Medicine (EBM). Determining genetically the molecular-subclasses of proinflammatory comorbid conditions is important to stratify patients that may more effectively respond to specific treatment interventions. The objective of this study is to develop a Machine Learning (ML) based classifier to stratify patients with Type-2-Diabetes and different comorbidities.Methods: A preliminary dataset of samples from 254 people with Type-2 Diabetes recruited at NICSM were genotyped with an Affymetrix UKBioBank Axiom Array. SNP results for 80 patient samples of class DCM1 (i.e. Type-2 Diabetes associated with comorbidities of circulatory system) and 90 patient samples of class DCM2 (i.e. Type-2-Diabetes associated with comorbidities of digestive system) were filtered through feature selection using ANOVA, Chi-square and Fast Correlation Based Filter. The top10 SNPs along with information from Electronic Care Records (ECR), were selected for building 5 ML binary classifiers, using Support Vector Machine, Random Forest, Artificial Neural Network, Decision Tree and Naive Bayes algorithms, and their performances were tested with a 10-fold cross validation.Results: Of the 5 classifiers, the Naive Bayes algorithm outperformed all others with an Area under the Curve score of 0.681, overall Classification Accuracy of 65.68% and Mathews Correlation Coefficient of 0.316.Conclusion: Further improvement in the performance of our ML classifier is currently in progress. With the inclusion of further data from ECR, as well as data from public repositories, we hope to build a better classifier.

KW - Machine Learning

KW - Microarrays

KW - Personalised Medicine

KW - Diabetes

KW - Comorbidities

UR - https://www.ums.ac.uk/umj088/088(1)062.pdf

M3 - Conference contribution

VL - 88(1)

SP - 70

BT - 21st Meeting of the Irish Society of Human Genetics

ER -

Villikudathil A, Mc Guigan D, English A, Kelly C, McClean P, Bjourson AJ et al. Stratification of Type-2 Diabetes comorbidities using genotypic array and Machine Learning. In 21st Meeting of the Irish Society of Human Genetics. Vol. 88(1). 2019. p. 70