Stratification of type-2 diabetes comorbidities using genotypic array and machine learning

Angelina Villikudathil, Declan Mc Guigan, Andrew English, Catriona Kelly, Paula McClean, AJ Bjourson, Priyank Shukla

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Background: The treatment of comorbidities remains costly and represents a major priority in Evidence Based Medicine (EBM). Determining genetically the molecular-subclasses of proinflammatory comorbid conditions is important to stratify patients that may more effectively respond to specific treatment interventions. The objective of this study is to develop a Machine Learning (ML) based classifier to stratify patients with Type-2-Diabetes and different comorbidities.

Methods: A preliminary dataset of samples from 254 people with Type-2 Diabetes recruited at NICSM were genotyped with an Affymetrix UKBioBank Axiom Array. SNP results for 80 patient samples of class DCM1 (i.e. Type-2 Diabetes associated with comorbidities of circulatory system) and 90 patient samples of class DCM2 (i.e. Type-2-Diabetes associated with comorbidities of digestive system) were filtered through feature selection using ANOVA, Chi-square and Fast Correlation Based Filter. The top10 SNPs along with information from Electronic Care Records (ECR), were selected for building 5 ML binary classifiers, using Support Vector Machine, Random Forest, Artificial Neural Network, Decision Tree and Naive Bayes algorithms, and their performances were tested with a 10-fold cross validation.

Results: Of the 5 classifiers, the Naive Bayes algorithm outperformed all others with an Area under the Curve score of 0.681, overall Classification Accuracy of 65.68% and Mathews Correlation Coefficient of 0.316.

Conclusion: Further improvement in the performance of our ML classifier is currently in progress. With the inclusion of further data from ECR, as well as data from public repositories, we hope to build a better classifier.
Original languageEnglish
Title of host publication21st Meeting of the Irish Society of Human Genetics
PublisherUlster Medical Journal
Publication statusPublished (in print/issue) - 22 Jan 2019
Event21st Meeting of the Irish Society of Human Genetics - Dublin, Ireland
Duration: 21 Sept 201821 Sept 2018
Conference number: 21


Conference21st Meeting of the Irish Society of Human Genetics


  • Machine Learning
  • Microarrays
  • Personalised Medicine
  • Type 2 diabetes
  • Comorbidities
  • Multimorbidity
  • Genomics
  • Bioinformatics


Dive into the research topics of 'Stratification of type-2 diabetes comorbidities using genotypic array and machine learning'. Together they form a unique fingerprint.

Cite this