Computational approaches in biomarker discovery for treatment response and comorbidity in type-2 diabetes

  • Angelina Villikudathil

Student thesis: Doctoral Thesis


Type-2 Diabetes Mellitus (T2DM) is a major global concern that affects millions of people worldwide. An estimate of 35% of individuals with T2DM do not derive benefit from metformin therapy and an average of 40% of diabetic individuals do not derive benefit from GLP-1 therapy. Prolonged elevated HbA1c levels in patients contributes to the development of detrimental secondary complications. Comorbidities are common in individuals affected with T2DM with 88.6% of individuals in a survey held in 2004 to have one comorbid condition while 15% of individuals to have four or more comorbid conditions. The objective of this thesis is: (i) To identify and understand biomarkers associated with treatment response and comorbidities in T2DM using a clinical, genomic and proteomic approach; (ii) To identify and understand subgroups that respond to specific treatment interventions in the case of metformin monotherapy and GLP-1 therapy and in comorbidities of T2DM; (iii) To develop Machine Learning based models that can effectively stratify treatment groups, non-response groups and comorbid groups before diagnosis using a genomic approach and after diagnosis using clinical and proteomic approach.

This thesis reports analysis(s) from a T2DM dataset from the Diastrat cohort recruited at the Northern Ireland Centre for Stratified Medicine (NICSM), Clinical Translational Research and Innovation Centre (C-TRIC), Altnagelvin Area Hospital, Ulster University. SNPs rs6551649, rs6551654, rs4495065, rs7940817, rs6724083, rs7596517, rs7593167, rs12911054, rs2569431, rs9844730, rs73590361, rs964680 are novel biomarker findings identified in this thesis and can possibly differentiate the groups, characterise and enable us to understand the treatment response and T2DM comorbid groups. The possible subgroups identified in this thesis using a clinical approach can aid enormously in the application of personalised medicine research. The genomic ML models built in this thesis with a classification accuracy of 83% can aid in early prediction of the comorbid T2DM groups. The clinical and proteomic ML models built in this thesis with a classification accuracy of 91% and 95% respectively can aid in identifying the subgroups of patients that would not respond to the treatment regime after diagnosis of T2DM through their circulating protein and bioclinical levels.
Date of AwardNov 2022
Original languageEnglish
SupervisorPaula McClean (Supervisor), Tony Bjourson (Supervisor) & Priyank Shukla (Supervisor)


  • Machine learning
  • Biomarker discovery
  • Type-2 diabetes
  • Genomics
  • Proteomics
  • Clinical data analysis
  • Feature selection

Cite this