Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification

Research output: Contribution to journalArticlepeer-review

17 Downloads (Pure)

Abstract

Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD data is limited. This study investigates the efficacy of denoising autoencoder-based imputation of missing key features of heterogeneous data that comprised tau-PET, MRI, cognitive and functional assessments, genotype, sociodemographic, and medical history. The authors focused on extreme (≥40%) missing at random of key features which depend on AD progression; identified as the history of a mother having AD, APoE ε4 alleles, and clinical dementia rating. Along with features selected using traditional feature selection methods, latent features extracted from the denoising autoencoder are incorporated for subsequent classification. Using random forest classification with 10-fold cross-validation, robust AD predictive performance of imputed datasets (accuracy: 79%–85%; precision: 71%–85%) across missingness levels, and high recall values with 40% missingness are found. Further, the feature-selected dataset using feature selection methods, including autoencoder, demonstrated higher classification score than that of the original complete dataset. These results highlight the effectiveness and robustness of autoencoder in imputing crucial information for reliable AD prediction in AI-based clinical decision support systems.

Original languageEnglish
Pages (from-to)452-460
Number of pages9
JournalHealthcare Technology Letters
Volume11
Issue number6
Early online date15 Sept 2024
DOIs
Publication statusPublished (in print/issue) - 31 Dec 2024

Bibliographical note

© 2024 The Author(s). Healthcare Technology Letters published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.

Data Access Statement

The original data that support the findings of this study areavailable from the Alzheimer’s Disease Neuroimaging Initiative(ADNI) dataset, more specifically the ADNIMERGE-3 openrepository at https://adni.loni.usc.edu portal as per request andwas accessed after approval by the Data Sharing and PublicationCommittee of Image and Data Archive (IDA). Processed dataare available upon reasonable request. Codes for the currentstudy are available at https://github.com/NamithaHaridas/Denoising_AE_ADNI/ https://github.com/NamithaHaridas/Denoising_AE_ADNI

Keywords

  • Dementia
  • Alzheimer's disease AD
  • data imputation
  • heterogeneous data
  • extreme missing data
  • Tau-PET Brain Imaging
  • gender
  • feature extraction
  • machine learning
  • deep learning
  • denoising autoencoder
  • Medical diagnostics
  • decision support system
  • classification
  • medical diagnostic computing
  • data mining
  • decision support systems
  • data reduction
  • learning (artificial intelligence)
  • feature selection
  • neural nets

Fingerprint

Dive into the research topics of 'Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification'. Together they form a unique fingerprint.

Cite this