Abstract
Background: Accurate diagnosis is crucial to the treatment and management of Alzheimer’s disease (AD). However, clinical data can be incomplete or inconsistent and the resultant “missing data” can affect computational algorithms seeking to objectively identify the disease severity level. In this work, we employed several computational methods to impute missing data, and tested whether the imputed data can lead to improved classification of cognitive impairment level. Material & Methods: We used the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data, focusing on cognitive/functional assessments, as they are often used in clinical decision making. We performed independent simulations in which portions of the values were randomly removed in various systematic ways, reflecting their possible underlying factors. Then multiple missing-data imputation methods were performed including mean/median/mode substitution, multiple imputation (MI), k-nearest neighbours (k-NN), and random forest (RF) algorithms. The effect of the imputed values on the accuracy of predictive models was evaluated using a support vector machine classification algorithm with respect to Clinical Dementia Rating Sum of Boxes (CDRSB). Results: In general, the RF algorithm provides the best method for the missing data conditions. The performance of each method decreases with more missing data. With 20% of data missing, the RF algorithm is the best with R2 of 0.796±0.016 and RMSE of 5.576, with accuracy rising to R2 of 0.854±0.009 and RMSE of 2.090 in the 10% condition. Regarding classification of CDRSB, the accuracy using the linear SVM model is 0.77 (95%CI 0.746-0.794) in the unmodified dataset, 0.669 (95%CI 0.577-0.753) with 20% missing data, and 0.732 (95%CI 0.705-0.757) with RF-imputed data. Conclusions: Overall, computational methods for missing data imputation can offer more value to existing imperfect AD data, through improving the classification accuracy of cognitive impairment level. Further work will investigate missing data in actual clinical datasets and in a more comprehensive way.
Original language | English |
---|---|
Number of pages | 1 |
Publication status | Published (in print/issue) - 12 Sept 2018 |
Event | TMED 9 Conference - City Hotel, Derry, United Kingdom Duration: 12 Sept 2018 → … |
Conference
Conference | TMED 9 Conference |
---|---|
Country/Territory | United Kingdom |
City | Derry |
Period | 12/09/18 → … |
Keywords
- Missing data
- Data imputation
- dementia
- Alzheimer's disease AD