Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition

Dionicio Neira-Rodado, CD Nugent, I Cleland, Javier Velasquez, Amelec Viloria

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)
118 Downloads (Pure)

Abstract

Human activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors such as accelerometers. The performance of these classifiers can be improved through an adequate training process. In order to improve the training process, multivariate outlier detection was used in order to improve the quality of data in the training set and, subsequently, performance of the classifier. The impact of the technique was evaluated with KNN and random forest (RF) classifiers. In the case of KNN, the performance of the classifier was improved from 55.9% to 63.59%.
Original languageEnglish
Article number1858
Pages (from-to)1-23
Number of pages23
JournalSensors
Volume20
Issue number7
Early online date27 Mar 2020
DOIs
Publication statusPublished online - 27 Mar 2020

Bibliographical note

Funding Information:
European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 734355.

Publisher Copyright:
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Keywords

  • Dataset quality
  • HAR
  • Machine learning
  • Multivariate analysis

Fingerprint

Dive into the research topics of 'Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition'. Together they form a unique fingerprint.

Cite this