Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition

Research output: Contribution to conferencePaper

1 Citation (Scopus)

Abstract

This paper discusses the opportunities and
challenges associated with the collection of a large scale, diverse
dataset for Activity Recognition. The dataset was collected by 141
undergraduate students, in a controlled environment. Students
collected triaxial accelerometer data from a wearable
accelerometer whilst each carrying out 3 of the 18 investigated
activities, categorized into 6 scenarios of daily living. This data was
subsequently labelled, anonymized and uploaded to a shared
repository. This paper presents an analysis of data quality,
through outlier detection and assesses the suitability of the dataset
for the creation and validation of Activity Recognition models.
This is achieved through the application of a range of common
data driven machine learning approaches. Finally, the paper
describes challenges identified during the data collection process
and discusses how these could be addressed. Issues surrounding
data quality, in particular, identifying and addressing poor
calibration of the data were identified. Results highlight the
potential of harnessing these diverse data for Activity Recognition.
Based on a comparison of six classification approaches, a Random
Forest provided the best classification (F-measure: 0.88). In future
data collection cycles, participants will be encouraged to collect a
set of “common” activities, to support generation of a larger
homogeneous dataset. Future work will seek to refine the
methodology further and to evaluate model on new unseen data.

Conference

Conference2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops)
CountryGreece
CityAthens
Period19/03/1823/03/18

Fingerprint

Accelerometers
Learning systems
Students

Keywords

  • Activity Recognition
  • Crowd Sourcing
  • Data Annotation
  • Data Collection
  • Data Quality
  • Data Sharing

Cite this

Cleland, I., Donnelly, MP., Nugent, CD., Hallberg, J., Espinilla, M., & Garcia-Constantino, M. (2018). Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. 522-527. Paper presented at 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops), Athens, Greece. https://doi.org/10.1109/PERCOMW.2018.8480322
Cleland, I ; Donnelly, MP ; Nugent, CD ; Hallberg, J ; Espinilla, M ; Garcia-Constantino, Matias. / Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. Paper presented at 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops), Athens, Greece.
@conference{578fd5a2412b442f92ccae70d7cdd317,
title = "Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition",
abstract = "This paper discusses the opportunities andchallenges associated with the collection of a large scale, diversedataset for Activity Recognition. The dataset was collected by 141undergraduate students, in a controlled environment. Studentscollected triaxial accelerometer data from a wearableaccelerometer whilst each carrying out 3 of the 18 investigatedactivities, categorized into 6 scenarios of daily living. This data wassubsequently labelled, anonymized and uploaded to a sharedrepository. This paper presents an analysis of data quality,through outlier detection and assesses the suitability of the datasetfor the creation and validation of Activity Recognition models.This is achieved through the application of a range of commondata driven machine learning approaches. Finally, the paperdescribes challenges identified during the data collection processand discusses how these could be addressed. Issues surroundingdata quality, in particular, identifying and addressing poorcalibration of the data were identified. Results highlight thepotential of harnessing these diverse data for Activity Recognition.Based on a comparison of six classification approaches, a RandomForest provided the best classification (F-measure: 0.88). In futuredata collection cycles, participants will be encouraged to collect aset of “common” activities, to support generation of a largerhomogeneous dataset. Future work will seek to refine themethodology further and to evaluate model on new unseen data.",
keywords = "Activity Recognition, Crowd Sourcing, Data Annotation, Data Collection, Data Quality, Data Sharing",
author = "I Cleland and MP Donnelly and CD Nugent and J Hallberg and M Espinilla and Matias Garcia-Constantino",
year = "2018",
month = "3",
day = "23",
doi = "10.1109/PERCOMW.2018.8480322",
language = "English",
pages = "522--527",
note = "2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops) ; Conference date: 19-03-2018 Through 23-03-2018",

}

Cleland, I, Donnelly, MP, Nugent, CD, Hallberg, J, Espinilla, M & Garcia-Constantino, M 2018, 'Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition' Paper presented at 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops), Athens, Greece, 19/03/18 - 23/03/18, pp. 522-527. https://doi.org/10.1109/PERCOMW.2018.8480322

Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. / Cleland, I; Donnelly, MP; Nugent, CD; Hallberg, J; Espinilla, M; Garcia-Constantino, Matias.

2018. 522-527 Paper presented at 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops), Athens, Greece.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition

AU - Cleland, I

AU - Donnelly, MP

AU - Nugent, CD

AU - Hallberg, J

AU - Espinilla, M

AU - Garcia-Constantino, Matias

PY - 2018/3/23

Y1 - 2018/3/23

N2 - This paper discusses the opportunities andchallenges associated with the collection of a large scale, diversedataset for Activity Recognition. The dataset was collected by 141undergraduate students, in a controlled environment. Studentscollected triaxial accelerometer data from a wearableaccelerometer whilst each carrying out 3 of the 18 investigatedactivities, categorized into 6 scenarios of daily living. This data wassubsequently labelled, anonymized and uploaded to a sharedrepository. This paper presents an analysis of data quality,through outlier detection and assesses the suitability of the datasetfor the creation and validation of Activity Recognition models.This is achieved through the application of a range of commondata driven machine learning approaches. Finally, the paperdescribes challenges identified during the data collection processand discusses how these could be addressed. Issues surroundingdata quality, in particular, identifying and addressing poorcalibration of the data were identified. Results highlight thepotential of harnessing these diverse data for Activity Recognition.Based on a comparison of six classification approaches, a RandomForest provided the best classification (F-measure: 0.88). In futuredata collection cycles, participants will be encouraged to collect aset of “common” activities, to support generation of a largerhomogeneous dataset. Future work will seek to refine themethodology further and to evaluate model on new unseen data.

AB - This paper discusses the opportunities andchallenges associated with the collection of a large scale, diversedataset for Activity Recognition. The dataset was collected by 141undergraduate students, in a controlled environment. Studentscollected triaxial accelerometer data from a wearableaccelerometer whilst each carrying out 3 of the 18 investigatedactivities, categorized into 6 scenarios of daily living. This data wassubsequently labelled, anonymized and uploaded to a sharedrepository. This paper presents an analysis of data quality,through outlier detection and assesses the suitability of the datasetfor the creation and validation of Activity Recognition models.This is achieved through the application of a range of commondata driven machine learning approaches. Finally, the paperdescribes challenges identified during the data collection processand discusses how these could be addressed. Issues surroundingdata quality, in particular, identifying and addressing poorcalibration of the data were identified. Results highlight thepotential of harnessing these diverse data for Activity Recognition.Based on a comparison of six classification approaches, a RandomForest provided the best classification (F-measure: 0.88). In futuredata collection cycles, participants will be encouraged to collect aset of “common” activities, to support generation of a largerhomogeneous dataset. Future work will seek to refine themethodology further and to evaluate model on new unseen data.

KW - Activity Recognition

KW - Crowd Sourcing

KW - Data Annotation

KW - Data Collection

KW - Data Quality

KW - Data Sharing

U2 - 10.1109/PERCOMW.2018.8480322

DO - 10.1109/PERCOMW.2018.8480322

M3 - Paper

SP - 522

EP - 527

ER -

Cleland I, Donnelly MP, Nugent CD, Hallberg J, Espinilla M, Garcia-Constantino M. Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. 2018. Paper presented at 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops), Athens, Greece. https://doi.org/10.1109/PERCOMW.2018.8480322