Abstract
challenges associated with the collection of a large scale, diverse
dataset for Activity Recognition. The dataset was collected by 141
undergraduate students, in a controlled environment. Students
collected triaxial accelerometer data from a wearable
accelerometer whilst each carrying out 3 of the 18 investigated
activities, categorized into 6 scenarios of daily living. This data was
subsequently labelled, anonymized and uploaded to a shared
repository. This paper presents an analysis of data quality,
through outlier detection and assesses the suitability of the dataset
for the creation and validation of Activity Recognition models.
This is achieved through the application of a range of common
data driven machine learning approaches. Finally, the paper
describes challenges identified during the data collection process
and discusses how these could be addressed. Issues surrounding
data quality, in particular, identifying and addressing poor
calibration of the data were identified. Results highlight the
potential of harnessing these diverse data for Activity Recognition.
Based on a comparison of six classification approaches, a Random
Forest provided the best classification (F-measure: 0.88). In future
data collection cycles, participants will be encouraged to collect a
set of “common” activities, to support generation of a larger
homogeneous dataset. Future work will seek to refine the
methodology further and to evaluate model on new unseen data.
Language | English |
---|---|
Pages | 522-527 |
DOIs | |
Publication status | Published - 23 Mar 2018 |
Event | 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops) - Athens, Greece Duration: 19 Mar 2018 → 23 Mar 2018 |
Conference
Conference | 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops) |
---|---|
Country | Greece |
City | Athens |
Period | 19/03/18 → 23/03/18 |
Fingerprint
Keywords
- Activity Recognition
- Crowd Sourcing
- Data Annotation
- Data Collection
- Data Quality
- Data Sharing
Cite this
}
Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. / Cleland, I; Donnelly, MP; Nugent, CD; Hallberg, J; Espinilla, M; Garcia-Constantino, Matias.
2018. 522-527 Paper presented at 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom Workshops), Athens, Greece.Research output: Contribution to conference › Paper
TY - CONF
T1 - Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition
AU - Cleland, I
AU - Donnelly, MP
AU - Nugent, CD
AU - Hallberg, J
AU - Espinilla, M
AU - Garcia-Constantino, Matias
PY - 2018/3/23
Y1 - 2018/3/23
N2 - This paper discusses the opportunities andchallenges associated with the collection of a large scale, diversedataset for Activity Recognition. The dataset was collected by 141undergraduate students, in a controlled environment. Studentscollected triaxial accelerometer data from a wearableaccelerometer whilst each carrying out 3 of the 18 investigatedactivities, categorized into 6 scenarios of daily living. This data wassubsequently labelled, anonymized and uploaded to a sharedrepository. This paper presents an analysis of data quality,through outlier detection and assesses the suitability of the datasetfor the creation and validation of Activity Recognition models.This is achieved through the application of a range of commondata driven machine learning approaches. Finally, the paperdescribes challenges identified during the data collection processand discusses how these could be addressed. Issues surroundingdata quality, in particular, identifying and addressing poorcalibration of the data were identified. Results highlight thepotential of harnessing these diverse data for Activity Recognition.Based on a comparison of six classification approaches, a RandomForest provided the best classification (F-measure: 0.88). In futuredata collection cycles, participants will be encouraged to collect aset of “common” activities, to support generation of a largerhomogeneous dataset. Future work will seek to refine themethodology further and to evaluate model on new unseen data.
AB - This paper discusses the opportunities andchallenges associated with the collection of a large scale, diversedataset for Activity Recognition. The dataset was collected by 141undergraduate students, in a controlled environment. Studentscollected triaxial accelerometer data from a wearableaccelerometer whilst each carrying out 3 of the 18 investigatedactivities, categorized into 6 scenarios of daily living. This data wassubsequently labelled, anonymized and uploaded to a sharedrepository. This paper presents an analysis of data quality,through outlier detection and assesses the suitability of the datasetfor the creation and validation of Activity Recognition models.This is achieved through the application of a range of commondata driven machine learning approaches. Finally, the paperdescribes challenges identified during the data collection processand discusses how these could be addressed. Issues surroundingdata quality, in particular, identifying and addressing poorcalibration of the data were identified. Results highlight thepotential of harnessing these diverse data for Activity Recognition.Based on a comparison of six classification approaches, a RandomForest provided the best classification (F-measure: 0.88). In futuredata collection cycles, participants will be encouraged to collect aset of “common” activities, to support generation of a largerhomogeneous dataset. Future work will seek to refine themethodology further and to evaluate model on new unseen data.
KW - Activity Recognition
KW - Crowd Sourcing
KW - Data Annotation
KW - Data Collection
KW - Data Quality
KW - Data Sharing
U2 - 10.1109/PERCOMW.2018.8480322
DO - 10.1109/PERCOMW.2018.8480322
M3 - Paper
SP - 522
EP - 527
ER -