Abstract
Assessing the activities an individual performs, through smart-devices or proprietary equipment, is a research area which is currently attracting a high level of interest. Current implementations of these systems use either, subjective methods, for instance, questionnaires administered in medical environments, or automated systems which use pre-captured data. Each of these systems has their drawbacks, with questionnaires limited in accuracy by the patient’s recollection and automated systems are limited by the data they are provided.In automated systems and the wider field of machine learning, a major problem is the collection of accurate and comprehensive labels. Sensors can be worn by individuals and data captured ambiently, however, the difficulty lies in capturing the context for this data, with that context being the activity that the user is currently performing. In many implementations, data-collection experiments are performed, these have practicality issues, and as mentioned, are limited to only the activities captured during the experiment. A potential solution is to reduce the burden of providing a fully labelled dataset by allowing a weaker labelling structure. Normally, a supervised classifier requires that each piece of data is provided with an associated label, however, weakly supervised techniques provide methods for handling inaccurate or incomplete annotations and literature has shown their effectiveness for classifying activity data.
Most people now carry some form of smart device, many of which are laden with an array of embedded sensors. This provides an ideal platform for not only recognition of activities but also the
ambient collection of data and a method of user-centric label collection. This work, therefore, focuses on, firstly, providing an overview of the current state of the art in weak supervision and activity recognition. It then investigates methods of collecting labels from users and efficiently applying these labels to large quantities of otherwise unlabelled data. Finally, methods of increasing the ambience of these implementations, by reducing label requests, or otherwise are tested.
These methodologies, when combined, provide a novel and significantly more feasible method of collecting activities from users when compared to the current state of the art. The combination of experience sampling and clustering, along with ideas derived from weak supervision, allowed a significant reduction in overall labels and the removal of the requirement to track beginning and end times of activities, while providing similar classification performance to that of a fully-supervised
dataset. Studying the confidence of classifiers also allowed further reductions in the number of labels required and could be used to improve classification performance by propagating labels to unlabelled data points. It was also found that deep learning could be used to remove the requirement for a domain expert for feature analysis and extraction, while maintaining classification performance.
Date of Award | Jul 2020 |
---|---|
Original language | English |
Supervisor | Daniel Kelly (Supervisor), Tom Lunney (Supervisor) & Kevin Curran (Supervisor) |