Abstract
Human activity recognition has been an open problem in computer vision for almost two decades. In that time there have been many approaches proposed to solve this problem, but very few have managed to solve it in a way that is
sufficiently computationally efficient for real time applications. Recently this has changed, with keypoint based methods demonstrating a high degree of accuracy with low computational cost. These approaches take a given image and return a set of joint locations for each individual within an image. In order to achieve real time performance, a sparse representation of these features over a given time frame is required for classification. Previous methods have achieved this by using a reduced number of keypoints, but this approach gives a less robust representation of the individual’s body pose and may limit the types of activity that can be detected. We present a novel method for reducing the size of the feature set, by calculating the Euclidian distance and the direction of keypoint changes across a number of frames. This allows for a meaningful representation of the individuals movements over time. We show that this method achieves accuracy on par with current state of the art methods, while demonstrating real time performance.
sufficiently computationally efficient for real time applications. Recently this has changed, with keypoint based methods demonstrating a high degree of accuracy with low computational cost. These approaches take a given image and return a set of joint locations for each individual within an image. In order to achieve real time performance, a sparse representation of these features over a given time frame is required for classification. Previous methods have achieved this by using a reduced number of keypoints, but this approach gives a less robust representation of the individual’s body pose and may limit the types of activity that can be detected. We present a novel method for reducing the size of the feature set, by calculating the Euclidian distance and the direction of keypoint changes across a number of frames. This allows for a meaningful representation of the individuals movements over time. We show that this method achieves accuracy on par with current state of the art methods, while demonstrating real time performance.
Original language | English |
---|---|
Pages | 91-98 |
DOIs | |
Publication status | Accepted/In press - 9 Feb 2021 |
Event | International Conference on image processing and vision engineering: International Conference on image processing and vision engineering - Online Virtual Duration: 27 Apr 2021 → 30 Apr 2021 http://www.improve.scitevents.org/ |
Conference
Conference | International Conference on image processing and vision engineering |
---|---|
Abbreviated title | IMPROVE |
Period | 27/04/21 → 30/04/21 |
Internet address |
Keywords
- human activity recognition
- computer vision
- pose estimation
- social signal processing