Version 1.0

The InSync data set was collected at the Pervasive Computing lab at Ulster University. It consists of subjects performing activities of daily living (ADLs) in an atmosphere that mimics a real-life environment while data is collected using three different sensing technologies: inertial, image, and audio. The data set can be used to research human activity recognition algorithms to tackle problems on classification, transfer learning, data fusion, data segmentation, feature extraction, so on and so forth.

Number of instances:
16,959 (inertial data points) + 650 (thermal images) + 16,986 (audio files)

Relevant information:
InSync contains 12 hours of data from ten subjects, consisting of 78 runs (times that a subject performed the scripted protocol). Sensor data from three different technologies (inertial, images and audio) captured the performance (not simulation) of the subjects performing ADLs. All the activities were annotated a posteriori using a video stream.


As the data set aimed at recording the subject's physical activity performance. The tasks consisted of ADLs and well-known scenarios. Three general scenarios were chosen, a bedroom-related scenario in which the subjects performed two of the ADLs, namely, personal hygiene and dressing, a breakfast-related scenario was chosen to embrace the ADL of feeding as it has extensively been used in literature, and free of obstacle scenario in which the subjects can walk alongside to demonstrate their transferring capabilities.

The script was designed with nine high-level activities:

(1) Napping
(2) Wearing joggers
(3) Combing hair
(4) Brushing teeth

(5) Operating door

(6) Drinking water
(7) Eating cereal

(8) Transporting (i.e. walking)
(9) Resting (i.e. sitting in a chair)

Details of the room's dimensions and sensor locations are available in the Relevant Papers.


The deployed sensing technology included thirteen shimmer devices enabled with 3-axis accelerometers, four Matrix Voice ESP32 consisting of eight embedded microphones and four Thermal Vision Sensor (TVS). The sensing technology was placed as described next:

Shimmers wore by the subject:
- Right wrist
- Left wrist
- Lower back
- Upper back
- Right shoe

Shimmers mounted on everyday items:
- Comb
- Toothbrush
- Glass
- Spoon
- Jogger
- Belt
- Strap to mimic a watch
- Strap to mimic smart shoe

Matrix Voice ESP32 (one located in each room):
- Bedroom
- Corridor
- Kitchen
- Livingroom

Thermal sensor (one located in each room):
- Bedroom
- Corridor
- Kitchen
- Livingroom

Attribute information:
The data set comprises the readings of inertial sensors, thermal images, and audio files to recorded performed ADLs.

There is a total of 60 attributes for the inertial data which includes the mean value and root-mean-square (RMS) from x, y, and z-axis. The thermal data consists of grayscale images in 32x32 pixels, and the audio data consists of 44.1 kHz waveform audio files.

A list of videos of the experiment can be seen in the following links.

(1) napping:
(2) wearing joggers:
(3) combing hair:
(4) brushing teeth:

(5) operating door:

(6) drinking water:
(7) eating cereal:

(8) transporting (i.e. walking):
(9) resting (i.e. sitting in a chair):

IMPORTANT: The videos previously provided were recorded using conventional webcams. The videos were used as ground truth; they were not used for training nor testing purposes. Note that the participant's identity has been considered by blurring their face. The speed of the videos varies as different sampling rates were used when recording the videos.
Date made available18 Aug 2021
PublisherUlster University
Date of data productionJan 2020 - May 2020

Cite this