Abstract
When it comes to the healthcare sector there are
many challenges to accessing and sharing data.
These included data usage agreements, obtaining
ethical approvals and most importantly privacy [1].
Recently researchers have been investigating how
these challenges can be overcome using Synthetic
Data (SD). SD are machine generated data that can
be synthesised, either based on real-world data or
from statistical rules, with variations depending on
the generating methods and use cases. In this study,
a real-world dataset mHealth (Mobile Health) [2] is
chosen to generate privacy preserving SD. The
dataset was obtained by monitoring 10 participants
conducting 12 activities, whilst wearing three
accelerometers. It is important that the SD mimics
the distributions found in the real data, whilst
preserving the privacy. To achieve this, three
synthetic generation techniques are investigated in
the currentn study and compared. Namely SD
Vault Probabilistic Autoregressive (SDV-PAR)
[3], Time-series Generative Adversarial Network
(TGAN)[4], and Conditional Tabular Generative
Adversarial Network (CTGAN)[5].
many challenges to accessing and sharing data.
These included data usage agreements, obtaining
ethical approvals and most importantly privacy [1].
Recently researchers have been investigating how
these challenges can be overcome using Synthetic
Data (SD). SD are machine generated data that can
be synthesised, either based on real-world data or
from statistical rules, with variations depending on
the generating methods and use cases. In this study,
a real-world dataset mHealth (Mobile Health) [2] is
chosen to generate privacy preserving SD. The
dataset was obtained by monitoring 10 participants
conducting 12 activities, whilst wearing three
accelerometers. It is important that the SD mimics
the distributions found in the real data, whilst
preserving the privacy. To achieve this, three
synthetic generation techniques are investigated in
the currentn study and compared. Namely SD
Vault Probabilistic Autoregressive (SDV-PAR)
[3], Time-series Generative Adversarial Network
(TGAN)[4], and Conditional Tabular Generative
Adversarial Network (CTGAN)[5].
Original language | English |
---|---|
Pages | 1 |
Number of pages | 1 |
Publication status | Published (in print/issue) - 23 May 2024 |
Event | Northern Ireland Biomedical Engineering Society Annual Symposium (NIBES) 2024 - Ulster University, Belfast, Northern Ireland Duration: 23 May 2024 → 23 May 2024 |
Conference
Conference | Northern Ireland Biomedical Engineering Society Annual Symposium (NIBES) 2024 |
---|---|
Country/Territory | Northern Ireland |
City | Belfast |
Period | 23/05/24 → 23/05/24 |
Keywords
- Synthetic data generation
- machine learning
- human activity recognition