Synthetic Subject Generation with Coupled Coherent Time Series Data

Xabat Larrea, Mikel Hernandez, Gorka Epelde, Andoni Beristain, Cristina Molina, Ane Alberdi, Debbie Rankin, Panagiotis Bamidis, Evdokimos Konstantinidis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)
89 Downloads (Pure)

Abstract

A large amount of health and well-being data is collected daily, but little of it reaches its research potential because personal data privacy needs to be protected as an individual’s right, as reflected in the data protection regulations. Moreover, the data that do reach the public domain will typically have under-gone anonymization, a process that can result in a loss of information and, consequently, research potential. Lately, synthetic data generation, which mimics the statistics and patterns of the original, real data on which it is based, has been presented as an alternative to data anonymization. As the data collected from health and well-being activities often have a temporal nature, these data tend to be time series data. The synthetic generation of this type of data has already been analyzed in different studies. However, in the healthcare context, time series data have reduced research potential without the subjects’ metadata, which are essential to explain the temporal data. Therefore, in this work, the option to generate synthetic subjects using both time series data and subject metadata has been analyzed. Two approaches for generating synthetic subjects are proposed. Real time series data are used in the first approach, while in the second approach, time series data are synthetically generated. Furthermore, the first proposed approach is implemented and evaluated. The generation of synthetic subjects with real time series data has been demonstrated to be functional, whilst the generation of synthetic subjects with synthetic time series data requires further improvements to demonstrate its viability.
Original languageEnglish
Title of host publicationEngineering Proceedings
Subtitle of host publicationProceedings, International Conference on Time Series and Forecasting (ITISE)
PublisherMDPI
Pages7
Number of pages10
Volume18
Edition1
DOIs
Publication statusPublished (in print/issue) - 21 Jun 2022

Publication series

NameEngineering Proceedings
PublisherMDPI

Bibliographical note

Funding Information:
This research was partly funded by the VITALISE (Virtual Health and well-being Living Lab Infrastructure) project, funded by the Horizon 2020 Framework Program of the European Union for Research Innovation (grant agreement 101007990).

Publisher Copyright:
© 2022 by the authors.

Keywords

  • time series
  • synthetic data
  • shareable data
  • privacy

Fingerprint

Dive into the research topics of 'Synthetic Subject Generation with Coupled Coherent Time Series Data'. Together they form a unique fingerprint.

Cite this