Complementing real datasets with simulated data: a regression-based approach

Jonathan Synnott, Miguel Ortiz_barrios, Jens Lundstrom, Eric Jarpe, Anita Sant'Anna

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)
152 Downloads (Pure)


Activity recognition in smart environments is essential for ensuring the wellbeing of older residents. By tracking activities of daily living (ADLs), a person’s health status can be monitored over time. Nonetheless, accurate activity classification must overcome the fact that each person performs ADLs in different ways and in homes with different layouts. One possible solution is to obtain large amounts of data to train a supervised classifier. Data collection in real environments, however, is very expensive and cannot contain every possible variation of how different ADLs are performed. A more cost-effective solution is to generate a variety of simulated scenarios and synthesize large amounts of data. Nonetheless, simulated data can be considerably different from real data. Therefore, this paper proposes the use of regression models to better approximate real observations based on simulated data. To achieve this, ADL data from a smart home were first compared with equivalent ADLs performed in a simulator. Such comparison was undertaken considering the number of events per activity, number of events per type of sensor per activity, and activity duration. Then, different regression models were assessed for calculating real data based on simulated data. The results evidenced that simulated data can be transformed with a prediction accuracy R2 = 97.03%.
Original languageEnglish
Pages (from-to)34301-34324
Number of pages24
JournalMultimedia Tools and Applications
Publication statusPublished (in print/issue) - 16 Jan 2020

Bibliographical note

Funding Information: The Authors which to acknowledge support from the REMIND Project from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 734355. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Publisher Copyright: © 2020, Springer Science+Business Media, LLC, part of Springer Nature. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.


  • Activity duration
  • Activity recognition
  • Determination coefficient
  • Non-linear models
  • Quantile-quantile plots
  • Regression analysis


Dive into the research topics of 'Complementing real datasets with simulated data: a regression-based approach'. Together they form a unique fingerprint.

Cite this