Machine learning using synthetic and real data: Similarity of evaluation metrics for different healthcare datasets and for different algorithms

Research output: Chapter in Book/Report/Conference proceedingConference contribution

206 Downloads (Pure)

Abstract

Sharing data is often a risk in terms of security and privacy especially if the data is sensitive. Algorithms can be used to generate synthetic data from an original raw dataset in order to share data that are considered more ‘privacy preserving’, and that increase the level of anonymity. In this paper, we carry out an experiment to study the validity of conducting machine learning on synthetic data. We compare the evaluation metrics produced from machine learning models that were trained using synthetic data with metrics yielded from machine learning models that were trained using the corresponding real data.
Original languageEnglish
Title of host publicationData Science and Knowledge Engineering for Sensing Decision Support
PublisherWorld Scientific Publishing
Pages1281-1291
Number of pages11
Volume11
ISBN (Electronic)978-981-3273-24-5
ISBN (Print)978-981-3273-22-1
DOIs
Publication statusPublished - 24 Aug 2018

Fingerprint Dive into the research topics of 'Machine learning using synthetic and real data: Similarity of evaluation metrics for different healthcare datasets and for different algorithms'. Together they form a unique fingerprint.

  • Cite this