Abstract
Sharing data is often a risk in terms of security and privacy especially if the data is sensitive. Algorithms can be used to generate synthetic data from an original raw dataset in order to share data that are considered more ‘privacy preserving’, and that increase the level of anonymity. In this paper, we carry out an experiment to study the validity of conducting machine learning on synthetic data. We compare the evaluation metrics produced from machine learning models that were trained using synthetic data with metrics yielded from machine learning models that were trained using the corresponding real data.
Original language | English |
---|---|
Title of host publication | Data Science and Knowledge Engineering for Sensing Decision Support |
Publisher | World Scientific Publishing |
Pages | 1281-1291 |
Number of pages | 11 |
Volume | 11 |
ISBN (Electronic) | 978-981-3273-24-5 |
ISBN (Print) | 978-981-3273-22-1 |
DOIs | |
Publication status | Published (in print/issue) - 24 Aug 2018 |
Fingerprint
Dive into the research topics of 'Machine learning using synthetic and real data: Similarity of evaluation metrics for different healthcare datasets and for different algorithms'. Together they form a unique fingerprint.Profiles
-
Michaela Black
- School of Computing, Eng & Intel. Sys - Professor of Artificial Intelligence
- Faculty Of Computing, Eng. & Built Env. - Full Professor
Person: Academic
-
Debbie Rankin
- School of Computing, Eng & Intel. Sys - Senior Lecturer in Computer Science
- Faculty Of Computing, Eng. & Built Env. - Senior Lecturer
Person: Academic