attacker and the similarity threshold used by the
attacker. For example, reducing the size of the
original subset will result in lower attacker
performance. However, it was chosen to show the
performance with half of the original dataset, as it
represents a meaningful test case.
4 CONCLUSIONS
This paper presents a dashboard application that,
through a simple and intuitive GUI, allows users to
conduct a quality analysis of a synthetic dataset
obtained using any generative method. The
application implements various quality evaluation
metrics across three different assessment aspects, to
evaluate the quality of synthetic data: resemblance,
utility and privacy preservation. Furthermore, the
users can also download summary reports from the
different evaluation panels. The application is freely
available for download at (Santangelo, 2023).
In order to assess the performance of the different
proposed metrics, they were used to evaluate the
quality of synthetic datasets obtained from two SDG
methods, namely HealthGAN and SDV. The original
dataset used is the MIMIC-II, which contains EHR
information from patients in ICU. In general,
synthetic data successfully replicate original data’s
statistical properties and ML classifiers’ performance
metrics obtained with the original dataset. However,
the privacy aspect is not fully respected since the
synthetic data are too similar to the original data.
Furthermore, the HealthGAN method seems to
overperform compared to the SDV method.
Among the limitations of this work, one is related
to the type of synthetic data generated, which includes
only tabular data, while EHRs may also include
bioimages and biosignals. All the implemented
metrics were designed for the evaluation of tabular
synthetic data, thus requiring modification or the
addition of new metrics for evaluating synthetic data
of a different nature. Another limitation is the
handling of missing data: the application assumes that
input datasets do not contain missing values.
Therefore, datasets with missing values need to be
imputed before use.
Regarding future developments of the
implemented metrics, it would be important and
advantageous for some analyses to integrate an
explainability (XAI) component for the results
obtained. For example, in the case of DLA, which
uses ML algorithms, it could be useful to identify
which features had a greater or lesser impact on the
final results, allowing for a detailed inspection of
these features. Moreover, it would be useful to
integrate a section for the evaluation of missing data’s
patterns, when they are present in the input datasets.
ACKNOWLEDGEMENTS
Gabriele Santangelo is a PhD student enrolled in the
National PhD program in Artificial Intelligence,
XXXIX cycle, course on Health and life sciences,
organized by Università Campus Bio-Medico di
Roma. This work was supported by “Fit4MedRob-
Fit for Medical Robotics” Grant B53C22006950001.
REFERENCES
Azizi, Z., Lindner, S., Shiba, Y., Raparelli, V., Norris, C. M.,
Kublickiene, K., Herrero, M. T., Kautzky-Willer, A.,
Klimek, P., Gisinger, T., Pilote, L., & El Emam, K.
(2023). A comparison of synthetic data generation and
federated analysis for enabling international evaluations
of cardiovascular health. Scientific Reports, 13(1),
11540. https://doi.org/10.1038/s41598-023-38457-3
Buuren, S. van, & Groothuis-Oudshoorn, K. (2011). mice:
Multivariate Imputation by Chained Equations in R.
Journal of Statistical Software, 45, 1–67.
https://doi.org/10.18637/jss.v045.i03
Chen, A., & Chen, D. O. (2022). Simulation of a machine
learning enabled learning health system for risk
prediction using synthetic patient data. Scientific
Reports, 12(1), 17917. https://doi.org/10.1038/s41598-
022-23011-4
Giomi, M., Boenisch, F., Wehmeyer, C., & Tasnádi, B.
(2023). A Unified Framework for Quantifying Privacy
Risk in Synthetic Data. Proceedings on Privacy
Enhancing Technologies, 2023(2), 312–328.
https://doi.org/10.56553/popets-2023-0055
Goncalves, A., Ray, P., Soper, B., Stevens, J., Coyle, L., &
Sales, A. P. (2020). Generation and evaluation of
synthetic patient data. BMC Medical Research
Methodology, 20(1), 108. https://doi.org/10.1186/s128
74-020-00977-1
Hernadez, M., Epelde, G., Alberdi, A., Cilla, R., & Rankin,
D. (2023). Synthetic Tabular Data Evaluation in the
Health Domain Covering Resemblance, Utility, and
Privacy Dimensions. Methods of Information in
Medicine, 62(S 01), e19–e38. https://doi.org/10.1055/s-
0042-1760247
Hernandez, M., Epelde, G., Alberdi, A., Cilla, R., & Rankin,
D. (2022). Synthetic data generation for tabular health
records: A systematic review. Neurocomputing, 493, 28–
45. https://doi.org/10.1016/j.neucom.2022.04.053
Johnson, A. (2023). Challenge2012 [Jupyter Notebook].
https://github.com/alistairewj/challenge2012 (Original
work published 2018)