Using Miniature Visualizations of Descriptive Statistics to Investigate the Quality of Electronic Health Records

Roy A. Ruddle, Marlous S. Hall

2019

Abstract

Descriptive statistics are typically presented as text, but that quickly becomes overwhelming when datasets contain many variables or analysts need to compare multiple datasets. Visualization offers a solution, but is rarely used apart from to show cardinalities (e.g., the % missing values) or distributions of a small set of variables. This paper describes dataset- and variable-centric designs for visualizing three categories of descriptive statistic (cardinalities, distributions and patterns), which scale to more than 100 variables, and use multiple channels to encode important semantic differences (e.g., zero vs. 1+ missing values). We evaluated our approach using large (multi-million record) primary and secondary care datasets. The miniature visualizations provided our users with a variety of important insights, including differences in character patterns that indicate data validation issues, missing values for a variable that should always be complete, and inconsistent encryption of patient identifiers. Finally, we highlight the need for research into methods of identifying anomalies in the distributions of dates in health data.

Download


Paper Citation


in Harvard Style

Ruddle R. and Hall M. (2019). Using Miniature Visualizations of Descriptive Statistics to Investigate the Quality of Electronic Health Records. In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - Volume 5: HEALTHINF; ISBN 978-989-758-353-7, SciTePress, pages 230-238. DOI: 10.5220/0007354802300238


in Bibtex Style

@conference{healthinf19,
author={Roy A. Ruddle and Marlous S. Hall},
title={Using Miniature Visualizations of Descriptive Statistics to Investigate the Quality of Electronic Health Records},
booktitle={Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - Volume 5: HEALTHINF},
year={2019},
pages={230-238},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007354802300238},
isbn={978-989-758-353-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - Volume 5: HEALTHINF
TI - Using Miniature Visualizations of Descriptive Statistics to Investigate the Quality of Electronic Health Records
SN - 978-989-758-353-7
AU - Ruddle R.
AU - Hall M.
PY - 2019
SP - 230
EP - 238
DO - 10.5220/0007354802300238
PB - SciTePress