loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock
An Efficient, Robust, and Customizable Information Extraction and Pre-processing Pipeline for Electronic Health Records

Topics: Bioinformatics & Pattern Discovery; Clustering and Classification Methods; Concept Mining; Foundations of Knowledge Discovery in Databases; Information Extraction; Machine Learning; Mining Text and Semi-Structured Data; Pre-Processing and Post-Processing for Data Mining

Authors: Eva K. Lee 1 ; Yuanbo Wang 1 ; Yuntian He 1 and Brent M. Egan 2

Affiliations: 1 Center for Operations Research in Medicine and HealthCare, U.S.A., H. Milton Stewart School of Industrial and Systems Engineering, U.S.A., School of Biological Sciences, Georgia Institute of Technology and U.S.A. ; 2 University of South Carolina School of Medicine–Greenville, U.S.A., Care Coordination Institute, Greenville and U.S.A.

Keyword(s): Electronic Health Record, Information Extraction, Encryption, Data Standardization, Clustering, Time Series.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; BioInformatics & Pattern Discovery ; Clustering and Classification Methods ; Computational Intelligence ; Concept Mining ; Evolutionary Computing ; Foundations of Knowledge Discovery in Databases ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Pre-Processing and Post-Processing for Data Mining ; Soft Computing ; Symbolic Systems

Abstract: Electronic Health Records (EHR) containing large amounts of patient data present both opportunities and challenges to industry, policy makers, and researchers. These data, when extracted and analyzed effectively, can reveal critical factors that can improve clinical practices and decisions. However, the inherently complex, heterogeneous and rapidly evolving nature of these data make them extremely difficult to analyze effectively. In addition, Protected Health Information (PHI) containing sensitive yet valuable information for clinical research must first be anonymized. In this paper we identify current challenges with obtaining and pre-processing information from EHR. We then present a comprehensive, efficient “pipeline” for extracting, de-identifying, and standardizing EHR data. We demonstrate the use of this pipeline, based on software from EPIC Systems, in analysing chronic kidney disease, prostate cancer, and cardiovascular disease. We also address challenges associated with tem poral laboratory time series data and natural text data and develop a novel approach for clustering irregular Multivariate Time Series (MTS). The pipeline organizes data into a structured, machine-readable format which can be effectively applied in clinical research studies to optimize processes, personalize care, and improve quality, and outcomes. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.163.14.144

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Lee, E.; Wang, Y.; He, Y. and Egan, B. (2019). An Efficient, Robust, and Customizable Information Extraction and Pre-processing Pipeline for Electronic Health Records. In Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - KDIR; ISBN 978-989-758-382-7; ISSN 2184-3228, SciTePress, pages 310-321. DOI: 10.5220/0008071303100321

@conference{kdir19,
author={Eva K. Lee. and Yuanbo Wang. and Yuntian He. and Brent M. Egan.},
title={An Efficient, Robust, and Customizable Information Extraction and Pre-processing Pipeline for Electronic Health Records},
booktitle={Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - KDIR},
year={2019},
pages={310-321},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008071303100321},
isbn={978-989-758-382-7},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - KDIR
TI - An Efficient, Robust, and Customizable Information Extraction and Pre-processing Pipeline for Electronic Health Records
SN - 978-989-758-382-7
IS - 2184-3228
AU - Lee, E.
AU - Wang, Y.
AU - He, Y.
AU - Egan, B.
PY - 2019
SP - 310
EP - 321
DO - 10.5220/0008071303100321
PB - SciTePress