The h-ANN Model: Comprehensive Colonoscopy Concept Compilation using Combined Contextual Embeddings

Shorabuddin Syed, Adam Jackson Angel, Hafsa Bareen Syeda, Carole France Jennings, Joseph VanScoy, Mahanazuddin Syed, Melody Greer, Sudeepa Bhattacharyya, Meredith Zozus, Benjamin Tharian, Fred Prior

2022

Abstract

Colonoscopy is a screening and diagnostic procedure for detection of colorectal carcinomas with specific quality metrics that monitor and improve adenoma detection rates. These quality metrics are stored in disparate documents i.e., colonoscopy, pathology, and radiology reports. The lack of integrated standardized documentation is impeding colorectal cancer research. Clinical concept extraction using Natural Language Processing (NLP) and Machine Learning (ML) techniques is an alternative to manual data abstraction. Contextual word embedding models such as BERT (Bidirectional Encoder Representations from Transformers) and FLAIR have enhanced performance of NLP tasks. Combining multiple clinically-trained embeddings can improve word representations and boost the performance of the clinical NLP systems. The objective of this study is to extract comprehensive clinical concepts from the consolidated colonoscopy documents using concatenated clinical embeddings. We built high-quality annotated corpora for three report types. BERT and FLAIR embeddings were trained on unlabeled colonoscopy related documents. We built a hybrid Artificial Neural Network (h-ANN) to concatenate and fine-tune BERT and FLAIR embeddings. To extract concepts of interest from three report types, 3 models were initialized from the h-ANN and fine-tuned using the annotated corpora. The models achieved best F1-scores of 91.76%, 92.25%, and 88.55% for colonoscopy, pathology, and radiology reports respectively.

Download


Paper Citation


in Harvard Style

Syed S., Angel A., Syeda H., Jennings C., VanScoy J., Syed M., Greer M., Bhattacharyya S., Zozus M., Tharian B. and Prior F. (2022). The h-ANN Model: Comprehensive Colonoscopy Concept Compilation using Combined Contextual Embeddings. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF; ISBN 978-989-758-552-4, SciTePress, pages 189-200. DOI: 10.5220/0010903300003123


in Bibtex Style

@conference{healthinf22,
author={Shorabuddin Syed and Adam Jackson Angel and Hafsa Bareen Syeda and Carole France Jennings and Joseph VanScoy and Mahanazuddin Syed and Melody Greer and Sudeepa Bhattacharyya and Meredith Zozus and Benjamin Tharian and Fred Prior},
title={The h-ANN Model: Comprehensive Colonoscopy Concept Compilation using Combined Contextual Embeddings},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF},
year={2022},
pages={189-200},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010903300003123},
isbn={978-989-758-552-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - Volume 5: HEALTHINF
TI - The h-ANN Model: Comprehensive Colonoscopy Concept Compilation using Combined Contextual Embeddings
SN - 978-989-758-552-4
AU - Syed S.
AU - Angel A.
AU - Syeda H.
AU - Jennings C.
AU - VanScoy J.
AU - Syed M.
AU - Greer M.
AU - Bhattacharyya S.
AU - Zozus M.
AU - Tharian B.
AU - Prior F.
PY - 2022
SP - 189
EP - 200
DO - 10.5220/0010903300003123
PB - SciTePress