Development, Implementation and Validation of a Stochastic Prediction Model of UICC Stages for Missing Values in Large Data Sets in a Hospital Cancer Registry
Sebastian Appelbaum, Daniel Krüerke, Daniel Krüerke, Stephan Baumgartner, Marianne Schenker, Thomas Ostermann
2023
Abstract
Cancer is still a fatal disease in many cases, despite intensive research into prevention, treatment and follow-up. In this context, an important parameter is the stage of the cancer. The TNM/UICC classification is an important method to describe a cancer. It dates back to the surgeon Pierre Denoix and is an important prognostic factor for patient survival. Unfortunately, despite its importance, the TNM/UICC classification is often poorly documented in cancer registries. The aim of this work is to investigate the possibility of predicting UICC stages using statistical learning methods based on cancer registry data. Data from the Cancer Registry Clinic Arlesheim (CRCA) were used for this analysis. It contains a total of 5,305 records of which 1,539 cases were eligible for data analysis. For prediction classification and regression trees, random forests, gradient tree boosting and logistic regression are used as statistical methods for the problem at hand. As performance measures Mean misclassification error (mmce), area under the receiver operating curve (AUC) and Cohen’s kappa are applied. Misclassification rates were in the range of 28.0% to 30.4%. AUCs ranged between 0.73 and 0.80 and Cohen kappa showed values between 0.39 and 0.44 which only show a moderate predictive performance. However, with only 1,539 records, the data set considered here was significantly lower than those of larger cancer registries, so that the results found here should be interpreted with caution.
DownloadPaper Citation
in Harvard Style
Appelbaum S., Krüerke D., Baumgartner S., Schenker M. and Ostermann T. (2023). Development, Implementation and Validation of a Stochastic Prediction Model of UICC Stages for Missing Values in Large Data Sets in a Hospital Cancer Registry. In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 5: HEALTHINF; ISBN 978-989-758-631-6, SciTePress, pages 117-123. DOI: 10.5220/0011667700003414
in Bibtex Style
@conference{healthinf23,
author={Sebastian Appelbaum and Daniel Krüerke and Stephan Baumgartner and Marianne Schenker and Thomas Ostermann},
title={Development, Implementation and Validation of a Stochastic Prediction Model of UICC Stages for Missing Values in Large Data Sets in a Hospital Cancer Registry},
booktitle={Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 5: HEALTHINF},
year={2023},
pages={117-123},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011667700003414},
isbn={978-989-758-631-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 5: HEALTHINF
TI - Development, Implementation and Validation of a Stochastic Prediction Model of UICC Stages for Missing Values in Large Data Sets in a Hospital Cancer Registry
SN - 978-989-758-631-6
AU - Appelbaum S.
AU - Krüerke D.
AU - Baumgartner S.
AU - Schenker M.
AU - Ostermann T.
PY - 2023
SP - 117
EP - 123
DO - 10.5220/0011667700003414
PB - SciTePress