Making Data Big for a Deep-learning Analysis: Aggregation of Public COVID-19 Datasets of Lung Computed Tomography Scans

Francesca Lizzi, Francesca Lizzi, Francesca Brero, Francesca Brero, Raffaella Cabini, Raffaella Cabini, Maria Fantacci, Maria Fantacci, Stefano Piffer, Stefano Piffer, Ian Postuma, Lisa Rinaldi, Lisa Rinaldi, Alessandra Retico

2021

Abstract

Lung Computed Tomography (CT) is an imaging technique useful to assess the severity of COVID-19 infection in symptomatic patients and to monitor its evolution over time. Lung CT can be analysed with the support of deep learning methods for both aforementioned tasks. We have developed a U-net based algorithm to segment the COVID-19 lesions. Unfortunately, public datasets populated with a huge amount of labelled CT scans of patients affected by COVID-19 are not available. In this work, we first review all the currently available public datasets of COVID-19 CT scans, presenting an extensive description of their characteristics. Then, we describe the design of the U-net we developed for the automated identification of COVID-19 lung lesions. Finally, we discuss the results obtained by using the different publicly available datasets. In particular, we trained the U-net on the dataset made available within the COVID-19 Lung CT Lesion Segmentation Challenge 2020, and we tested it on data from the MosMed and the COVID-19-CT-Seg datasets to explore the transferability of the model and to assess whether the image annotation process affects the detection performances. We evaluated the performance of the system in lesion segmentation in terms of the Dice index, which measures the overlap between the ground truth and the predicted masks. The proposed U-net segmentation model reaches a Dice index equal to 0.67, 0.42 and 0.58 on the independent validation sets of the COVID-19 Lung CT Lesion Segmentation Challenge 2020, on the MosMed and on the COVID-19-CT-Seg datasets, respectively. This work focusing on lesion segmentation constitutes a preliminary work for a more accurate analysis of COVID-19 lesions, based for example on the extraction and analysis of radiomic features.

Download


Paper Citation


in Harvard Style

Lizzi F., Brero F., Cabini R., Fantacci M., Piffer S., Postuma I., Rinaldi L. and Retico A. (2021). Making Data Big for a Deep-learning Analysis: Aggregation of Public COVID-19 Datasets of Lung Computed Tomography Scans. In Proceedings of the 10th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-521-0, pages 316-321. DOI: 10.5220/0010584403160321


in Bibtex Style

@conference{data21,
author={Francesca Lizzi and Francesca Brero and Raffaella Cabini and Maria Fantacci and Stefano Piffer and Ian Postuma and Lisa Rinaldi and Alessandra Retico},
title={Making Data Big for a Deep-learning Analysis: Aggregation of Public COVID-19 Datasets of Lung Computed Tomography Scans},
booktitle={Proceedings of the 10th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2021},
pages={316-321},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010584403160321},
isbn={978-989-758-521-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Making Data Big for a Deep-learning Analysis: Aggregation of Public COVID-19 Datasets of Lung Computed Tomography Scans
SN - 978-989-758-521-0
AU - Lizzi F.
AU - Brero F.
AU - Cabini R.
AU - Fantacci M.
AU - Piffer S.
AU - Postuma I.
AU - Rinaldi L.
AU - Retico A.
PY - 2021
SP - 316
EP - 321
DO - 10.5220/0010584403160321