
tion 5, presents the evaluation and the performance of
the proposed approach. Finally, section 6 gives con-
clusions and some insights on potential direction and
future works.
2 RELATED WORK
In recent years, there has been an increasing trend
in the prevalence of idiopathic pulmonary fibrosis
(IPF) in the USA (Nalysnyk et al., 2012). In a study
by (Raghu et al., 2014) on US Medicare patients aged
65 years and older, the authors found that patients
with a median age of 79.4±7.2 years had a survival
time of 3.8 years. The study also revealed an overall
incidence ratio of 93.7 cases per 100k persons. An-
other study by (Hutchinson et al., 2015) reported an
increase in mortality due to pulmonary fibrosis, rang-
ing from 4.64 per 100k for Spain to 8.28 for England
and Wales.
The integration of assitive vision and AI in health-
care has become essential due to the abundance of
clinical data. CT scans are particularly useful for
visually estimating the stage of lung deterioration.
A clinical association was found between interstitial
lung abnormalities (ILA) judged by high-resolution
computed tomography (HRCT) and idiopathic pul-
monary fibrosis (IPF), as reported in (Wells and
Kokosi, 2016; Scatarige et al., 2003).
Motivated by recent advances in deep learning and
computer vision, researchers have investigated vari-
ous architectures for interstitial lung disease (IDL) de-
tection, segmentation, and classification (Soffer et al.,
2022). For instance, in a study by Walsh et al. (Walsh
et al., 2018), deep learning was used to classify fi-
brotic lung diseases from high-resolution CT scans,
and the algorithm outperformed 60 out of 91 radiol-
ogists with a median accuracy of 73.3% compared to
the physicians’ accuracy of 70%. In another study,
Comelli et al. (Comelli et al., 2020) evaluated the
UNet and E-Net segmentation models on 10 patients
with idiopathic pulmonary fibrosis (IPF) and achieved
a segmentation accuracy of 96% using the dice simi-
larity coefficient without any radiologist intervention.
Kido et al. (Kido et al., 2022) developed a deep
neural-network architecture for three-dimensional
segmentation of lung nodules for lung cancer diag-
nosis from CT images. The 3D UNet model’s per-
formance was comparable to human experts with a
dice similarity coefficient of 84.5% and 82.2%, re-
spectively. The authors found that traditional machine
learning techniques such as watershed and graph-cut
provided lower accuracy compared to neural-network
based models, with only 62.8% and 56.6% dice sim-
ilarity coefficient, respectively. In another study
by (Christe et al., 2019), an integrated computer-aided
diagnosis system for IPF was developed using deep
learning on CT images. The system’s performance
was similar to that of radiologists under certain evalu-
ation criteria. The study conducted by (Zucker et al.,
2020) utilized a DCNN model based on ResNet-18
to predict Brasfield scores, which are indicative of
various lung function features such as air trapping,
linear markings, nodular cystic lesions, large lesions,
and overall severity. The authors reported minimal
differences between the model’s Brasfield scores and
those of the experts, except for the large lesion fea-
tures, which had an average Spearman correlation
of only 32% between the model and the radiolo-
gists. However, the correlation rate for large lesion
scores showed a higher rate of 80.2%. In (Agarwala
et al., 2020), a convolutional neural network was first
trained on natural images and then fine-tuned on CT
images to automatically segment interstitial lung dis-
ease (ILD) patterns such as emphysema, consolida-
tion, and fibrosis. The reported results were accept-
able, with a classification rate of 90% for fibrosis pat-
tern segmentation.
Several research studies based their works on the
OSIC data (Osic, 2023) to predict the decline in
lung function severity which is assessed by measur-
ing the forced vital capacity using a spirometer (Wat-
ters et al., 1986; Noth et al., 2021). The best perfor-
mance was obtained using a bimodal deep learning
model to process CT images and a neural net regres-
sor to process patient clinical metadata. The objec-
tive function was optimized using a multiple quantile
loss function. Efficient-Net was adopted as a back-
bone to process the images. In their study, (Wong
et al., 2021) developed Fibros-Net, an architecture
designed to predict fibrosis progression from chest
scans. The model used CT images, spirometry mea-
surements, and patient clinical metadata to estimate
forced vital capacity (FVC) over a specific time in-
terval from the OSIC data. The model achieved a
good Laplace log-likelihood score of -6.8188. In con-
trast, FVC-Net from (Yadav et al., 2022) represents
a different architecture that estimates FVC from de-
rived honeycombing features, CT scans, and meta-
data of the OSIC dataset. The model showed a higher
Laplace log-likelihood coefficient of -6.64. A study
by (Mandal et al., 2020) compared the performance
of machine learning models with CNN architecture
in predicting FVC from CT images and patient meta-
data. The experiments showcased good results using
an Elastic-Net regression method achieving a higher
likelihood score of -6.73 on the OSIC dataset.
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
462