steps: (1) semi-automatic labeling of structural faults
of the target building in UAV imagery using AI tools,
(2) 3D point cloud image mapping, and classification
using geometric and radiometric features.The main
contribution of the proposed method lies in the de-
velopment of a methodology for recognizing target
materials in 3D point clouds in real-world scenarios
and identifying harmful elements in historic build-
ings. This is of vital importance in architectural con-
servation tasks and strategies.
This document is structured as follows: section 2
presents the current state-of-the-art, reviewing rele-
vant technologies and methodologies related to dig-
italization for preserving cultural heritage. Then,
section 3 describes datasets to which the proposed
method is targeted. section 4 outlines the proposed
method, whereas the obtained results and the exper-
iments conducted to validate our proposal are pre-
sented in section 5. Finally, the main contributions
of this work are summarized in section 6, including
insights toward future work that aid in further enhanc-
ing the proposed methodology.
2 PREVIOUS WORK
Built heritages face changes through time, includ-
ing erosion, degradation, deformations from natu-
ral phenomena, human interventions, inappropriate
restorations, etc (Li et al., 2023). More formally, the
ISO 19208:2016 standard (recently withdrawn, a new
standard is pending) (ISO, 2016) categorizes these de-
fects into five major groups: mechanical, electromag-
netic, thermal, chemical and biological agents. These
downgrading factors contrast with the relevance of
preserving built heritages and thus evidence the im-
portance of this work.
A deep study of the preservation of built heritage
using multiple technologies is provided by (Li et al.,
2023). Amongst these techniques, preservation and
conservation of cultural heritage are not only under-
stood as extracting faults and defects. Instead, the dig-
itization of cultural heritage and its dissemination has
also been vastly revised (Mendoza et al., 2023), de-
spite not being the main goal of this work. In this re-
gard, photogrammetry, Light Detection and Ranging
(LiDAR) and CAD modelling are frequent acquisition
techniques. These technologies are sometimes com-
bined with their digitization in Building Information
Modelling (BIM) that enables maintaining a record
of repairs and changes in cultural heritage (Moyano
et al., 2020; Rocha et al., 2020). Still, the digitization
is an indirect result of our work due to the reconstruc-
tion of 3D point clouds.
Regarding the detection of building anomalies,
current trends involve using Convolutional Neural
Networks (CNN) over imagery from UAVs and close-
sensing technology. Further insight into this field is
given by (Cumbajin et al., 2023). CNNs are catego-
rized according to the target surface, kind of prob-
lem (classification, semantic segmentation, instance
segmentation, etc.), network and training methodol-
ogy. According to this, the detection of defects over
metal surfaces is trained differently than building-
based methods as they require specialized datasets.
This even applies to individual defects: (Perez et al.,
2019) experimented with a shallow CNN composed
of convolutional and dense layers to identify mois-
ture. For this purpose, a small collaborative dataset
from copyright-free Internet images was used.
Transfer Learning has a significantly higher pres-
ence in building supervision than using custom
CNNs. Amongst the most frequent CNN archi-
tectures, pre-trained VGG, YOLO, U-Net, AlexNet,
GoogleLeNet, Inception and Xception networks stand
out. The work of (Kumar et al., 2021) outputs the
bounding box of cracks in close-sensing building im-
ages, helping to monitor them in real-time with UAVs
coupled with a Jetson-TX2. Otherwise, images can be
semantically segmented to highlight cracks (Mouz-
inho and Fukai, 2021). Region-based CNNs are also
frequent in the literature by using R-CNN (Xu et al.,
2021), Fast R-CNN, Faster R-CNN (Maningo et al.,
2020) and YOLO (Kumar et al., 2021). The objective
was to detect regions with cracks.
From the revised literature, it is clear that there are
some gaps in current building monitoring. Firstly, it is
mainly carried out using close-sensed imagery, rather
than enabling the monitoring of large areas. Thus,
surveys are far slower as they need to capture small
regions of buildings. Second, CNNs are specialized
in specific materials and defects. This drawback is
not only caused by learning limitations but also by
the lack of available datasets. This is even more visi-
ble for defects such as moisture, where RGB imagery
is used instead of more suitable data sources (e.g.,
thermography). Unlike our work, some of the revised
studies are intended for real-time tracking by com-
municating information with Internet of Things (IoT)
communication (Kumar et al., 2021). The main draw-
back of the latter is that it requires planning the lo-
cation of a few devices, for instance, addressing the
optimal sensor placement (OSP). On the other hand,
our case study provides a long-term monitoring tool
for large buildings, that, however, is not a continuous
tracking. Therefore, it is intended for cultural her-
itage whose immediate changes are of no relevance
in the short term. Although this study is conducted
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
742