Stachniss, 2017)—change detection is obtained by re-
projecting a novel image onto the previous views by
exploiting the 3D model, so as to highlight possible
2D misalignments, indicating a change in 3D. Howe-
ver, these solutions find not only structural changes
but also changes due to moving objects (i.e. cars, pe-
destrians, etc.) that can appear in the new sequence:
in order to discard such nuisances, object recogni-
tion methods have been trained and used to select
changed areas to be discarded (Taneja et al., 2011).
In (Taneja et al., 2013), a similar solution is adop-
ted, using instead a cadastral 3D model and panora-
mic images. Differently, in (Qin and Gruen, 2014)
the reference 3D point cloud is obtained with an accu-
rate yet expensive laser scanning technology; chan-
ges are then detected in images captured at later ti-
mes by re-projection. Note that, in order to register
the laser-based point cloud with the images, control
points have to be selected manually. More recently,
even deep network have been used to tackle change
detection (Alcantarilla et al., 2016) using as input re-
gistered images from the old and new sequences.
Differently from the state-of-the-art, in this pa-
per we propose a simple yet effective solution ba-
sed on the analysis of 3D reconstructions computed
from image collections acquired at different times. In
this way, our method focuses on detecting structural
changes and avoids problem related to difference in
illumination, since it exploits only geometric infor-
mation from the scene. Moreover, 3D reconstruction
methods such as Structure from Motion (SfM) (Sze-
liski, 2010), Simultaneous Localization and Mapping
(SLAM) (Fanfani et al., 2013) or Visual Odome-
try (Fanfani et al., 2016) build 3D models of fixed
structures only, thus automatically discarding any mo-
ving elements in the scene. By detecting differences
in the 3D models, the system is able to produce an
output 3D map that outlines the changed areas. Our
change detection algorithm is fully automatic and is
composed by two main steps: (i) initially, a rigid re-
gistration at six degrees of freedom has to be esti-
mated in order to align the temporally ordered 3D
maps; (ii) then, the actual change detection is per-
formed by comparing the local 3D structures of cor-
responding areas. The detected changes can also be
transported onto the input photos/videos to highlight
the image areas with altered structures. It is worth
noting that our method is easy to implement and can
be sided with any SfM software—as for example Vi-
sualSFM (Wu, 2013) or COLMAP (Sch
¨
onberger and
Frahm, 2016), both freely available—to let even non
expert users build their own change detection system.
2 METHOD DESCRIPTION
Let I
0
and I
1
be two image collections of the same
scene acquired at different times t
0
and t
1
. At first, our
method exploits SfM approaches to obtain estimates
for the intrinsic and extrinsic camera parameters and
a sparse 3D point cloud representing the scene, for
both I
0
and I
1
. We also retain all the corresponden-
ces that link 2D points in the images with 3D points
in the model. Then, the initial point clouds is enri-
ched with region growing approaches (Furukawa and
Ponce, 2010) to obtain more dense 3D data. Note that
camera positions and both the sparse and dense mo-
dels obtained from I
0
and I
1
are expressed in two in-
dependent and arbitrary coordinate systems, since no
particular calibration is used to register the two col-
lections.
Hereafter we present the two main steps of the
change detection method: (i) to estimate the rigid
transformation that maps the model of I
0
onto that of
I
1
, implicitly exploiting the common and fixed struc-
tures in the area, (ii) to detect possible changes in the
scene by comparing the registered 3D models.
2.1 Photometric Rigid Registration
Since 3D models obtained through automatic re-
construction methods typically include wrongly es-
timated points, before using a global registration
approach—such as the Iterative Closest Point (ICP)
algorithm (Besl and McKay, 1992)—our system ex-
ploits the computed correspondences among image
(2D) and model (3D) points, to obtain an initial esti-
mate of the rigid transformation between the two 3D
reconstructions.
Once computed the sparse reconstructions S
0
and
S
1
, from I
0
and I
1
respectively, for each 3D point we
can retrieve a list of 2D projections and, additionally,
for each 2D projection a descriptor vector based on
the photometric appearance of its neighbourhood is
also recovered. Exploiting this information, we can
establish correspondences among images in I
0
and I
1
,
as follows. Each image in I
0
is compared against all
images in I
1
and putative matches are found using the
descriptor vectors previously computed. More in de-
tail, let f
i
0
= {m
i
0
, , m
i
N
} be the set of N 2D features
in image I
i
0
∈ I
0
that have an associated 3D point, and
f
j
1
= {m
j
0
, , m
j
M
} the set relative to I
j
1
∈ I
1
. For each
2D point in f
i
0
we compute its distance w.r.t. all points
in f
j
1
by comparing their associated descriptor vec-
tors. Then, starting from the minimum distance ma-
tch, every point in f
i
0
is put in correspondence with a
point in f
j
1
.
Structural Change Detection by Direct 3D Model Comparison
761