2. improve performance as less matches need to be
computed for calibration;
3. combine as many layers as necessary to perform
robust guided matching.
In principle, any feature detector computing the lo-
cation of the feature on the image along with a lo-
cal descriptor of its neighborhood could be employed,
such as (Lowe, 2004), (Bay et al., 2008) or (Tola
et al., 2009). Additionally, a similarity function is re-
quired so that descriptors may be compared. In this
work, we employ the same feature detector as pro-
posed in (Pagani et al., 2011) and refer to it as Spher-
ical Affine SIFT (SASIFT). SASIFT was chosen due
to its robustness against the distortion imposed by the
longitude-latitude representation of spherical images.
This is specially important near the image poles.
4 ROBUST GUIDED MATCHING
In this section the main contribution of our approach
is detailed. The goal is to robustly add 3D points to
the SPC to increase the number of seed points for
3D dense reconstruction or to improve the current
(sparse) representation of the scene.
Theoretically, an arbitrary number of layers could
be computed per image. In practice, few layers are
computed because this is already sufficient to achieve
both precise calibration – using the first layer – and
dense image sampling – using the remaining layers.
Yet, these layers may contain several thousands of
descriptors and handling numerous images simulta-
neously is not optimal as computational resources are
limited. Thus, we devise the method for pairs of im-
ages, so that only the corresponding layers have to be
handled. The image pairs are determined according to
their neighborhood relation, which is encoded in a bi-
nary upper triangular matrix N. If N(i, j) = 1, images
I
i
and I
j
are considered as neighbors and matches are
computed between them.
Our algorithm combines multiple feature layers,
3D points from calibration and a set of constraints,
as epipolar geometry, thresholding and symmetric
matching. Moreover, it enforces the consistency of
new 3D points and may be applied recursively, allow-
ing to push the number of points even further.
4.1 The Anchor Points
After calibration, most 3D points in the SPC are cor-
rectly triangulated. However, some outliers remain.
Thus, before applying our guided matching, outliers
are removed according to a local density computed
for each point in the SPC. We denote the filtered point
cloud as S
0
. After filtering, all remaining points are
assumed to be inliers. These points are regarded as
reference and we refer to them as anchor points. We
define an anchor point A as a 3D point in Euclidean
coordinates along with a set Θ holding the images and
the respective features where A is observed.
A =
P
W
∈ ℜ
3
Θ = {(I
i
, f) | λp = R
i
P
W
+ t
i
}
(3)
In Equation 3, p is the image point associated to f.
We also define the SPC as the set S of all anchor
points. To improve readability we sometimes use A
instead of its 3D coordinates P
W
throughout the text.
4.2 Matching based on Anchor Points
In the literature, the term guided matching is usually
regarded as the class of methods searching for corre-
spondences given a constraint. This constraint could
be imposed by epipolar geometry, a dispartity range
on aligned images, a predefined or estimated search
region or any other criteria that restricts the search for
correspondences to a subset of the image pixels.
Our guided matching algorithm is not driven by a
single, but by a set of constraints, as described below.
Given a reference image I
r
, a target image I
t
, and a
feature f
r
detected on I
r
, we search for a feature f
t
on
I
t
under the following constraints:
1. Epipolar geometry: p
T
r
Ep
t
= 0, with E the essen-
tial matrix defined by I
r
and I
t
, p
r
and p
t
are the
unit vectors corresponding to f
r
and f
t
;
2. Threshold: the matching score δ between the de-
scriptors of f
r
and f
t
is above a given threshold τ,
i.e. δ( f
r
, f
t
) > τ;
3. Symmetry: δ( f
r
, f
t
) is the highest score when
symmetric matching is performed, that is, f
r
and
f
t
are the best match when the roles of reference
and target images are swapped;
However, these constraints are usually not suffi-
cient to achieve robust matching because the set of
features f
t
complying with the first two criteria above
is in general large. As a result, the search has to be
done in a large set of potentially ambiguous features.
We propose an approach to overcome this is-
sue. Robustness of guided matching is improved by
combining the constraints outlined above, the anchor
points and a consistency filter. Our method works as
follows: For each feature f
r
a set of anchor points
projecting on a region Ω centered at p
r
is selected.
These points form a subset of S, S
Ω
. Assuming depth
continuity for the points in S
Ω
, they can be used to
determine a depth range [λ
min
, λ
max
] in which the 3D
VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications
324