image into a grid of M × M buckets and selecting only
a small number of features from each bucket, the
bucketing technique ensures a well-distributed
selection of features. In particular, this uniform
distribution of features has the potential to improve
both the accuracy and computational efficiency of
pose estimation.
The previously mentioned approaches for feature
selection aimed to select dependable features through
the removal of outliers or consideration of feature
point distribution. While these methods have proven
effective in enhancing the accuracy of VO, they do
not ensure optimality in terms of mathematical
formalism, which guarantees the minimum
uncertainty of VO.
Recently, the concept of the Orthogonality Index
(Nguyen and Lee, 2019) has been introduced to
analytically derive optimal feature selection. This
approach demonstrates optimal feature selection
through a well-defined mathematical format instead
of random selection. The process increases the
orthogonal exponent of individual equations and
applies constraints to computation to reduce
uncertainty when estimating Essential, Fundamental,
or Homography matrices associated with visual
odometry. However, while the Orthogonality Index
provides a mathematical method for optimal feature
selection, they do not account for uncertainty in
feature points due to measurement or other noise.
This issue must be addressed as it significantly
impacts VO estimation. Therefore, a method that
reflects these factors is necessary to ensure optimal
feature selection.
To this end, our study capitalizes on insights
gained from simulation experiments, which have
shown that the measurement error variance and the
spatial distribution of the extracted feature points
significantly affect pose estimation. We propose a
novel approach that incorporates both of these
factors.
Our approach can be summarized as follows: If
the matched feature point pairs are well-matched with
minimal measurement error and are uniformly
distributed throughout the image, the estimated
essential matrix is expected to be close to the ground
truth essential matrix. However, due to the
uncertainty of the matched feature point pairs used to
estimate the essential matrix and the error of the
equations generated using them, the estimated
essential matrix forms a stochastically distributed
distribution centered on the ground truth(GT). We
experimentally demonstrate that the degree of
dispersion depends on the magnitude of the
uncertainty in estimating the essential matrix. We
found that the spatial distribution they form should be
taken into account when selecting matching feature
point pairs, and present a novel "Uncertainty
Hypervolume" approach that takes both into account.
The estimated essential matrix is stochastically
distributed around the reference ground truth
essential matrix, and we quantify this with
hypervolume. Through experiments, we show a
significant correlation between hypervolume and the
error of the pose derived from the essential matrix.
Based on these results, we propose a mathematically
well-structured Uncertainty Hypervolume based
approach for feature point pair selection to obtain the
optimal solution.
In the following sections, we detail our
methodology, experimental setup, and results,
culminating in a comprehensive analysis of the
interplay between feature selection, spatial
distribution, and pose estimation accuracy.
2 PROBLEM DEFINITION AND
APPROACH
2.1 Preliminary
Figure 1: Epipolar Geometry, A 3D point 𝑃 is projected
onto the normalized image plane of each camera at 𝑝 and 𝑞.
The points 𝑒 and 𝑒′ where the line connecting the two
camera origins and the image plane meet are called epipole,
and the straight lines 𝑙 and 𝑙′ connecting the projection
points and the epipole are called epiline (epipolar line).
In epipolar geometry (Deriche et al., 1994), given a
point 𝑃 in space, cameras 𝐶
and 𝐶
view the point 𝑃
from two different perspectives. The point 𝑃 is then
projected onto the normalized image plane of each
camera 𝐶
and 𝐶
as 𝑝 and 𝑞 ( 𝑝 and 𝑞 are
homogeneous normalized image coordinates). It is
known that there is always a 3x3 essential matrix
(Nistér, 2004), 𝑬 between the projected points 𝑝 and
𝑞 that satisfies the epipolar constraint 𝑝
𝐸 𝑞= 0.