STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES
USING PHASE DOMAIN METHODS
Manthos Alifragis and Costas S. Tzafestas
National Technical University of Athens, School of Electrical and Computer Engineering
Division of Signals, Control, and Robotics, Zographou Campus, 15773 Athens, Greece
Keywords:
Stereo matching, Phase congruency, Monogenic signal, SIFT keypoints.
Abstract:
This paper conducts an experimental study on the application of some recent theories of image preprocessing
and analysis in the frequency domain, particularly the phase congruency and monogenic filtering methods.
Our goal was to examine the performance of such methods in a stereo matching problem setting, with photos
of complicated scenes. Two objects were used: a scene of an ancient Greek temple of Acropolis and the out-
side scene of the gate of an ancient theatre. Due to the complex structure of the photographed object, classic
techniques used for feature detection and matching give poor results. The phase-domain approach followed in
this paper is based on the phase-congruency method for feature extraction, together with monogenic filtering
and a new correlation measure in the frequency domain for image correspondence and stereo matching. Com-
parative results show that the three-dimensional models of the scene computed when applying these phase
domain methods are much more detailed and consistent as compared to the models obtained when using clas-
sic approaches or the SIFT based techniques, which give poor depth representation and less accurate metric
information.
1 INTRODUCTION
The problem of stereo matching and depth estimation
have become in recent years the focus of consider-
able research in the field of computer vision. Reli-
able edge or feature detection techniques constitute
the precursors of three dimensional structure or scene
reconstruction methods. Throughout the years, there
has been a significant progress in the development of
image correspondence analysis and feature detection
methods. As far as feature (edge or corner) detec-
tion is concerned, the traditional approach endorsedin
most of the applications, is the one applying gradient-
based methods, such as those developed by Canny
(Canny, 1986), and Marr & Hildreth (Marr and Hil-
dreth, 1980). These methods have the drawback of
sensitivity in image illumination, contrast, blurring
and magnification. Another disadvantage when us-
ing these methods is the non-automatic determina-
tion of the appropriate thresholds for feature detec-
tion. More recently, Fleck (Fleck, 1992) used an a-
priori knowledge of the noise characteristics of the
camera, in order to set feature detection thresholds.
A remarkable study on the detection of image fea-
tures invariant to image scale and rotation has been
made by Lowe (Lowe, 2004). This specific approach
has been named Scale Invariant Feature Transform
(SIFT). This approach imposes a local image descrip-
tor which is highly distinctive and invariant to image
scale-space variations and changes in illuminations or
3D viewpoint, too.
In the work presented in this paper, we made use
of Fourier transformations of the images, and Ga-
bor filtering, together with the phase congruency ap-
proach proposed by Kovesi (Kovesi, 1999). This ap-
proach utilizes the local frequency spread and uses
this information to weigh the phase congruency mea-
sure of the image. Concerning image correspondence,
we extend the approach proposed in (Kovesi, 1995),
using monogenic filtering together with a new cor-
relation measure in the frequency domain. This ap-
proach was enriched by a normalized expression of
the correlation measure, as well as by the additional
information of line detection results and an approx-
imately known camera motion. The combination of
the above methods led us to the development and
implementation of a filter in the frequency domain,
which was experimentally applied in a stereo match-
ing problem.
The rest of the paper is organized as follows: Sec-
21
Alifragis M. and S. Tzafestas C. (2009).
STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS.
In Proceedings of the First International Conference on Computer Imaging Theory and Applications, pages 21-28
DOI: 10.5220/0001791500210028
Copyright
c
SciTePress
tion 2.1 presents the computational approach applied
in the frequency domain for image processing and
edge detection. Section 2.2 presents the new stereo-
matching approach that is based on the application of
monogenic filtering. Section 2.3 makes a brief dis-
cussion on camera calibration issues, also presenting
the image rectification methods that were used in our
experiments. Finally, an evaluation of all the above
mentioned methods in comparison with the traditional
approaches as applied to the problem of stereo match-
ing and depth estimation for 3D scene reconstruction,
is presented in Section 3.
2 PHASE DOMAIN METHODS
2.1 Feature Extraction using Phase
Congruency
The first phase in a stereo analysis problem is the ex-
traction of useful information from the images. This
phase consists of detecting edges or other geometrical
structures of interest on the images. The detection of
sporadic or texture points is avoided. The traditional
approachto the edge detection problem is based on in-
tensity information (Canny, 1986). It is known, how-
ever, that these kind of intensity based methods are
very sensitive to random lighting of the photographs,
to optical characteristics of the camera and are mostly
dependent on the different levels of thresholding.
The development of local energy models to de-
tect image features, such as edges or curves, under
random lighting conditions is presented in (Morrone
and Owens, 1987). According to these models and
through two dimensional fourier transformation of an
image, local frequency and phase information is ob-
tained. The phase congruency definition by Morrone
and Owens (Morrone and Owens, 1987) is given by
the Fourier series expansion of an one dimensional
signal at some location x as:
PC(x) = max
ϕ(x)[0,2π]
n
A
n
cos(ϕ
n
(x) ϕ(x))
n
A
n
(1)
The quantity ϕ represents the local phase angle
of the Fourier transformation at the specific image
point. The quantity ϕ which maximizes Eq. (1) is the
mean phase angle of all Fourier components at a local
neighborhood of the image. In the approach followed
in this paper, we use complex Gabor functions (sine
and cosine functions modulated over gaussian). In or-
der to measure, in a specific location of the signal,
the amplitude and the phase, we can apply two linear
phase filters in quadrature as Gabor complex filters,
for a specific scale and frequency. The Gabor filter
is composed of two main components, the complex
sinusoidal carrier, and a gaussian envelope. Alterna-
tively, log-Gabor filters can be applied as proposed
by to Field (Field, 1987). The log-Gabor filters have
gaussian transfer function and allow the construction
of large bandwidth filters with odd symmetry and DC
component equal to zero. The zero DC component
cannot be kept in Gabor filters with bandwidth greater
than one octave. The log Gabor function has a fre-
quency response described by:
G( f) = e
[log( f/ f
0
)]
2
/2[log(σ/ f
0
)]
2
(2)
The frequency response of a log-Gabor filter is a
Gaussian on a log frequency axis. The f
0
defines
the centre frequency of the sinusoid and represent the
scaling factor of the filter. In addition, the σ is a scal-
ing factor of the bandwidth. In order to maintain con-
stant shape ratio filters, the ratio of σ/ f
0
should be
maintained constant. The first step of this analysis
consisted of the convolution of the signal with each
quadrature part of the filter. Let I be the signal, and
M
e
n
, M
o
n
be the even (cosine) and odd (sine) symmetry
waveforms, respectively, at scale n. The response vec-
tor is then consisted of the responses of each quadra-
ture pair of filters. It is written as:
(e
n
(x),o
n
(x)) = (I(x) M
e
n
,I(x) M
o
n
) (3)
where e
n
(x) and o
n
(x) are the real and imaginary parts
of the frequency Fourier component. The amplitude
of the transformed signal at a specific scale of the ga-
bor filter is given by:
A
n
(x) =
q
e
n
(x)
2
+ o
n
(x)
2
(4)
The phase is given by:
Φ
n
(x) = arctan(e
n
(x),o
n
(x)) (5)
at each point x in a signal. The arctan is the four
quadrant inverse tangent.
A response vector was thus formed for each scale
of the filter. The array of all these vectors repre-
sents the localized information of a signal. The above
expression (1) of the phase congruency measure has
poor localization on blurred features. This measure
is also sensitive to noise. These issues led Kovesi in
(Kovesi, 1999) to develop a new measure for phase
congruency and to extend the one-dimensional fil-
ters described previously into two dimensions. This
new measure includes the additional sum of the log-
Gabor filters response amplitudes multiplied in the
frequency domain with some spreading function over
all orientations and scales at a specific location in the
image.
IMAGAPP 2009 - International Conference on Imaging Theory and Applications
22
Gaussian functions were also used as spreading
functions across the filter perpendicular to its orien-
tation, according to Kovesi (Kovesi, 1999). This re-
sulted in preserving the phase information unaffected,
due to the fact that any signal convolved with a gaus-
sian function has its amplitude components modu-
lated, while the phase remains unaltered. Towards this
direction, the two dimensional filters used in the fre-
quency domain are Gaussian with geometrically in-
creasing center frequencies and bandwidths. Their
transfer function is:
G(θ) = e
(θθ
0
)
2
2σ
2
θ
(6)
where θ
0
is the orientation angle of the filter, and σ
θ
is
the standard deviation of the gaussian spreading func-
tion in the angular direction.
The equation for 2-D phase congruency for two-
dimensional signals analysis, like images, can be ob-
tained as follows:
PC
2
=
o
n
W
o
(x)A
no
φ
no
(x) T
o
n
A
no
(x) + ε
(7)
at each two-dimensional image location x. The
φ
no
(x) = cos(ϕ
no
(x) ϕ(x)) | sin(ϕ
no
(x)
ϕ(x))|, and o, n refer to the filter’s orientation and
scale, respectively. It must be also noted that when
the amount between the symbols . is non-positive,
then the outcome becomes zero. The numerator of
the above fraction represents the total energy of the
2D signal at a local point of the image. This amount
of energy is an approximation of the local energy
function defined for an analytical signal, according to
Venkatesh and Owens (Venkatesh and Owens, 1989).
The term W
o
(x) weighs the frequency spread. Kovesi
made use of W
o
(x) as a component to cope with
the lack of reliability of phase congruency measures
in image areas with less frequency spread (e.g.
smoothed images). The role of ε is to avoid division
by zero. Finally, only the values which are above a
threshold T (the expected influence of noise) are used
to calculate the final result. The appropriate threshold
T for the noise is set experimentally, according to the
response of the smallest scale filter on each image.
It must be also noted here that the type of an image
feature detected, such as a line or a corner correspond-
ing to maximum phase congruency value, needs to be
classified accordingly. Towards this end, the phase
congruency feature maps were calculated, according
to (Kovesi, ).
2.2 Feature Matching on the Phase
Domain
The image correspondence problem has been exces-
sively studied as a fundamental problem of low-level
computer vision. In order to track correspondent
points through images, intensity correlation processes
can be applied. However, the apparent complex-
ity of the images used in our experiments, dictated
the use of a different approach to address this issue.
In the work presented in this paper, we employ an
approach based on two-dimensional analytic signal
theory and monogenic signal theory, inspired by the
work of Sommer and Ferlsberg in (Felsberg and Som-
mer, 2001).
The two dimensional analytic signal is based on
a two dimensional generalization of Hilbert transfor-
mation, also known as Rietz transformation. The ex-
pression of the Rietz-transformed signal F (u) in the
frequency domain:
F
R
(u) = i
u
|u|
F (u)
where u the two dimensional frequency vector
(u
1
,u
2
), and |u| =
q
u
2
1
+ u
2
2
.
The Fourier transformation of each image I
F
was
firstly computed. The next stage was the introduction
of a log-Gabor filter (see Eq.(2)), which contributes to
the construction of bandpass expressions of the signal
F
R
in the frequency domain:
H
R
s
= F
R
G( f) (8)
where H
R
s
=
H
1
R
s
H
2
R
s
T
. The above image filters
are applied in the frequency domain, and after the ap-
plication of an inverse Fourier transformation, the real
part of the consequent signals is obtained as follows:
I
F
s
= Re
F
1
{I
F
G( f)}
(9)
H
1
F
s
= Re
F
1
I
F
H
1
R
s

(10)
H
2
F
s
= Re
F
1
I
F
H
2
R
s

(11)
This leads to a generalized complex 2D analytic
signal expression, which has as real part the signal I
F
s
(9) and complex part the mathematical expression of
its Rietz transformation, according to Ferlsberg and
Sommer (Felsberg and Sommer, 2001). Therefore,
the complex part of the 2D analytic signal consists
of two signals, the signal H
1
F
s
(10) and the signal H
2
F
s
(11). Consequently, at each point of the image (x,y),
at a specific scale and orientation, we havea 3D vector
x(x,y) consisting of the three above signals (9), (10)
and (11). In addition, the measure of the amplitude of
the energy is given by:
STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS
23
A =
q
(H
1
F
s
)
2
+ (H
2
F
s
)
2
+ (I
F
s
)
2
(12)
Given two locations (x
1
,y
1
) and (x
2
,y
2
) in the first
and second image (Im
1
and Im
2
) respectively, the cor-
relation measure is then given by:
C
12
(x
Im
1
1
,x
Im
2
2
) =
+k
m=k
+l
n=l
x
Im
1
1
(x
1
+ m,y
1
+ n)
+k
m=k
+l
n=l
A
Im
1
(x
1
+ m,y
1
+ n)
x
Im
2
2
(x
2
+ m,y
2
+ n)
A
Im
2
(x
2
+ m,y
2
+ n)
(13)
where x
Im
1
1
and x
Im
2
1
are the three dimensional mono-
genic filter responses vectors of candidate points
(x
1
,y
1
) and (x
2
,y
2
) for matching. The correlation
measure, in (13) was computed by the dot product
of above vectors with window l to l and k to
k in the two-dimensional image plane. In addition,
this measure was normalized by the sum of the am-
plitude responses, according to (12). The match-
ing is considered successful for the pairs of points
where the above correlation measure is maximized
(i.e. argmax|C
12
(x
Im
1
1
,x
Im
2
2
)|).
2.3 Camera Calibration and Image
Rectification
In the first phase of our experiments, the camera was
calibrated using the algorithm presented by Zhang in
(Zhang, 1999) with planar patterns. Regarding the
camera model, we assumed, that there is no skew.
The focal length per distance unit for the two image
plane directions is represented by α
x
and α
y
values.
The values p
x
and p
y
represent the coordinates of the
principal points in x and y direction accordingly. Fol-
lowing these assumptions, the matrix of the camera
intrinsic parameters, takes the following the form:
K =
α
x
0 p
x
0 α
y
p
y
0 0 1
(14)
The process of image rectification simplifies the
matching problem by transforming the whole search
area for each image from 2D to 1D. Therefore, the
epipolar lines become parallel and coincide with the
scan lines used to find matching pairs. The rectifi-
cation transformation used to place the epipolar lines
in parallel, was based on the above assumptions con-
cerning the camera intrinsic parameters, according to
Hartley (Hartley, 1997) and Koch et al. (Koch et al.,
1998).
Figure 1: Successive photos, in grayscale, of a temple in
Acropolis and the gate of an ancient theatre, captured with
a camera moving in a straight line.
3 IMPLEMENTATION - RESULTS
3.1 Feature Detection - Matching
The camera that we used for the experiments was a
Nikon D70s 18-74mm. Successivephotographsof the
side views of the two subjects (namely, a temple in
Acropolis of Athens and the outside scene of an an-
cient theatres gate) were used as experimental data,
while the camera was sliding in an almost straight
line. A short displacement was used to avoid hav-
ing large occluded areas. All the images were trans-
formed in grayscale, and color information was not
used for depth estimation. The photos captured and
used for the experiments are shown in Fig.1.
The initial phase in a stereo matching process is
feature detection for each image frame. The three
main approaches for feature detection that were ex-
perimentally tested (in comparison) are presented in
the sequel. The first approach consisted of applying
classic filters based on image intesity gradient com-
putations such as the Canny edge detector. In the
second approach a scale-space representation of the
image was utilized in order to extract features (key-
points) that can be repeatedly detected under slightly
different views or any change in image scale, rota-
tion , or illumination conditions. The candidate key-
points have been detected according to Lowe (Lowe,
2004) using scale-space extrema in the difference of
Gaussian function convolved with the image. In the
third approach, edge detection on the rectified im-
ages, was performed based on phase congruency fil-
tering. In this case, edges were detected on the im-
ages through the calculation of the maximum value
of the moments of phase congruency covariance, as
it has already been described. From the group of de-
tected edges, we chose those with length greater than
IMAGAPP 2009 - International Conference on Imaging Theory and Applications
24
a selected threshold. The feature detection results for
both approaches are shown in Fig.2.
For the task of edge detection based on phase
congruency concepts, log-Gabor functions was used,
with Gaussian transfer functions on a logarithmic fre-
quency scale. This filter was applied in six orienta-
tions and at four scales, with a constant one octave
bandwidth, according to Equations (7), (2), and fol-
lowing analysis presented in (Field, 1987). By ob-
serving the first row in Fig. 2 one can conclude that
the application of traditional techniques, like Canny
filtering, for such a complicated scene results in poor
localization of the detected edges, as compared to the
results obtained by applying phase congruency meth-
ods shown in the third row of Fig. 2. The second row
in Fig. 2 shows the detected canditate SIFT keypoints
at a specific level of the constructed scale space pyra-
mid. In this case, it is evident that the use of differ-
ence of Gaussian operator, which is based on gradient
measurements, emphasizes edge features, even those
features with low contrast. This kind of low contrast
features will be excluded from SIFT features as being
non distinctive. The apparent complexity of the scene
resulted in a large number of these features, therefore,
the matching process will be based on less candidate
points.
The next stage in depth estimation is the process
of matching corresponding points between successive
images, which is known as an ill-conditioned prob-
lems in low-level vision. The quality of the solution
of the matching problem has a direct impact on the
quality of the scene reconstruction. The matching
process was again performed based on three meth-
ods, in comparison: (a) The first approach used Canny
filtering for feature detection and a typical intensity-
based correlation method for the matching process.
(b) The second approach consisted of SIFT keypoints
detection, based on image gradient amplitude and ori-
entation measurements, the construction of invariant
keypoint descriptors for each image of the stereo pair
and, finally, the matching process which was based on
these descriptors correspondence, through Euclidean
distance measurements. (c) Finally, the third ap-
proach followed in this paper was based on the appli-
cation of monogenic filters, as has been described in
section 2.2. The specific and unique direction of cam-
era motion indicated the direction on which the candi-
dates for feature matching were moved on the image
plane. The process of rectification also locates corre-
sponding points on the same line. Based on these re-
marks the search area for image correspondences was
radically reduced, resulting effectively in the calcula-
tion of much more reliable matching points.
During our experiments, we observed that the
matching of sporadic points created many prob-
lems, especially when an intensity-based correlation
method was applied. This occurs because the photos
were outdoor, very complicated and had been taken
under random lighting and illumination conditions.
This means that the intensity values of specific points
include a lot of uncertainty. Variations in shading (in
one or more photos), repeated patterns on the images
and a uniform texture, all result in very close inten-
sity values for certain pixel neighborhoods, leading
to a large number of candidate points for matching.
The first attempt to overcome the uncertainty prob-
lems was based on the use of a correlation measure
for whole geometric primitives like lines, excluding
from the correlation process sporadic points.
The prior knowledge of the camera motion was
used in order to look for probable corresponding lines
in the opposite direction of which the camera was
moving. The search area was basically restricted on
a horizontal axis on the image, due to the known
horizontal motion of the camera. The search win-
dow was chosen to have its (horizontal dimension)
width almost equal to the half of the image width,
and its height (vertical dimension) equal to a few pix-
els. However, line characteristics like length and di-
rection may present considerable deviation between
corresponding images. It is, for instance, possible
for a detected line on one image to break into two or
more parts on the other image. Therefore, we decided,
instead, to use point correlation techniques on these
candidate lines. Such an approach improved indeed
the obtained results. The confirmation of the match-
ing validity in the neighborhood of each point, was
achieved by the implementation of classic relaxation
methods, as in (Faugeras, 1993).
3.2 3D Reconstruction Results
Our main goal in this study was to evaluate the perfor-
mance of the phase domain methods, in comparison
to the classical (intensity based) filtering techniques
and the SIFT local keypoint descriptors in a stereo
matching and 3D calibrated reconstruction problem.
The feature matching process led to the acquisition of
matching pairs in the two images. Hence, the estima-
tion of the fundamental matrix becomes feasible. The
calibration matrix was recovered by implementing the
Zhang’s method as briefly described in Section 2.3.
Consequently, the projection matrices of the camera
were computed in both configurations through the es-
sential matrix according to the known relation intro-
duced by Hartley and Zisserman (Hartley and Zis-
serman, 2000). The metric information of the scene
was recovered using the camera calibration matrix.
STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS
25
Figure 2: Edge detection results for the two photos. First Row: Application of Canny filter. Second Row: Use of Gaussian
filter. Third Row: Application of phase congruency methods.
Depth estimation for each matching pair of points was
performed by point triangulation, and was further re-
fined by minimizing the reprojection error using the
Levenberg-Marquadt method.
Comparative results for the three different meth-
ods are presented in Figures 3(a) and 3(b). The first
row to both of these subfigures presents the results
obtained when applying classic methods both for fea-
ture detection and point matching, that is, Canny edge
detection and intensity-based correlation techniques.
For the results depicted in the second row, the differ-
ence of Gaussian filter was applied for SIFT keypoints
detection. The matching process for image points ap-
proached by constructinglocal image descriptors. Lo-
cal image region descriptors assign the gradient mag-
nitude and orientation to each keypoint according to
(Lowe, 2004). The best candidate match was found
by identifying of that point, on the other image of the
stereo pair, with minimum Euclidean distance for the
invariant keypoint descriptor vector. In the third row,
the results are obtained using both the phase congru-
ency method for edge detection and the monogenic
filtering approach for point correlation.
The results presented in Fig. 3(a) and 3(b), are
also organized in three columns. The first column of
each subfigure shows the 3D scene reconstruction re-
sults, for the three methods mentioned above, while
the second column depicts the same reconstructed
scenes but rotated by a small angle (approximately 20
degrees, to better illustrate the estimated scene depth).
Finally, the third column presents a colored illustra-
tion of the scene depth, with red being the color of the
nearest points and blue the color of the most distant
ones (with linear color to distance variation).
By observing these experimental results, it can
be seen that a more dense and accurate representa-
tion of the scene structure is obtained when all com-
putations are conducted in the phase domain (third
row in Fig. 3(a) and 3(b)). In addition, we observe
that a better variance of the depth values is obtained
when the matching process is implemented in the
phase domain. This observation becomes more evi-
dent when the reconstructed scene is rotated as shown
in the second column of Figures 3(a) and 3(b). The
classic methods give results that are evidently non-
satisfactory in this case, regarding the structure and
IMAGAPP 2009 - International Conference on Imaging Theory and Applications
26
(a) Ancient Temple Reconstruction
(b) Gate of Ancient Theatre
Figure 3: Scene reconstruction results obtained from two
experimental stereo pairs. The organization of the results in
rows and columns is the same to both of subfigures. First
row: classic techniques applied using only the intensity val-
ues of each image. Second row: scene reconstruction results
using gradient measurementsfor edge detection and SIFT
keypoints matching for scene reconstruction. Third row:
computations conducted solely in the phase domain. First
Column: scene reconstruction results using different edge
detection and matching methods. Second column: recon-
struction results rotated by a small angle. Third column:
colored illustration of the scene depth, with red being the
color of the nearest and blue the color of the most distant
points.
depth estimation of the scene. The computation of the
three dimensional points by an intensity-based cor-
relation method (first row results in figure 3(a) and
3(b)), in this case, leads to a structure where all points
are almost co-planar with a very vague reconstructed
scene structure and a lot of outliers (for the results de-
picted in the first row). Furthermore, the results of the
second row in both of the subfigures indicate that the
use of SIFT keypoints descriptors is also not an ap-
propriate procedure for stereo matching in such cases
of images with large scale of complexity. The com-
putation of keypoint descriptors is based on gradi-
ents measurements in the image in different levels of
Gaussian blur. That kind of measurements can detect
features with poorly defined peaks in the difference-
of-Gaussian function, which will be rejected from
keypoint descriptor computation. This results in the
detection of sporadic keypoints that do not follow a
specific structure. The same conclusion can be con-
firmed by the very poor depth variation without a clear
sense of the scene structure, in the second column of
the second row results for both figures 3(a) and 3(b).
4 CONCLUSIONS AND FUTURE
WORK
This paper presents an approach for depth estimation
and scene reconstruction using phase domain meth-
ods, based on concepts that involve local representa-
tion of image features. We implement recent ideas
of local energy models for each stage of the 3D re-
construction process, comprising mainly the tasks of
feature detection and image correspondence. More
specifically, for the task of detecting edges as image
features, we applied the phase congruency method,
introduced by Kovesi (Kovesi, 1999), while for the
image correspondence task we implemented a new
version of a correlation measure based on mono-
genic filtering. An appropriate normalization was per-
formed, based on localized amplitude responses of
log-Gabor signals and a prior-knowledge of the cam-
era motion, in order to enrich the mathematical ex-
pression of the new correlation measure. Experimen-
tal results showed that feature estimation in the fre-
quency domain remains invariant to changes in the
lighting conditions of the scene. It was concluded
that the proposed approach leads to more reliable re-
sults, producing a more accurate metric information
of the scene and a more dense structure regarding the
outcome of the 3D scene reconstruction process. The
good behavior of such phase-domain models was con-
firmed, which seem to present a choice of preference
for the task of image-based 3D reconstruction of com-
plex sceneries, such as the archaeological site or an
outside scene used in the experimental study of this
work. In the future we plan to extend the approach
presented in this paper to extract dense disparity maps
from multiple camera views, integrated within proba-
bilistic frameworks.
STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS
27
ACKNOWLEDGEMENTS
This work was supported by grant ΠENE-2003-
E865 [co-financed by E.U.-European Social Fund
(80%) and the Greek Ministry of Development-GSRT
(20%)].
REFERENCES
Canny, F. (1986). A computational approach to edge de-
tection. IEEE Trans. Pattern Analysis and Machine
Intelligence, 8:112–131.
Faugeras, O. (1993). Three-Dimensional Computer Vision:
A Geometric Viewpoint. MIT Press, Cambridge, Mas-
sachussets.
Felsberg, M. and Sommer, G. (Dec 2001). The mono-
genic signal. IEEE Transactions on Signal Process-
ing, 49(12).
Field, D. J. (Dec. 1987). Relations between the statistics of
natural images and the response properties of cortical
cells. Journal of The Optical Society of America A,
4(12):2379–2394.
Fleck, M. M. (March 1992). Multiple widths yield reliable
finite differences. IEEE T-PAMI, 14(3):337–345.
Hartley, R. (1997). In defence of the eight-point algorithm.
IEEE T-PAMI, 19(6):580–593.
Hartley, R. and Zisserman, A. (2000). Multiview Geometry
in Computer Vision. Cambridge University Press.
Koch, R., Pollefeys, M., and Gool, L. V. (1998). Automatic
3d model acquisition from uncalibrated image. In
Proceedings Computer Graphics International, pages
597–604, Hannover.
Kovesi, P. D. Matlab and octave functions for com-
puter vision and image processing. Available from:
http://www.csse.uwa.edu.au/ pk/research/matlabfns.
Kovesi, P. D. (1995). Image correlation from local fre-
quency information. In The Australian Pattern Recog-
nition Society Conference: DICTA’95, pages 336–
341, Brisbane.
Kovesi, P. D. (1999). Image features from phase congru-
ency. Videre: A Journal of Computer Vision Research,
MIT Press.
Lowe, D. (2004). Distinctive image features from scale-
invariant keypoints. International Journal of Com-
puter Vision, 60(2):91–110.
Marr, D. and Hildreth, E. C. (1980). Theory of edge detec-
tion. In Proceedings of the Royal Society, London B,
pages 187–217.
Morrone, M. C. and Owens, R. A. (1987). Feature detec-
tion from local energy. Pattern Recognition Letters,
6:303–313.
Venkatesh, S. and Owens, R. (1989). An energy feature de-
tection scheme. In International Conference on Image
Processing, pages 553–557, Singapore.
Zhang, Z. (1999). A flexible new technique for camera cali-
bration. In International Conference on Computer Vi-
sion, Kerkyra, Greece.
IMAGAPP 2009 - International Conference on Imaging Theory and Applications
28