STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES

USING PHASE DOMAIN METHODS

Manthos Alifragis and Costas S. Tzafestas

National Technical University of Athens, School of Electrical and Computer Engineering

Division of Signals, Control, and Robotics, Zographou Campus, 15773 Athens, Greece

Keywords:

Stereo matching, Phase congruency, Monogenic signal, SIFT keypoints.

Abstract:

This paper conducts an experimental study on the application of some recent theories of image preprocessing

and analysis in the frequency domain, particularly the phase congruency and monogenic ﬁltering methods.

Our goal was to examine the performance of such methods in a stereo matching problem setting, with photos

of complicated scenes. Two objects were used: a scene of an ancient Greek temple of Acropolis and the out-

side scene of the gate of an ancient theatre. Due to the complex structure of the photographed object, classic

techniques used for feature detection and matching give poor results. The phase-domain approach followed in

this paper is based on the phase-congruency method for feature extraction, together with monogenic ﬁltering

and a new correlation measure in the frequency domain for image correspondence and stereo matching. Com-

parative results show that the three-dimensional models of the scene computed when applying these phase

domain methods are much more detailed and consistent as compared to the models obtained when using clas-

sic approaches or the SIFT based techniques, which give poor depth representation and less accurate metric

information.

1 INTRODUCTION

The problem of stereo matching and depth estimation

have become in recent years the focus of consider-

able research in the ﬁeld of computer vision. Reli-

able edge or feature detection techniques constitute

the precursors of three dimensional structure or scene

reconstruction methods. Throughout the years, there

has been a signiﬁcant progress in the development of

image correspondence analysis and feature detection

methods. As far as feature (edge or corner) detec-

tion is concerned, the traditional approach endorsedin

most of the applications, is the one applying gradient-

based methods, such as those developed by Canny

(Canny, 1986), and Marr & Hildreth (Marr and Hil-

dreth, 1980). These methods have the drawback of

sensitivity in image illumination, contrast, blurring

and magniﬁcation. Another disadvantage when us-

ing these methods is the non-automatic determina-

tion of the appropriate thresholds for feature detec-

tion. More recently, Fleck (Fleck, 1992) used an a-

priori knowledge of the noise characteristics of the

camera, in order to set feature detection thresholds.

A remarkable study on the detection of image fea-

tures invariant to image scale and rotation has been

made by Lowe (Lowe, 2004). This speciﬁc approach

has been named Scale Invariant Feature Transform

(SIFT). This approach imposes a local image descrip-

tor which is highly distinctive and invariant to image

scale-space variations and changes in illuminations or

3D viewpoint, too.

In the work presented in this paper, we made use

of Fourier transformations of the images, and Ga-

bor ﬁltering, together with the phase congruency ap-

proach proposed by Kovesi (Kovesi, 1999). This ap-

proach utilizes the local frequency spread and uses

this information to weigh the phase congruency mea-

sure of the image. Concerning image correspondence,

we extend the approach proposed in (Kovesi, 1995),

using monogenic ﬁltering together with a new cor-

relation measure in the frequency domain. This ap-

proach was enriched by a normalized expression of

the correlation measure, as well as by the additional

information of line detection results and an approx-

imately known camera motion. The combination of

the above methods led us to the development and

implementation of a ﬁlter in the frequency domain,

which was experimentally applied in a stereo match-

ing problem.

The rest of the paper is organized as follows: Sec-

Alifragis M. and S. Tzafestas C. (2009).

STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS.

In Proceedings of the First International Conference on Computer Imaging Theory and Applications, pages 21-28

DOI: 10.5220/0001791500210028

 SciTePress

tion 2.1 presents the computational approach applied

in the frequency domain for image processing and

edge detection. Section 2.2 presents the new stereo-

matching approach that is based on the application of

monogenic ﬁltering. Section 2.3 makes a brief dis-

cussion on camera calibration issues, also presenting

the image rectiﬁcation methods that were used in our

experiments. Finally, an evaluation of all the above

mentioned methods in comparison with the traditional

approaches as applied to the problem of stereo match-

ing and depth estimation for 3D scene reconstruction,

is presented in Section 3.

2 PHASE DOMAIN METHODS

2.1 Feature Extraction using Phase

Congruency

The ﬁrst phase in a stereo analysis problem is the ex-

traction of useful information from the images. This

phase consists of detecting edges or other geometrical

structures of interest on the images. The detection of

sporadic or texture points is avoided. The traditional

approachto the edge detection problem is based on in-

tensity information (Canny, 1986). It is known, how-

ever, that these kind of intensity based methods are

very sensitive to random lighting of the photographs,

to optical characteristics of the camera and are mostly

dependent on the different levels of thresholding.

The development of local energy models to de-

tect image features, such as edges or curves, under

random lighting conditions is presented in (Morrone

and Owens, 1987). According to these models and

through two dimensional fourier transformation of an

image, local frequency and phase information is ob-

tained. The phase congruency deﬁnition by Morrone

and Owens (Morrone and Owens, 1987) is given by

the Fourier series expansion of an one dimensional

signal at some location x as:

PC(x) = max

ϕ(x)∈[0,2π]

∑

cos(ϕ

(x) − ϕ(x))

∑

(1)

The quantity ϕ represents the local phase angle

of the Fourier transformation at the speciﬁc image

point. The quantity ϕ which maximizes Eq. (1) is the

mean phase angle of all Fourier components at a local

neighborhood of the image. In the approach followed

in this paper, we use complex Gabor functions (sine

and cosine functions modulated over gaussian). In or-

der to measure, in a speciﬁc location of the signal,

the amplitude and the phase, we can apply two linear

phase ﬁlters in quadrature as Gabor complex ﬁlters,

for a speciﬁc scale and frequency. The Gabor ﬁlter

is composed of two main components, the complex

sinusoidal carrier, and a gaussian envelope. Alterna-

tively, log-Gabor ﬁlters can be applied as proposed

by to Field (Field, 1987). The log-Gabor ﬁlters have

gaussian transfer function and allow the construction

of large bandwidth ﬁlters with odd symmetry and DC

component equal to zero. The zero DC component

cannot be kept in Gabor ﬁlters with bandwidth greater

than one octave. The log Gabor function has a fre-

quency response described by:

G( f) = e

−[log( f/ f

)]

/2[log(σ/ f

)]

(2)

The frequency response of a log-Gabor ﬁlter is a

Gaussian on a log frequency axis. The f

deﬁnes

the centre frequency of the sinusoid and represent the

scaling factor of the ﬁlter. In addition, the σ is a scal-

ing factor of the bandwidth. In order to maintain con-

stant shape ratio ﬁlters, the ratio of σ/ f

should be

maintained constant. The ﬁrst step of this analysis

consisted of the convolution of the signal with each

quadrature part of the ﬁlter. Let I be the signal, and

, M

be the even (cosine) and odd (sine) symmetry

waveforms, respectively, at scale n. The response vec-

tor is then consisted of the responses of each quadra-

ture pair of ﬁlters. It is written as:

(x),o

(x)) = (I(x)∗ M

,I(x) ∗ M

) (3)

where e

(x) and o

(x) are the real and imaginary parts

of the frequency Fourier component. The amplitude

of the transformed signal at a speciﬁc scale of the ga-

bor ﬁlter is given by:

(x) =

(x)

+ o

(x)

(4)

The phase is given by:

(x) = arctan(e

(x),o

(x)) (5)

at each point x in a signal. The arctan is the four

quadrant inverse tangent.

A response vector was thus formed for each scale

of the ﬁlter. The array of all these vectors repre-

sents the localized information of a signal. The above

expression (1) of the phase congruency measure has

poor localization on blurred features. This measure

is also sensitive to noise. These issues led Kovesi in

(Kovesi, 1999) to develop a new measure for phase

congruency and to extend the one-dimensional ﬁl-

ters described previously into two dimensions. This

new measure includes the additional sum of the log-

Gabor ﬁlters response amplitudes multiplied in the

frequency domain with some spreading function over

all orientations and scales at a speciﬁc location in the

image.

IMAGAPP 2009 - International Conference on Imaging Theory and Applications

Gaussian functions were also used as spreading

functions across the ﬁlter perpendicular to its orien-

tation, according to Kovesi (Kovesi, 1999). This re-

sulted in preserving the phase information unaffected,

due to the fact that any signal convolved with a gaus-

sian function has its amplitude components modu-

lated, while the phase remains unaltered. Towards this

direction, the two dimensional ﬁlters used in the fre-

quency domain are Gaussian with geometrically in-

creasing center frequencies and bandwidths. Their

transfer function is:

G(θ) = e

−

(θ−θ

)

2σ

(6)

where θ

is the orientation angle of the ﬁlter, and σ

the standard deviation of the gaussian spreading func-

tion in the angular direction.

The equation for 2-D phase congruency for two-

dimensional signals analysis, like images, can be ob-

tained as follows:

∑

(x)⌊A

△φ

(x) − T⌋

∑

(x) + ε

(7)

at each two-dimensional image location x. The

△φ

(x) = cos(ϕ

(x) − ϕ(x)) − | sin(ϕ

(x) −

ϕ(x))|, and o, n refer to the ﬁlter’s orientation and

scale, respectively. It must be also noted that when

the amount between the symbols ⌊ . ⌋ is non-positive,

then the outcome becomes zero. The numerator of

the above fraction represents the total energy of the

2D signal at a local point of the image. This amount

of energy is an approximation of the local energy

function deﬁned for an analytical signal, according to

Venkatesh and Owens (Venkatesh and Owens, 1989).

The term W

(x) weighs the frequency spread. Kovesi

made use of W

(x) as a component to cope with

the lack of reliability of phase congruency measures

in image areas with less frequency spread (e.g.

smoothed images). The role of ε is to avoid division

by zero. Finally, only the values which are above a

threshold T (the expected inﬂuence of noise) are used

to calculate the ﬁnal result. The appropriate threshold

T for the noise is set experimentally, according to the

response of the smallest scale ﬁlter on each image.

It must be also noted here that the type of an image

feature detected, such as a line or a corner correspond-

ing to maximum phase congruency value, needs to be

classiﬁed accordingly. Towards this end, the phase

congruency feature maps were calculated, according

to (Kovesi, ).

2.2 Feature Matching on the Phase

Domain

The image correspondence problem has been exces-

sively studied as a fundamental problem of low-level

computer vision. In order to track correspondent

points through images, intensity correlation processes

can be applied. However, the apparent complex-

ity of the images used in our experiments, dictated

the use of a different approach to address this issue.

In the work presented in this paper, we employ an

approach based on two-dimensional analytic signal

theory and monogenic signal theory, inspired by the

work of Sommer and Ferlsberg in (Felsberg and Som-

mer, 2001).

The two dimensional analytic signal is based on

a two dimensional generalization of Hilbert transfor-

mation, also known as Rietz transformation. The ex-

pression of the Rietz-transformed signal F (u) in the

frequency domain:

(u) = i

|u|

F (u)

where u the two dimensional frequency vector

), and |u| =

+ u

The Fourier transformation of each image I

was

ﬁrstly computed. The next stage was the introduction

of a log-Gabor ﬁlter (see Eq.(2)), which contributes to

the construction of bandpass expressions of the signal

in the frequency domain:

= F

G( f) (8)

where H





. The above image ﬁlters

are applied in the frequency domain, and after the ap-

plication of an inverse Fourier transformation, the real

part of the consequent signals is obtained as follows:

= Re



−1

G( f)}



(9)

= Re



−1





(10)

= Re



−1





(11)

This leads to a generalized complex 2D analytic

signal expression, which has as real part the signal I

(9) and complex part the mathematical expression of

its Rietz transformation, according to Ferlsberg and

Sommer (Felsberg and Sommer, 2001). Therefore,

the complex part of the 2D analytic signal consists

of two signals, the signal H

(10) and the signal H

(11). Consequently, at each point of the image (x,y),

at a speciﬁc scale and orientation, we havea 3D vector

x(x,y) consisting of the three above signals (9), (10)

and (11). In addition, the measure of the amplitude of

the energy is given by:

STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS

A =

)

+ (H

)

+ (I

)

(12)

Given two locations (x

) and (x

) in the ﬁrst

and second image (Im

and Im

) respectively, the cor-

relation measure is then given by:

) =

∑

m=−k

∑

n=−l

+ m,y

+ n)

∑

m=−k

∑

n=−l

+ m,y

+ n)

+ m,y

+ n)

+ m,y

+ n)

(13)

where x

and x

are the three dimensional mono-

genic ﬁlter responses vectors of candidate points

) and (x

) for matching. The correlation

measure, in (13) was computed by the dot product

of above vectors with window −l to l and −k to

k in the two-dimensional image plane. In addition,

this measure was normalized by the sum of the am-

plitude responses, according to (12). The match-

ing is considered successful for the pairs of points

where the above correlation measure is maximized

(i.e. argmax|C

)|).

2.3 Camera Calibration and Image

Rectiﬁcation

In the ﬁrst phase of our experiments, the camera was

calibrated using the algorithm presented by Zhang in

(Zhang, 1999) with planar patterns. Regarding the

camera model, we assumed, that there is no skew.

The focal length per distance unit for the two image

plane directions is represented by α

and α

values.

The values p

and p

represent the coordinates of the

principal points in x and y direction accordingly. Fol-

lowing these assumptions, the matrix of the camera

intrinsic parameters, takes the following the form:

K =





0 p

0 α

0 0 1





(14)

The process of image rectiﬁcation simpliﬁes the

matching problem by transforming the whole search

area for each image from 2D to 1D. Therefore, the

epipolar lines become parallel and coincide with the

scan lines used to ﬁnd matching pairs. The rectiﬁ-

cation transformation used to place the epipolar lines

in parallel, was based on the above assumptions con-

cerning the camera intrinsic parameters, according to

Hartley (Hartley, 1997) and Koch et al. (Koch et al.,

1998).

Figure 1: Successive photos, in grayscale, of a temple in

Acropolis and the gate of an ancient theatre, captured with

a camera moving in a straight line.

3 IMPLEMENTATION - RESULTS

3.1 Feature Detection - Matching

The camera that we used for the experiments was a

Nikon D70s 18-74mm. Successivephotographsof the

side views of the two subjects (namely, a temple in

Acropolis of Athens and the outside scene of an an-

cient theatres gate) were used as experimental data,

while the camera was sliding in an almost straight

line. A short displacement was used to avoid hav-

ing large occluded areas. All the images were trans-

formed in grayscale, and color information was not

used for depth estimation. The photos captured and

used for the experiments are shown in Fig.1.

The initial phase in a stereo matching process is

feature detection for each image frame. The three

main approaches for feature detection that were ex-

perimentally tested (in comparison) are presented in

the sequel. The ﬁrst approach consisted of applying

classic ﬁlters based on image intesity gradient com-

putations such as the Canny edge detector. In the

second approach a scale-space representation of the

image was utilized in order to extract features (key-

points) that can be repeatedly detected under slightly

different views or any change in image scale, rota-

tion , or illumination conditions. The candidate key-

points have been detected according to Lowe (Lowe,

2004) using scale-space extrema in the difference of

Gaussian function convolved with the image. In the

third approach, edge detection on the rectiﬁed im-

ages, was performed based on phase congruency ﬁl-

tering. In this case, edges were detected on the im-

ages through the calculation of the maximum value

of the moments of phase congruency covariance, as

it has already been described. From the group of de-

tected edges, we chose those with length greater than

IMAGAPP 2009 - International Conference on Imaging Theory and Applications

a selected threshold. The feature detection results for

both approaches are shown in Fig.2.

For the task of edge detection based on phase

congruency concepts, log-Gabor functions was used,

with Gaussian transfer functions on a logarithmic fre-

quency scale. This ﬁlter was applied in six orienta-

tions and at four scales, with a constant one octave

bandwidth, according to Equations (7), (2), and fol-

lowing analysis presented in (Field, 1987). By ob-

serving the ﬁrst row in Fig. 2 one can conclude that

the application of traditional techniques, like Canny

ﬁltering, for such a complicated scene results in poor

localization of the detected edges, as compared to the

results obtained by applying phase congruency meth-

ods shown in the third row of Fig. 2. The second row

in Fig. 2 shows the detected canditate SIFT keypoints

at a speciﬁc level of the constructed scale space pyra-

mid. In this case, it is evident that the use of differ-

ence of Gaussian operator, which is based on gradient

measurements, emphasizes edge features, even those

features with low contrast. This kind of low contrast

features will be excluded from SIFT features as being

non distinctive. The apparent complexity of the scene

resulted in a large number of these features, therefore,

the matching process will be based on less candidate

points.

The next stage in depth estimation is the process

of matching corresponding points between successive

images, which is known as an ill-conditioned prob-

lems in low-level vision. The quality of the solution

of the matching problem has a direct impact on the

quality of the scene reconstruction. The matching

process was again performed based on three meth-

ods, in comparison: (a) The ﬁrst approach used Canny

ﬁltering for feature detection and a typical intensity-

based correlation method for the matching process.

(b) The second approach consisted of SIFT keypoints

detection, based on image gradient amplitude and ori-

entation measurements, the construction of invariant

keypoint descriptors for each image of the stereo pair

and, ﬁnally, the matching process which was based on

these descriptors correspondence, through Euclidean

distance measurements. (c) Finally, the third ap-

proach followed in this paper was based on the appli-

cation of monogenic ﬁlters, as has been described in

section 2.2. The speciﬁc and unique direction of cam-

era motion indicated the direction on which the candi-

dates for feature matching were moved on the image

plane. The process of rectiﬁcation also locates corre-

sponding points on the same line. Based on these re-

marks the search area for image correspondences was

radically reduced, resulting effectively in the calcula-

tion of much more reliable matching points.

During our experiments, we observed that the

matching of sporadic points created many prob-

lems, especially when an intensity-based correlation

method was applied. This occurs because the photos

were outdoor, very complicated and had been taken

under random lighting and illumination conditions.

This means that the intensity values of speciﬁc points

include a lot of uncertainty. Variations in shading (in

one or more photos), repeated patterns on the images

and a uniform texture, all result in very close inten-

sity values for certain pixel neighborhoods, leading

to a large number of candidate points for matching.

The ﬁrst attempt to overcome the uncertainty prob-

lems was based on the use of a correlation measure

for whole geometric primitives like lines, excluding

from the correlation process sporadic points.

The prior knowledge of the camera motion was

used in order to look for probable corresponding lines

in the opposite direction of which the camera was

moving. The search area was basically restricted on

a horizontal axis on the image, due to the known

horizontal motion of the camera. The search win-

dow was chosen to have its (horizontal dimension)

width almost equal to the half of the image width,

and its height (vertical dimension) equal to a few pix-

els. However, line characteristics like length and di-

rection may present considerable deviation between

corresponding images. It is, for instance, possible

for a detected line on one image to break into two or

more parts on the other image. Therefore, we decided,

instead, to use point correlation techniques on these

candidate lines. Such an approach improved indeed

the obtained results. The conﬁrmation of the match-

ing validity in the neighborhood of each point, was

achieved by the implementation of classic relaxation

methods, as in (Faugeras, 1993).

3.2 3D Reconstruction Results

Our main goal in this study was to evaluate the perfor-

mance of the phase domain methods, in comparison

to the classical (intensity based) ﬁltering techniques

and the SIFT local keypoint descriptors in a stereo

matching and 3D calibrated reconstruction problem.

The feature matching process led to the acquisition of

matching pairs in the two images. Hence, the estima-

tion of the fundamental matrix becomes feasible. The

calibration matrix was recovered by implementing the

Zhang’s method as brieﬂy described in Section 2.3.

Consequently, the projection matrices of the camera

were computed in both conﬁgurations through the es-

sential matrix according to the known relation intro-

duced by Hartley and Zisserman (Hartley and Zis-

serman, 2000). The metric information of the scene

was recovered using the camera calibration matrix.

STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS

Figure 2: Edge detection results for the two photos. First Row: Application of Canny ﬁlter. Second Row: Use of Gaussian

ﬁlter. Third Row: Application of phase congruency methods.

Depth estimation for each matching pair of points was

performed by point triangulation, and was further re-

ﬁned by minimizing the reprojection error using the

Levenberg-Marquadt method.

Comparative results for the three different meth-

ods are presented in Figures 3(a) and 3(b). The ﬁrst

row to both of these subﬁgures presents the results

obtained when applying classic methods both for fea-

ture detection and point matching, that is, Canny edge

detection and intensity-based correlation techniques.

For the results depicted in the second row, the differ-

ence of Gaussian ﬁlter was applied for SIFT keypoints

detection. The matching process for image points ap-

proached by constructinglocal image descriptors. Lo-

cal image region descriptors assign the gradient mag-

nitude and orientation to each keypoint according to

(Lowe, 2004). The best candidate match was found

by identifying of that point, on the other image of the

stereo pair, with minimum Euclidean distance for the

invariant keypoint descriptor vector. In the third row,

the results are obtained using both the phase congru-

ency method for edge detection and the monogenic

ﬁltering approach for point correlation.

The results presented in Fig. 3(a) and 3(b), are

also organized in three columns. The ﬁrst column of

each subﬁgure shows the 3D scene reconstruction re-

sults, for the three methods mentioned above, while

the second column depicts the same reconstructed

scenes but rotated by a small angle (approximately 20

degrees, to better illustrate the estimated scene depth).

Finally, the third column presents a colored illustra-

tion of the scene depth, with red being the color of the

nearest points and blue the color of the most distant

ones (with linear color to distance variation).

By observing these experimental results, it can

be seen that a more dense and accurate representa-

tion of the scene structure is obtained when all com-

putations are conducted in the phase domain (third

row in Fig. 3(a) and 3(b)). In addition, we observe

that a better variance of the depth values is obtained

when the matching process is implemented in the

phase domain. This observation becomes more evi-

dent when the reconstructed scene is rotated as shown

in the second column of Figures 3(a) and 3(b). The

classic methods give results that are evidently non-

satisfactory in this case, regarding the structure and

IMAGAPP 2009 - International Conference on Imaging Theory and Applications

(a) Ancient Temple Reconstruction

(b) Gate of Ancient Theatre

Figure 3: Scene reconstruction results obtained from two

experimental stereo pairs. The organization of the results in

rows and columns is the same to both of subﬁgures. First

row: classic techniques applied using only the intensity val-

ues of each image. Second row: scene reconstruction results

using gradient measurementsfor edge detection and SIFT

keypoints matching for scene reconstruction. Third row:

computations conducted solely in the phase domain. First

Column: scene reconstruction results using different edge

detection and matching methods. Second column: recon-

struction results rotated by a small angle. Third column:

colored illustration of the scene depth, with red being the

color of the nearest and blue the color of the most distant

points.

depth estimation of the scene. The computation of the

three dimensional points by an intensity-based cor-

relation method (ﬁrst row results in ﬁgure 3(a) and

3(b)), in this case, leads to a structure where all points

are almost co-planar with a very vague reconstructed

scene structure and a lot of outliers (for the results de-

picted in the ﬁrst row). Furthermore, the results of the

second row in both of the subﬁgures indicate that the

use of SIFT keypoints descriptors is also not an ap-

propriate procedure for stereo matching in such cases

of images with large scale of complexity. The com-

putation of keypoint descriptors is based on gradi-

ents measurements in the image in different levels of

Gaussian blur. That kind of measurements can detect

features with poorly deﬁned peaks in the difference-

of-Gaussian function, which will be rejected from

keypoint descriptor computation. This results in the

detection of sporadic keypoints that do not follow a

speciﬁc structure. The same conclusion can be con-

ﬁrmed by the very poor depth variation without a clear

sense of the scene structure, in the second column of

the second row results for both ﬁgures 3(a) and 3(b).

4 CONCLUSIONS AND FUTURE

WORK

This paper presents an approach for depth estimation

and scene reconstruction using phase domain meth-

ods, based on concepts that involve local representa-

tion of image features. We implement recent ideas

of local energy models for each stage of the 3D re-

construction process, comprising mainly the tasks of

feature detection and image correspondence. More

speciﬁcally, for the task of detecting edges as image

features, we applied the phase congruency method,

introduced by Kovesi (Kovesi, 1999), while for the

image correspondence task we implemented a new

version of a correlation measure based on mono-

genic ﬁltering. An appropriate normalization was per-

formed, based on localized amplitude responses of

log-Gabor signals and a prior-knowledge of the cam-

era motion, in order to enrich the mathematical ex-

pression of the new correlation measure. Experimen-

tal results showed that feature estimation in the fre-

quency domain remains invariant to changes in the

lighting conditions of the scene. It was concluded

that the proposed approach leads to more reliable re-

sults, producing a more accurate metric information

of the scene and a more dense structure regarding the

outcome of the 3D scene reconstruction process. The

good behavior of such phase-domain models was con-

ﬁrmed, which seem to present a choice of preference

for the task of image-based 3D reconstruction of com-

plex sceneries, such as the archaeological site or an

outside scene used in the experimental study of this

work. In the future we plan to extend the approach

presented in this paper to extract dense disparity maps

from multiple camera views, integrated within proba-

bilistic frameworks.

STEREO PAIR MATCHING OF ARCHAEOLOGICAL SCENES USING PHASE DOMAIN METHODS

ACKNOWLEDGEMENTS

This work was supported by grant ΠENE∆-2003-

E∆865 [co-ﬁnanced by E.U.-European Social Fund

(80%) and the Greek Ministry of Development-GSRT

(20%)].

REFERENCES

Canny, F. (1986). A computational approach to edge de-

tection. IEEE Trans. Pattern Analysis and Machine

Intelligence, 8:112–131.

Faugeras, O. (1993). Three-Dimensional Computer Vision:

A Geometric Viewpoint. MIT Press, Cambridge, Mas-

sachussets.

Felsberg, M. and Sommer, G. (Dec 2001). The mono-

genic signal. IEEE Transactions on Signal Process-

ing, 49(12).

Field, D. J. (Dec. 1987). Relations between the statistics of

natural images and the response properties of cortical

cells. Journal of The Optical Society of America A,

4(12):2379–2394.

Fleck, M. M. (March 1992). Multiple widths yield reliable

ﬁnite differences. IEEE T-PAMI, 14(3):337–345.

Hartley, R. (1997). In defence of the eight-point algorithm.

IEEE T-PAMI, 19(6):580–593.

Hartley, R. and Zisserman, A. (2000). Multiview Geometry

in Computer Vision. Cambridge University Press.

Koch, R., Pollefeys, M., and Gool, L. V. (1998). Automatic

3d model acquisition from uncalibrated image. In

Proceedings Computer Graphics International, pages

597–604, Hannover.

Kovesi, P. D. Matlab and octave functions for com-

puter vision and image processing. Available from:

http://www.csse.uwa.edu.au/ pk/research/matlabfns.

Kovesi, P. D. (1995). Image correlation from local fre-

quency information. In The Australian Pattern Recog-

nition Society Conference: DICTA’95, pages 336–

341, Brisbane.

Kovesi, P. D. (1999). Image features from phase congru-

ency. Videre: A Journal of Computer Vision Research,

MIT Press.

Lowe, D. (2004). Distinctive image features from scale-

invariant keypoints. International Journal of Com-

puter Vision, 60(2):91–110.

Marr, D. and Hildreth, E. C. (1980). Theory of edge detec-

tion. In Proceedings of the Royal Society, London B,

pages 187–217.

Morrone, M. C. and Owens, R. A. (1987). Feature detec-

tion from local energy. Pattern Recognition Letters,

6:303–313.

Venkatesh, S. and Owens, R. (1989). An energy feature de-

tection scheme. In International Conference on Image

Processing, pages 553–557, Singapore.

Zhang, Z. (1999). A ﬂexible new technique for camera cali-

bration. In International Conference on Computer Vi-

sion, Kerkyra, Greece.

IMAGAPP 2009 - International Conference on Imaging Theory and Applications