DEVELOPMENT OF A COMPUTER PLATFORMFOR OBJECT
3D RECONSTRUCTIONUSING COMPUTER VISION
TECHNIQUES
Teresa Azevedo
INEGI – Inst. de Eng. Mecânica e Gestão Industrial, LOME – Lab. Óptica e Mecânica Experimental
FEUP – Faculdade de Engenharia da Universidade do Porto
Rua Dr. Roberto Frias s/n, 4200-465 Porto, Portugal
João Manuel R. S. Tavares, Mário A. Vaz
INEGI / LOME, FEUP
Rua Dr. Roberto Frias s/n, 4200-465 Porto, Portugal
Keywords: Computer Vision, Active Vision, 3D Reconstruction, Structure from Motion.
Abstract: In this paper we describe the development of a Computer Platform, whose goal is to recover the three-
dimensional (3D) structure of a scene or the shape of an object, using Structure From Motion (SFM)
techniques. SFM is an Active Computer Vision technique, which doesn’t need contact or energy projection.
The main objective of this project is to recover the 3D shape of an object or scene using the camera(s)’s or
object’s movement, without imposing any kind of restrictions to it. Starting with an uncalibrated sequence
of images, the referred movement is extracted, as well as the camera(s) parameters, and finally, the 3D
geometry of the object or scene is inferred. Shortly, in the first section of this paper the goals are defined; in
the second, the computer platform is presented, as well as some experimental results; in the third and last
section, the conclusions relative to the study and work done are drawn and, finally, some perspectives of
future work are given.
1 INTRODUCTION
Computer Vision is continuously trying to develop
theories and methods for automatic extraction of
useful information from images, in a way as similar
as possible to the complex human visual system.
Contactless techniques to recover the 3D
geometry of an object are usually divided in two
classes: active techniques, that require some kind of
energy projection or the camera’s or object’s
movement to obtain information about the object’s
shape, and passive techniques, that only use ambient
light and so, usually, the extraction of 3D
information becomes more difficult.
The main goal of this work is to obtain 3D
models of objects using a Structure From Motion
(SFM) methodology, which is a Active Vision
technique (Pollefeys, 1998). Along time, SFM
technique has received several contributions and
diverse approaches. In the present case, we do not
want to impose any kind of restrictions to the
movement involved: starting from an uncalibrated
image sequence of an object, we intend to extract the
referred movement (camera(s)’s or object’s), to
calibrate the camera(s) used, and finally to obtain the
3D geometry of the object in cause.
To help accomplishing our goals, a modular
computer platform has been developed, with a
graphical interface, into which functions, from
several libraries of public domain, are being
integrated.
The functions already integrated enclose several
Computer Vision techniques, such as: feature points
extraction and matching between images, epipolar
geometry determination, rectification and dense
matching (Azevedo, 2005). These techniques are
usually used in 3D shape extraction of objects
starting from an image sequence (Figure 1).
383
Azevedo T., Manuel R. S. Tavares J. and A. Vaz M. (2006).
DEVELOPMENT OF A COMPUTER PLATFORMFOR OBJECT 3D RECONSTRUCTIONUSING COMPUTER VISION TECHNIQUES.
In Proceedings of the First International Conference on Computer Vision Theory and Applications, pages 383-388
DOI: 10.5220/0001360003830388
Copyright
c
SciTePress
Figure 1: Adopted methodology for 3D reconstruction
of objects from a sequence of images.
2 COMPUTER PLATFORM
The computer platform has been being developed in
C
++
, with the tool Microsoft Visual Studio, using
MFC libraries (Microsoft Foundation Classes). It
has a graphical interface and a modular structure
allowing the user to comparatively analyse the
performance of each function.
Several functions for 3D reconstruction are
already available, integrated from five software
programs and one computational library, all open
source (for more information about them see
(Azevedo, 2005)). Further on, these entities will be
referred generically as Programs 1 to 6:
1. Peter’s Matlab Functions for Computer Vision
and Image Analysis (Kovesi, 2004): Matlab
functions for image processing and analysis and
3D Vision;
2. Torr’s Matlab Toolkit (Torr, 2002): Matlab
software with a graphical interface, that applies
some SFM techniques between two images;
3. OpenCV (2004): C
++
functions library which
implement the most common algorithms in the
Computer Vision domain;
4. Kanade-Lucas-Tomasi (KLT) Feature Tracker
(Birchfield, 2004): functions in C that implement
the KLT algorithm for feature points extraction
and matching in sequences of images;
5. Projective Rectification without Epipolar
Geometry (Isgrò, 1999): functions in C that
perform rectification in stereo images pairs,
without previous determination of the epipolar
geometry;
6. Depth Discontinuities by Pixel-to-Pixel Stereo
(Birchfield, 1999): C program that returns
disparity and discontinuity maps between two
rectified images.
In order to integrate them conveniently into the
computer platform, programs originally written in
Matlab were ported to C, using the Matlab Compiler
toolbox.
In the referred computer platform, for each
available Active Vision technique, the user can
easily choose the algorithm to use (Figure 2), as well
as conveniently define its parameters (Figure 3).
Figure 2: For each platform menu item,
several algorithms are available.
Figure 3: Example of setting the parameters for
the program OpenCV matching algorithm.
VISAPP 2006 - MOTION, TRACKING AND STEREO VISION
384
3 EXPERIMENTAL RESULTS
For experimental results, stereo images where used,
with dimensions
540 612× pixels, obtained from
several real objects, captured with an off-the-shelf
digital camera.
Some of the experimental results obtained are
presented here following the sequence indicated in
Figure 1.
3.1 Feature Points Detection
Feature or interesting points are those who reflect
the relevant discrepancies between their intensity
values and those of their neighbours. Usually, these
points represent vertices of objects, and their
detection allows posterior matching between the
images of the sequences.
Only programs 1 to 4 have algorithms for feature
points detection. As an example, in Figure 4 are
some results obtained with these programs, using
one of the test images. It is possible to observe that,
in this case, program 2 presents the weakest results,
especially because there are some areas with high
density of detected feature points. This happens
because this technique does not ensure a minimal
distance between detected feature points, as happens
with programs 3 and 4.
3.2 Matching
Matching is the 2D points association between
sequential images, which are the projection of the
same 3D object point. Automatic detection of
matching points between images can be achieved
using several matching measures (Azevedo, 2005).
Again, only programs 1 to 4 have algorithms for
feature points matching. Experimental results,
obtained using the matching techniques integrated
into the computer platform, are shown in Figure 5.
It is possible to observe that matching is a critical
stage in 3D Vision. For all techniques, there are
correlation mismatches, where program 2 is again
the one that presents the weakest results.
3.3 Epipolar Geometry
Epipolar geometry corresponds to the geometrical
structure between two distinct points of view and
expresses itself mathematically by the fundamental
matrix
F
, of size 3x3 : 0
T
mFm
= , where m and
m
are the matching points.
Figure 4: Results of feature points detection in one
image out of the stereo pair: red crosses are the
detected interesting points (top to bottom:
results from programs 1 to 4).
DEVELOPMENT OF A COMPUTER PLATFORMFOR OBJECT 3D RECONSTRUCTIONUSING COMPUTER
VISION TECHNIQUES
385
Figure 5: Red crosses are the matched interesting
points in the other image of the stereo pair (top
to bottom: results from programs 1 to 4).
Determining the epipolar geometry means
getting relative pose information between two
different views (images) of an object. That
information allows also eliminating some previous
wrong matches (outliers), as well as to make it easier
to get new matching points (dense matching).
Only programs 2 and 3 have algorithms for this
purpose. Some experimental results obtained by the
techniques integrated in our computer platform are
given in Figure 6. Because of the weak results
obtained in the prior step and presented in the
previous subsection, it was not possible to determine
the epipolar geometry by using program 2, because
the number of outliers was considerably higher than
the inliers’s. Program 3 (OpenCV) performs this
step very well, although it is very important to get
good matching points to correctly estimate the
epipolar geometry.
Figure 6: Epipolar lines (green) and inliers (blue) in the
first image of the stereo pair, after epipolar geometry
determination; top to bottom: program’s 1 result using
Ransac algorithm (Fischler, 1981), program 1’s
result using LmedS algorithm.
VISAPP 2006 - MOTION, TRACKING AND STEREO VISION
386
3.4 Rectification
The operation of changing two stereo images, in
order to make them coplanar is usually known as
Rectification. Performing this step makes dense
matching much easier to obtain.
Program 5 is the only one that makes
rectification, without previous determination of the
epipolar geometry. The results from the referred
program are shown in Figure 7.
It is possible to observe that the quality of the
results is proportional to the quality of the epipolar
geometry determination.
Figure 7: Rectification result obtained with program 5.
3.5 Disparity Map
A disparity map codifies the distance between the
object and the camera(s): closer points will have
maximal disparity and farther points will get zero
disparity. For short, a disparity map gives some
perception of discontinuity in terms of depth.
Only programs 3 and 6 perform this operation,
with the difference that program 6 also returns a
discontinuity map.
This step was tested with already rectified
images included in the package of the program 6
(Figure 8). The results obtained are presented in
Figure 9 and Figure 10. Given two rectified images,
it is possible to observe that both program 3 and 6
perform very well on the determination of the
disparity map.
Figure 8: One of the original images used by
programs 3 and 6 to obtain the disparity map
(associated to dense matching).
Figure 9: Disparity map, obtained by program 3.
Figure 10: Disparity (top) and discontinuity
(bottom) maps, obtained by program 6.
4 CONCLUSIONS
Along this work, the developing of a computer
platform was initiated, with an appropriated
graphical interface and a modular design, to allow
DEVELOPMENT OF A COMPUTER PLATFORMFOR OBJECT 3D RECONSTRUCTIONUSING COMPUTER
VISION TECHNIQUES
387
the application and study of several techniques of
3D Active Vision, with the final goal of object 3D
reconstruction. In short, the functions already
integrated in the referred computer platform and
experimentally analysed, obtain good results when
applied to objects with strong characteristics. From
the same used results, it is possible to conclude that
low quality results are strongly correlated with
strong points detection and matching, as the
functions in the further steps of the 3D
reconstruction methodology adopted (Figure 1) are
based on those points.
5 FUTURE WORK
The next steps of this work will focus on improving
the results obtained when the objects to be
reconstructed have smooth and continuous surfaces.
To do so, the approach will be:
o inclusion of space carving techniques for object
reconstruction (see for example, (Kutulatos, 1998),
(Sainz, 2002), (Montenegro, 2004));
o the strong points to use in the 3D space object
definition will be detected with the use of a
reduced number of markers added on the object;
o inclusion of a camera calibration technique, as
well as pose and motion estimation algorithms;
some of the techniques to consider are (Meng,
2000) and (Zhang, 2000).
Finally, the computer platform will be used in
3D reconstruction and characterization of 3D
external human shapes.
ACKNOWLEDGMENTS
This work was partially done in the scope of the
project "Segmentation, Tracking and Motion
Analysis of Deformable (2D/3D) Objects using
Physical Principles", reference POSC/EEA-
SRI/55386/2004, financially supported by FCT -
Fundação de Ciência e Tecnologia in Portugal.
REFERENCES
M. Pollefeys, R. Koch, M. Vergauwen, L. V. Gool,
Flexible acquisition of 3D structure from motion,
Proceedings of the IEEE Workshop on Image and
Multidimensional Digital Signal Processing, Alpbach,
Austria, pp. 195-198, 1998.
T. Azevedo, J. M. R. S. Tavares, M. Vaz, Obtenção da
Forma 3D de Objectos usando Metodologias de
Reconstrução de Estruturas a partir do Movimento,
Congreso de Métodos Numéricos en Ingeniería,
Granada, Espanha, 2005.
P. Kovesi, MATLAB Functions for Computer Vision and
Image Analysis,
http://www.csse.uwa.edu.au/~pk/Research/MatlabFns,
2004.
P. H. S. Torr, A Structure and Motion Toolkit in Matlab,
http://wwwcms.brookes.ac.uk/~philiptorr/Beta/torrsam
.zip, 2002.
OpenCV - Open Computer Vision Library, beta 4,
http://sourceforge.net/projects/opencvlibrary, 2004.
S. Birchfield, KLT: An Implementation of the Kanade-
Lucas-Tomasi Feature Tracker,
http://www.ces.clemson.edu/~stb/klt/index.html, 2004.
F. Isgrò, E. Trucco, Projective rectification without
epipolar geometry, Proceedings of the IEEE
International Conference on Computer Vision and
Pattern Recognition, Fort Collins (Colorado), USA,
vol. 1, pp. 94-99, 1999.
S. Birchfield, Depth Discontinuities by Pixel-to-Pixel
Stereo, International Journal of Computer Vision, vol.
35, no. 3, pp. 269-293,
http://vision.stanford.edu/~birch/p2p/, 1999.
M. A. Fischler, R. Bolles, RANdom SAmpling Consensus:
a paradigm for model fitting with application to image
analysis and automated cartography, Communications
of the Association for Computing Machinery, vol. 24,
no. 6, pp. 381-395, 1981.
K. N. Kutulatos, S. M. Steiz, A Theory of Shape by Space
Carving, Technical Report TR692, Computer Science
Department, University of Rochester, New York,
USA, 1998.
M. Sainz, N. Bagherzadeh, A. Susin, Carving 3D Models
from Uncalibrated Views, Proceedings of the 5th
IASTED International Conference Computer Graphics
and Imaging, Hawaii, USA, pp. 144-149, 2002.
A. Montenegro, P. Carvalho, M. Gattass, L. Velho, Space
Carving with a Hand-Held Camera, Proceedings of
the SIBGRAPI'2004 - International Symposium on
Computer Graphics, Image Processing and Vision,
Curitiba, Brasil, 2004.
X. Meng, H. Li, Z. Hu, A New Easy Camera Calibration
Technique Based on Circular Points, Proceedings of
the British Machine Vision Conference, University of
Bristol, UK, pp. 496-501, 2000.
Z. Zhang, A Flexible New Technique for Camera
Calibration, IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 22, no. 11, pp. 1330-
1334, 2000.
VISAPP 2006 - MOTION, TRACKING AND STEREO VISION
388