Genetic Algorithm for Stereo Correspondence with a Novel Fitness

Function and Occlusion Handling

Alvaro Arranz

, Manuel Alvar

, Jaime Boal

, Alvaro Sanchez-Miralles

and Arturo de la Escalera

Insitute for Research in Technology (IIT), ICAI School of Engineering, C/ Alberto Aguilera 23, 28015 Madrid, Spain

Intelligent Systems Lab, University Carlos III of Madrid, C/ Butarque 15, 28911 Leganes, Madrid, Spain

Keywords:

Stereo Reconstruction, Genetic Algorithm.

Abstract:

This paper proposes a genetic algorithm for solving the stereo correspondence problem. Applied to stereo,

genetic algorithms are ﬂexible in the cost function and permit global reasoning. The main contribution of this

paper is a new crossover and a mutation operator which accounts for occlusion management and a new ﬁtness

function which considers occluded pixels and photometric derivatives. Both left and right disparity images

are analysed in order to classify occluded pixels correctly. The proposed ﬁtness function is compared to the

traditional energy function based in the framework of the Markov Random Fields. The results show that a

32% bad-pixel error reduction can be achieved on average using the proposed ﬁtness function. The results

have been uploaded to the Middlebury ranking webpage, as the ﬁrst evolutionary algorithm evaluated.

1 INTRODUCTION

Passive stereo has received a huge amount of atten-

tion from the research community over the past two

decades. The ﬁrst algorithms that dealt with the

stereo correspondence were sparse-feature based al-

gorithms. Considering that some applications would

ﬁnd a per-pixel estimation of the disparity for the ref-

erence image more useful, dense stereo algorithms

started to show their value. New algorithms such

as global methods, dynamic programming, multi-

resolution, or cooperative algorithms soon appeared

to deal with dense disparity estimation. A taxonomy

and evaluation of the most important dense stereo al-

gorithms was proposed in (Alahari et al., 2010). This

paper presented a methodology for comparing dif-

ferent stereo algorithms incorporating a ranking sys-

tem (Middlebury, ).

Results shown in (Alahari et al., 2010) suggest

that global methods are the most accurate ones while

local methods are ideal for real-time applications due

to its simplicity and their parallelizable nature. How-

ever, new algorithms that outperform them, such as

(Mei et al., 2011), have been published. Generally

speaking, the best algorithms in the Middlebury rank-

ing use some kind of optimization process for global

reasoning followed by a reﬁnement process for out-

liers and occluded pixels.

In this paper, a stereo algorithm using a genetic

optimization approach is proposed. The main contri-

butions are the occlusion handling procedure that is

included as a part of the genetic algorithm and the

proposed ﬁtness function that is demonstrated to im-

prove the number of bad pixels in the solution.

This document is organized as follows. In 2,

previous genetic algorithms in stereo are brieﬂy re-

viewed. In 3 detailed information about the genetic

algorithm proposed is given. 4 puts forward the ex-

periments carried out as well as a comparison be-

tween algorithms. Finally in 5 the main conclusions

are drawn.

2 GENETIC ALGORITHMS IN

STEREO

Genetic algorithms are a class of evolutionary algo-

rithms that have been widely used as an heuristic

search for optimization problems in a many differ-

ent applications. In (Saito and Mori, 1995) a genetic

algorithm is used to combine solutions of window-

based methods with different window sizes while

favouring photo-consistency and smoothness. In or-

der to reduce the size of the problem, it proposed

also to divide the solution into blocks and ﬁnd op-

timal disparity maps block-by-block. The authors

in (Han et al., 2001) use a region extraction algo-

294

Arranz A., Alvar M., Boal J., Sanchez-Miralles A. and de la Escalera A..

Genetic Algorithm for Stereo Correspondence with a Novel Fitness Function and Occlusion Handling.

DOI: 10.5220/0004291202940299

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2013), pages 294-299

ISBN: 978-989-8565-48-8

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

rithm for dividing the image. Their ﬁtness function

is made of an intensity similarity and a smoothing

term between regions. A multi-resolution approach

is proposed in (Gong and Yang, 2001) and (Gong

and Yang, 2002), where a quad-tree structure is used

for representing each individual. A Markov Random

Field (MRF) based ﬁtness function for global reason-

ing is used. In (Wang et al., 2003) it is proposed to

use the whole disparity map as a representation of

genomes with no mutation operation. Recently, (Dai

et al., 2008) use an adaptive crossover and muta-

tion while their ﬁtness function do not include any

smoothing term. Finally, (Zhang et al., 2009) use a

pyramidal propagation stratagem for solution repre-

sentation and (Nie et al., 2009) implement a stereo

correspondence genetic algorithm in GPU for perfor-

mance enhancement. Genetic algorithms have also

been used for matching sparse features, for instance

in (Issa et al., 2002) a genetic algorithm is employed

to match edges.

The utilization of genetic algorithms in stereo cor-

respondence has some advantages over other tradi-

tional methods. Firstly, genetic algorithms may op-

timize an energy function for some global reasoning.

In this sense it resembles to global methods based

on MRF such as graph-cuts (Kolmogorov and Zabin,

2004). It has the advantage of its ﬂexibility, given that

practically any ﬁtness function can be used, although

getting close to the optimum is not guaranteed. Sec-

ondly, unlike most global or local methods, genetic

algorithms can provide various local solutions during

the optimization process.

Our approach uses some of the ideas found in the

literature and proposes new crossover and mutation

operators. One of the main contributions of this paper

is that a method for occlusion handling is included

natively in the genetic algorithm, not only as a reﬁne-

ment process. The disparity estimation is performed

on both right and left images. This permits to calcu-

late an occlusion map for both images, and treat oc-

cluded pixels and the common ones differently. The

new crossover proposed permits to make a combina-

tion of a large number of pixels at the same time while

favouring the child to inherit the best regions of both

parents. The new mutation operator radically changes

the disparity values of some regions to enable large

jumps in the solution space. Finally, another con-

tribution is the ﬁtness function proposed, that really

achieves to optimize correctly the number of bad pix-

els in the image considering occlusions.

As mentioned before, (Middlebury, ) has become

one of the main resources for evaluation and com-

parison of stereo correspondence algorithms. Any of

the genetic stereo correspondence methods previously

cited neither used the standard set of stereo images,

nor compared their results with the state-of-the-art

stereo algorithms. In this paper a comparison between

several stereo methods is shown and results have been

uploaded to the ranking system in (Middlebury, ) for

future comparison.

3 PROPOSED ALGORITHM

Generally, stereo correspondence has been demon-

strated to be an ill-posed, NP-hard problem. More-

over, considering a common size image of 400 by 300

pixels and sixty different disparity labels, the num-

ber of different possible solutions is completely over-

whelming. A naive implementation of a genetic al-

gorithm, with highly random disparity assignments

to each pixel, does not perform correctly due to the

fact that almost any random disparity image does not

even make sense as an image. Hence, due to the huge

search space involved in stereo correspondence, prop-

erly guiding the genetic algorithm towards feasible

disparity maps is fundamental to make the algorithm

computationally tractable.

In the following subsections the genetic stereo

correspondence algorithm proposed in this paper is

explained in more detail.

3.1 Genome Representation

The most remarkable genome representations used in

the literature are the quad-tree and the disparity map

representation. Given that, in the method herein pro-

posed, no multi-resolution is used and the disparity

map representation makes it easier to compute the ﬁt-

ness function, the whole disparity map representation

has been chosen. An important novelty is to include

both left and right disparity images in the genome rep-

resentation.

¯g



¯g

{

,...,X

}

¯g

{

,...,X

}

, X

∈

{

,...,L

}

(1)

where g is the genome, g

and g

are the representa-

tion of the left and right disparity images respectively,

and X

are the disparities estimated for pixel i on

the left and right disparity images respectively, N the

total number of pixels in each image and L

the set of

labels representing the set of disparities analysed.

3.2 Initialization

Some algorithms in the literature use random sam-

pling for their initialization process. The probability

GeneticAlgorithmforStereoCorrespondencewithaNovelFitnessFunctionandOcclusionHandling

295

of each pixel having a certain disparity value is based

on a photo-consistency measurement. Others use a

solution of other local window-based algorithm with

a random window size. This approach is similar to the

one proposed in (Saito and Mori, 1995) with the main

difference that the disparity range is not restricted to

the range obtained by the local methods.

For the initialization process we have used

two different window-based algorithms with differ-

ent window sizes, the adaptive support-weight ap-

proach (Yoon and Kweon, 2006) with random pa-

rameters and the census based with window-cost ag-

gregation. For the census transform, a constant win-

dow size of 9x7 was used as suggested in (Mei et al.,

2011). During the initialization process, each pixel is

sampled with a probability proportional to the times it

has appeared in the local window-based algorithms. It

is important to notice that either algorithm alone per-

forms well in discontinuities, occluded or untextured

areas.

3.3 Fitness Function

As the stereo correspondence problem can be formu-

lated as a MRF, the ﬁtness function will be assigned

the related energy value of the left disparity image. A

classical formulation is given by the following equa-

tions

classic

( ¯g) = E

data−classic

( ¯g

) + E

smooth−classic

( ¯g

)

(2)

data−classic

( ¯g

) =

∑

i∈ ¯g

) − I

− X

))

(3)

smooth−classic

( ¯g

) =

∑

{

p,q

}

∈N

{

p,q

}

) (4)

where g is a certain individual, g

is the left dispar-

ity image, I

and I

stand for the left and right stereo

pair, x

and y

are the image coordinates of pixel i, and

{

p,q

}

is a smoothing function favouring the neigh-

bouring pixels having the same disparity.

This energy conﬁguration has been widely used in

the literature by global stereo algorithms. It is sim-

ple and it can be optimized by heuristic algorithms

such as graph-cuts (Kolmogorov and Zabin, 2004).

As stated in (Kolmogorov and Zabin, 2004), not any

function can be used for this purpose.

The classic energy function might present some

problems in the disparity estimation along the discon-

tinuities and the occluded areas. Due to the ﬂexible

nature of the genetic algorithms, other more compli-

cated functions can be chosen as ﬁtness functions.

Herein, an energy function that considers discontinu-

ities and occlusions for the energy evaluation is pro-

posed:

E ( ¯g) = E

data

( ¯g

) + E

smooth

( ¯g

) (5)

data

( ¯g

) =







if i is occluded

∑

i∈ ¯g

)−

− X

))|

otherwise

(6)

smooth

( ¯g

) =

∑

{

p,q

}

∈N

− X

| (7)

= max(λ

,γ

− |I

(p) − I

(q)|) (8)

where λ

, γ

and ϕ

are constant parameters for every

pixel.

The main modiﬁcation in the E

data

term is that

occluded pixels, as they do not have a correspondent

pixel on the right image, contribute to the energy with

a constant value λ

. This enables low energy conﬁg-

urations where the occluded pixels are matched cor-

rectly.

This new smooth function establishes a relation

between the colour consistency of the neighbours and

the associated weight of their disparity difference. For

neighbouring pixels that are very different in colour,

low weight is assigned to their disparity difference,

while neighbouring pixels that are very similar are

forced to have the same disparity.

At this point it is very important to emphasize

that in any case the genetic algorithm, neither using

the classic energy function nor the proposed one, is

guaranteed to ﬁnd the optimum energy conﬁguration.

Even more, how close we can get to the optimum will

depend on the ﬁtness function, the crossover and mu-

tator operators, their parameters, the population size,

etc. Probably, in any case the genetic algorithm will

get close to the optimum, but the experiments carried

out and shown in 4, suggest that the proposed energy

function is more adequate than the classic one for both

discontinuity and occlusion management.

3.4 Crossover

On ﬁrst place, our method employed as a crossover

algorithm that is very similar to the uniform crossover.

For each crossover, a random block size is selected

representing a region on each disparity image g

and

. Then, a random assignation of each parent block

to the children is performed.

While this stochastic approach to the crossover

operation is inherent to the genetic algorithms, some

tests with a deterministic crossover were also carried

out. A new crossover was deﬁned, instead of assign-

ing the blocks to the sons randomly, ﬁrst it evaluates

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

296

the ﬁtness function on each parent block and then

put the blocks with the best ﬁtness function on the

same son. In this sense, this approach contradicts the

stochastic nature of the genetic algorithm and might

involve getting stuck in local minima. However, after

testing both approaches, the deterministic crossover

achieved a lot better ﬁtness function than the stochas-

tic one, so this one was used on our ﬁnal tests. This is

also suggested in (Wang et al., 2003).

3.5 Mutation

Three different mutation operations that may occur to

each individual have been deﬁned. Firstly, one pos-

sible mutation operation is to initialize again some

pixels of one of the left or right images following

the steps explained in 3.2. That is, the disparity of

the pixel is changed stochastically with a probabil-

ity proportional to those suggested by local methods.

This mutation operation may happen with a probabil-

ity P

Secondly, a median ﬁlter operation with a random

window size is also performed as a mutation function.

It is not any novelty, but sometimes it is effective for

managing some sparse outliers. This median ﬁlter op-

eration is performed with a probability P

Finally, an occlusion detection and handling is

also included as a mutation with probability P

. This

process is a two step operation: an occlusion detec-

tion followed by an occlusion management. Given

that both left and right disparity images are being es-

timated by our algorithm, we can use the right image

disparities to estimate which pixels cannot have pos-

sible matches on the left one and vice-versa. The fol-

lowing operations are deﬁned for calculating the left

occlusion map:

(p) =







0 ∃i/



x(i) + ¯g

(i)

y(i)





x(p)

y(p)



p,i ∈ P

1 otherwise

(9)

being O

the left occlusion map, x(p) and y(p) the x

and y coordinates of point p respectively and P the set

of disparity image points.

Similarly, an expression for the right occlusion

map for the right image is:

(p) =







0 ∃i/



x(i) − ¯g

(i)

y(i)





x(p)

y(p)



p,i ∈ P

1 otherwise

(10)

being O

the right occlusion map.

Once the occlusion maps are calculated for both

images, a very simple occlusion management is per-

formed. We follow an iterative process based on

the neighbouring disparities of the occluded pixels.

For the left image, each occluded pixel is assigned

the disparity value of the most photo-consistent non-

occluded neighbour from left to right and afterwards

it is marked as non-occluded. If no non-occluded

neighbours exist, it maintains its occluded status for

the next iteration. Special status have the occluded

pixels whose x(p) coordinate is less than the number

of disparities analysed. In this case the iteration is

made from right to left and bottom-up. The iteration

is ﬁnished when no occluded pixels are left on the left

occluded map.

For the right image it is similarly done but vice

versa (right to left for common pixels and left to right

for pixels whose x(p) is at a distance of the number

of disparities analysed from the right image border).

This fast an simple algorithm demonstrates to be ef-

fective in 4.

4 EXPERIMENTAL RESULTS

The genetic algorithm proposed has been applied to

solve the Middlebury standard stereo dataset in (Mid-

dlebury, ) that consists of four images. The param-

eters used related with the new energy function pro-

posed are shown in 1, while the parameters related

with the genetic algorithm are shown in 2. For local

methods, window sizes between 3 and 45 have been

used and random values for the adaptive-weight pa-

rameters. All the test-cases were run using the same

parameters.

The resulting left disparity images with their bad

pixels percentage image representation are shown in 1

and 2. Looking to the bad-pixel images, it is clear

that Tsukuba and Venus obtain the best results. Al-

though the algorithm performs quite well all along

non-occluded and discontinuity regions, in other ar-

eas such as untextured regions it fails substantially.

This can be attributed to the fact that the local algo-

rithms used in the initialization process also fail in

these untextured regions, so the genetic algorithm is

unable to generate individuals with proper disparities

on that region.

All four images were uploaded and evaluated us-

ing the Middlebury web-site. The algorithm achieved

an average rank of 38.5 and an average percent of bad

pixels of 5.81. This is an improvement over, for exam-

ple, the adaptive-weight algorithm used for its initial-

ization step which has an average rank of 61.4 and an

average percent of bad pixels of 6.67. Moreover, the

proposed genetic algorithm achieved the best rank in

the discontinuity areas of the Tsukuba image. Com-

GeneticAlgorithmforStereoCorrespondencewithaNovelFitnessFunctionandOcclusionHandling

297

Figure 1: Tsukuba and Venus results. Disparity images

(ﬁrst row) and bad pixels (second row)

paring the proposed algorithm to the best reported

one (Mei et al., 2011), our algorithm performs 1.84%

worse. This can be attributed mainly to the untex-

tured regions already explained where the local meth-

ods fail considerably.

Table 1: Parameters for the new energy function

10.0 50.0 2.0 10.0

Table 2: Parameters for the genetic algorithm.

Population Generations P

ross P

50.0 1000 0.9 0.1 0.1 0.5

In order to evaluate the performance of the energy

functions described in 3.3, some tests were carried out

using exactly the same genetic algorithm but applying

the classic energy formulation as ﬁtness function in-

stead. The truncated linear function was used for the

smoothing function with a cost of 1.0 and a truncation

value of 10.0. The results were uploaded to the Mid-

dlebury stereo web-page, following the same steps as

in the previous case. The average percent of bad pix-

els increases from 5.81 to 8.56, which is near 3 more

bad pixel percentage error if the classic energy func-

tion is used.

The evolution of each energy function during the

optimization process compared to the bad-pixels error

measurement for Tsukuba stereo pair is shown in 3.

The image on the ﬁrst row shows the whole energy

which is being minimized during the ﬁrst 500 gener-

ations. Both algorithms follow a similar descendant

curve. However, they cannot be compared in terms

of the minimum energy achieved given that different

functions and parameters are used.

The image on the second row shows the evolution

Figure 2: Teddy and Cones results. Disparity images (ﬁrst

row) and bad pixels (second row).

of the bad-pixels measurement of the best individual

for each population. These charts were selected be-

cause they show empirically how the real disparity er-

ror evolves when each energy function is minimized.

3 shows that a reduction of the classic energy func-

tion not always translates into an effective reduction

of the bad-pixels. Actually, in this test-case it pro-

duces some kind of unstable behaviour. The experi-

ments carried out with the rest of the stereo pairs show

the same trend. Meanwhile, using the energy func-

tion proposed, a much more stable behaviour and a

much better ﬁnal error for all the tests carried out is

obtained.

However, it is important to notice that it cannot

be stated that the proposed energy function represents

the real disparity images better, i.e. the true disparity

images obtain lower energy values than others. Ge-

netic algorithms work ﬁne for ﬁnding good approxi-

mations to real optimum values only when all the ge-

netic operators are well set. It cannot be guaranteed

that the genetic algorithm will perform better using

the proposed energy function for any genetic conﬁgu-

ration. Neither can be guaranteed that the optimum in

one case has less error than the optimum in the other

case. However, this trend has appeared in every tests

carried out.

5 CONCLUSIONS

A new genetic algorithm has been proposed for stereo

correspondence. Applying genetic algorithms has

some beneﬁts such as global reasoning and unre-

stricted ﬁtness function. The contributions of this pa-

per, is twofold. Firstly, compared to other genetic al-

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

298

Figure 3: Evolution of the energy functions and the bad-

pixels in Tsukuba.

gorithms previously proposed, it uses new crossover

and mutation operators that account for occlusion

handling. Both left and right disparity images are esti-

mated in order to manage occlusions adequately. Sec-

ondly, it has been proposed and analysed a new en-

ergy function that includes occluded pixels handling

in the formulation and enables depth discontinuities

on pixels with high photometric derivatives.

The genetic algorithm has been evaluated using

the standard Middlebury stereo dataset using both

classic and proposed energy functions. Our imple-

mentation outperformed the classical one in 2.75 of

bad pixels percentage on average, which represents a

32% error reduction using the new energy function.

Moreover, an analysis of the evolution of the bad-

pixels error measurement suggests that the new for-

mulation is more adequate for representing real dis-

parities. The algorithm proposed was rated with an

average rank of 38.5 in the Middlebury ranking and

as far as we know, is the ﬁrst evolutionary algorithm

included on this table.

REFERENCES

Alahari, K., Kohli, P., and Torr, P. H. S. (2010). Dy-

namic hybrid algorithms for map inference in dis-

crete mrfs. Pattern Analysis and Machine Intelligence,

IEEE Transactions on, 32(10):1846–1857.

Boykov, Y., Veksler, O., and Zabih, R. (2001). Fast ap-

proximate energy minimization via graph cuts. IEEE

Transactions On Pattern Analysis And Machine Intel-

ligence, 23(11):1222–1239.

Dai, C., Wu, X., and Liu, J. (2008). Stereo matching using

adaptive genetic algorithm. In Audio, Language and

Image Processing, 2008. ICALIP 2008. International

Conference on, pages 1225–1228.

Gong, M. and Yang, Y.-H. (2001). Multi-resolution stereo

matching using genetic algorithm. In Stereo and

Multi-Baseline Vision, 2001. (SMBV 2001). Proceed-

ings. IEEE Workshop on, pages 21–29.

Gong, M. and Yang, Y.-H. (2002). Genetic-based stereo

algorithm and disparity map evaluation. International

Journal of Computer Vision, 47(1):63–77.

Han, K.-P., Song, K.-W., Chung, E.-Y., Cho, S.-J., and Ha,

Y.-H. (2001). Stereo matching using genetic algo-

rithm with adaptive chromosomes. Pattern Recogni-

tion, 34(9):1729–1740.

Issa, H., Ruichek, Y., and Postaire, J. G. (2002). Stereo cor-

respondence using a genetic scheme with a new solu-

tion encoding. In Systems, Man and Cybernetics, 2002

IEEE International Conference on, volume 6, page 5

pp. vol.6.

Kolmogorov, V. and Zabin, R. (2004). What energy func-

tions can be minimized via graph cuts? Pattern Anal-

ysis and Machine Intelligence, IEEE Transactions on,

26(2):147–159.

Mei, X., Sun, X., Zhou, M., Jiao, S., Wang, H., and Zhang,

X. (2011). On building an accurate stereo matching

system on graphics hardware. In Computer Vision

Workshops (ICCV Workshops), 2011 IEEE Interna-

tional Conference on, pages 467 –474.

Middlebury. http://vision.middlebury.edu/stereo/.

Nie, D.-H., Han, K.-P., and Lee, H.-S. (2009). Stereo

matching algorithm using population-based incremen-

tal learning on gpu. In Intelligent Systems and Appli-

cations, 2009. ISA 2009. International Workshop on,

pages 1–4.

Saito, H. and Mori, M. (1995). Application of genetic algo-

rithms to stereo matching of images. Pattern Recog-

nition Letters, 16(8):815–821.

Wang, B., Chung, R., and Shen, C.-L. (2003). Ge-

netic algorithm-based stereo vision with no block-

partitioning of input images. In Computational Intel-

ligence in Robotics and Automation, 2003. Proceed-

ings. 2003 IEEE International Symposium on, vol-

ume 2, pages 830–836 vol.2.

Yoon, K. J. and Kweon, I. S. (2006). Adaptive support-

weight approach for correspondence search. Ieee

Transactions On Pattern Analysis And Machine Intel-

ligence, 28(4):650–656.

Zhang, Z., Hou, C., and Yang, J. (2009). A stereo match-

ing algorithm based on genetic algorithm with prop-

agation stratagem. In Intelligent Systems and Appli-

cations, 2009. ISA 2009. International Workshop on,

pages 1–4.

GeneticAlgorithmforStereoCorrespondencewithaNovelFitnessFunctionandOcclusionHandling

299