Iterative Human Segmentation from Detection Windows

using Contour Segment Analysis

Cyrille Migniot, Pascal Bertolino and Jean-Marc Chassery

CNRS Gipsa-Lab DIS, 961 rue de la Houille Blanche, BP 46-38402, Grenoble Cedex, France

Keywords:

Pedestrian, Segmentation, Silhouette, Contour Segment, Oriented Graph.

Abstract:

This paper presents a new algorithm for human segmentation in images. The human silhouette is estimated

in positive windows that are already obtained with an existing efﬁcient detection method. This accurate seg-

mentation uses the data previously computed in the detection. First, a pre-segmentation step computes the

likelihood of contour segments as being a part of a human silhouette. Then, a contour segment oriented graph

is constructed from the shape continuity cue and the prior cue obtained by the pre-segmentation. Segmen-

tation is so posed as the computation of the shortest-path cycle which corresponds to the human silhouette.

Additionally, the process is achieved iteratively to eliminate irrelevant paths and to increase the segmenta-

tion performance. The approach is tested on a human image database and the segmentation performance is

evaluated quantitatively.

1 INTRODUCTION

Human segmentation is of fundamental interest in

computer vision due to the variations in human pose

and clothing. Moreover, an accurate segmentation is

needed in many applications such as human-computer

interaction, video indexing, image editing or movie

special effects.

Recognizing person can’t be done from color and

texture. On the contrary, since a person can be rec-

ognized only from its silhouette, shape is a more de-

scriptive cue. In the proposed method, the contour

map of an image is obtained with the Canny’s opera-

tor (see Figure 1(b)). Nearly linear contour fragments

are modeled with segments (see Figure 1(c)) which

are relevant parts of the image for our study. Indeed,

the human silhouette can be rebuilt from them. The

segmentation is then performed by a reconstruction

of the silhouette from the contour segments.

Traditionally, when the detection and the segmen-

tation are performed simultaneously, the detection

process is chosen to be well-adapted to the segmen-

tation. Conversely, we aim at realizing the segmenta-

tion from one of the most efﬁcient existing detection

method. Giving good performance, the Dalal’s algo-

rithm (Dalal and Triggs, 2005) based on Histograms

of Oriented Gradients (HOG) descriptor with Support

Vector Machine (SVM) classiﬁer is used in numer-

ous papers (Felzenszwalb et al., 2010) (Alonso et al.,

Figure 1: A detection window containing a pedestrian (a),

contour image computed with the Canny’s algorithm (b),

contour pixels gathered in contour segments (c), cells like-

lihood provided by SVM (d), likely segments computed by

the pre-segmentation (e) and segmented silhouette obtained

by our method (f).

2007) (Bertozzi et al., 2007) (Zhu et al., 2006).

In our work, the segmentation is carried out from

the detection. Indeed, the Dalal’s detection provides

detection windows (see Figure 1(a)) from the compu-

tation of the HOG. Then, our method uses these HOG

405

Migniot C., Bertolino P. and Chassery J..

Iterative Human Segmentation from Detection Windows using Contour Segment Analysis.

DOI: 10.5220/0004209404050412

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2013), pages 405-412

ISBN: 978-989-8565-47-1

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

to achieve the segmentation in the detection windows.

This novel proposed segmentation method is com-

posed of two steps:

Firstly, the HOG and SVM detection process com-

puted for the whole window in (Dalal and Triggs,

2005) is used in sub-parts of the window to pro-

vide more local shape information. The likelihood of

each contour segment of the window as being a part

of a human silhouette is computed and gives a pre-

segmentation (see Figure 1(e)) where the gray level

of a segment is proportional to its likelihood).

Secondly, the contour segment cycle that is the most

representative of a pedestrian is obtained by a Dijk-

stra’s algorithm in an oriented graph. This graph is

made with the contour segments as vertices and the

neighborhood between couple of close contour seg-

ments as edges. The integration of the knowledge on

the researched class (here the pedestrians) is obtained

by weighting the edges of the graph with the pre-

segmentation data. The optimal cycle ﬁnally gives the

human silhouette and provides the segmentation (see

Figure 1(f)).

Due to the human shape complexity, errors fre-

quently appear in the obtained results. Nevertheless,

some of them can be easily located. To this end the

process depicted above is iterated in the problematic

areas with updated graph features. Thus, each itera-

tion may improve the result.

The remainder of the paper is organized as fol-

lows: Section 2 reviews the human segmentation and

the use of contour segments. Section 3 describes the

pre-segmentation process. Section 4 presents the ori-

ented graph approach used for segmentation. Section

5 develops the iterative algorithm. Experimental re-

sults are presented in Section 6, followed by conclu-

sions in Section 7.

2 RELATED WORK

Human Detection and Segmentation. The descrip-

tor and classiﬁer combination is the most used frame-

work in human detection. The descriptor converts an

image into a vector of discriminative features and the

classiﬁer compares the features of a tested image to

the features of images of an annotated database. HOG

(Dalal and Triggs, 2005) and Haar wavelets (Oren

et al., 1997) are the most used descriptors. SVM

(Vapnik, 1995) and Adaboost (Freund and Schapire,

1995) are the most used classiﬁers.

Simultaneous detection and segmentation can

stem from the research of the region of interest (ROI)

which can be based on depth (with stereo as in

(Kang et al., 2002)) or color (by normalized cut as in

(Mori et al., 2007)). Otherwise, the silhouette can be

found by a template matching (Lin and Davis, 2010)

(Munder and Gavrila, 2006) where the image con-

tours are compared to the silhouettes of a codebook.

The relevance of ROI or the similarity to a template

of the codebook gives the detection. Gathering of the

ROI or ﬁnding template delineates the silhouette and

also achieves the segmentation.

Hernandez (Hernandez et al., 2010) performs face

detection and skin color model for seed initializa-

tion in a graph cut process. This initialization is

provided by a previously computed pose estimation

in (Pishchulin et al., 2012). Wang (Wang and Koller,

2011) ﬁnally minimizes an energy that simultane-

ously takes into account the body parts localization

and the segmentation.

Contour Segment Approaches. As silhouette shape

is well-descriptive of the human class, there is a range

of methods based on the analysis of its parts. Indeed

Shotton (Shotton et al., 2008) demonstrates that a few

number of fragments of outline contours permit hu-

man recognition. For segmentation, Ferrari (Ferrari

et al., 2006) focuses on the succession of descriptive

contour segments. Wu (Wu and Nevatia, 2007) builds

a classiﬁer to recognize human parts from edgelet fea-

tures (detection) and a classiﬁer to recognize the fore-

ground pixels (segmentation). The two classiﬁers are

used together to carry out the two processes simulta-

neously. Gao (Gao et al., 2009) generates from the

contour a feature named Adaptive Contour Feature

that at the same time deﬁnes a weak classiﬁer for hu-

man detection and segmentation. Hariharan (Hariha-

ran et al., 2011) combines information from different

part detectors to classify category-speciﬁc object con-

tours. Lastly, Sharma (Sharma and Davis, 2007) ﬁnds

the relevant contour segment cycles from an oriented

graph. Then, the cycles are integrated in a Markov

Random Field and a graph cut selects the one which

are related to silhouette and achieves the segmenta-

tion.

We want that segmentation deals with an usual

and efﬁcient detection method. Our approach, which

inversely as (Sharma and Davis, 2007), searches the

prior cue ﬁrst and then the cycle, is so well-adapted.

3 PRE-SEGMENTATION

In (Dalal and Triggs, 2005), the HOG and SVM com-

bination allows the detection. For each detection win-

dow, the only obtained information is the decision

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

406

about the presence of a person in the window. Seg-

mentation needs a process at a smaller scale. The de-

tection window is partitioned in square areas named

cells. The HOG of each cell is computed. Regular

cell gatherings are formed and named block. Thus,

several HOG are associated to a single block to in-

crease the descriptiveness of its features. Then, SVM

gives a classiﬁcation for each block and provides a

value S

SV M

block

which corresponds to the likelihood of

elements contained by the block as being a part of

the human silhouette. Blocks overlap so each cell be-

longs to several blocks. The value S

SV M

cell

is associated

to each cell (Figure 1(d)). It is the mean of the classi-

ﬁcation values for all the blocks that contain the cell.

The classiﬁcation gives information on each cell

but our study needs to deal with elements directly re-

lated to the silhouette. The Canny’s algorithm deter-

mines the contour pixels (Figure 1(b)) which are gath-

ered in segments (Figure 1(c)). We aim to reconstruct

the silhouette from these segments. For each contour

segment seg, L

seg

is the likelihood of the segment as

being a part of the silhouette (Figure 1(e)). It is de-

ﬁned by:

seg

= mean

p∈seg

(G(p)S

SV M

) (1)

where c

is the cell that contains the pixel p and G(p)

is the intensity of gradient of the pixel p.

Figure 2: Product of the likelihood provided by SVM with

the gradient of the image for 4 examples. These values used

in equation 1 give accurate clues on the silhouette contour.

These data provide information that guides the

segmentation (Figure 2). A pre-segmentation of the

person is achieved.

4 SILHOUETTE REFORMATION

The most likely silhouette segments must be linked to

form the contour silhouette. Similarly to (Elder and

Zucker, 1996), an oriented graph from contour parts

(here contour segments) is studied. The graph edges

are weighted so as to give silhouette continuity and its

likelihood as human one. Hence, the likelihood com-

puted during the pre-segmentation step will be inte-

grated in the process.

4.1 Building the Graph of Contour

Segments

We create an oriented graph G(V , E) in which a path

is a sequence of connected contour segments. The

weights are set so that the searched shortest path cor-

responds to the silhouette.

A contour map of the detection window is ob-

tained with the Canny’s edge detector and is vector-

ized to provide the contour segments which are the

vertices V of the graph.

Nevertheless, some contours of the silhouette may

be absent due to lack of local contrast and introduce

gaps in the silhouette (Figure 3(a)). As in (Elder and

Zucker, 1996), contour need to be closed. Additional

segments called transitions are so introduced. They

connect the extremities of the contour segments (Fig-

ure 3(b)). These transitions are the edges E of the

graph. Connecting the right segments with the right

transitions requires large computing. Moreover, con-

tour segments that are spatially far have little proba-

bility to be consecutive in the silhouette path. Con-

sequently, only the transitions whose size is under a

determinate threshold T

make edges in E. The se-

quence of vertices and edges of a path in G represents

a sequence of contour segments and transition seg-

ments which gives a silhouette.

Figure 3: The lacks in the contour detection generate gaps

in the silhouette (a). A transition (dotted line) connects the

extremities of two existing segments (b). The correspond-

ing piece of graph (c).

Finally, weights are associated to the edges. To do

it, a local interaction term is deﬁned: the afﬁnity is

related to the probability of a contour segment seg

to follow a contour segment seg

. It associates two

notions: continuity and likelihood.

IterativeHumanSegmentationfromDetectionWindowsusingContourSegmentAnalysis

407

Continuity corresponds to the path coherence

from seg

to seg

. It is usually related to the spa-

tial distance, the magnitude difference or the orien-

tation difference between the two segments. Here,

since only the contours are studied, the magnitude dif-

ference is not available. Moreover, the irregularities

of human silhouette prevent from using orientation

continuity. Consequently, only the spatial distance is

used.

Likelihood takes into account the knowledge on

the searched class. If a segment is likely to be a part

of a human silhouette, it should be promoted.

The afﬁnity of the path from segment seg

to seg-

ment seg

is ﬁrstly deﬁned by:

A f f inity(seg

, seg

) = e

−

−αS

(2)

where L

is the likelihood of seg

as deﬁned in equa-

tion 1, S

is the size of the transition segment that con-

nects seg

to seg

and α is related to the transition

segment inﬂuence.

However, the size of the contour segments is vari-

ous. The long segments must be more weighted in the

graph because they mean important parts in the path.

Moreover, the transition segments which are unlikely

to be a part of a human silhouette may be penalized.

Thus, the deﬁnition of afﬁnity is modiﬁed as follows:

A f f inity(seg

, seg

) = e

−

−α

(3)

where L

is the transition segment likelihood as de-

ﬁned in equation 1 and S

is the size of seg

Inverse logarithm is ﬁnally used to compute the

weight ω associated to the edge between the two con-

tour segments.

ω = −log(A f f inity(seg

, seg

)) (4)

4.2 The Silhouette as an Optimal Cycle

Once the graph is built, the segmentation is seen as a

shortest-path problem where the goal is, starting from

a conﬁdence segment, to ﬁnd the shortest path in the

graph (in terms of edge weights) that makes a cycle.

To do so, we use the well known Dijkstra’s algorithm.

Since human silhouette is complex and because

cumulative weights are used, the Dijkstra’s algorithm

promotes spatially short paths that can miss large

parts of the silhouette. To avoid this bias the path is

forced to pass through the two spatially farthermost

segments. So, these are actually two shortest paths

linking these two segments that are searched. The

concatenation of these two paths then provides the op-

timal cycle. As we are dealing with pedestrian, it is

assumed that these segments correspond to the top of

the head (top) and the bottom of the feet (bottom).

top and bottom are found automatically. Their

choice is made using the location, orientation and

likelihood of the segments. For a segment seg, let

(x, y) be the coordinates of its middle, θ its orienta-

tion and L its likelihood. A Gaussian function whose

parameters are set experimentally is deﬁned for each

of these four features f :

µ,σ

( f ) = e

−

( f −µ)

2σ

(5)

where µ is the mean value of the feature in the dataset

and σ its standard variation.

Then the probabilities P

(seg) to be an appropriate top

segment and P

(seg) to be a bottom segment are de-

ﬁned by:

(

(seg) = G

,σ

(x).G

,σ

(y).G

,σ

(θ).G

,σ

(L)

(seg) = G

,σ

(x).G

,σ

(y).G

,σ

(θ).G

,σ

(L)

(6)

The segments that maximize these probabilities are

chosen:







top = argmax

seg

(s)

bottom = argmax

seg

(s)

(7)

5 ITERATIVE PROCESS

When some contours of the silhouette are missing, the

transitions between successive segments in the opti-

mal cycle may be long. But, the threshold T

de-

ﬁned in Section 4.1 prevents from too long transi-

tions. Moreover, the Dijkstra’s algorithm does not

adapt perfectly the pre-segmentation. New iterations

of the process with the data and the models of the pre-

segmentation are so achieved to improve the segmen-

tation. They are applied on wrong parts of the seg-

mentation with more adapted features. The iterative

process is summed up in Figure 5.

Segmentation Evaluation. The segments of the cy-

cle provide a segmentation mask that is presented

to the SVM classiﬁer already used in the pre-

segmentation step. The likelihood L

seg

of each one

is calculated using equation 1.

Updating the Cycle. The segments of the cycle

whose likelihood is under a threshold T

are consid-

ered to be wrong. For each sequence of successive

wrong segments in the cycle, a shortest path search

is done using a relaxed threshold and locally updated

weights: the threshold T

is increased at each itera-

tion to permit longer transitions. On the other hand,

the wrong segments of the sequence are penalized.

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

408

Figure 4: Segmentation of 12 pedestrians in cluttered scenes. From left to right: initial detection window, likelihood of

contour segments as being a part of a human silhouette and segmentation obtained with the iterative process.

Figure 5: An overview of the iterative process. A new iter-

ation is achieved on wrong parts of the cycle as long as it

improves the segmentation quality.

Let an edge of G that goes to a wrong segment seg

seg

6 T

). The weight ω of the edge is updated as

follows:

ω ←

1 + T

1 + L

seg

ω (8)

The new path replaces the previous one in the current

cycle.

Process Termination. At each iteration, an evalua-

tion of the segments in the new current cycle is made

to check if it is better than the previous one (Figure

7(d)). The new cycle is evaluated by calculating the

mean likelihood L of the cycle weighted by the seg-

ment size S

seg

L =

∑

k∈cycle

∑

k∈cycle

(9)

If the value of L is smaller or equal to the previ-

ous one, the previous cycle is kept and the process is

stopped. Otherwise, an improvement is still possible

and a new iteration is performed.

6 EXPERIMENTS

The learning required for the SVM model computing

IterativeHumanSegmentationfromDetectionWindowsusingContourSegmentAnalysis

409

is achieved using 400 positive examples from a bi-

nary human silhouette image database created for this

work and from 200 negative examples of the INRIA

Static Person Data Set. The algorithm used for clas-

siﬁcation in the pre-segmentation step is SVM-light

(Joachims, 1999). It requires T

=0. The Canny oper-

ator parameters are adapted to the dataset. Thus we

choose the ones taken in (Dalal and Triggs, 2005).

The evaluation of the segmentation is based on

three measures advised by (Philipp-Foliguet and

Guigues, 2008). They involve a ground truth which

constitutes a reference segmentation. We manually

made the ground truths of all the testing windows.

• The F

measure

considers the compromise between

the precision and the recall of the assignation of

pixel to the foreground or the background.

measure

2. precision.recall

precision + recall

(10)

• The Martin measure checks the true assignation

of important regions.

• The Yasnoff measure computes the distance be-

tween ill-assigned pixels and the nearest pixel be-

longing to its true region. It is closely related to

the human perception of the quality of the seg-

mentation.

The F

measure

and the Martin measure give a value in

[0, 1], whereas the Yasnoff measure gives a value in

[0, +∞[. The Martin and Yasnoff measures decrease

with the segmentation performance and the F

measure

increases with the segmentation performance.

In the experiments, 400 images from the INRIA

Static Person Data Set were tested and compared to

the manually made ground-truths. The evaluation of

the segmentation is estimated from the mean of the

measures for all the tested images.

6.1 Single Iteration Evaluation

First, the experiments are only conducted on the

method without iteration. To optimize the algorithm,

the appropriate value of the threshold T

, related to

the maximum distance between two consecutive seg-

ments, and the appropriate inﬂuence factor α of the

transition in the graph weights (see equation 3) need

to be ﬁxed. Figure 6 shows the F

measure

for various

values of these two parameters. F

measure

promotes the

values T

=14 and α=4 that are used in the sequel.

Using a non optimized C++ implementation on a

3GHz Pentium D machine, the ﬁrst iteration exclud-

ing the pre-segmentation stage is processed in a mean

time of 23 ms.

Figure 6: Segmentation evaluation by the F

measure

to eval-

uate the optimal value of the threshold T

and the factor α.

We chose from these evaluations a threshold of 14 and a

factor α of 4.

6.2 Evaluation of the Iterative Process

On the 400 tests of the experiments, the mean number

of required iterations is 2,14. That demonstrates that

convergence is fast and computational cost is not too

high.

Multiple iterations eliminate the illogical paths

(see Figure 7). Actually, the three evaluation mea-

sures (see Table 1) conﬁrm an important segmentation

improvement with the iterative process. The iterations

eliminate the false detections and particularly the far

ill-assigned pixel. Some examples of segmentation

obtained by this method can be shown in Figure 4.

Table 1: Evaluation measures with the Dijkstra’s algorithm

for a single iteration and with the iterative process. The

three measures demonstrate that several iterations improve

the segmentation. In order to facilitate the reading, the sign

↓ indicates a measure to minimize and the sign ↑ a measure

to maximize.

Mesure amp; First iteration amp; Iterative process

measure

(↑) amp; 0,8386 amp; 0,8405

Martin (↓) amp; 0,0495 amp; 0,0490

Yasnoff (↓) amp; 0,6482 amp; 0,6346

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

410

Figure 7: Segmentation improvement by the iterative pro-

cess for 8 examples: initial detection window (a), segmen-

tation obtained by a single iteration (b), evaluation of the

segments of the cycle in the ﬁrst iteration (c) and segmen-

tation obtained at the end of the iterative process (d). Mul-

tiple iterations remove wrong parts of the silhouette when

needed.

7 CONCLUSIONS

In this paper, we have proposed to directly adapt a

new human segmentation process to an already ex-

isting efﬁcient human detection method. Both pro-

cesses are closely related and data previously com-

puted in the detection are used in the segmentation.

In this way, a pre-segmentation based on a HOG and

SVM framework gives local information on the con-

tour segments. Then, the segmentation is performed

with the integration of the pre-segmentation cue in an

oriented graph. Detection and segmentation can thus

be achieved simultaneously. The quality of the seg-

mentation is increased by an iterative process.

Future research directions will involve different

issues. First of all, we have only studied static im-

ages. We could enrich descriptiveness by integrat-

ing a human motion cue. Additionally, the same pro-

cess could be adapted to others classes than “human”

(for example for cows, cars or dogs). Nevertheless,

“human” is one of the most shape descriptive class.

The same process should be less effective with other

classes. Finally, some interactions with the user could

improve performance and better deal with hard cases.

REFERENCES

Alonso, I., Llorca, D., Sotelo, M., Bergasa, L., Toro, P. D.,

Nuevo, J., Ocania, M., and Garrido, M. (2007). Com-

bination of feature extraction methods for svm pedes-

trian detection. IEEE Transactions on Intelligent

Transportation Systems, 30:292–307.

Bertozzi, M., Broggi, A., Rose, M. D., Felisa, M., Rako-

tomamonjy, A., and Suard, F. (2007). A pedestrian

detector using histograms of oriented gradients and

a support vector machine classiﬁer. IEEE Intelligent

Transportation Systems Conference, pages 143–148.

Dalal, N. and Triggs, B. (2005). Histograms of oriented gra-

dients for human detection. IEEE International Con-

ference on Computer Vision and Pattern Recognition,

2:886–893.

Elder, J. and Zucker, S. (1996). Computing contour closure.

European Conference on Computer Vision, 1:399–

412.

Felzenszwalb, P., Girshik, R., McAllester, D., and Ra-

manan, D. (2010). Object detection with discrimi-

natively trained part based models. IEEE Transac-

tions on Pattern Analysis and Machine Intelligence,

32:1627–1645.

Ferrari, V., Tuytelaars, T., and Gool, L. V. (2006). Ob-

ject detection by contour segment networks. European

Conference on Computer Vision, 3953:14–28.

Freund, Y. and Schapire, R. (1995). A decision-theoretic

generalization of on-line learning and an application

to boosting. European Conference on Computational

Learning Theory, pages 23–37.

IterativeHumanSegmentationfromDetectionWindowsusingContourSegmentAnalysis

411

Gao, W., Ai, H., and Lao, S. (2009). Adaptive contour fea-

tures in oriented granular space for human detection

and segmentation. IEEE International Conference

on Computer Vision and Pattern Recognition, pages

1786–1793.

Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., and Ma-

lik, J. (2011). Semantic contours from inverse detec-

tors. IEEE International Conference in Computer Vi-

sion, pages 991–998.

Hernandez, A., Reyes, M., Escalera, S., and Radeva, P.

(2010). Spatio-temporal grabcut human segmentation

for face and pose recovery. IEEE International Con-

ference on Computer Vision and Pattern Recognition,

pages 33–40.

Joachims, T. (1999). Making large-scale svm learning prac-

tical. Advances in Kernel Methods - Support Vector

Learning.

Kang, S., Byun, H., and Lee, S. (2002). Real-time pedes-

trian detection using support vector machines. Inter-

national Journal of Pattern Recognition and Artiﬁcial

Intelligence, pages 268–277.

Lin, Z. and Davis, L. (2010). Shape-based human detec-

tion and segmentation via hierarchical part-template

matching. IEEE Transactions on Pattern Analysis and

Machine Intelligence, 32:604–618.

Mori, G., Ren, X., Efros, A., and Malik, J. (2007). Re-

covering human body conﬁgurations: Combining seg-

mentation and recognition. IEEE International Con-

ference on Computer Vision and Pattern Recognition,

2:326–333.

Munder, S. and Gavrila, D. (2006). An experimental study

on pedestrian classiﬁcation. IEEE Transactions on

Pattern Analysis and Machine Intelligence, 28:1863–

1868.

Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., and Pog-

gio, T. (1997). Pedestrian detection using wavelet

templates. IEEE Computer Society Conference on

Computer Vision and Pattern Recognition, pages 193–

199.

Philipp-Foliguet, S. and Guigues, L. (2008). Multi-scale

criteria for the evaluation of image segmentation al-

gorithms. Journal of Multimedia, pages 42–56.

Pishchulin, L., Jain, A., Andriluka, M., Thormaehlen, T.,

and Schiele, B. (2012). Articulated people detection

and pose estimation: Reshaping the future. IEEE Con-

ference on Computer Vision and Pattern Recognition,

pages 1–8.

Sharma, V. and Davis, J. (2007). Integrating appearance and

motion cues for simultaneous detection and segmenta-

tion of pedestrians. IEEE International Conference on

Computer Vision, pages 1–8.

Shotton, J., Blake, A., and Cipolla, R. (2008). Mul-

tiscale categorical object recognition using contour

fragments. IEEE Transactions on Pattern Analysis

and Machine Intelligence, 30:1270–1281.

Vapnik, V. (1995). The nature of statistical learning theory.

Springer-Verlag.

Wang, H. and Koller, D. (2011). Multi-level inference by

relaxed dual decomposition for human pose segmen-

tation. IEEE Conference on Computer Vision and Pat-

tern Recognition, pages 2433–2440.

Wu, B. and Nevatia, R. (2007). Simultaneous object detec-

tion and segmentation by boosting local shape feature

based classiﬁer. IEEE Conference on Computer Vision

and Pattern Recognition, pages 1–8.

Zhu, Q., Yeh, M., Cheng, K., and Avidan, S. (2006). Fast

human detection using a cascade of histograms of ori-

ented gradients. IEEE Conference on Computer Vision

and Pattern Recognition, 2:1491–1498.

VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications

412