Toward Moving Objects Detection in 3D-Lidar and Camera Data

Clement Deymier and Thierry Chateau

Lasmea-UMR-CNRS 6602, Campus des Cezeaux, 24 Avenue des Landais, 63177 Aubiere Cedex, France

Keywords:

Moving Objects Detection, Photo-consistency.

Abstract:

In this paper, we propose a major improvement of an algorithm named IPCC (Clement Deymier, 2013a)

for Iterative Photo Consistency-Check. His goal is to detect a posteriori moving objects in both camera and

rangeﬁnder data. The range data may be provided by different sensors such as: Velodyne, Riegl or Kinect with

no distinction. The main idea is to consider that range data acquired on static objects are photo-consistent, they

have the same color and texture in all the camera images, but range data acquired on moving object are not

photo-consistent. The central matter is to take into account that range sensor and camera are not synchronous,

so what is seen in camera is not what range sensors acquire. This work propose to estimate photo-consistency

of range data by using his 3D neighborhood as a texture descriptor wich is a major improvement of the original

method based on texture patches. A Gaussian mixture method has been developed to deal with occluded

background. Moreover, we will see how to remove non photo-consistent range data from the scene by an

erosion process and how to repair images by inpainting. Finally, experiments will show the relevance of the

proposed method in terms of both accuracy and computation time.

1 INTRODUCTION

Sequences using sensors such as panoramic range

ﬁnders (Lidar, Velodyne, Riegl) or depth camera

(Kinect, SwissRanger) are increasingly common. The

main use of these data is to be realigned in a com-

mon reference thanks to a localization system to form

a coherent point cloud. For this step, ICP (Iterative

Closest Point) algorithm is often used (Rusinkiewicz

and Levoy, 2001), but sometimes this processing re-

quires the help of others sensors. It is indeed usual

to perform this transformation with the assistance

of an inertial unit coupled to a GPS or with a vi-

sual SLAM (Simultaneous Localization And Map-

ping) (Royer et al., 2007). When the 3D point cloud is

realigned, it can be used for navigation of autonomous

robots, but also for mapping and point based 3D ren-

dering of a scene. However, in some applications

such as mobile robotics exploration, we need to de-

tect and remove the moving objects from these data

in order to realize a digital map modeling. To this

end, several methods have been developed based on

the probability of occupancy using only range data.

More precisely, it is important to know if a point

in space is contained within a solid object or is an

empty space (air). This topic is very popular and of-

ten used in moving objects detection (Himmelsbach,

2008; Wurm et al., 2010; Clement Deymier, 2013b).

In fact, an object is moving if measures on his sur-

face are part of a space that is empty at a previous (or

future) time. However, it is unnecessary to know the

occupancy probability at any position but only those

corresponding to impact points (measurements). This

way, it is possible to “ﬁlter” moving objects within

the data by measuring the occupancy probability cal-

culated for a detection. If a given measurement has a

low probability then it is very likely that the detection

have been taken on a moving object.

On the other hand, the computer vision commu-

nity works on moving object detection in cameras.

There are many ways to ﬁnd moving objects in a se-

quence captured by a mobile camera. One of them is

to compute the optical ﬂow or dense matching (Jung

and Sukhatme, 2004) and analyze the results by a mo-

tion segmentation algorithm (Zografos et al., 2010).

Another way is to use a classiﬁcation algorithm with

a large knowledge database applied to each image,

or even deformable contour approach (Yilmaz et al.,

2004). Unfortunately, all these solutions make strong

hypothesis on the moving object or the scene. Their

movement must be small, they cannot be static for a

while, they cannot be deformable. . . So a pure 3D ap-

proach cannot remove an object from the camera, and

a camera approach cannot help us analyze the range

data because, in this paper, range data are considered

not synchronous with the images.

809

Deymier C. and Chateau T..

Toward Moving Objects Detection in 3D-Lidar and Camera Data.

DOI: 10.5220/0004928608090816

In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods (USA-2014), pages 809-816

ISBN: 978-989-758-018-5

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

Our motivation is to unify the range ﬁnding

and camera methods within a global solution named

IPCC (Clement Deymier, 2013a). This one is able

to detect moving objects a posteriori on a sequence

of range data with a color camera by exploiting a

photo-consistency criterion. After the application of

this method, the moving objects are detected in both

camera and range data. Moreover it is possible to re-

store the background in the images where there was

an moving object.

This paper propose a major improvement of the

original version of the algorithm by using a new es-

timation principle of the photo-consistency and new

approach to inpaint image data. Section 2 presents

the principle of the method and focuses on photo-

consistency estimation, scene erosion and image in-

painting. The beneﬁt of the proposed method is ana-

lyzed and discussed in details in section 3 and shows

the relevance of this solution in experimentations.

2 PRINCIPLE OF THE IPCC

ALGORITHM

2.1 A Global Presentation of the

Method

Our algorithm takes in entry two types of data: P

}

j=1,..,n

a set of 3D points coming from range

ﬁnder sensor expressed in a common reference, I

}

k=1,..,n

the set of color images provided by one or

more camera, and Ω

the projection function in the

image I

Because range ﬁnder acquires measure on object

surface, it’s possible to estimate a photo-consistency

criterion. If this 3D point is seen in more than one

images, the color of each re-projection can be com-

pared and a score of photo-consistency can be cal-

culated and used to classify moving objects. So, the

ﬁrst step of the IPCC algorithm is to determine which

points are visible or not in the images I

To this end, a clipping method (2.2) is applied

to determine which points are in the I

camera ﬁeld.

Then, all the visible points are projected in a z-buffer

by disk-splatting to estimate the occlusion constrain

over the scene. Then, all the generated z-buffers

are used to build an occurrence list, which maps 3D

points P

to the images where they are seen (2.4).

The second phase is to compute a photo-

consistency criterion. Unfortunately, the pixels color

standard deviation used by (Slabaugh et al., 2003) of

a projected 3D point is not robust and is very sensi-

ble to noise. Usually, to avoid this problem, a rect-

Figure 1: This is a top view of a scene. The 3D point cloud

P is constituted by a green wall, a red wall and a blue car.

The blue car is moving and have been scanned three times

by the range ﬁnder sensor. Three images I

and I

come

from a camera. Colors of 3D points are given only as indi-

cation but are not known at this step.

angular patch is taken tangent to the 3D surface and

projected in images. The corresponding quadrilater-

als in images are compared by a ZNCC computation

(Zeng et al., 2005). By this way, the system is more

robust and takes more information in account. But,

the IPCC algorithm takes the point cloud of an en-

tire sequence in entry, with the moving object, so, the

surfaces in not differentiable : the normal vector can-

not be computed. To this end, the 3D neighborhood

of a 3D point will be used as a local descriptor (2.5)

by projecting it in the images and getting the corre-

sponding pixels color vector. By this way, the normal

is not needed, no assumption are done on the surface

regularity and the descriptor is invariant to rotation,

translation, scale and projection.

The main difﬁculty lies in that moving objects are

present in images, so even static objects have a bad

photo-consistency score when moving object get in

front. To deal with this problem, color vectors are

compared with a robust Gaussian mixture and a prin-

cipal mode extraction to obtain the photo-consistency

score (2.5). Then a threshold is used to classify mov-

ing objects and all the corresponding 3D points are

deleted from the scene.

All this steps are computed iteratively until all the

remaining 3D points are photo-consistent. For a bet-

ter understanding of the algorithm, a simple example

will be used to illustrate each step (ﬁgure 1). When

the algorithm ends, the photo-consistent point are re-

projected in all images to detect where there was a

moving object. We can optionally chose to highlight

moving object or restore the image background by in-

painting. This part is developed and explained in de-

tail in the subsection 2.7.

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

810

Figure 2: An application of the clipping algorithm to image

. Dashed lines are outside frustum.

2.2 Image Clipping

This ﬁrst step consist to ﬁnd out which 3D points are

inside the camera ﬁeld of each image I

. This phase

is named clipping or frustum culling and needs a lot

of computation time. In fact, if the scene contains ten

million points, and if the camera took one hundred

images, we need to make one billion projections. To

solve this problem, an octree is used to speed-up clip-

ping. After all 3D points have been inserted inside the

octree leaves, the solution given by (Fujimoto et al.,

2012), which consist to compute the intersection be-

tween the camera frustum (four planes) and the oc-

tree cubic node, is used. If a node is intersected by

a camera frustum plane, then the child nodes are ex-

plored. If the node center projected with Ω

in the

image I

is inside the image, then the child nodes are

explored. At the end, the algorithm returns a leaves

list N

which contains all the octree nodes seen in the

k-th images. Figure 2 shows an example of clipping.

2.3 Z-buffer Computation

Once the clipping is done, we need to determine the

occlusion culling for each camera pose (for each im-

age), ie. which object is hidden by another. Oc-

clusion culling is very important because only points

which are really visible in a camera can have a photo-

consistency score. For our method, we need a pes-

simistic depth-buffer because if we see something

through an solid object, this point will be non-photo-

consistent and removed instantly, so we need to be

careful. Moreover, the surface is not differentiable be-

cause of presence of moving object at different time,

so it’s forbidden to try to compute normal vector to

the surface. To avoid normal estimation, we use a Z-

buffer with a disk-splatting algorithm.

For an image I

the process is quite simple. We

Figure 3: An example of Z-buffer for the image I

. The

point under the pink line are selected by this phase.

initialize a depth image Z

with the same size as I

For all the points P

contained in N

obtained during

the clipping step the coordinates of the projected point

are computed by : (u

, v

) = Ω

). If P

have a

depth inferior to the value of Z

, v

) then draw a

disk centered on (u

, v

) with the same depth value as

and a radius of

ρ× f

depth

where f is the focal length of

the camera. At the end, return the list of points LP

corresponding to 3D points P

with a depth inferior or

equal to Z

((u

, v

)).

ρ is the metric size of the disk in the disk-splatting

algorithm. Its value is chosen to ﬁt with the density

of the acquired range data to avoid holes in an object

or a wall The result on example scene can be seen in

ﬁgures 3.

2.4 Occurrence Retrieval

For each image, a list LP

of visible 3D points is

given by the Z-buffer. In order to estimate the photo-

consistency of the scene, we need to know in which

images a 3D point is seen. This is the role of the oc-

currence list O

= {O

}

j=1,..,n

where O

is a list of all

images containing the point P

. To build this struc-

ture, the lists LP

are concatenated in an array and

sorted by point index. Then, the occurrences of the

same point number in multiple views are regrouped

in O

and sent to the photo-consistency estimation.

2.5 Photo-consistency Estimation

In this part, to simplify index and formula, the photo-

consistency estimation will be explained for one 3D

point call P

. The same computation is assumed to

be done on all the points visible in a minimum of 3

images (ie. size(O

) ≥ 3)

TowardMovingObjectsDetectionin3D-LidarandCameraData

811

2.5.1 Building the Neighborhood Set V

As we said previously, the 3D neighborhood of a point

will be used as local texture descriptor to com-

pute the photo-consistency score of the point P

. Let

∈ V the 3D neighborhood of the point P

among

the 3D points visible in the view I

∈ O

. So for a

3D points P

for each view I

, we are looking for 3D

points visible (ie. ∈ LP

) in this view and in a neigh-

borhood of radius σ centered in P

. This step is done

by searching in the list of visible points in a view I

which is exactly the list provided by the z-buffer :

. The free parameter σ determine the 3D sphere

where we consider constant the probability density of

being a moving object. It is chosen empirical to ﬁt

the approximative minimal size of an moving object

(usually σ = 0.15m).

2.5.2 Extracting Color Vector Descriptor Set C

In a second step, for each V

∈ V , the 3D points in this

neighborhood are projected in the corresponding im-

age I

by Ω

to obtain a color value (via a bi-cubic

interpolation) to build a descriptor color vector C

The color vector C

contains for each 3D points of

the neighborhood V

a color value given by the im-

age I

. Now, we get a set of texture descriptor for a

considered point P

: C

= {C

}

j=1,..,n

2.5.3 Photo-consistency Criterion Computation

It is very important to understand that, since there are

moving objects in the camera and in the 3D point

cloud, the photo-consistency will be wrong for any

point, static or not, because moving objects are in

camera images. So a static point will have a bad

photo-consistency error in images where a moving

object stands in front of it. On the contrary, a 3D

point on a moving object is photo-consistent in all the

images where it was not moving. This problem is re-

ally hard to solve because image data are biased by

moving object, but there is a solution. A point on a

static object is photo-consistent (ﬁgure 5) anywhere

except on images with a moving object, but a point

on a moving object is photo-consistent only in images

where this moving object was there (ﬁgure 4). So the

two types of object can be distinguished by a principal

mode extraction as we seen in ﬁgure 5.

To this end, a kernel density estimator will be

used as a Gaussian mixture model to ﬁnd the prin-

cipal mode in the set of the color descriptor vector

C . If C

∈ C and C

∈ C are to color vector descrip-

tor coming respectively from I

and I

. A SSD score

noted D

k,l

estimates the similarity between them by

using L

norm of the difference of colors. The D

i, j

Figure 4: A point taken on a moving object is only photo-

consistent in the image where it is visible (I

). In the other

images it is never consistent. Here it has three different col-

ors: blue, green and red.

Figure 5: A static point can be partially photo-consistent

because its colors are: red, red and blue. But there is a

consistent principal mode: red.

are computed for all the (i, j) pairs and regrouped in

a distance matrix.





1,1

... D

1,n

... ... ...

n,1

... D

n,n





(1)

To extract a sample consensus from this matrix,

a Gaussian kernel with a standard deviation of σ is

applied to each term of the distance matrix, the rows

are summed and the column cmax with the highest

score is selected. This column represent the descrip-

tor nearest to the principal mode of the estimated

density of probability for the descriptors distribution.

The value of this mode (the maximal value of the

sum of the column) is our photo-consistency crite-

rion: J

∑

i=1

i,cmax

associated to the 3D point P

In many photo-consistency algorithms, the criterion is

not robust to outliers and do not tolerate any problem.

With this solution, the criterion is robust to outliers,

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

812

Figure 6: The eroded scene after one iteration. The 3D

points on the blue car are disappearing and let the back-

ground appear which is photo-consistent.

deal with occlusion and is totally symmetric from the

point of view of images. No image is used as refer-

ence, no preferences or a priori are done on the im-

age set or on the surface.The parameter σ is chosen as

6σ

where σ

is the standard deviation of the sensor

noise (ie. photogrammetric noise on pixels value).

This phase is computationally intensive with mil-

lions of 3D points and hundred of images because for

a point seen in n images,

n(n−1)

descriptor distance

are computed. So we choose the SSD as distance be-

cause it is very fast. SSD is not robust to illumina-

tion change but in our case, the images are taken in a

short time so the illumination is similar. As example,

a scene with 20 images, 1 millions 3D points seen in

an average of 7 images per points, the total computa-

tional time including clipping and z-buffer is 10s on

a Quad-Core 3.2Ghz where 80% of the time is con-

sumed in photo-consistency score estimation.

2.6 Scene Erosion

The photo-consistency score J

is analyzed to esti-

mate if a point needs to be removed or not. For this

step, a simple threshold S

is used : all the points in-

ferior to S

are deleted from the point database and

from the octree, otherwise nothing is done (ﬁgure 6).

By the way, the 3D scene is eroded and the process

restarts at the step (2.3) until all the scene points are

photo-consistent. At the end of the IPCC algorithm,

the photo-consistent scene is returned and the deleted

points are stored in a list. Static and moving objects

are classiﬁed in the 3D world but not in the images.

The classiﬁcation in the images is left to the next sec-

tion.

2.7 Image Inpainting

After the end of the Iterative Photo Consistency

Check algorithm, the point cloud P contain only static

and photo-consistent point. The main idea to detect

moving object in images is to compute a last time

all the step of the method and stop to the photo-

consistency criterion step (2.5). This time, we don’t

want to estimate if the 3D point is moving or not, but

we want to estimate in which images it is not photo-

consistent and inpaint this area. By the way, image

inpainting is computed in three step : ﬁrst a photo-

consistency score is computed over all the pixel of

the image I

deﬁning a probability ﬁeld M

. Secondly,

this probability is thresholded to obtain a binary map

. Then all the area of B

with value 0 are inpainted

by 3D surface regression and pixel re-projection in

the image I

cmax

. An application on the example can

be seen in ﬁgure 7.

2.7.1 Probability Field Estimation

In fact, the solution is already in the distance ma-

trix D

i, j

, because after the application of the Gaussian

kernel, the column with the largest sum deﬁne the

best photo-consistent color descriptor but the proba-

bility than the best photo-consistent color descriptor is

photo-consistent in the image I

is given by the value

of F

= D

k,cmax

. Where k is the image index and cmax

is the column index of the principal mode.

So for each point P

∈ P seen in I

, a photo-

consistency score is given for this image. But the

image is a pixel matrix and we only compute photo-

consistency on the set of sparse 3D points which are

visible in I

. So, it is necessary to project the 3D point

cloud in I

and interpolate the probability of being a

photo-consistent object on each pixel (i, j) to obtain a

dense probability map M

. To this end, sparse data are

interpolated with a robust method : the median of the

nearest neighbor value. This solution consist to ﬁnd

the 3D points projected in I

and near from coordinate

(i, j) then collect them probability value and take the

median as the value of pixel (i, j) of M

. An example

of probability ﬁeld M

can be seen in the experimental

section 3.

2.7.2 Thresholding and Inpainting

The probability map M

is then thresholded by the

same value as the 3D point in the scene erosion : S

We obtain a binary map B

where all value equal to

0 need to be inpainted. For each pixel p

i, j

∈ B

with

value 0, we want to ﬁnd a color to replace the old

one. This new color can be found in another image

where the scene is photo-consistent, but it is a local

TowardMovingObjectsDetectionin3D-LidarandCameraData

813

Figure 7: Principle of Image inpainting. The color C

from

image I

and C

from I

are averaged to obtain the color

of point P

. This one is projected in the non consistent

image I

and the images pixels are ﬁxed.

parameter because a part of the inpaint can be better

in image I

and another part can be good in I

. To

determine which image will be used to inpaint a pixel

(i, j) of image I

, we chose to look for the visible 3D

point projected in I

and nearest to coordinate (i, j).

Let P

this point, so the locally best photo-consistent

image is I

cmax

where cmax is the index of the column

corresponding to the principal mode for the point P

Once the image I

cmax

is ﬁnd for the pixel p

i, j

, we

need to determine which pixel of I

cmax

will be use to

ﬁll the pixel (i, j) of I

. The only solution is to back

project a ray from (i, j) in image I

on the 3D sur-

face to obtain a virtual 3D point P

, and project it

in the image I

cmax

using Ω

cmax

to ﬁnd it color (via

a bi-cubic interpolation). This method can be done

because the erosion delete all moving object and let a

photo-consistent scene, the 3D surface become differ-

entiable and we are able to compute a 3D surface esti-

mation. To this end, we use the 3D points projected in

and nearest from coordinate (i, j) as in 2.7.1 and we

compute a moving least square (Cheng et al., 2008)

regression of the depth value of this point set using a

polynomial of degree two. So an interpolated depth is

computed in (i, j), the inverse projection function give

the needed virtual 3D point P

then the pixel color is

locally ﬁxed by the color of Ω

cmax

) in image I

cmax

In all our experiment, nearest neighbour interpolation

and regression is done with the 15 nearest data.

3 APPLICATION TO SEQUENCE

3.1 Methodology

In order to test our algorithm and prove the possibil-

ity of detecting moving objects in both camera and

range-data with non synchronous sensors, our method

has been applied to a synthetic dataset. We consider

a outdoor scene, in front of a shop where a pedestrian

come from the right and go to the left in the cam-

era. This represents the type of scenes that motivated

our approach, in the context of robotic applications,

where a vehicle would take pictures of its environ-

ment. This sequence was made by a realistic sen-

sor simulator (4D-Virtualiz (Delmas, 2011; Malartre,

2011)) which generates data from a color camera and

a Velodyne HDL-64E. Both sensors are positioned on

a mobile vehicle going forward. The Velodyne is a

3D panoramic lidar with 64 lasers, an angular resolu-

tion of 0.09 degrees and a precise depth information

(σ = 0.05m). This sensor acquires a massive point

cloud with 1.3 millions point by second all around

it. The color camera has a resolution of 1024 × 768,

a frame rate of 15 f ps, and positioned to look at the

left side of the vehicle. For all theses data, the sen-

sors pose ground truth is used to realign the range

data and the camera poses in the common reference

system. This test sequence contain 4.10

3D points

acquired by the Velodyne and 52 pictures. A example

image can be seen in ﬁgure 8. For this experiment, we

choose a probability threshold of S

= 0.4.

After the execution of our method, points are clas-

siﬁed in static or moving object. In order to analyze

our results, we deﬁne ﬁve statistics indicator :

1. Pt : The number of 3D points detected on static

objects which is really on a static object (true pos-

itive)

2. Nt : The number of 3D points detected on moving

objects which is really on a moving object (true

negative)

3. P f : The number of 3D points detected on static

objects which is not really on a statics object (false

positive)

4. N f : The number of 3D points detected on mov-

ing objects which is not really on a moving object

(false negative)

3.2 Results, Analysis and Discussion

The Figure 8 show the original image from the exper-

imental sequence. IPCC algorithm is then applied to

this image set and to the corresponding point cloud

to delete all non-photo-consistent 3D points. The Ta-

ble 1 presents informations about the dataset for each

iteration of the IPCC algorithm. We can see that

only 263.10

points are selected by the clipping and

the Z-buffers. This points are seen in an average of

6 images, that is sufﬁcient to ﬁnd a principal mode

by using our photo-consistent criterion. By the way,

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

814

Figure 8: Example image of the simulated sequence.

Figure 9: An image corresponding to the interpolated prob-

ability ﬁeld M

, in image I

. Gray is moving, White is static

and Black is unknown because there is not enough neigh-

bors.

the ﬁrst iteration remove 8314 points, the second 467

and the scene is totally photo-consistent at the 6-th

iteration. The Table 2 present statistical information

about the point cloud classiﬁcation. Our ﬁlter is very

strict with static objects because the positive predic-

tive value is equal to 99.40% and has a good negative

predictive value of 96.80%. So our ﬁlter is very strict

but better to remove objects than to extract moving

objects, it is due to an higher value of the negative

false.

Table 1: Algorithm statistics over iterations.

Iterations 1 2 3

Occurrence list size (10

) : 217 262 263

Average occurrence per points 6.4 6.99 7.66

Number of removed points 8314 467 78

Computation time (s) 12 12 11

Image inpainting time (s) 10

Figure 10: An image corresponding to the thresholded

probability ﬁeld B

. Gray is static, White zone need to be

inpainted and Black is unknown because there is not enough

neighbors.

Figure 11: A inpainted image using MLS (Moving

Least Square) interpolation of surface and pixel color re-

projection.

Table 2: Point cloud classiﬁcation statistics.

Positive Negative

True Pt = 264 Nt = 1.6

False P f = 0.1 N f = 7.1

Image 9 show an example of interpolated proba-

bility ﬁeld and Figure 10 is the thresholded value of

. We can see in this qualitative result of our algo-

rithm that the detection of the person is precise but we

can see that his feet where not detected as moving ob-

ject. The reason of this miss-detection is that the color

of the ﬂoor is gray from any point of view so the 3D

points near the ﬂoor (the feet of the pedestrian) are

considered photo-consistent. Figure 11 show the im-

age after inpainting pixel color value with the value

of another image via the interpolated surface using

MLS. Classiﬁcation performance and image inpaint-

TowardMovingObjectsDetectionin3D-LidarandCameraData

815

ing are clearly improved in comparison to the original

article (Clement Deymier, 2013a).

The IPCC algorithm complexity is evaluated to

O(n

m × log(m)) where n is the number of camera

and m is the number of points in the scene. But if

we consider that a 3D points is only seen in a small

number of camera : τ (It is often the case when the

mobile vehicle is acquiring) then the algorithm com-

plexity become : O(τ

nm × log(m)). The method be-

come scalable to very long sequence with millions of

points and hundred of images. The number of iter-

ation necessary to remove all 3D points depend on

volume and trajectory of the moving objects but in

our experiments, after six iteration the scene is always

close to be totally photo-consistent.

4 CONCLUSIONS AND FUTURE

WORK

The objective of this article was to improve the photo-

consistency estimation and to demonstrate the efﬁ-

cacy of the IPCC algorithm. It present an original

method to detect moving object both in camera and

in range data. The core of the problem, the non syn-

chronous acquisition of the data is solved by using a

time-independent photo-consistency criterion applied

on the entire sequence. This criterion use a princi-

pal mode extraction in color descriptor vector space

to ﬁnd the statics points in the sequence and the non

photo-consistent point are classiﬁed as moving and

deleted from the scene. The iterative aspect of this

algorithm allow to detect all the 3D points on the

moving object and not only it surface. Moreover, this

method is very ﬂexible because more than one range

ﬁnder or camera can be used since the cloud is dense

enough and the camera are in color.

REFERENCES

Cheng, Z.-Q., Wang, Y.-Z., Li, B., Xu, K., Dang, G., and

Jin, S.-Y. (2008). A survey of methods for mov-

ing least squares surfaces. In Proceedings of the

Fifth Eurographics / IEEE VGTC conference on Point-

Based Graphics, SPBG’08, pages 9–23, Aire-la-Ville,

Switzerland, Switzerland. Eurographics Association.

Clement Deymier, T. C. (2013a). Ipcc algorithm: Moving

object detection in 3d-lidar and camera data. In IEEE

Intelligent Vehicles Symposium.

Clement Deymier, Damien Vivet, T. C. (2013b). Non-

parametric occupancy map using millions of range

data. In IEEE Intelligent Vehicles Symposium.

Delmas, P. (2011). Gnration active des dplacements d’un

vhicule agricole dans son environnement. PhD the-

sis, Ecole Doctorale Science Pour l’Ing

enieur de Cler-

mont Ferrand.

Fujimoto, K., Kimura, N., and Moriya, T. (2012). Method

for fast detecting the intersection of a plane and a cube

in an octree structure to ﬁnd point sets within a convex

region. In Society of Photo-Optical Instrumentation

Engineers (SPIE) Conference Series, volume 8301 of

Society of Photo-Optical Instrumentation Engineers

(SPIE) Conference Series.

Himmelsbach, M. (2008). Lidar-based 3d object percep-

tion. Proceedings of 1st International Workshop on

Cognition for Technical Systems.

Jung, B. and Sukhatme, G. S. (2004). Detecting moving

objects using a single camera on a mobile robot in an

outdoor environment. In in International Conference

on Intelligent Autonomous Systems, pages 980–987.

Malartre, F. (2011). Perception intelligente pour la nav-

igation rapide de robots mobiles en environnement

naturel. PhD thesis, Ecole Doctorale Science Pour

l’Ing

enieur de Clermont Ferrand.

Royer, E., Lhuillier, M., Dhome, M., and Lavest, J. (2007).

Monocular vision for mobile robot localization and

autonomous navigation. International Journal of

Computer Vision, 74:237–260. 10.1007/s11263-006-

0023-y.

Rusinkiewicz, S. and Levoy, M. (2001). Efﬁcient variants

of the icp algorithm. In International Conference on

3-D Digital Imaging and Modeling.

Slabaugh, G. G., Culbertson, W. B., Malzbender, T.,

Stevens, M. R., and Schafer, R. W. (2003). Methods

for volumetric reconstruction of visual scenes. Inter-

national Journal of Computer Vision, 57:179–199.

Wurm, K. M., Hornung, A., Bennewitz, M., Stachniss, C.,

and Burgard, W. (2010). OctoMap: A probabilis-

tic, ﬂexible, and compact 3D map representation for

robotic systems. In Proc. of the ICRA 2010 Workshop

on Best Practice in 3D Perception and Modeling for

Mobile Manipulation, Anchorage, AK, USA.

Yilmaz, A., Li, X., and Shah, M. (2004). Contour based

object tracking with occlusion handling in video ac-

quired using mobile cameras. IEEE Transactions on

Pattern Analysis and Machine Intelligence, 26:1531–

1536.

Zeng, G., Paris, S., Quan, L., and Sillion, F. (2005). Pro-

gressive surface reconstruction from images using a

local prior. In In ICCV, pages 1230–1237.

Zografos, V., Nordberg, K., and Ellis, L. (2010). Sparse

motion segmentation using multiple six-point consis-

tencies. CoRR, abs/1012.2138.

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

816