Line-based SLAM Considering Directional Distribution of Line Features

in an Urban Environment

Kei Uehara

, Hideo Saito

and Kosuke Hara

Graduate School of Science and Technology, Keio University,

3-14-1 Hiyoshi Kohoku-ku, Yokohama, Kanagawa, 223-0061, Japan

Denso IT Laboratory, Cross Tower 28F, 2-15-1 Shibuya Shibuya-ku, Tokyo, 150-0002, Japan

{kei-uehara, saito}@hvrl.ics.keio.ac.jp, khara@d-itlab.co.jp

Keywords:

Line-based SLAM, Manhattan World Assumption, Road Markings, Gaussian Distribution.

Abstract:

In this paper, we propose a line-based SLAM from an image sequence captured by a vehicle in consideration

with the directional distribution of line features that detected in an urban environments. The proposed SLAM

is based on line segments detected from objects in an urban environment, for example, road markings and

buildings, that are too conspicuous to be detected. We use additional constraints regarding the line segments

so that we can improve the accuracy of the SLAM. We assume that the angle of the vector of the line segments

to the vehicle’s direction of travel conform to four-component Gaussian mixture distribution. We deﬁne a

new cost function considering the distribution and optimize the relative camera pose, position, and the 3D

line segments by bundle adjustment. In addition, we make digital maps from the detected line segments. Our

method increases the accuracy of localization and revises tilted lines in the digital maps. We implement our

method to both the single-camera system and the multi-camera system. The accuracy of SLAM, which uses

a single-camera system with our constraint, works just as well as a method that uses a multi-camera system

without our constraint.

1 INTRODUCTION

Currently, advanced driver assistance sys-

tems(ADAS) has been actively researched. An

autonomous car is an example of an ADAS that will

enable people to go anyplace without your driving

operation.To achieve that, it requires localization to

calculate its’ trajectory. (Teramoto et al., 2012) indi-

cated that the accuracy of the localization required

for practical use is from several dozen centimeters

to several meters. For localization, one of the main

methods is to use GPS. Although POSLV, one of

the high-end integrated accurate positioning system,

achieves accuracy within several dozen centimeters

using a RTK-GPS receiver, it is not appropriate for

use in the autonomous car. One reason is that the cost

of RTK-GPS is very high. Another reason is that the

GPS system depends on the strength of microwaves

from the satellite. Thus, if the vehicle is in the tunnel,

it cannot work.

Meanwhile, the visual simultaneous localization

and mapping system (SLAM) has been attracting at-

tention of researchers because it does not have these

problems. To determine its position, detects feature

points from the environment around the vehicle us-

ing a camera as an input. In the case that feature

points can be detected fully, the accuracy of SLAM

is as well as a method with laser range scanner. Since

the accuracy of SLAM decreases without enough fea-

ture points, there are many studies trying to solve that

problem; for example, studies that uses multi-cameras

or researches detect not only points, but also lines.

However, using many feature points or lines means an

increasing of calculation cost, and it is fatal because

the system of the autonomous car requires a real-time

calculation. Increasing equipment means increasing

of production costs. The less equipment attached to

the vehicle, the better it is as long as the accuracy is

maintained in the autonomous car.

In this paper, we propose a line-based SLAM con-

sidering a directional distribution of line features in

an urban environment, which is based on the Manhat-

tan world assumption. Based on the fact that many

line segments in road markings are parallel or vertical

to a vehicle’s direction of travel, we deﬁne the distri-

bution as four-component Gaussian mixture distribu-

tion. To prove our method’s effectiveness, we con-

duct four experiments: 1) single-camera system with

line-based SLAM, 2) multi-camera system with line-

Uehara K., Saito H. and Hara K.

Line-based SLAM Considering Directional Distribution of Line Features in an Urban Environment.

DOI: 10.5220/0006149302550264

In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), pages 255-264

ISBN: 978-989-758-227-1

255

based SLAM, 3) single-camera system with line-and-

point-based SLAM, and 4) multi-camera system with

line-and-point-based SLAM. We can achieve a high

accuracy of SLAM in all experiments by our method.

In addition, we simultaneously make a digital

map. Although, the digital map has been used in

ADAS, it must be accurate and up-to-date. Roads and

road markings are destroyed frequently, so they must

be updated. Our digital map can be generated while

driving on streets, and it requires cameras. No other

equipments are needed. SLAM often has some intrin-

sic or extrinsic errors, so the line segments in the map

often tilt. However, it is not a problem in our method

because the directional distribution is considered.

2 PREVIOUS WORKS

Our goal is to realize SLAM using the distribution of

line segments. There are some SLAM works that use

line and road markings. Therefore, we discuss line-

based SLAM and SLAM using road markings sepa-

rately in this section.

2.1 Line-based SLAM

There are previous works on line-based SLAM that

have taken different approaches. (Smith et al., 2006)

proposed a real-time line-based SLAM. They added

straight lines to a monocular extended Kalman ﬁlter

(EKF) SLAM, and realized the real-time system using

a new algorithm and a fast straight-lines detector that

did not insist on detecting every straight line in the

frame.

Another approach for the line-based SLAM is us-

ing lines and other features. (Koletschka et al., 2014)

proposed a method of motion estimation using points

and lines by stereo line matching. They developed a

new stereo matching algorithm for lines that was able

to deal with textured and textureless environments.

In addition, (Zhou et al., 2015) proposed a vi-

sual 6-DOF SLAM (using EKF) based on the struc-

tural regularity of building environments, which is

called the Manhattan world assumption (Coughlan

and Yuille, 1999). By introducing a constraint about

buildings, the system achieved decreasing position

and orientation errors.

2.2 SLAM using Road Markings

There are also previous works using road markings

for SLAM for a vehicle. (Wu and Ranganathan, 2013)

proposed a method for SLAM using road markings

that they previously learned. They detected feature

(a) On roads (inter)

(b) On buildings (inter)

(d) On buildings (intra)

Figure 1: Examples of intra- and inter-camera correspon-

dences.

points from learned road markings and achieved a

high accuracy in estimation of a camera pose.

(Hata and Wolf, 2014) proposed a method which

detects road markings robustly for SLAM. They de-

veloped a line detector which did not affected by il-

lumination condition by adapting Otsu thresholding

method.

In addition, (Z.Tao et al., 2013) combined GPS,

proprioceptivesensors, and road markings for SLAM.

3 SYSTEM OVERVIEW

We provide an overview of a line-based SLAM con-

sidering the distribution of road markings. We im-

plement our method into both a single-camera system

and a multi-camera system. We explain the method of

the multi-camera system because the radical method

for both the single-camera and the multi-camera sys-

tems is the almost same .

As a premise, our SLAM method requires wheel

odometry. A relativecamera pose and position in each

frame are estimated from the data from the wheel sen-

sors. Then, we obtain line segments from input im-

ages for each frame t by the line segment detector

(LSD) algorithm. To correspond detected line seg-

ments, we consider the following three cases: match-

ing between the front images features at frame t and

t − 1, matching between the rear image at frame t

and t − 1, and matching between the rear image at

frame t and the front image at frame t − s. However,

the viewing directions are very different, so it is hard

to ﬁnd matching segments, especially in the case of

the third case. For a robust match, we make the fea-

ture patches around the line segments warp into other

frames. Due to this warping processing, the perspec-

tive appearance of the patch resembles a target image.

In addition, searching for the appropriate s value for

the best match enables us to ﬁnd the correspondences

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

256

between a rear image and multiple front images more

robustly. As long as corresponding line segments are

detected, each line segment can be tracked over. Fig-

ure 1 indicates examples of matching.

Using these corresponding line segments, 3D line

segments can be initially estimated using a method

based on the Manhattan world assumption. After

that, bundle adjustment is applied to the relative cam-

era pose, positions and 3D line segments to optimize

them. In a cost function of bundle adjustment, which

is used for the optimization, we implement a new el-

ement considering the distribution of road markings,

which is deﬁned by 4-component Gaussian mixture

distribution. Due to this element, the accuracy of lo-

calization improves and tilted line segments, which

include some errors, in the generated map are revised.

The difference between a single-camera system

and a multi-camera system is that the former does not

consider the third matching case: matching between

the rear image at frame t and the front image at frame

t − s.

4 NOTATION

We explain the notations brieﬂy before discussing our

method in detail. First, we deﬁne the four coordi-

nates W, C

, C

and V

, which indicate the world,

front camera, rear camera, and vehicle coordinate sys-

tems, respectively. The relative transformations from

the world coordinate system to the vehicle coordi-

nates system at frame t are expressed as R

, T

and these from the vehicle coordinates system to the

front or the rear camera coordinate system are ex-

pressed as R

, T

, R

, T

. The relative pose

and position of the front and rear cameras are cali-

brated beforehand. Using these notations, the projec-

tions from a world point p = (x,y,z)

to camera point

q = (q

)

are calculated as follows:









= R

p+ T

) + T

(1)

where k denotes camera 1 or camera 2. For this case,

k = 1 indicates the front camera. Then, projection

from a camera point q to a image point u = (u,v)

calculated as follows:





= π













(2)

Secondly, the 3D line L

is expressed as six-vector

)

. The three-vectors p

is a center point of

the 3D line and r

is the direction of the 3D line. The

2D line l

, which is a projection line of L

in the im-

age plane, is given as four-vector (u

)

. Point

u is calculated in the same way as the world point p

and the vector d in the image plane is calculated as

follows:









= R

(3)







− q



(4)

where D

is a point of the camera coordinate system.

Finally, the wheel odometry x is expressed as follows:

t+1





t+1





= x

+ ∆x

+ ε

(5)

where ∆x

denotes its relative movement from a pre-

vious time and ε

denotes the noise of ∆x

. In the

proposed method, we estimate the 3-DOF motion of

the vehicle coordinates (x, z, θ) in a 2D-environment.

When x

, which is the initial position of the vehicle,

is in the same position as the origin of the world coor-

dinate system, x

equals R

and T

. We suppose

that ε

is zero-mean Gaussian white noise with co-

variance Σ

∼ N (0,Σ

) (6)





0 0

0 σ

0 0 σ





∆t (7)

where σ

, σ

, and σ

denote error variances. The con-

variance increases in proportion to time, supposing

that ε

simply increases.

5 PROPOSED METHOD

The purpose of our method is to enhance the accuracy

of line-based SLAM by bundle adjustment, which

considers the distribution of the line segments. Our

system consists of two parts: line matching and bun-

dle adjustment. In this section, we explain how to

realize them individually.

5.1 Line Matching

Our method has two matching algorithms: match-

ing inter-camera correspondences and matching intra-

camera correspondences, the basis of which is warp-

ing patches based on the Manhattan world assump-

tion. We explain them individually.

Line-based SLAM Considering Directional Distribution of Line Features in an Urban Environment

257

5.1.1 Inter-Camera Correspondences

This method of matching is used between the front

and the rear images, and is not featured with a single-

camera system. The blue-colored rectangle shown in

Figure 2(a) is a patch, which is 20 pixels wide and

has a detected red-colored line segment in the center.

The patch is transformed into a target image plane by

a warp function, which considers the place where the

line segments are detected. Conceivable places where

they can be detected include front walls of buildings

or road surfaces, as shown in Figure 1(a) and (b).

Therefore, two types of warping processing are per-

formed on all line segments in this matching algo-

rithm.

To explain the case of the front walls of building,

according to the Manhattan world assumption, a y-

coordinate value of all the points in the patch in C

h, which indicates the height of the attached camera.

Therefore, the 3D point in the patch is calculated as

follows:





h(u

m,2

+ α)/(v

m,2

+ β)

h/(v

m,2

+ β)





(8)

We deﬁne the coordinate of the patch as u

m,2

+ α,v

m,2

+ β)

, where u

m,2

denotes the center

point of the line segment and α and β are the Eu-

clidean distances from the center point.

In the case of the line segments detected on the

road surface, the 3D point in the patch is calculated as

follows:





m,2

+ β)/(u

m,2

+ α)

/(u

m,2

+ α)





(9)

The x-coordinate value of all the points in the patch

in C

is d

, which is shown in Figure 1(b), under the

assumption.

The result of the warping is shown in Figure

2(b). It shows that the perspective appearance be-

comes similar to the target image (Figure 2(c)). Sub-

sequently, we judge whether the warped patch and the

line segments in the target image correspond or not

with using an error ellipse based on the EKF proposed

by (Davison et al., 2007). We use a raster scan of the

error ellipse of the warped patch and calculate a zero-

mean normalized cross-correlation (ZNCC) score. At

the position where the ZNCC is the highest, we calcu-

late two more values: the angle between the warped

line segment and the line segments in the target im-

age, and the distances from the endpoints of warped

line segments to the line segment in the target image.

If these two values are lower than the threshold, they

are regarded as the correspondence.

(a) Rear image and patch (b)Warped patch

Figure 2: (a)A rear image with a patch of the red detected

line. (b) The warped patch converted from (a). (c) The front

image.

Although we propose this matching method, it has

two ambiguous points: one is that we cannot precisely

distinguish whether the line segments exist on build-

ings or the road surface, and the precise value of d

cannot be calculated. To exclude these ambiguous

elements, we test two calculations, for the building

and road surface, and change d

at regular intervals in

each experiment. Then we decide the correspondence

based on the highest ZNCC score.

5.1.2 Intra-Camera Correspondences

This method of matching is used between pairs of

front images or pairs of rear images. There are three

conceivable places where the line segments can be de-

tected in this method: two of three are the same as the

inter-camera correspondences and the last one is the

side wall of buildings, as shown inFigure 1(c) and (d).

The 3D point in the patch is calculated as follows:





m,1

+ α)

m,1

+ β)





(10)

Under the Manhattan world assumption, z-coordinate

of all the 3D points in the patch is d

, which is shown

in Figure 1(b), so Equation 10 can be deﬁned. As well

as the inter-camera correspondences, we change d

regular intervals and ﬁnd the best one for matching.

5.1.3 The Matching Result

The results of matching are shown in Figures 3.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

258

(a) (b) (c) (d) (e) (f)

(g)

(h) (i)

(j)

(k) (l)

Figure 3: Examples of matching using our method (a)-(f) and LEHF (Hirose and Saito, 2012) (g)-(l). (a)-(d) and (g)-(j) are

matched pairs of front and rear images. (e) (f), (k), and (l) are matched pairs of front images. The red lines and green lines

show correct and incorrect matching, respectively.

Figures 3(a) to (d) indicate intra-matching corre-

spondences with using our method, and Figure 3(g)

to (j) indicate correspondences using LEHF (Hirose

and Saito, 2012). Although, the enough number of

the line segments cannot detected, the accuracy of the

matching obviously increases.

Figures 3(e) and (f) indicate inter-matching corre-

spondences with using our method, and Figures 3(k)

and (l) indicate correspondences using LEHF (Hirose

and Saito, 2012). The accuracy of them is the almost

same.

Using our method, we can classify all matched

line segments into on the building walls or

on the road, which LEHF cannot do. This detected

place data are important for the next step.

5.2 Initial Estimation of 3D Line

We use two methods to estimate the 3D line segments,

which will be explained individually in this section.

First, we estimate line segments using a method

based on the line of intersection of planes that passes

the camera center and the line segment. Figure 4(a)

shows examples of road maps generated by the 3D in-

formation calculated by this method, although most of

them lie wrong place. The reason for this is because

when the line segments run parallel to the travel di-

rection, the angle between the plane passing the seg-

ments becomes too low. The 3D information create

error because of that.

Secondly, we estimate the line segments using

Equations (8)(9),(10) as explained in Section 5.1. In

this method, we can obtain some 3D information per

one line segment, because it requires only one line

segment to calculate the 3D information and each

line segment has some corresponded line segments.

Therefore, we choose one line segment by minimum

median method. An element used in the minimum

median method is the re-projection error,which is cal-

culated from the perpendicular distance from a repro-

jected 3D line to the endpoints of a detected line in the

image plane. This method is widely used and is de-

ﬁned in a paper written by (Bartoli and Sturm, 2005).

Figure 4(b) shows a result of this method, which is

obviously better than ﬁrst method.

5.3 Optimization by Bundle Adjustment

We optimize the relative camera pose, position (R

) and the 3D line segments L

by bundle adjust-

ment. We deﬁne the set of corresponding line seg-

ments Ω as :

Ω ={ω

= (t,k, j, p)|

t ∈ {1,...,T},k ∈ {1,2}, j ∈ {1,...,J} p ∈ {1,2,3}}

(11)

where the ith line segment indicates that a 3D line

is observed in a place p by camera k at frame t. In

our method, the objectives are to minimize the re-

projection errors of all the line segments and to mini-

mize the angle errors of the line segments observed in

the road surface. Then, the cost function is deﬁned as

following.

E = e

+ e

∑

n=1

⊥

) +

2σ

∑

)

(12)

Line-based SLAM Considering Directional Distribution of Line Features in an Urban Environment

259









]>P@

[>P@

(a) Standard initial estimation









]>P@

[>P@

(b) Our initial estimation











]>P@

[>P@











]>P@

[>P@

(d) With the constraint

Figure 4: Examples of road maps.

where d

⊥

(a,b) denotes the perpendicular distance

from a point a to a line b in images, g

and g

de-

note the endpoints of observed line segments, and e

denotes an angle error, which can be calculated using

the difference between the travel direction of the vehi-

cle and the vector of the line segments. An objective

is to ﬁnd the best values of R

, T

, and L

to min-

imize E. We use the iterative non-linear Levenberg-

Marquardtoptimization algorithm with numerical dif-

ferentiation based on a method proposed by (Madsen

et al., 1999). We explain the calculation method for

the reprojection error and angle error individually.

5.3.1 Reprojection Error

We use the following equation to calculate d

⊥

= d

⊥

) =

− u

) − d

− v

)

+ d

(13)

To solve the bundle adjustment, we make Jacobian

matrices made from the result of differentiated Equa-

tion (13) in the relative camera pose, position(R

) and the 3D line segments L

. The differentia-

tion equations are expressed as follows by conform-

ing to the chain rule:

∂e

∂R

∂e

∂q

∂R

(14)

∂e

∂T

∂e

∂q

∂T

(15)

∂e

∂p

∂e

∂q

∂p

(16)

∂e

∂r

∂e

∂D

∂r

(17)

where q and D are deﬁned as Equations (1) and (4),

respectively.

Figure 5: A schematic diagram of d

In addition, we include a geometric constraint into

the cost function to enhance the accuracy of the opti-

mization. Under the Manhattan world assumption, y-

coordinate of the line segments detected on the road

surface is absolutely h, so we keep it constant dur-

ing the process of bundle adjustment. Since d

and

have ambiguousness, we do not incorporate that

constraint. Figure 4(c) and (d) are the result of map-

ping after bundle adjustment without or with the con-

straint, respectively. The ﬁgures show that the con-

straint works well.

5.3.2 Angle Error

First, we show that d

in Figure 5 is an angle between

the travel direction and the vector of the 3D line seg-

ments, and it has either a positive or a negative value;

the maximum value is

and the minimum value is

−

. It is calculated as follows:

= arctan

− arctan

(18)

((F

≧ 0 and F

≧ 0) or (F

≦ 0 and F

≦ 0))

= arctan

− arctan

− π (19)

((F

≧ 0 and F

≦ 0) or (F

≦ 0 and F

≧ 0))

where F = (F

) denotes the vehicle’s direction

of travel. To establish the consistency of the sign

of d

, we take π from d

in Equation (19). In our

method, we add a new constraint about d

. As I dis-

cussed in section 1, we suppose that most of the road

markings are parallel or vertical to the vehicle’s direc-

tion of travel. Although some markings include diag-

onal lines, there are many markings that include paral-

lel or vertical lines; for examples, car lanes, and mark-

ings at crosswalks. In the case of the parallel line, we

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

260

Figure 6: A four-component Gaussian mixture distribution.

assume that d

conforms to the Gaussian distribution.

In the case of the vertical line, we assume that d

conforms to the Gaussian distribution. Figure 6 shows

their distribution. Considering the diagonal lines, we

deﬁne one more Gaussian distribution. It prevents the

diagonal lines from being corrected to the parallel or

vertical lines. The equations can be expressed as fol-

lows:

|µ = 0,σ = σ

) =

2πσ

exp(−

2σ

)

(20)

|µ =

,σ = σ

) =

2πσ

exp(−

2σ

−

)

(21)

|µ = −

,σ = σ

) =

2πσ

exp(−

2σ

)

(22)

|µ = 0,σ = σ

) =

2πσ

exp(−

2σ

)

(23)

The numbers in the Legend of Figure 6 correspond to

these equations.

Based on these equations, the cost function is cal-

culated as follows:

= d

− µ (24)

This equation indicates that if the line segment is ver-

tical, but is a little tilted, it is corrected to the accurate

vertical line; and if it is parallel, but is a little tilted, it

is corrected to the accurate parallel line. The value of

µ is decided by the value of P(d

); if P

) is greater

than other P values, µ is

As well as reprojection error, we make a Jacobian

matrix. The differentiated equations are expressed as

follows:

∂e

∂r



∂e

∂r

∂e

∂r

∂e

∂r



(25)

∂e

∂r

+ r

(26)

∂e

∂r

= 0 (27)

∂e

∂r

= −

+ r

(28)

In addition, we obtain the best value of σ

and

by changing them at regular intervals for the best

optimization in each experiment.

6 EXPERIMENT

In this section, we introduce a practical experiment

that uses a real vehicle driving in an urban environ-

ment. Two cameras and a RTK-GPS, which can get

high accuracy of self-position, are attached to the ve-

hicle. The frame rate of the camera is 10 fps and

we prepare two datasets: one is a straight scene that

has 72 frames and another is curve scene that has 200

frames. We use the GPS as a ground truth. We evalu-

ate the accuracy of localization and mapping individ-

ually.

6.1 Evaluation of Localization

In our experiment, we apply our method to four cases;

1) single-camera system with line-based SLAM

(called line (S)), 2) single-camera system with point-

and-line based SLAM (point-line (S)), 3) multi-

camera system with line-based SLAM (line (M)),

and 4) multi-camera system with point-and-line based

SLAM (point-line (M)). We check how much the ac-

curacy of localization improves when the directional

distribution of road markings is considered in each

cases.

Figure 7 shows trajectories of the vehicles in the

case of line (S) in two datasets. We compare four

types of data in each case; ground truth, odome-

try, optimized data without the directional constraint,

and optimized data with the directional constraint. A

closeup area (1) of Figure 7(a) and (b) indicates that

the optimized data with the constraint is the closest

to the ground truth and it has higher accuracy than

that without the constraint. However, (2) of Figure

7(a) indicates the accuracy decreases by adding the

Line-based SLAM Considering Directional Distribution of Line Features in an Urban Environment

261

(a) Dataset 1

(b) Dataset 2

Figure 7: The trajectory results obtained from datasets 1 and 2 in the experiment of line (S). Comparison of the results

estimated by ground truth, odometry, optimized data without the constraint, and optimized data with the constraint

(a) line (S) (b) line (M)

Figure 8: Comparison of position error in each frame (dataset 1).

(a) line (S) (b) line (M)

Figure 9: Comparison of position error in each frame (dataset 2).

Table 1: The sum of the position error and the improvement rate.

Dataset 1 Dataset 2

Without the

constraint [m]

With the

constraint [m]

Improvement

rate [%]

Without the

constraint [m]

With the

constraint [m]

Improvement

rate [%]

line (S) 13.57 6.52 52.0 176.11 107.32 39.1

line (M) 6.22 5.54 10.9 205.56 65.16 68.3

line-point (S) 8.45 4.32 48.9 186.18 125.96 32.3

line-point (M) 2.62 1.58 39.7 267.90 129.20 51.8

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

262

(a) Dataset 1 (b) Dataset 2

Figure 10: Digital maps made from detected line segments.

Table 2: The rate of inlier lines

Dataset 1 Dataset 2

Without the

constraint [%]

With the

constraint [%]

Change [%]

Without the

constraint [%]

With the

constraint [%]

Change [%]

line (S) 28.5 47.5 +19 40.1 35.9 -4.2

line (M) 35.1 41.6 +6.5 39.6 36.9 -2.7

line-point (S) 39.1 49.7 +10.6 41.5 33.4 -8.1

line-point (M) 49.6 52.8 +3.2 42.2 39.1 -3.1

(a) Without the constraint (b) With the constraint

Figure 11: Mapping results in line (S) Red lines indicate inliar lines and green lines indicate outlier lines.

constraint. The reason for that is because the vehicle

cannot detect enough line segments. Actually, there

are few line segments in the area.

Figure 8 and Figure 9 provide a quantitative anal-

ysis of the accuracy of localization. They are posi-

tion errors, which indicate a perpendicular distance

to the ground truth. In addition, Table 1 shows the

sum of the position errors in each case and the rate of

improvement in percentage. The accuracy improves

in all experiments by considering the constraint. For

dataset 1, the accuracy improves about 50% in the

single-camera experiment. The accuracy of line (S)

with the constraint is as well as line (M) without it.

For dataset 2, the accuracy improves, especially in the

multi-camera experiments. Since our SLAM method

is based on the Manhattan world assumption, it is not

appropriate for the curving scene because the assump-

tion does not stand up well . However, by using the

Line-based SLAM Considering Directional Distribution of Line Features in an Urban Environment

263

directional constraint, the error generated in the curv-

ing scene is revised. It may be a reason for the high

improvement rate seen in dataset2.

6.2 Evaluation of Mapping

The detected line segments can be used to make dig-

ital maps. Figure 10 show the results of the digital

map. To produce a quantitative analysis, we judge

whether the generated line segments are inlier or out-

lier. Inlier lines are deﬁned as the perpendicular dis-

tance from the endpoints of the generated lines to

a professional line within 100 mm. Table 2 shows

the rate of inliers in each experiment. The results of

dataset 1 become more accuracy, while that of dataset

2 gets worse. The reason is that dataset 2 has more di-

agonal lines, which are collected to parallel or vertical

lines although it does not. Dataset 1 has many vertical

and parallel lines, so the rate of inliers increases in any

experiment. Figure 11 shows whether the line seg-

ments in the digital map are whether inlier or outlier

in line (S) of dataset 1 . Red lines indicate inlier lines

and green lines indicate outlier lines. Tilted lines in

Figure 11(a) is revised, so they change to green lines

in Figure 11(b).

7 CONCLUSIONS

In this paper, we propose a line-based SLAM consid-

ering the directional distribution of line features in an

urban environment. We regard the directional distri-

bution of road markings as a combination of Gaus-

sian distribution, and deﬁne a new constraint to a cost

function of bundle adjustment. In the practical exper-

iment, we prove that the accuracy of SLAM improves

in all cases. Due to our method, the single-camera

SLAM is as accurate as the multi-camera SLAM. In

addition, we make digital maps from the detected line

segments. Tilted lines are revised by our method, but

diagonal lines are badly corrected in some cases. We

will improve our method to apply to other cases.

REFERENCES

Bartoli, A. and Sturm, P. (2005). Structure-from-motion us-

ing lines:representation, triangulation, and bundle ad-

justment. In Computer Vision and Image Understand-

ing. Elsevier.

Coughlan, J. M. and Yuille, A. L. (1999). Manhattan world:

Compass direction from a single image by bayesian

inference. In Computer Vision, 1999. The Proceed-

ings of the Seventh IEEE International Conference on.

IEEE.

Davison, A. J., Reid, I. D., Molton, N. S., and Stasse, O.

(2007). Monoslam: Real-time single camera slam.

In Pattern Analysis and Machine Intelligence, IEEE

Transactions on. IEEE.

Hata, A. and Wolf, D. (2014). Road marking detection us-

ing lidar reﬂective intensity data and its application

to vehicle localization. In Intelligent Transportation

Systems(ITSC), 2014 IEEE 17th International Confer-

ence on. IEEE.

Hirose, K. and Saito, H. (2012). Fast line description for

line-based slam. In Proceedings of the British Ma-

chine Vision Conference. BMVA.

Koletschka, T., Puig, L., and Daniilidis, K. (2014).

Mevo: Multi-environment stereo visual odometry us-

ing points and lines. In Intelligent Robots and Systems

(IROS), 2014. IEEE.

Madsen, K., Bruun, H., and Tingleff, O. (1999). Methods

for non-linear least squares problems. In Informatics

and Mathematical Modelling, Technical University of

Denmark. Citeseer.

Smith, P., Reid, I. D., and Davison, A. J. (2006). Real-time

monocular slam with straight lines. In Proceedings of

the British Machine Vision Conference. BMVC.

Teramoto, E., Kojima, Y., Meguro, J., and Suzuki, N.

(2012). Development of the “precise” automotive in-

tegrated positioning system and high-accuracy digital

map generation. In R&D Review of Toyota CRDL.

IEEE.

Wu, T. and Ranganathan, A. (2013). Vehicle localization

using road markings. In Intelligent Vehicles Sympo-

sium(IV), 2013 IEEE. IEEE.

Zhou, H., Zou, D., Pei, L., Ying, R., Liu, P., and Yu, W.

(2015). Structslam: Visual slam with building struc-

ture lines. In Vehicular Technology, IEEE Transac-

tions on.

Z.Tao, Bonnifait, P., V.Fremont, and J.Ibanez-Guzman

(2013). Mapping and localization using gps, lane

markings and proprioceptive sensors. In Intelligent

Robots and Systems(IROS), 2013 IEEE/RSJ Interna-

tional Conference on. IEEE.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

264