Handling Delayed Fusion in Vision-Augmented Inertial Navigation

Ehsan Asadi and Carlo L. Bottasso

Department of Aerospace Engineering, Politecnico di Milano, Milano, Italy

Keywords:

Delayed Fusion, Vision-Augmented INS, State Estimation.

Abstract:

In this paper we consider the effects of delay caused by real-time image acquisition and feature tracking in

a previously documented Vision-Augmented Inertial Navigation System. At ﬁrst, the paper illustrates how

delay caused by image processing, if not explicitly taken into account, can lead to appreciable performance

degradation of the estimator. Next, three different existing methods of delayed fusion are considered and

compared. Simulations and Monte Carlo analyses are used to assess the estimation error and computational

effort of the various methods. Finally, a best performing formulation is identiﬁed, that properly handles the

fusion of delayed measurements in the estimator without increasing the time burden of the ﬁlter.

1 INTRODUCTION

In this paper we consider the handling of delay asso-

ciated with tracked feature points within a previously

documented Vision-Augmented Inertial Navigation

System (VA-INS) (Bottasso and Leonello, 2009).

Navigation approaches often use vision systems,

since these are among the most information-rich sen-

sors for autonomous positioning and mapping pur-

poses (Bonin-Fontand et al., 2008). Vision-based

navigation systems have been in use in numer-

ous applications such as Autonomous Ground Vehi-

cles (AGV) and underwater environments (Dalgleish

et al., 2005). Recently, they have been gaining in-

creased attention also in the ﬁeld of Unmanned Aerial

Vehicles (UAV) (Liu and Dai, 2010). Vision sys-

tems provide long range, high resolution measure-

ments with low power consumption and low cost. On

the other hand, they are usually associated with rather

low sample rates, since they often require complex

processing of the acquired images, and this limits and

hinders their usability in fast and real-time applica-

tions such as UAVs.

Several attempts have already been documentedin

the design and implementation of robust visual odom-

etry systems (Nister et al., 2006; Goedeme et al.,

2007). Some authors have proposed the incorporation

of inertial measurements as model inputs (Roumeli-

otis et al., 2002) or states (Qian et al., 2001; Veth

et al., 2006; Mourikis and Roumeliotis, 2007), us-

ing variants of the Kalman ﬁltering approach to ro-

bustly estimate the vehicle motion. The VA-INS of

Bottasso and Leonello (2009) combined in a syner-

gistic way vision-based sensors together with classi-

cal inertial navigation ones. The method made use of

an Extended Kalman Filter (EKF), assuming that all

measurements were available with no delay.

However, latency due to the extraction of infor-

mation from images in real-time applications is one

of the factors affecting accuracy and robustness of

vision-based navigation systems (Jones and Soatto,

2011). Since image processing procedures required

for tracking feature points between stereo images and

across time steps are time consuming tasks, visual

observations are generated with delay. If delays are

small or the estimation is performed off-line, then the

use of a classic ﬁltering approach leads to acceptable

results. Otherwise, the quality of the estimates is af-

fected by the magnitude of the delay. Consequently, it

becomes important to understand how to account for

such delay in a consistent manner, without at the same

time excessively increasing the computational burden

of the ﬁlter.

Measurement delay has been the subject of nu-

merous investigations, for example in the context of

systems requiring long time visual processing (Porn-

sarayouth and Wongsaisuwan, 2009). If the delay

is rather small, a simple solution is to ignore it, but

this implies that the estimates are not optimal and

their quality may be affected. Another straightfor-

ward method to handle delay is to completely recal-

culate the ﬁlter during the delay period as measure-

ments arrive. Usually this method cannot be used in

practical applications because of its large storage cost

394

Asadi E. and L. Bottasso C..

Handling Delayed Fusion in Vision-Augmented Inertial Navigation.

DOI: 10.5220/0004041403940401

In Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2012), pages 394-401

ISBN: 978-989-8565-21-1

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

and computational burden. In chemical and biochem-

ical processes, methods have been proposed based

on the augmentation of the states (Gopalakrishnan

et al., 2011; Tatiraju et al., 1999). Other documented

methods fuse delayed measurements as they arrive

(Alexander, 1991; Larsen et al., 1998). These meth-

ods are effectively implemented in tracking and nav-

igation systems for handling delays associated with

the Global Positioning System (GPS).

The aim of this paper is to present a modiﬁcation

of the VA-INS of Bottasso and Leonello (2009), based

on a delayed fusion process. In the new formulation,

tracked feature points associated with delay are in-

corporated as delayed measurements in a multi-rate

multi-sensor data fusion process using a non-linear

estimator. More speciﬁcally, the paper:

• Analyzes the effects of delay caused by image

processing on state estimation, when such delay

is not explicitly accounted for in the estimator;

• Considers implementation issues and assesses

the performance of three existing delayed fusion

methods to incorporate delayed vision-based mea-

surements in the estimator;

• Assesses the quality of the various formulations

and identiﬁes the most promising one, in terms of

computational burden of the ﬁlter and of the qual-

ity of its estimates, using simulation experiments

and Monte Carlo analysis.

2 VISION-AUGMENTED

INERTIAL NAVIGATION

Bottasso and Leonello (2009) proposed a VA-INS to

achieve higher precision in the estimation of the vehi-

cle motion. Their implementation used relatively low

resolution and relatively high noise low-cost small-

size cameras, that can be mounted on-boardsmall Ro-

torcraft Unmanned Aerial Vehicles (RAUVs). In this

approach, the sensor readings of a standard inertial

measurement unit (a triaxial accelerometer and gyro,

a triaxial magnetometer, a GPS and a sonar altimeter)

are fused within an EKF together with the outputs of

so-called vision-based motion sensors. The overall

architecture of the system is brieﬂy reviewed here.

2.1 Kinematics

The sensor conﬁguration and reference frames used

in the kinematic modeling of the system are depicted

in Fig. 1.

The inertial frame of reference is centered at

point O and denoted by a triad of unit vectors E

Figure 1: Reference frames and location of sensors.

), pointing North, East and down(NED nav-

igational system). A body-attached frame has origin

in the generic material point B of the vehicle and has

a triad of unit vectors B

= (b

The components of the acceleration in the body-

attached frame are sensed by an accelerometer located

at point A on the vehicle. The accelerometer yields a

reading a

acc

affected by noise n

acc

= g

− a

+ n

acc

. (1)

In this expression, g

indicates the body-attached

components of the acceleration of gravity, where

= R

with g

= (0,0,g)

, while R

R = R

R(q

q) are

the components of the rotation tensor which brings

triad E into triad B .

Gyroscopes measure the body-attached compo-

nents of the angular velocity vector, yielding a reading

gyro

affected by a noise disturbance n

gyro

= ω

+ n

gyro

. (2)

The kinematic equations, describing the motion of

the body-attached reference frame with respect to the

inertial one, can be written as

= g

− R

R[a

acc

+ ω

× ω

× r

+ α

× r

]

+ R

acc

, (3a)

= α

(ω

gyro

+ n

gyro

), (3b)

= v

, (3c)

q = T

T(ω

q, (3d)

where v

is the velocity of point B, ω

ω is the angular

velocity and α

α the angular acceleration, while r

the position vector from point B to point A and r

from point O to point B. Finally q

q are rotation param-

eters, which are chosen here as quaternions, so that

matrix T

T can be written as

T(ω

) =



0 −ω

−ω



. (4)

Gyro measures are used in Eq. (3b) for computing

an estimate of the angular acceleration. Since this im-

plies a differentiation of the gyro measures, assuming

HandlingDelayedFusioninVision-AugmentedInertialNavigation

395

a constant (or slowly varying) bias over the differenti-

ation interval, knowledge of the bias becomes unnec-

essary. Hence, the angular acceleration is computed

≃ α

(ω

gyro

), (5)

where α

is a discrete differentiation operator. The

angular acceleration at time t

is computed accord-

ing to the following three-point stencil formula based

on a parabolic interpolation α

) =



3ω

gyro

) −

4ω

gyro

k−1

) + ω

gyro

k−2

)



/(2h), where h = t

−

k−1

= t

k−1

− t

k−2

A GPS is located at point G on the vehicle (see

Fig. 1). The velocity and position vectors of point G,

noted respectively v

and r

, can be expressed as

= v

+ R

Rω

× r

, (6a)

= r

+ R

. (6b)

The GPS yields measurements of the position and

velocity of point G affected by noise, i.e.

gps

= v

+ n

gps

, (7a)

gps

= r

+ n

gps

. (7b)

A sonar altimeter measures the distance h along

the body-attached vector b

, between its location at

point S and point T on the terrain. The sonar altimeter

yields a reading h

sonar

affected by noise n

sonar

, i.e.

h = r

− s, (8)

sonar

= h+ n

sonar

, (9)

where r

= r

· e

and R

R = [R

], i, j = 1,2, 3.

Furthermore, we consider a magnetometer which

senses the magnetic ﬁeld m

m of the Earth in the body-

attached system B . The inertial components m

are

assumed to be known and constant in the area of oper-

ation of the vehicle. The magnetometer yields a mea-

surement m

magn

affected by noise n

magn

, i.e.

= R

, (10)

magn

= m

+ n

magn

. (11)

Finally, considering a pair of stereo cameras lo-

cated on the vehicle (see Fig. 2), a triad of unit vec-

tors C

= (c

) has origin at the optical center C

of the left camera, where c

is directed along the hor-

izontal scanlines of the image plane, while c

is par-

allel to the optical axis, pointing towards the scene.

Considering that P is a ﬁxed point, the vision-based

observation model , discretized across two consecu-

tive time instants t

and t

k+1

= t

+ ∆t, is

d(t

k+1

)

k+1

= −∆tC



R(t

k+1

)

k+1

)

+ ω

k+1

) × (c

d(t

)



+ d

d(t

)

, (12)

Figure 2: Geometry for the derivation of the discrete vision-

based motion sensor.

where C

C are the components of the rotation tensor

which brings triad B into triad C . The tracked fea-

ture point distances are noted d

and d

′

for the left

and right cameras, respectively, and are obtained by

stereo reconstruction using d

= p

b/d, where p

p is

the position vector of the feature point on the image

plane, b is the stereo baseline and d the disparity. This

process yields at each time step t

k+1

an estimate d

vsn

affected by noise n

vsn

= d

d(t

k+1

)

k+1

+ n

vsn

, (13)

for the left camera, and a similar expression for the

right one.

2.2 Process Model and Observations

The estimator is based on the following state-space

model

x(t) = f



x(t),u

u(t), ν

ν(t)



, (14a)

y(t

) = h



x(t

)



, (14b)

z(t

) = y

y(t

) + µ

µ(t

), (14c)

where the state vector x

x is deﬁned as

= (v

,ω

. (15)

Function f

f(·,·,·) in Eq. (14a) represents in compact

form the rigid body kinematics expressed by Eqs. (3).

The input vector u

u appearing in Eq. (14a) is deﬁned

as measurements provided by the accelerometers a

acc

and gyros ω

gyro

, and ν

ν is their associated measure-

ment noise vector.

Similarly, Eqs. (6), (8), (10) and (12) may be gath-

ered together and written in compact form as an ob-

servation model h

h(·) expressed by Eqs. (14b), where

the vector of outputs y

y is deﬁned as

y = (v

,h,m

,...,d

′T

,...)

. (16)

ICINCO2012-9thInternationalConferenceonInformaticsinControl,AutomationandRobotics

396

The deﬁnition of model (14) is complemented by

the vector of measurements z

z and associated noise µ

vectors

= (v

gps

sonar

magn

,...,d

vsn

T′

vsn

)

, (17a)

= (n

gps

sonar

magn

,...,n

vsn

)

. (17b)

The state estimation problem expressed by

Eqs. (14–17) was solved using the EKF approach, ini-

tially assuming that all measurements are available

with no delay.

2.3 Classic State Estimation using EKF

The EKF formulation is brieﬂy reviewed here using

the time-discrete form of Eqs. (14) and assuming ν

and µ

µ to be white noises with covariance Q

Q and U

The prediction stage of states and observations is per-

formed by using the non-linear model equations,

−

k−1

+ f

k−1

,0) ∆t, (18a)

= h



−



, (18b)

whereas a linear approximation is used for estimating

the error covariance and computing the Kalman gain

matrices,

−

= A

k−1

+ G

, (19a)

= P

−

k−1

]

−1

. (19b)

Matrices A

, G

and H

are computed by linearizing

the non-linear model about the current estimate,

= I

I + ∆t

∂f

∂x

, G

= ∆t

∂f

∂ν

, H

∂h

∂x

. (20)

Finally, covariance updates and state estimates are

computed as

= [I

I − K

−

, (21a)

−

+ K



− h



−



. (21b)

2.4 Image Processing and Tracking

The idea of VA-INS is based on tracking scene points

between stereo images and across time steps, to ex-

press the apparent motion of the tracked points in

terms of the motion of the vehicle. The identiﬁca-

tion and matching of feature points is begun with the

acquisition of the images; then, strong corners are ex-

tracted from the left image with the feature extrac-

tor of the KLT tracker (Jianbo and Tomasi, 1994),

and a dense disparity map is obtained. Identiﬁed fea-

ture points are encoded using the BRIEF descriptor

(Calonder et al., 2010), and subsequently matches in a

transformed image are found by computing distances

between descriptors. This descriptor is in general

competitivewith algorithms like SURF and SIFT, and

much faster in terms of generation and matching.

A real-time implementation of the system was

based on an on-board PC-104 with a 1.6 GHz CPU

and 512 Mb of volatile memory, with the purpose of

analyzing the performance and computational effort

of the feature tracking process. Images were captured

by a Point Grey Bumblebee XB3 stereo vision cam-

era, and resized images with a resolution of 640x480

were used for tracking 100 points between frames.

These tests indicate the presence of a 490 millisecond

latency (see table 1) between the instant the image is

captured and the time the state estimator receives the

required visual information.

Table 1: Time consumption of image processing tasks.

Process Task Computing Time

Image acquisition 100 ms

Resizing, rectiﬁcation 40 ms

Dense disparity mapping 150 ms

Feature extraction 130 ms

Feature description 50 ms

Feature matching 20 ms

TOTAL 490 ms

3 DELAYED FUSION IN VA-INS

Simulation analyses, presented later, show that the

half a second delay of the system is signiﬁcant enough

not to be neglected. In other words, directly feeding

this delayed vision-based measurements to the EKF

estimator will affect the quality of the estimates.

The outputs of the vision-based motion sensors

vsn

(s) and d

′

vsn

(s) from a captured image at time s

will only be available at time k = s + N, where

N is the sample delay. Such delayed outputs are la-

beled d

∗

vsn

(k) and d

′

∗

vsn

(k). On the other hand, mea-

surements from other sensors are not affected by such

delay, and are available at each sampling time. For

the purpose of handling multi-rate estimation and de-

lay, observations are here partitioned in two groups,

one collecting multi-rate non-delayed GPS, sonar and

magnetometer readings (labeled rt, for real-time), and

the other collecting delayed vision-based observa-

tions (labeled vsn, for vision):

˙=(v

gps

sonar

magn

)

, (22a)

vsn

∗

˙=(d

∗T

vsn(1)

∗T

vsn(2)

,...,d

∗T

vsn(n)

)

. (22b)

The state estimation process is based on using a

proper EKF update for each group. More speciﬁ-

cally, the recalculation, Alexander (Alexander, 1991)

HandlingDelayedFusioninVision-AugmentedInertialNavigation

397

and Larsen (Larsen et al., 1998) methods are surveyed

here for fusing delayed tracked points in the VA-INS

structure as they arrive. All methods are brieﬂy re-

viewed in the following.

3.1 Recalculation Method

A straightforward estimate can be obtained simply by

recalculating the ﬁlter throughout the delay period.

As the vision-based measurements are not available in

the time interval between s to k, one may update states

and covariance using only non-delayed measurements

in this time interval. As soon as vision measurements

originally captured at time s are received with delay

at time k, the estimation procedure begins from s by

repeating the update procedure while incorporating

both non-delayed measurements and lagged vision-

based measurements.

The computational burden of this implementa-

tion of the ﬁlter in the VA-INS is critical, because

of the need of fusing a fairly large set of measure-

ments. Therefore the approach, although rigorous and

straightforward, is not a good candidate for the imple-

mentation on-board small size aerial vehicles.

3.2 Alexander Method

In this method (Alexander, 1991), a correction term

∗

is calculated based on Kalman information and

added to the ﬁlter estimates when the delayed mea-

surements are received:

∗

∏

i=1



I − K

′

s+i



s+i−1

. (23)

In the above equation, K

′

k−i

is used to distinguish it

from K

k−i

. The Kalman gain and error covariance are

updated at time s as if the measurements were avail-

able without delay. Then, at time k when measure-

ments z

vsn

∗

become available with delay, their incor-

poration and the state update is obtained by using the

following correction term in the Kalman equation

= M

∗

vsn



vsn

∗

− H

vsn

∗



. (24)

The problem of implementing Alexander method

in the VA-INS arises since it is not possible to iden-

tify which points are tracked until all image process-

ing tasks at time k are completed. Consequently, the

global measurement model H

vsn

∗

, including the sub-

models of all tracked feature points in a new scene, is

unknown at time s. Moreover, the uncertainty U

vsn

∗

related to each point is unknown, since this is changed

by its distance and position in the image plane.

3.3 Larsen Method

In our VA-INS, the successive tracked points, their

uncertainty and consequently the exact measurement

model will be unknown until images are completely

processed. Therefore, a method is needed that does

not require information about z

vsn

∗

until new mea-

surements arrive.

Larsen extended Alexander approach, by extrap-

olating delayed measurements to the present ones

(Larsen et al., 1998):

vsn

(int)

= z

vsn

∗

+ H

vsn

∗

− H

vsn

∗

. (25)

Larsen shows that the correction term is calculated

based on Kalman information in a way that closely

resembles Alexander method, i.e.

∗

∏

i=1



I − K

s+i



s+i−1

, (26)

where the Kalman gain and covariance are kept frozen

until the delayed measurements arrive. Once this hap-

pens, they are updated in a simple and fast manner:

vsn

= M

∗

vsn

∗T

vsn

∗

vsn

∗T

vsn

∗

−1

(27a)

δP

= K

vsn

∗

, (27b)

= M

∗

vsn



vsn

∗

− H

vsn

∗



. (27c)

3.4 Flow of EKF-Larsen Processing

Fig. 3 shows an overview of the measurement pro-

cessing procedures for the standard EKF and Larsen

method. The image processing routines are started at

time s, tracking feature points in new scenes; how-

ever, there is no available vision-based measurement

until time k = s+ N.

Figure 3: Flow of sequential EKF/Larsen processing.

Meanwhile, the multi-rate real-time measure-

ments z

s+i

,1 ≤ i ≤ N are fused through the EKF

Eqs. (19b–21b) as they arrive, using H

s+i

. This will

produce the Kalman gain K

s+i

, state estimates

s+i

ICINCO2012-9thInternationalConferenceonInformaticsinControl,AutomationandRobotics

398

and covariance P

s+i

. Implementing Larsen approach

requires only the state vector and covariance error at

time s to be stored and the correction term M

s+i

∗

to be

calculated during the delay period as

s+i

∗

= M

s+i−1

∗



I − K

s+i



s+i−1

. (28)

At time k, when the vision-based measurements

become available, Larsen equations are used to incor-

porate delayed measurement z

vsn

∗

in the estimation

procedure. The Kalman gain K

vsn

is calculated using

Eq. (27a). Finally, visual measurements corrections

δP

and δ

, obtained by Eqs. (27b–27c), are added

to the covariance matrix and state vector of real-time

measurements updates P

and

, to obtain new quan-

tities P

and

4 SIMULATION EXPERIMENTS

A Matlab/Simulink simulator was developed, that in-

cludes a ﬂight mechanics model of a small RUAV,

models of inertial navigation sensors, magnetometer,

GPS and their noise models. The simulator is used in

conjunction with the OGRE graphics engine (Junker,

2006), for rendering a virtual environment scene and

simulating the image acquisition process. All sensor

measurements are simulated (see table 2) as the heli-

copter ﬂies at an altitude of 2 m following a rectangu-

lar path at a constant speed of 2 m/sec within a small

village, composed of houses and several other objects

with realistic textures (see Fig. 4).

Table 2: Sensors and vibration noise levels.

Sensors Noise Level

Gyro 50 deg/s

Accelerometer 0.5 m/s

Magnetometer 1∗ 10

−4

Gauss

GPS 2 m

Figure 4: View of simulated small village environment and

ﬂight trajectory.

Navigation measurements are provided at a rate

of 100 Hz, while stereo images at the rate of 2 Hz.

The GPS, available at a rate of 1 Hz, is turned off

after 5 sec in the ﬂight, to clarify the affects of

the visual measurement delay. The state estimates

are obtained by four parallel data fusion processes:

classic EKF with non-delayed measurements, classic

EKF with delayed measurements, recalculation and

EKF/Larsen methods in the presence of delay.

Fig. 5(a) shows the effects of delay on the EKF

estimates, presenting a comparison of positions ob-

tained by classic EKF, fed with delayed and non-

delayed visual measurements. Fig. 5(b) shows po-

sition estimates obtained by the two methods of re-

calculation and sequential EKF/Larsen in the pres-

ence of delayed visual measurements, in comparison

with the ideal classic EKF (without delay). Results

clearly show the negative effects of delay on the stan-

dard EKF estimation, which are optimally compen-

sated with the recalculation ﬁlter and the sequential

ﬁltering by EKF/Larsen.

4.1 Monte Carlo Simulation

A Monte Carlo simulation is used here for consid-

ering the affects of random variation on the perfor-

mance of the approaches, as well as evaluating the

computational time burden of each method. The anal-

ysis consisted of 70 runs, which is the number of

simulations that were necessary in this case to bring

the statistics to convergence. For each simulation

run, measurements and stereo images are generated

for the 100 sec maneuver described above, each with

randomly generated errors due to noise and random

walks.

The average error in the position, velocity and at-

titude estimates are shown in Figs. (6–8), using the

four implementations of the vision-augmented data

fusion procedures explained above. The average es-

timation errors for each approach obtained by Monte

Carlo Simulation are reported in table 3.

Considering random noise variations, the recalcu-

lation and EKF/Larsen methods show a good perfor-

mance.

In fact, the average errors of these methods is

very close to that obtained by the classic EKF with

no delay on the visual measurements. However, the

processing time of the ﬁlter recalculation increases

twofold, as shown by table 3, implying a considerable

additional computational burden. On the other hand,

the EKF/Larsen approach does not signiﬁcantly affect

the processing time of the ﬁlter, and therefore conju-

gates high quality estimation and low computingtime.

HandlingDelayedFusioninVision-AugmentedInertialNavigation

399

(a) Effects of delay on classic EKF estimate (b) Performance of delayed fusion methods

Figure 5: Comparison of position estimates in the X-Y plane. (a) EKF with (dark-line) and without (light-line) delay on visual

measurements; (b) Recalculation and EKF/Larsen methods in the presence of delayed visual measurements.

5 CONCLUSIONS

In this work, a previously documented VA-INS was

extended by implementing various approaches to han-

dle feature tracking delays in a multi-rate multi-sensor

data fusion process. Simulation experiments were

used together with Monte Carlo analyses to assess the

estimation error and the computational burden of the

methods.

The paper shows that delay caused by image pro-

cessing, if not properly handled in the state estima-

tor, can lead to an appreciable performance degra-

dation. Furthermore, recalculation and sequential

EKF/Larsen restore the estimate accuracy in the pres-

ence of delay, while Alexander method is not a suit-

able solution in this case because of tracking uncer-

tainties. Finally, the results of the paper indicate that

recalculation implies a signiﬁcant computational bur-

den, while Larsen method is as expensive as the stan-

dard EKF.

This study concluded that Larsen method, for

the present application, provides estimates that have

the same quality and computational cost of the non-

delayed case.

ACKNOWLEDGEMENTS

The authors would like to thank Mehmet Suat Kay

for his help in the development of the feature tracking

code.

Figure 6: Position estimates error in X-Y-Z directions.

Table 3: Monte Carlo simulation results.

Position Velocity Attitude Filter

Method RMSE RMSE RMSE Burden

(m) (m/s) (deg) (sec)

EKF

No Delay

3.7355 0.8173 3.4061 0.0111

EKF

Delay

6.8368 1.1069 3.9330 0.0111

Recalculation

Delay

3.8156 0.8477 3.4397 0.0208

EKF/Larsen

Delay

3.7690 0.8457 3.4336 0.0111

ICINCO2012-9thInternationalConferenceonInformaticsinControl,AutomationandRobotics

400

Figure 7: Velocity estimate errors in X-Y-Z directions.

Figure 8: Attitude estimate errors in X-Y-Z directions.

REFERENCES

Alexander, H. L. (1991). State estimation for distributed

systems with sensing delay. SPIE, 1470(1), 103-111.

doi:10.1117/12.44843.

Bonin-Fontand, F., Ortiz, A., and Oliver, G. (2008). Visual

navigation for mobile robots: A survey. Journal of

Intelligent and Robotic Systems, 53(3), 263-296.

Bottasso, C. L. and Leonello, D. (2009). Vision-augmented

inertial navigation by sensor fusion for an autonomous

rotorcraft vehicle. In AHS International Specialists

Meeting on Unmanned Rotorcraft. 324-334.

Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010).

Brief: Binary robust independent elementary features.

In 11th European Conference on Computer Vision.

LNCS Springer, 6314(3), 778-792.

Dalgleish, F. R., Tetlow, J. W., and Allwood, R. L. (2005).

Vision-based navigation of unmanned underwater ve-

hicles : a survey. part 2: Vision-basedstation-keeping

and positioning. In IMAREST Proceedings, Part B:

Journal of Marine Design and Operations. 8, 13-19.

Goedeme, T., Nuttin, M., Tuytelaars, T., and Gool, L. V.

(2007). Omnidirectional vision based topological nav-

igation. International Journal of Computer Vision,

74(3), 219-236.

Gopalakrishnan, A., Kaisare, N., and Narasimhan, S.

(2011). Incorporating delayed and infrequent mea-

surements in extended kalman ﬁlter based nonlinear

state estimation. Journal of Process Control, 21(1),

119-129.

Jianbo, S. and Tomasi, C. (1994). Good features to track.

In IEEE Computer Society Conference on Computer

Vision and Pattern Recognition. IEEE, 593-600.

Jones, E. S. and Soatto, S. (2011). Visual-inertial nav-

igation, mapping and localization: A scalable real-

time causal approach. The International Journal of

Robotics Research, 30(4), 407-430.

Junker, G. (2006). Pro OGRE 3D Programming. Springer-

Verlag, New York.

Larsen, T. D., Andersen, N. A., Ravn, O., and Poulsen, N.

(1998). Incorporation of time delayed measurements

in a discrete-time kalman ﬁlter. In 37th IEEE Confer-

ence on Decision and Control. IEEE, 3972-3977.

Liu, Y. C. and Dai, Q. H. (2010). Vision aided unmanned

aerial vehicle autonomy : An overview. In 3th Inter-

national Congress on Image and Signal Processing.

IEEE, 417-421.

Mourikis, A. I. and Roumeliotis, S. I. (2007). A multi-state

constraint kalman ﬁlter for vision-aided inertial navi-

gation. In IEEE International Conference on Robotics

and Automation. IEEE, 3565-3572.

Nister, D., Naroditsky, O., and Bergen, J. (2006). Visual

odometry for ground vehicle applications. Journal of

Field Robotics, 23(1), 3-20.

Pornsarayouth, S. and Wongsaisuwan, M. (2009). Sensor

fusion of delay and non-delay signal using kalman ﬁl-

ter with moving covariance. In ROBIO’09, IEEE In-

ternational Conference on Robotics and Biomimetics.

2045-2049.

Qian, G., Chellappa, R., and Zheng, Q. (2001). Robust

structure from motion estimation using inertial data.

Journal of the Optical Society of America, 18(12),

2982-2997.

Roumeliotis, S. I., Johnson, A. E., and Montgomery, J. F.

(2002). Augmenting inertial navigation with image-

based motion estimation. In ICRA’02, IEEE Interna-

tional Conference on Robotics and Automation. IEEE,

4326-4333.

Tatiraju, S., Soroush, S., and Ogunnaike, B. A. (1999). Mul-

tirate nonlinear state estimation with application to a

polymerization reactor. AIChE Journal, 45(4), 769-

780.

Veth, M. J., Raquet, J. F., and Pachter, M. (2006). Stochastic

constraints for efﬁcient image correspondence search.

Journal of IEEE Transactions on Aerospace Elec-

tronic Systems, 42(3), 973-982.

HandlingDelayedFusioninVision-AugmentedInertialNavigation

401