An Algorithm for Extended Dynamic Range Video in Embedded Systems

Alberto Ferrante

, Massimo Chelodi

, Francesco Bruschi

and Valeria Mozzetti

ALaRI, Faculty of Informatics, Università della Svizzera italiana, Lugano, Switzerland

ViSSee AG, Zurich, Switzerland

Keywords:

Video Camera, Extended Dynamic Range, Dynamic Range, Video, Visual Sensor.

Abstract:

Video cameras are gaining popularity in embedded systems, either used directly or as sensing devices. Being

embedded systems often portable, they can be employed in different environments with various lightning

conditions; mutating lightning conditions may pose some problems to conventional video-cameras. Yet, high-

dynamic range video-cameras can be too expensive for certain applications. In this paper we propose an

algorithm that extends the dynamic range of common video-cameras by using limited computational resources,

yet exploiting the original frame rate of the camera. Furthermore, we discuss some results obtained by using

our algorithm coupled with a commercial webcam.

1 INTRODUCTION

Video-cameras are used for different applications,

ranging from the consumer market to video surveil-

lance and other industrial applications. Among all

of these applications, video-cameras can be also used

as input devices of different kinds of sensors. In this

case, the ﬁnal user is not interested in the camera out-

put; his interest is in the ﬁnal output of the sensor:

image quality as perceived by the human eye is not

important in this case. It is fundamental, though, that

the output of the image sensor exhibits the charac-

teristics required by the successive elaborations per-

formed in the sensor. For example, ViSSee AG (vis,

2009) produces speed sensors that are based on visual

inputs. When this sensor is used, the output is a speed

measure and the images from the video-camera are

normally not visible to the ﬁnal application or user.

One of the fundamental characteristics of image

sensors is dynamic range. Dynamic range represents

the maximum difference in luminosity among dif-

ferent zones of the image that the sensor can regis-

ter. Unfortunately, many of the typical scenes con-

tain ranges of luminosities that cannot be represented

by the camera sensor (i.e., the dynamic range of the

scene is greater than the one allowed by the sensor);

this causes some zones of the images to be overex-

posed (i.e., completely white, with no details) or un-

derexposed (i.e., completely black, with no details).

Overexposed or underexposed portions of the images

are unusable in most of the cases as they contain lit-

tle or no detail. Furthermore, due to the reduced dy-

namic range of image sensors, most of the cameras

require their exposure to be adjusted depending on the

lightning conditions. Most of the automatic exposure

adjustment algorithms, though, require some time to

change the exposure appropriately. During this time

most of the images may result to be completely over-

exposed or underexposed.

A well known photographic technique for increas-

ing dynamic range relies on merging multiple expo-

sures (typically 2 or 3) of the same scene taken with

different exposure times. This technique is sometimes

also used in videos.

In this paper we discuss an algorithm aimed at

cameras used in embedded devices; video-cameras,

in this case, are not required to provide a good vi-

sual output. Our algorithm increases dynamic range

by relying on the merge of different exposures, yet

maintaining the original frame rate of the camera.

The remainder of this paper is organized as fol-

lows: in Section 2 we discuss the related work; in

Section 3 we describe the algorithms that we devel-

oped; in Section 4 we discuss an implementation of

the algorithms and we show the results that we have

obtained.

2 RELATED WORK

Extending dynamic range of image sensors has been

the subject of a number of research and industrial

works. In many currently available digital cameras

181

Ferrante A., Chelodi M., Bruschi F. and Mozzetti V..

An Algorithm for Extended Dynamic Range Video in Embedded Systems.

DOI: 10.5220/0004207001810186

In Proceedings of the 2nd International Conference on Sensor Networks (SENSORNETS-2013), pages 181-186

ISBN: 978-989-8565-45-7

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

there is the possibility of creating extended dynamic

range images by merging different frames (e.g., the

Canon EOS 5D Mark III, can automatically produce

an extended dynamic range image from up to 3 pho-

tos (EOS, 2012)). This functionality, though, is not

available in current consumer video cameras.

Hardware solutions are based on extended dy-

namic range image sensors. One of these sensors, for

example, is the IcyCAM (Arm et al., 2008). This im-

age sensor, though, is only available as as an IP core

that should be implemented on chip.

One software method that requires low level ac-

cess to the sensor is discussed in (Bandoh et al.,

2010). Other methods, such as the ones discussed

in (Schulz et al., 2007; Jinno and Okuda, 2012;

Guarnieri et al., 2011; Xinqiao and Gamal, 2003;

Mertens et al., 2007; Lu et al., 2009; Castro et al.,

2011; Wang et al., 2011), are software only, even

though they may require the knowledge of different

low level parameters of the sensors. All the software

techniques are based on the combination of multiple

frames to obtain a ﬁnal image with extended dynamic

range.

In the present paper we propose an algorithm for

extended dynamic range videos that relies on general

purpose video-cameras; our algorithm is inspired to

the one presented in (Schulz et al., 2007). Our method

differs in the fact that it requires no knowledge of the

physical sensor parameters and that it produces videos

with the same frame rate of the camera. Our method

is also low on computational resources demand and

can be easily used in embedded systems.

3 ALGORITHMS

We have designed two different variants of our algo-

rithm, both of them allowing the system to produce

ﬁnal images with the same frame rate of the video-

camera employed as image source. The ﬁrst vari-

ant merges two frames as explained in (Schulz et al.,

2007); the sequence of images to be merged, though,

is different than the original one. The second vari-

ant allows considering three different exposure values

and quickly switching among them to optimize dy-

namic range. We have tuned the parameters of the al-

gorithm without any speciﬁc knowledge on the image

sensors and by using, as a reference, only the images

obtained from the camera.

In this section we introduce the two variants of

the algorithm that we have designed and we discuss

a methodology to determine the correct exposure val-

ues for the frames that should be considered during

merge. We have designed our algorithm with black

and white images in mind, as our applications do not

require color images; the algorithm, though, can be

used without any modiﬁcation.

3.1 Two-frame Merge

In this variant of the algorithm two frames are con-

sidered, one with shorter exposure time, named dark,

and another one with longer exposure time, named

light. The dark frame is merged to the light one with

the goal of extracting the maximum number of details

from both images: the loss of details in the black ar-

eas of the dark frame are compensated with the de-

tails that are in the same zones of the light frame.

Details lost in the white areas of the light frame are

compensated by using the details taken from the dark

frame. The merge is performed by applying the fol-

lowing formula for each pixel j of the image:











, l

< L



1−

−L



−L

≤ l

≤ L

, l

> L

(1)

where l

and d

represent the pixelsof the light and the

dark images, respectively; L

is a threshold on the lu-

minosity that deﬁnes underexposed pixels; similarly,

is a threshold that deﬁnes overexposed pixels.

Let us name each frame generated by the camera

. When i is odd, the shortest exposure time is set

and, therefore, F

corresponds to a dark image; when

i is even, the longest exposure time is set and, there-

fore, F

corresponds to a light image. In our algo-

rithm, differently than the one explained in (Schulz

et al., 2007), we adopted a way to merge images that

allows obtaining a resulting image at each i (i.e., at

each image acquired by the camera), instead of only

at even values of i:



merge(A

i−1

) i odd, i > 1

merge(A

i−1

) i even

(2)

The merge parameters need to be determined by

considering the characteristics of the camera and the

lightning conditions foreseen for the usage of the de-

vice. In particular, the two exposure values need to

be determined in such a way that the dynamic range

of the ﬁnal image is maximized and that also the mid-

tones of the image are present. In fact, if the expo-

sure values are too far away one another, details in

the middle gray levels are lost. An example of these

parameters is shown in Section 4.

The merge function needs to be applied an all pix-

els of the image and, supposing that the values of

the function f are tabbed instead of computed every

time, it uses two integer sums and two integer mul-

tiplications for each pixel. Two frames at a time (if

SENSORNETS2013-2ndInternationalConferenceonSensorNetworks

182

the oldest frame is overwritten with the results from

the merge) need to be kept in memory. Therefore, the

resources used by the algorithms are limited and com-

patible with most of the embedded systems where the

algorithm can be adopted.

3.2 Two-frame Adaptive Merge

The dynamic range of a frame obtained by merging

two images is higher than the one of a single frame

obtained from the camera, but it is still limited. In

some situations it may be desirable to increase the dy-

namic range even further. For this purpose, more than

two frames can be merged (e.g., three). The solution

that we propose here, though, uses only two frames

for merging, even though it relies on three ﬁxed expo-

sure values. The algorithm that we propose, in fact,

is able to choose dynamically which couple of expo-

sure values should be considered at a given time. As

mentioned earlier, three exposure values are consid-

ered instead of three, one for the darkest frame, one

for lightest frame, and one that sits in the middle. The

distance between the exposure time of dark and the

one of the medium frame, as well as the distance be-

tween the exposure time of the medium and the one

of the light frame can be considered similar to the one

adopted between the dark and the light frames in the

non adaptive version of the two-frame merge. Con-

sidering the three exposure values, there are two dif-

ferent kinds of sequences of exposure times that can

be used: one corresponds to dark and medium frames;

the other one to medium and light frames. Whether to

use the light or the dark images for the merge is de-

cided by comparing the previously obtained merged

image with the one exposed for mid-tones: if the

image obtained through the merge has similar aver-

age luminosity of the one exposed for mid-tones, it

means that the merge operation has revealed limited

details (i.e., both the merge and the image exposed

for mid-tones are possibly overexposed or underex-

posed). The average luminosity can be computed by

summing up the luminosity of each pixel. No divi-

sion by the total number of pixels is necessary as it

is a constant that in the comparison can be omitted.

This evaluation is only performed for even values of

i. Let us call L

the average luminosity of the frame

obtained from the merge and L

the average luminos-

ity of the frame with mid-level exposure. Let us also

call S the threshold that we use to deﬁne similarity

between L

and L

. S is expressed as a percentage.

The camera will start acquiring sequences of

frames in which the dark frames are considered for

odd values of i and the medium frames are considered

for even values of i. For even values of i L

and L

are evaluated with the following possibilities:



− L



≥ S∗ L

: no adaptation required;



− L



< S ∗ L

: the new sequence will be

composed by medium frames for even is and and

light frames for odd is; the resulting images will

be obtained as follows:



merge(A

i−1

) i odd, i > 1

merge(A

i−1

) i even

(3)

A change from medium and light frames to dark

and medium ones is performed in a similar way when



− L



< S,

As in the two-frame merge, the merge function

needs to be applied an all pixels of the image and it

uses two integer sums, two integer multiplications for

each pixel. Two additional sums are required when

even frames are considered to compute the average lu-

minosity values of the medium frame and of the frame

resulting from the merge operation. Two frames at a

time need to be kept in memory. Therefore, the re-

sources used by the algorithms are limited and com-

patible with most of the embedded systems where the

algorithm can be adopted.

3.3 Exposure Values

By using our algorithm, two frames are merged to

obtain a single image with extended dynamic range;

therefore, exposure parameters of the two frames

must be chosen carefully to obtain the best possible

results in the largest possible range of lightning condi-

tions. Exposure is determined by the couple exposure

time-gain; the latter is also called ISO in photography

(ISO, 2012). Our main goal is to preserve as many

details as possible in the light areas, in the dark areas,

and in the mid-tones. The main constraint in comput-

ing the exposure values resides in the longest expo-

sure time that must be shorter than 1/ frame − rate;

this constraint may be problematic when lightning is

scarce. Gain can be increased to obtain shorter expo-

sure times at the price of increasing image noise.

The exposure parameters are determined by

means of experimentsperformed in the different light-

ning conditions considered. Different images of a ref-

erence scene are taken by considering these lightning

conditions and different couple of exposure param-

eters. The reference scene if represented by a Ko-

dak/Tiffen Q-13 gray scale. This printed image a

quality control device is chart consisting of 20 zones,

labeled 0-19, which have optical densities from 0.0

(white) to 1.90 (practical printing black) in steps of

1/3rd of EV/f-stop. The total range of values repre-

sented by the target is of 6 and 2/3rd f-stops. The ob-

AnAlgorithmforExtendedDynamicRangeVideoinEmbeddedSystems

183

tained images are evaluated by considering the num-

ber number of gray patches retained by them, both

in the highlights and in the shadows. In particular,

we consider the most underexposed image for the

dark scene and the most overexposed one for the light

scene in which all gray patches can be distinguished

one another. If the exposure parameters of the under-

exposed image of the dark scene are compatible with

the ones of the overexposed image of the light scene,

the correct value for the exposure parameters is cho-

sen among the ones belonging to the intersection of

the results obtained in the two extreme lightning con-

ditions. Otherwise it is not possible to cover the con-

sidered EV range.

4 IMPLEMENTATION AND

RESULTS

In this section we discuss the results obtained by with

our algorithm.

A Logitech Webcam Pro9000 has been used

for testing. The algorithm discussed in Section 3

have been implemented in C under Linux and the

Video4Linux2 (v4l2)(Schimek et al., 2008) API have

been used to control the webcam and to acquire the

video frames. The ImageMagic suite of programs

have been used for image manipulation and conver-

sion. The values chosen for L

and L

of Equation 1

are 330 and 165

′

000, respectively. S of Section 3.2

have been set to 15%

4.1 Dynamic Range Measurement

The dynamic range measurement methodology that

we have used relies on the aforementioned Ko-

dak/Tiffen Q-13 gray scale. The target need to be

front-lighted evenly; in our case we used a diffused

light lamp. By considering a constant light level, a se-

ries of pictures, of the target are shot with different ex-

posure values; we used exposure values in steps of of

1/3 f-stop. Dynamic range is obtained by counting the

number of gray patches that are visible in the images.

Since the dynamic range of the sensor is, in most of

the cases, larger than 6 and 2/3rd f-stops, multiple im-

ages need to be used for this purpose. Using multiple

images obtained with different exposure levels is ap-

proximately equivalent to having a longer grayscale.

Multiple images are used in the following way:

1. Thee set A of the images in which all the 20 gray

levels are distinguishable is identiﬁed.

2. The images with lowest (a

) and highest (a

) ex-

posure values are selected in set A.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Figure 1: Overlapping of two different photos of the target

to compute the dynamic range. Please notice that some gray

levels may not be distinguishable one another on screen or

paper, but they are by measuring the luminosity by using

appropriate pieces of software.

3. The smallest set (I) of images with intermediate

exposure values are selected in set A.

4. A single image (g) is formed by joining a

, a

and the images in the set I such as the values of

the gray levels overlap as shown in Figure 1.

5. The number of gray levels that are present in g are

counted. As 3 gray patches correspond to EV1,

the dynamic range in dB is obtained by multiply-

ing the number of gray levels of g by 2.0667.

If no image shows the full range of gray levels (i.e.,

the dynamic range of the camera is lower than 6 and

2/3 f-stop), the dynamic range can be simply obtained

by selecting one of the images where the highest num-

ber of gray patches can be distinguished and counting

their number. To validate the procedure we performed

a test by using a camera with a known dynamic range.

The procedure to measure the dynamic range of

images resulting from merge is different as the no-

tion of exposure time gets lost in the ﬁnal images.

Also in this case a set of images, obtained through

merge, are considered. Of all the obtained pictures,

the brighter valid merge (all the 20 gray levels can be

seen) and the darker valid merge are selected. The

darker original (i.e., before merge) fame of the darker

image obtained from merge and the lighter original

frame of the lighter image obtained from merge need

to be considered. A gap in the gray scale will be

present between the two images; this gap needs to be

ﬁlled, with the same procedure described above for

measuring the dynamic range of non-merged images,

by using other images (non-merged) taken with expo-

sure settings in between the two of the darker and of

the lighter images. The total number of gray patches

is obtained as 40 plus the number of gray patches used

to ﬁll the distance between the two original images.

SENSORNETS2013-2ndInternationalConferenceonSensorNetworks

184

Figure 2: Ghosting effect resulting from the merge of two

frames; the camera was moved horizontally. The ghosting

effect is visible on the borders of the lamp.

4.2 Ghosting Evaluation

Ghosting is due to motion of the camera and/or of

the subject between successive frames that are later

merged; due to this motion, the border of some zones

of the image are repeated multiple times in the result-

ing image, thus creating a ghost image, as shown in

Figure 2. Ghosting depends on different parameters

such as the speed of movement, the frames per sec-

ond of the camera, the merge algorithm, the subject

considered, and the lightning conditions. The higher

the speed and the lower the frames per second, the

higher the ghosting may potentially be. Ghosting is

concentrated around the borders between zones with

high luminosity and zones with with low luminosity.

Some of the software methods for extending the

dynamic range of cameras discussed in Section 2 in-

clude techniques to prevent or mitigate the problem

of ghosting; hardware solutions do not usually have

ghosting problems. (Khan et al., 2006) describes an

approach to removing ghosting artifacts from high dy-

namic range images, without the need for explicit ob-

ject detection and motion estimation.

The presence of ghosting in the images is very dif-

ﬁcult to quantify without considering a proper refer-

ence scene. We considered artiﬁcially created scenes,

composed by gray patches. This allows leaving as

variables only the ones connected to the merge algo-

rithm considered. In particular, we considered two

different merges obtained from: a dark frame with

half of the patches underexposed and a light frame

with half of the patches overexposed; a dark frame

with all the bins underexposed and a light frame with

all bins correctly exposed. For simulating motion we

shifted the second frame on the right by a certain num-

ber of pixels.

4.3 Results

We measured the dynamic range and we evaluated

ghosting for each variants of the algorithm introduced

(a) Starting light picture

(b) Starting dark picture

Figure 3: Original pictures and the merge result obtained

with EV 15 (full sunlight at noon).

in Section 3. The dynamic range measured for the

Logitech Webcam Pro9000 that we used in our exper-

iments is 47.5dB, as shown in Figure Figure 1.

When considering the two-frame merge, we have

chosen exposure times that can be used in most of

the normal natural lightning conditions that can occur

outdoor during the day in Summer (from dark, EV 1

to full light, direct sunlight at noon, EV 15). Consid-

ering that the frame rate of our webcam camera is of

20 frames per second, the longer exposure time must

be shorter than 1/20s. The webcam does not provide

accurate results for exposure values below EV 7 (typ-

ical light 10 minutes after sunset). This greatly limits

the possibility of obtaining exposure times that can

be used in darks environments. The exposure values

chosen for these experiments are 1/1000s and 1/30s

(5 f-stops distance).

The dynamic range measured when applying the

merge algorithm is 93dB, that is almost double the

one of the considered camera (47.5dB).

From the ghosting stand point the worst case hap-

pens when half of the bins of the dark image are un-

derexposed and half bins of the light image are over-

exposed. In this case, as shown in Figure 4, halos

are present in all vertical borders of the image (the

movement is simulated to be horizontal in these ex-

periments). Due to the sequence of dark and light

images considered during merge, ghosting is not con-

stant in one direction: halos will be in the same di-

rection of the motion in one frame and in the opposite

direction in the successive frame.

The dynamic range of the IcyCAM is 132dB

(Rüedi and Gray, 2008). The IcyCAM is also free

from ghosting thus exhibiting better performances

than our system. The IcyCAM, though, is much more

expensive than a standard video-camera sensor.

Two-frame adaptive merge provides the same re-

sults, in term of dynamic range and ghosting, as the

two-frame merge, when single ﬁnal frames are con-

sidered. The main difference, in this case, is that there

are two different possible sets of exposure values that

can be considered and this virtually doubles the num-

ber of lightning conditions supported. This algorithm

AnAlgorithmforExtendedDynamicRangeVideoinEmbeddedSystems

185

(a) Half bins under-exposed (b) Half bins over-exposed (c) Merge all bins with all bins

Figure 4: Halos when half bins of the dark image are underexposed and half bin of the light image are overexposed.

does not provide the same dynamic range, in a single

image, as a merge obtained from three images. Merg-

ing three frames, though, increases signiﬁcantly the

problems with ghosting.

5 CONCLUSIONS AND FUTURE

WORK

In this paper we discussed two variants of an algo-

rithm aimed at extending the dynamic range of video-

cameras. The main characteristics of this algorithm is

that it allows using a normal video camera (i.e., with

no special sensors and with no speciﬁc knowledge

about it) and it allows maintaining the same frame

rate of which the camera is capable. Furthermore, the

algorithm is quite simple and, therefore, it uses lim-

ited computational resources. In the paper we show,

through experimental results, the effectiveness of the

algorithm in enlarging the dynamic range of the cam-

era adopted.

Future work will concentrate on methods for re-

moving ghosting in the ﬁnal frames. The method of

choice must be compatible with low-cost and low-

resources embedded systems: it must not be compu-

tationally too expensive.

REFERENCES

(2009). ViSSee AG. http://www.vissee.ch.

(2012). Canon EOS 5D Mark III Instruction Manual, pages

173–176.

(2012). ISO 5800:1987 - Colour negative ﬁlms for still

photography – Determination of ISO speed. http://

www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue

_detail.htm? csnumber=11948.

Arm, C., Gyger, S., Heim, P., Kaess, F., Nagel, J. L., Rüedi,

P. F., and Todeschini, S. (2008). IcyCAM - a high dy-

namic range vision system. Technical report, CSEM -

Centre Suisse d’électronique et de microelectronique.

Bandoh, Y., Qiu, G., Okuda, M., Daly, S., Aach, T., and Au,

O. C. (2010). Recent advance in high dynamic range

imaging technology. Sensors Peterborough NH, pages

3125–3128.

Castro, T., Chapiro, A., Cicconet, M., and Velho, L. (2011).

Towards Mobile HDR Video. Poster: IEEE Interna-

tional Conference on Computational Photography.

Guarnieri, G., Marsi, S., and Ramponi, G. (2011). High Dy-

namic Range Image Display With Halo and Clipping

Prevention . Image Processing, IEEE Transactions ,

20(5):1351–1362.

Jinno, T. and Okuda, M. (2012). Multiple Exposure Fusion

for High Dynamic Range Image Acquisition. Image

Processing, IEEE Transactions , 21(1):358–365.

Khan, E., Akyuz, A., and Reinhard, E. (2006). Ghost Re-

moval in High Dynamic Range Images, pages 2005–

2008. IEEE.

Lu, P.-y., Wu, M.-s., Cheng, Y.-t., and Chuang, Y.-y. (2009).

High dynamic range image reconstruction from hand-

held cameras. IEEE Conference on Computer Vision

and Pattern Recognition (2009), 54(1):509–516.

Mertens, T., Kautz, J., and Van Reeth, F. (2007). Exposure

fusion. 15th Paciﬁc Conference on Computer Graph-

ics and Applications PG07, pages 382–390.

Rüedi, P.-F. and Gray, S. (2008). IcyCAM - high dy-

namic range system-on-chip for vision systems com-

bining a 132db qvga pixel array and a 32-bit dsp/mcu

processor. Technical report, CSEM - Centre Suisse

d’électronique et de microelectronique.

Schimek, M. H., Dirks, B., Verkuil, H., and Rubli, M.

(2008). Video for linux two API speciﬁcation.

Schulz, S., Grimm, M., and rainer Grigat, R. (2007). Using

brightness histogram to perform optimum auto expo-

sure. In WSEAS Transactions On Systems and Con-

trol, volume 2, pages 93–100. World Scientiﬁc and

Engineering Academy and Society.

Wang, H., Cao, J., Tang, L., and Tang, Y. (2011). Hdr im-

age synthesis based on multi-exposure color images.

In Tan, H., editor, Informatics in Control, Automation

and Robotics, volume 132 of Lecture Notes in Electri-

cal Engineering, pages 117–123. Springer Berlin Hei-

delberg.

Xinqiao, L. and Gamal, A. E. (2003). Synthesis of High Dy-

namic Range Motion Blur Free Image From Multiple

Captures . Circuits and Systems I: Fundamental The-

ory and Applications, IEEE Transactions , 50(4):530–

539.

SENSORNETS2013-2ndInternationalConferenceonSensorNetworks

186