Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile

Mapping Images

Steven Puttemans, Stef Van Wolputte and Toon Goedem

EAVISE Research Group, KU Leuven, Jan Pieter De Nayerlaan 5, Sint-Katelijne-Waver, Belgium

Keywords:

Cycloramic Imagery, Mobile Mapping, Pedestrian Detection, Application Speciﬁc Constraints, Soft Blurring.

Abstract:

When capturing images in the wild containing pedestrians, privacy issues remain a major concern for indus-

trial applications. Our application, collecting cycloramic mobile mapping data in crowded environments, is an

example of this. If the data is processed and accessed by third parties, privacy of pedestrians must be ensured.

This is where pedestrian detectors come into play, used to detect individuals and privacy mask them through

blurring. The problem of undesired false positive detections, typical for pedestrian detectors and unavoidable,

still leaves undesired areas of the images being blurred. We tackled this problem using application-speciﬁc

scene constraints, modelled by a height-position mapping based on scene-speciﬁc pedestrian annotation data,

combined with reducing the ﬁeld of interest and case-speciﬁc false positive elimination classiﬁers. We ap-

plied a soft blurring technique to avoid the artiﬁcial look of simply applying Gaussian blurring to the found

detections, which results in an effective fully-automated masking pipeline for privacy safeguarding in mobile

mapping images. We prove that we can use pre-trained pedestrian detection models, but by collecting a limited

amount of application-speciﬁc annotations and by exploiting scene-speciﬁc constraints, we are able to boost

the detection accuracy enormously.

1 INTRODUCTION

In mobile mapping applications, a vehicle equipped

with cameras is used to grab images in order to give

the user a digital view of the surroundings. This is

repeated at preset intervals in order to ensure that the

complete surroundings of the car are being captured.

Companies like Google, but also local land surveying

ofﬁces, are carrying out such measuring campaigns

to make digital images of streets across the globe.

When collecting all this data, one can imagine that

the amount of data increases drastically once someone

is capturing larger projects, e.g. the ‘Google Street

View’ application. The goal of capturing all this data

is providing users with fast, accurate and detailed data

measurements for producing all kinds of 2D and 3D

geographical information systems.

Avoiding pedestrians walking around when cap-

turing mobile data is nearly impossible, which raises

the question of privacy issues when they are. Espe-

cially when this data is shared with or sold to indus-

trial partners, it is important that the privacy of these

pedestrians is guaranteed. Therefore companies are

continuously looking for robust solutions able to ﬁlter

out privacy-sensitive content from the captured data.

One solution could be to manually browse the

data, indicating every pedestrian and making them

privacy-safe by applying a blurring ﬁlter to the an-

notations. In the case that the amount of data is rather

limited, this might be the fastest and most accurate

solution. However when the data size rises over sev-

eral millions of captured images a week, one immedi-

ately notices that this approach is no longer suitable.

In those cases an automated unsupervised approach is

preferred. One of the most frequently used techniques

in tackling this problem is applying pedestrian detec-

tion algorithms like (Dalal and Triggs, 2005; Viola

and Jones, 2001; Doll

ar et al., 2009; Felzenszwalb

et al., 2008) on the captured mobile mapping data,

marking possible pedestrian-like areas in the image.

These in turn can then be blurred or cut out, to avoid

transferring privacy-sensitive data.

A major downside of existing pedestrian detectors

is that they require the manual selection of a thresh-

old on the detection certainty score to ﬁnd a good bal-

ance between ﬁnding actual pedestrians in the image

and avoiding false positive detections. If the thresh-

old is set too strict, we will only detect pedestrians

but be unable to ﬁnd all of them, and thus privacy is-

sues arise again. If we put the threshold too sloppy, all

408

Puttemans, S., Wolputte, S. and Goedemé, T.

Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile Mapping Images.

DOI: 10.5220/0005784304060415

In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 3: VISAPP, pages 408-417

ISBN: 978-989-758-175-5

pedestrians will be found, but similar objects or areas

will trigger a false positive detection such that other

objects will be blurred. The mobile mapping commu-

nity wants to avoid this at all costs, because most data

is used to derive GIS systems, which need to be as

accurate and complete as possible.

In this paper, we propose an effective post-

ﬁltering step using scene-speciﬁc constraints, by set-

ting a sloppy detection certainty threshold, avoiding

false negative detections (missed pedestrians), but ad-

ditionally ensuring the removal of false positive de-

tections using several effective post-processing steps.

Furthermore we expand the system with additional

small color based classiﬁers able to remove even

more false positives. Finally we provide an elegant

soft blurring approach for safeguarding the privacy of

pedestrians inside the mobile mapping images.

The remainder of this paper is structured as fol-

lows. Section 2 presents related research, while sec-

tion 3 discusses the data collection. This is followed

by section 4 in which the proposed approach is dis-

cussed in detail. Finally section 5 elaborates on the

obtained results while section 6 sums up conclusions

and possible future improvements.

2 RELATED WORK

Pedestrian detectors come in different types and ﬂa-

vors. The main difference lies in the ﬂexibility of the

model, where we distinct between rigid and non-rigid

approaches. Rigid approaches focus on an object al-

ways being in the same constellation, with only one

large part trained as a model. Such approach is sug-

gested by (Dalal and Triggs, 2005; Viola and Jones,

2001), where a rigid model based on gradients is fed

to a support vector machine or a boosting step. A

downside is that they are trained on a ﬁxed frontal

view of the object. Non-rigid approaches on the other

hand try to model objects as a combination of de-

formable parts, existing of several rigid parts (arms,

head, torso and legs), and a deformation relationship

between them (Felzenszwalb et al., 2008). As pedes-

trians tend to move and change position frequently,

we decided to use a non-rigid detector. Most pedes-

trian detectors discard color information, because of

the wide variation in clothes and appearance. How-

ever more recent techniques like (Doll

ar et al., 2009;

Doll

ar et al., 2010) show that including color informa-

tion can have a signiﬁcant increase in performance.

(Van Beeck et al., 2012) introduces a warping

window approach where fast real-time vision-based

pedestrian detection is obtained by calibrating the

height and orientation of the pedestrian at each spe-

ciﬁc image location. We prove that we can apply a

similar technique, as a post-processing step after the

detection phase, by learning a relation between the

height and position of an average pedestrian from a

limited set of application-speciﬁc annotations.

(Puttemans and Goedem

e, 2013) proves that us-

ing application-speciﬁc information, is one way to

improve the accuracy of object detection algorithms.

Similar rules apply for pedestrian detection, as far as

the application allows you to ﬁnd some application-

speciﬁc constraints. In our application we exploit the

fact that the camera is mounted on top of car, at a ﬁxed

position with respect to the ground plane, resulting in

a relation between the position and the height of any

given pedestrian. Furthermore we exploit the anno-

tated training data to learn regions of interest, avoid-

ing the processing of undesired image regions, like

the sky or on top of buildings. (Cho et al., 2012; Peng

et al., 2015; Dibra et al., 2015) describes a similar use

of a ground plane assumption for 3D modeling and

multiple camera view processing.

For privacy masking, several solutions have been

proposed. (Tanaka et al., 2015) tries to deﬁne how

much blurring is needed to reach a certain level of pri-

vacy. (Panagiotis, 2015) applies a simple block based

blurring, whereas (Nakashima et al., 2015) suggests

to use image melding, replacing a person’s face with

a ﬁxed neutral expression instead of blurring. Our

application still demands masking, but to avoid the

hardness of block based blurring, we propose to use a

smooth soft blurring approach.

3 DATASETS

This research is developed on top of two mobile map-

ping datasets, which are made publicly available

, to

encourage further research in this area.

The ﬁrst dataset is a series of mobile mapping cy-

cloramic images with a resolution of 4800×2400 pix-

els, captured using a LadyBug 1 camera setup, in a

quiet and calm urban area in the Netherlands. The

captured images give a full 360 degree view from the

surroundings of the car at any given position. The

camera itself is ﬁxed and mounted on the top of the

roof of the car. The set has 450 images under day-

light conditions. The dataset is used to develop and

ﬁne-tune the suggested approach.

The second dataset was captured using a LadyBug

2 camera, having a resolution of 8000 × 4000 pixels,

again mounted on top of the roof of the car, containing

45 images of a train and bus station in Belgium. We

http://www.eavise.be/MobileMappingDataset

Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile Mapping Images

409

Figure 1: Example frames for both datasets used: (top)

Dataset 1 - urban area in the Netherlands; (bottom) Dataset

2 - Belgian train and bus station.

used this dataset to prove that the developed approach

is independent of the application-speciﬁc settings like

camera setup and application environment, except for

deﬁning the actual height-position relation used to im-

prove the detection success rate. An example of both

mobile mapping datasets can be seen in Figure 1.

All database images were manually annotated to

provide ground truth data for the actual locations of

pedestrians. For the ﬁrst dataset this led to 240 pedes-

trian annotations while the second dataset contained

1630 pedestrian annotations. The large difference is

mainly due to the recorded surroundings, where a

train and bus station is likely to have more pedestrians

walking around in each mobile mapping image.

4 APPROACH

Our approach can be split into several processing

blocks, as seen in Figure 2. First we create a limited

amount of ground truth annotations for both datasets,

needed for both building the height-position relation

and inferring the color-speciﬁc constraints for learn-

ing the false positive elimination classiﬁers. The an-

notations are also used for validating each additional

post-processing step. At runtime, we apply a multi-

scale pedestrian detection algorithm on the input data

provided in a sliding window manner, image, stor-

ing the detection results and their detection certainty

score. Based on the annotations we apply a valid re-

gion reduction, a height-position location relation and

a certainty score thresholding, all leading to an ef-

Figure 2: Block diagram of the suggested approach.

ﬁcient pruning of the obtained detections. For spe-

ciﬁc object classes that still trigger false positive de-

tections, we design speciﬁc color based false positive

elimination classiﬁers, which in turn further improve

the results. The main goal is thus to remove as much

false positives as possible and increase the resulting

detection accuracy. The detections found are then

passed on to an elegant soft blurring step to ensure

privacy safeguarding.

For our research we used an implementation

(De Smedt et al., 2012) of the cascaded Felzenszwalb

latentSVM4 implementation, which uses a part based

object detector for efﬁcient pedestrian detection. The

reason for this is quite straightforward. A part based

detector is non-rigid and thus captures the different

poses of pedestrians efﬁciently. On the other hand

we also have a fast and optimized C++ implementa-

tion available. However, our post-processing is inde-

pendent of the pedestrian detector used, so basically

it could be replaced by any out-of-the-box pedestrian

detector. This is one of the major beneﬁts of our sys-

tem, encouraging cross-dataset evaluation.

A downside to every pedestrian detection algo-

rithm is that one must ﬁlter the output based on the

object detection certainty score, by selecting a spe-

ciﬁc threshold and thus locking down on a speciﬁc

point on the precision recall curve of that detector. If

we decide to put the threshold too sloppy, we get an

increase in false positive detections, while in the same

time, reducing the amount of false negative detections

(every single pedestrian will likely be returned). This

would lead to an enormous amount of privacy mask-

ing, however also removing a lot of useful informa-

tion from the image. This is unacceptable for the

mobile mapping community, where the data is used

to create high quality 2D and 3D GIS systems. On

the other hand, if we put the threshold very strict,

the amount of false positives will decrease drastically,

but we will get an increase in false negative detec-

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

410

tions, which in our application of privacy masking

would not be acceptable. Our approach therefore uses

a sloppy threshold for obtaining every possible pedes-

trian as a true positive detection, then subsequently

using smart post-processing to efﬁciently remove as

much false positive detections as possible.

4.1 Scale-space Location Relation

When considering our application of mobile mapping

using a ﬁxed 360 degree cycloramic camera, we know

that the actual height position of the camera, com-

pared to the environment, will be ﬁxed, only if we

assume a ﬂat ground plane, and if that ground plane

will never change drastically. This is a crucial scene

constraint, allowing us to take into account that every

pedestrian in the image, walking on the street or on

the sidewalks, will have an average ﬁxed height in re-

lation to the position in the ﬁnal cycloramic images.

People closer to the car and thus to the camera will be

larger, while people further away will move towards

the camera’s vantage points and thus be smaller. For

any given horizontal line in the image, we can state

that all pedestrians on that line will have the same av-

erage height, of course keeping in mind that we have

a natural height variance within pedestrians.

4.1.1 Mapping Ground Truth Annotations

In order to model a height-position relation we started

by mapping out the ground truth annotations collected

on the ﬁrst dataset. The effort of annotating a smaller

part of application-speciﬁc data, to be used for deriv-

ing scene constraints, is small compared to training a

complete new pedestrian detector (which needs much

more annotations and processing time). The result of

these manual annotations can be seen in Figure 4(a),

where the height of each annotation is mapped in re-

lation to the position, deﬁned as the center of gravity.

During the annotation phase only pedestrians on side-

walks, parks and roads were annotated. If a pedes-

trian would be standing on a balcony of a building,

this pedestrian was not taken into account.

Figure 3: Applying borders for minimal and maximal

pedestrian height, as deﬁned by the blue borders in 4(a).

4.1.2 Model Fitting and Region Reduction

In relation to the data mapping seen in Figure 4(a) we

ﬁt a linear model to the data points and apply a search

for image region boundaries. The red curve is the

ﬁtted linear relation to the mapped annotation data,

which models the relation between pedestrian height

and pedestrian position, relative to the camera posi-

tion. The green borders are based on the assumption

that we have a Gaussian data distribution compared to

the ﬁtted model, and that these borders should capture

99.8% of all detections using the rule of [−3σ, +3σ].

The reasons for this allowed model deviation are quite

straightforward. First of all we have a natural devia-

tion in pedestrian height, while secondly, due to the

cars suspension, the camera height is not completely

ﬁxed to the ground plane. Thirdly, there is a possi-

ble deviation from the ﬂat ground plane assumption

caused by height differences due to sidewalks, defects

in the road, speed bumps, etc. The blue borders deﬁne

allowed position regions for pedestrians in the image,

assuming the training data covers a wide variance of

available pedestrian speciﬁc to the application. This

is visualized in Figure 3 and allows us to immediately

ignore detections that are outside these regions, re-

moving about 50% of the image, and thus lowering

the chance of false positive detections occurring.

4.1.3 Applying Constraints on Detection Data

Figure 4(b) visualizes the detections obtained by our

pedestrian detection algorithm. When applying the

realistic pedestrian occurring boundaries, calculated

from the annotated data in the previous subsection,

we obtain the green dots, representing pedestrian de-

tections in reasonable and allowed positions in the

image. We do notice that this allows us to drop a

signiﬁcant amount of false positive detections. Sub-

sequently we force the green borders on top of the

green data, demanding that our detections also ﬁt our

height-position relation created from the manually an-

notated data. This in return removes a large part of the

false positive detections, keeping only the red detec-

tions as acceptable pedestrian detections.

4.1.4 Updating the Distribution Constraint

We acknowledge that assuming a Gaussian distribu-

tion around the ﬁtted linear height-position relation

might not always be the best choice, especially when

you consider the fact that when moving further from

the car, differences in pedestrian height become less

obvious to notice, certainly at pixel level, whereas

close to the car height differences are clearly visible.

Therefore we updated the green borders, to closely

Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile Mapping Images

411

(a) Ground truth annotations with general and narrow bounds. (b) Detections with pruning steps applied.

Figure 4: The height-position relation building process.

map the correct distribution of the annotation data,

which can be seen in Figure 4(a) as the magenta bor-

ders. Applying those updated constraints on the ac-

tual detection output, again removes several false pos-

itive detections, resulting in the magenta colored de-

tections seen in Figure 4(c).

4.1.5 Detection Certainty Thresholding

Before applying all the constraints deﬁned in the pre-

vious subsections on top of the detection output, we

decided to put the detection certainty threshold very

sloppy, to ensure that the amount of false negative

detections is close to 0%. Now that we have auto-

matically removed multiple false positive detections,

we can look back to this setting and adapt it to our

application-speciﬁc needs. Due to less cluttered im-

ages ﬁlled with detections, since most false positives

are removed now, it becomes easier to select a decent

score threshold for our application. From experience

in using pedestrian detectors in the wild, we learned

that the used LatentSVM4 detector almost never re-

turns valid pedestrian detections when the certainty

score is below 0. Of course this value is application

speciﬁc and can change drastically when considering

other application ﬁelds. In our application, detections

with lower scores mainly resemble objects that have

similar feature descriptions, like a smaller tree or a

trafﬁc sign, but in 99% of the cases, they do not match

actual pedestrians. Since we want to avoid blurring

too much valuable image information, we enforce an

extra pruning rule, demanding a detection certainty

score equal or above 0. This results in the black de-

tections, seen in Figure 4(d).

4.1.6 Visually Verifying the Filtered Detections

When visually checking the data, we wondered why

very small pedestrians in the background where ig-

nored by the pedestrian detection interface. As seen in

Figure 4(d) we calculated the smallest retrieved detec-

tion height by the DPM detector, which had a height

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

412

of 105 pixels. Considering this in relation to the pre-

trained pedestrian model, this is actually normal, be-

cause the model is trained with a ﬁxed training sample

height of 124 pixels, keeping a small area of back-

ground around the pedestrian, also called padding. At

detection time, the model’s dimensions always limit

the smallest possible detection height, so if we would

like to include these smaller pedestrians, we should

ﬁrst upscale the input data. However we should keep

in mind that this introduces image artifacts which

could interfere with the pedestrian detector. In our

application this is no problem, since pedestrians with

a height smaller than 100 pixels are already privacy

secure and impossible to recognize when looking at

the complete mobile mapping image of 8000 × 4000

pixels (Tanaka et al., 2015).

4.2 Color-based Removal of

Pedestrian-like Detections

Even with all the proposed post-ﬁltering steps ap-

plied, we noticed that some object classes contin-

uously succeeded in triggering false positive detec-

tions. Take for example the case of small trafﬁc signs

indicating the trafﬁc ﬂow when entering a round-

about, as seen in Figure 5. As humans we clearly see

the difference between a pedestrian and this rigid traf-

ﬁc sign. However due to the speciﬁc nature of pedes-

trian detection algorithms, it is normal that these false

positive detections occur. First of all, the used al-

gorithm (Felzenszwalb et al., 2008) ignores color in-

formation, since the variety of color in pedestrians is

enormous. Secondly as feature it uses edge informa-

tion of deformable parts. And this is exactly where

the biggest problems occur. The mentioned trafﬁc

sign has a top part that is very similar to a head and

a middle and bottom part that have similar feature re-

sponses as a human body. Since the body and the head

are parts with a big weight in part-based pedestrian

models, it is important to add an extra pruning step

to remove these false positive detections that are still

classiﬁed as valid detections by our pipeline. Espe-

cially in the context of mobile mapping it is important

that crucial road information is not ﬁltered or blurred

out due to privacy reasons, because many clients in-

Figure 5: Example of the need of an extra ﬁlter for trafﬁc

signs still passing the post-processing steps.

(a) Positive training set.

(b) Negative training set.

red = pedestrian).

Figure 6: Positive, negative training and test set for trafﬁc

sign ﬁltering.

terested in this data are looking for exact locations of

trafﬁc signs like this, e.g. to keep an automated index

of road sign conditions.

To avoid these kind of problems we propose a sim-

ple pruning step using a small Naive Bayes classiﬁer.

This machine learning technique takes a limited set

of positive and negative training samples and, based

on some very simple color-based features calculated

from the training data, decides whether a valid de-

tection should still be classiﬁed as pedestrian or not.

We prefer using a machine learning approach towards

setting hard thresholds on basic features, because it is

more robust in ﬁnding the optimal separation between

classes once more training data is supplied.

As seen in Figure 6, we use a small positive and

negative training set (both containing only 5 samples),

where we tried to include as much trafﬁc sign like

pedestrians in the negative set as possible (by look-

ing for matching colors), to avoid that those would

now get ﬁltered out, e.g. when someone is wearing

a bright jacket. Finally we constructed a small test

set to evaluate the success rate of our classiﬁer. From

each training sample a set of simple visual features

are calculated. In this case the most distinct feature is

the bright yellow color of the ‘body’ part of the traf-

ﬁc sign. We separate the top 30% of the image and

then split up the bottom 70% in 3 equal regions, as

Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile Mapping Images

413

(a) (b) (c) (d) (e)

Figure 7: Naive Bayes Features: (a) original (b) CMY(K)

seen in Figure 7(a). middle area is then transferred to

the CMYK color space (Figure 7(b)) because the traf-

ﬁc sign has a very low response in the C layer (Figure

7(c)), an average response in the M layer (Figure 7(d))

and a high response in the Y layer (Figure 7(e)). This

behavior is not equal for pedestrians. We take the av-

erage CMY values for this smaller window and use

that as feature vector for each positive and negative

sample. The K channel is simply ignored.

Finally when running the classiﬁer on the test set

provided, the samples were all classiﬁed correctly ei-

ther as pedestrian or as trafﬁc sign and thus the simple

classiﬁer proved to work as an effective post-ﬁltering

step. Similar behavior was detected for speciﬁc kind

of bushes, again in this case, an extra small Naive

Bayes ﬁlter could be constructed. The advantage of

this approach is that at post-processing time, the cal-

culation of these extra ﬁlters is computationally very

cheap ( 1ms) due to the very simple features used and

thus a small cost for a better classiﬁcation result.

4.3 Soft Blurring Approach

The ﬁnal step of our proposed pipeline is to obtain

the valid detected pedestrian regions and apply a local

apply a privacy-safeguarding ﬁlter to them. In our co-

operation with mobile mapping companies it became

clear that they want to manually deﬁne which part of

the detection is being blurred. Therefore we provided

the option for both pedestrian and face region blur-

ring. An intuitive way to apply privacy-safeguarding

would be to apply a standard Gaussian blurring ﬁl-

ter. One of the main downsides to this is the exis-

tence of very prominent edge artifacts which cannot

be removed, as seen in the left part of Figure 8. We

would prefer a blurring ﬁlter that is not as strong on

the edges, as seen in the right part of Figure 8, but

which is strong in the middle and softens up towards

the edges of a detection. This ensures privacy but the

end result is visually more pleasing.

Instead of convolving the image region with a

Gaussian kernel with a ﬁxed size and sigma, we pro-

pose a convolution with an adaptable Gaussian ker-

Figure 8: Blurring ﬁlters (left) standard Gaussian blur

(right) smooth blurring ﬁlter.

nel, where the sigma (σ

kernel

) is deﬁned as a function

of the normalized pixel distance Ψ to the center of

the detection itself as described in equations (1), (2)

and (3). To ensure that the blurring is proportional for

differently sized pedestrian detections, we add an ex-

tra size dependency ∆, which takes into account the

area of the detection found compared to the area of

the original image. This ensures that in the end each

detection is equally blurred.

Ψ = 1 −

d(center

detection

, position)

(1)

∆ =

area(pedestrian)

area(image)

(2)

kernel

= 0.1 + (∆Ψ

) (3)

We apply this soft blurring ﬁlter to every pedes-

trian detection in a given input image, blur out the

detected pedestrians or their associated face region

and make the captured mobile mapping image pri-

vacy safe. In our application we applied face blurring

which can be seen in Figure 10 and 9. This is simply

passed as an extra parameter to our smooth blurring

function. In order to make the blurring regions more

visible we also visualized the actual detections.

Figure 9: Close-up of privacy smoothing using only the

face region of the detection.

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

414

Figure 10: Applying soft blurring on ﬁltered pedestrian detections, but limiting the blurring to face regions only.

5 RESULTS

We applied the same post-processing steps discussed

in the previous sections (enforcing valid pedestrian re-

gions, applying a height-position relation and adding

a scoring threshold) to the second dataset and got

similarly good improvements. Only the color-based

Naive Bayes classiﬁcation was left out, since the spe-

ciﬁc object class (roundabout trafﬁc sign) did not oc-

cur inside the second dataset. We did not explicitly

look into an object class speciﬁc Naive Bayes ﬁlter for

the second dataset, but if such false positive trigger-

ing object class would occur, one could simply train a

classiﬁer for that class using our software. The result

of pruning the detection output can be seen in Figure

11, while visual results of applying these constraints

can be seen in Figure 13. Especially pay attention to

the false positive detections on the car that are disap-

pearing as well as some of the double detections.

In order to make sure that we actually achieved

an improvement over simply using the out-of-the-box

Figure 11: Applying all post-processing steps to the out-

put of the LatentSVM 4 INRIA based pedestrian detector

applied on dataset 2.

available pedestrian detection algorithm, we evalu-

ated the number of true positive, false positive and

false negative detections after applying the different

post-processing steps discussed in section 4. The re-

sult of this comparison can be seen in Table 1. Subse-

quently, using precision-recall curves, we visualized

the accuracy gained by applying the mentioned post-

processing steps to the detection results on dataset

2, which can be seen in Figure 12. Notice that an

out-of-the-box object detector already experiences a

large accuracy drop when doing cross-dataset eval-

uation, and that there is a substantial accuracy gain

when applying our post-processing steps. Since the

detection analysis for dataset 2, shown in Figure 11,

proves that the minimum object size found by the

used DPM model is 101 pixels, we ignored possible

ground truth annotations on pedestrians smaller than

Figure 12: Precision-Recall curves generated for dataset 2

with all post-processing steps applied and the reported ac-

curacy using the area under the curve measurement.

Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile Mapping Images

415

Figure 13: Example of applying post-processing steps to data from the second dataset. (top) original detections at score

threshold -1; (bottom left) after pruning; (bottom right) after height-position relationship and score > 0.

100 pixels, to make an as accurate precision-recall

curve as possible. The remaining false negative de-

tections are mainly due to people sitting on benches,

riding bikes or motorcycles, which are less likely to

get detected by the used pedestrian DPM model and

which is a known issue.

We do acknowledge that this solution is far from

100% fail prove. There are still some bottlenecks that

should be taken into account. The overall approach

is generated to improve the output of any available

pedestrian detector without retraining the actual DPM

model speciﬁc to the application. However, up till

now there is not yet a single of-the-shelf pedestrian

detector which is able to detect every single pedes-

trian out there in any given application, especially

when performing cross-dataset validation (Torralba

et al., 2011). While our approach focuses mainly on

improving the recall rate of our detector, as seen in

Figure 12, getting the precision of pedestrian detec-

tors to 100% in any given application is still a very

challenging task and an actively researched topic.

During this research we made a visualization

showing the inﬂuence of changing the threshold on

the detection certainty, from a very sloppy value to

a very strict value, in relation to the amount of false

Table 1: Comparison of TP, FP and FN values after each

post-processing step for ﬁrst dataset. To obtain a clear ben-

eﬁt of applying these techniques, we ran the original DPM

detector at a score threshold of -1 like in the visual results

shown in Figure 13.

#TP #FP #FN

DPM orig. 928 4159 349

After pruning 928 3182 349

After height-position 852 1015 384

positive detections produced. This clearly shows the

inﬂuence of changing this parameter in search of an

ideal setting for any given application. The video can

be found at: https://youtu.be/-xrBg8sDDOQ.

6 CONCLUSIONS AND FUTURE

WORK

The goal of this paper is to efﬁciently blur pedestrians

in mobile mapping images to avoid privacy related

issues while safeguarding as much image informa-

tion as possible. By using an off-the-shelf pedestrian

detector trained on a different dataset and setting a

sloppy conﬁdence threshold, we proved that applying

efﬁcient post-processing ﬁlters, based on application-

speciﬁc constraints, e.g. a height-position relation,

can greatly improve the detection outcome. In ad-

dition to the proposed height-position ﬁltering step,

we supply additional easy to train lightweight Naive

Bayes ﬁlters for objects that still trigger false positive

detections, e.g. roundabout trafﬁc signs, without the

need of large amounts of annotated training data.

We prove that in a speciﬁc situation, we can use

pre-trained pedestrian detection models, but, given

a limited amount of manual annotations on a situa-

tion speciﬁc dataset, we can boost the detection ac-

curacy enormously by exploiting scene-speciﬁc con-

straints, e.g a known ground plane assumption. Fi-

nally we proposed an efﬁcient soft blurring alterna-

tive to a standard Gaussian blurring ﬁlter, for privacy

masking reasons, by adaptively changing the param-

eters of the Gaussian kernel used for the convolution

with the found pedestrian detections.

VISAPP 2016 - International Conference on Computer Vision Theory and Applications

416

Since the processing of mobile mapping images is

being done off-line, and time and resource manage-

ment was not the focus of this research, we do not

need to concern about running the detector on every

image location, which is computationally quite ex-

pensive. One could argue that running the out-of-the-

box object detector multi-scale on every image po-

sition is actually a waste of resources and computing

time. As future work we suggest to integrate our post-

processing steps inside the actual pedestrian detec-

tion algorithm, enormously reducing the processing

time needed for a single mobile mapping image. This

might open up the possibility to do the processing

on-line, while capturing the actual data. This would

be better for industrial partners, since privacy issues

would be solved completely, due to the privacy sensi-

tive data not being physically stored anymore.

Our application focuses on detecting pedestrians

walking on the modeled ground plane, which raises a

new problem. People standing on a balcony, sitting

on a bench, lying on the grass or driving a bike, will

not ﬁt into this ground plane assumption and will thus

simply be ﬁltered out by our approach. We could im-

prove our approach by using multiple detection mod-

els, for these different pedestrian classes and then ap-

ply separate post-ﬁltering rules for each detector.

One could not disagree that even with the current

bottlenecks, that this work is valuable for people han-

dling privacy sensitive mobile mapping data. This re-

search allows users to automatically remove privacy

sensitive data from their captured datasets, without

the need of manually handling each image (which

would be very costly and time consuming). It allows

users to grab off-the-shelve available pedestrian de-

tectors, add them to the system, and use a limited

manual input in their application ﬁeld to derive the

post-processing rules. This highly beneﬁts the com-

panies because they do not need to put huge amounts

of time and resources into building an application-

speciﬁc pedestrian detector themselves, needing thou-

sands of pedestrians to be manually annotated.

ACKNOWLEDGEMENTS

This work is supported by the Institute for the Pro-

motion of Innovation through Science and Technol-

ogy in Flanders (IWT) via the IWT-TETRA project

TOBCAT and via the IWT-TETRA project RaPiDo.

We would also like to thank Vansteelandt BVBA

and Grontmij Belgium, the companies who provided

the cycloramic image datasets during these projects,

which were used to develop and test this approach.

REFERENCES

Cho, H., Rybski, P. E., Bar-Hillel, A., and Zhang, W.

(2012). Real-time pedestrian detection with de-

formable part models. In IVS, pages 1035–1042.

IEEE.

Dalal, N. and Triggs, B. (2005). Histograms of oriented

gradients for human detection. In CVPR, volume 1,

pages 886–893. IEEE.

De Smedt, F., Struyf, L., Beckers, S., Vennekens, J.,

De Samblanx, G., and Goedem

e, T. (2012). Is the

game worth the candle? Evaluation of OpenCL for ob-

ject detection algorithm optimization. PECCS, pages

284–291.

Dibra, E., Maye, J., Diamanti, O., Siegwart, R., and Beard-

sley, P. (2015). Extending the performance of hu-

man classiﬁers using a viewpoint speciﬁc approach.

In WACV, pages 765–772. IEEE.

Doll

ar, P., Belongie, S., and Perona, P. (2010). The fastest

pedestrian detector in the west. In BMVC, volume 2,

page 7. Citeseer.

Doll

ar, P., Tu, Z., Perona, P., and Belongie, S. (2009). Inte-

gral channel features. In BMVC, volume 2, page 5.

Felzenszwalb, P., McAllester, D., and Ramanan, D. (2008).

A discriminatively trained, multiscale, deformable

part model. In CVPR, pages 1–8. IEEE.

Nakashima, Y., Koyama, T., Yokoya, N., and Babaguchi, N.

(2015). Facial expression preserving privacy protec-

tion using image melding. In ICME, pages 1–6. IEEE.

Panagiotis, I. (2015). Preventing privacy leakage from pho-

tos in social networks. In CCS2015. ACM.

Peng, P., Tian, Y., Wang, Y., Li, J., and Huang, T. (2015).

Robust multiple cameras pedestrian detection with

multi-view bayesian network. Pattern Recognition,

48(5):1760–1772.

Puttemans, S. and Goedem

e, T. (2013). How to exploit

scene constraints to improve object categorization al-

gorithms for industrial applications. In VISAPP, vol-

ume 1, pages 827–830.

Tanaka, Y., Kodate, A., Ichifuji, Y., and Sonehara, N.

(2015). Relationship between willingness to share

photos and preferred level of photo blurring for pri-

vacy protection. In ASE BigData & SocialInformatics,

page 33. ACM.

Torralba, A., Efros, A., et al. (2011). Unbiased look at

dataset bias. In CVPR, pages 1521–1528. IEEE.

Van Beeck, K., Goedem

e, T., and Tuytelaars, T. (2012). A

warping window approach to real-time vision-based

pedestrian detection in a truck’s blind spot zone. In

ICINCO, volume 2, pages 561–568.

Viola, P. and Jones, M. (2001). Rapid object detection using

a boosted cascade of simple features. In CVPR, pages

I–511.

Safeguarding Privacy by Reliable Automatic Blurring of Faces in Mobile Mapping Images

417