On the Effectiveness of Dynamic Taint Analysis for Protecting against

Private Information Leaks on Android-based Devices

∗

Golam Sarwar (Babil),

1,2

Olivier Mehani,

Roksana Boreli,

1,2

and Mohamed-Ali Kaafar

1,3

NICTA, Eveleigh, Sydney, NSW, Australia

UNSW, Kensington, Sydney, NSW, Australia

Inria, Grenoble, Rhône-Alpes, France

Keywords:

Dynamic Taint Analysis, Privacy, Malware, Anti-Taint-Analysis, Anti-TaintDroid, Android.

Abstract:

We investigate the limitations of using dynamic taint analysis for tracking privacy-sensitive information on

Android-based mobile devices. Taint tracking keeps track of data as it propagates through variables, inter-

process messages and ﬁles, by tagging them with taint marks. A popular taint-tracking system, TaintDroid,

uses this approach in Android mobile applications to mark private information, such as device identiﬁers or

user’s contacts details, and subsequently issue warnings when this information is misused (e.g., sent to an un-

desired third party). We present a collection of attacks on Android-based taint tracking. Speciﬁcally, we apply

generic classes of anti-taint methods in a mobile device environment to circumvent this security technique. We

have implemented the presented techniques in an Android application, ScrubDroid. We successfully tested our

app with the TaintDroid implementations for Android OS versions 2.3 to 4.1.1, both using the emulator and

with real devices. Finally, we evaluate the success rate and time to complete of the presented attacks. We

conclude that, although taint tracking may be a valuable tool for software developers, it will not effectively

protect sensitive data from the black-box code of a motivated attacker applying any of the presented anti-taint

tracking methods.

1 INTRODUCTION

Mobile devices have become an integral part of our

daily lives, with hugely increased usage of various

applications and services in addition to their origi-

nal purpose of enabling mobile communications. The

reliance on such devices has also resulted in an in-

creased amount of personal information which is ei-

ther stored locally, or potentially available through

various peripherals such as built-in GPS or camera.

Lists of contacts, personal or work emails, brows-

ing history and other private data can be accessed

by the software running on such devices and for-

warded to external entities. With their ability to eas-

ily access, install and run applications from various

sources, these mobile devices have, perhaps unsur-

prisingly, become a prime target for private data-

collecting applications bundled with, or sometimes

masquerading as, legitimate software (Egele et al.,

2011; Hornyack et al., 2011). Collecting information

from user’s mobile devices has actually become a line

∗

This paper is a shortened version of the technical report

available at http://www.nicta.com.au/pub?id=7091

of business (e.g., 201, 2011). Such data may be used

for a number of purposes, ranging from identity theft

to proﬁling and tracking for purposes of targeted ad-

vertising (Grace et al., 2012).

The Android mobile operating system includes a

permissions framework whereby, upon installation,

an application has to explicitly request access to spe-

ciﬁc resources from the user. However, it is not un-

common that application developers request access to

a greater number of resources than what is needed

for the application to perform the intended function-

ality (Felt et al., 2011), and users are usually unable

to properly evaluate these requests (Felt et al., 2012).

Moreover, users do not have a choice in regards to

speciﬁc permissions, as an app can only be installed

if the users agrees to all that is requested. Therefore,

additional methods to protect the privacy of users’

data are required. A number of tools to achieve

this have been developed in recent years.

Within

the research community, the TaintDroid (Enck et al.,

2012) tool has received a lot of attention and a num-

For example, PDroid and LBE Privacy Guard, available

from Google Play.

461

Sarwar G., Mehani O., Boreli R. and Kaafar M..

On the Effectiveness of Dynamic Taint Analysis for Protecting against Private Information Leaks on Android-based Devices.

DOI: 10.5220/0004535104610468

In Proceedings of the 10th International Conference on Security and Cryptography (SECRYPT-2013), pages 461-468

ISBN: 978-989-8565-73-0

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

ber of extensions have also been proposed and imple-

mented (Hornyack et al., 2011; Russello et al., 2012).

This patch for the Android system uses dynamic taint

analysis (Newsome and Song, 2005; Schwartz et al.,

2010) to track sensitive data as it is used by (un-

trusted) apps. It “taints” sensitive data, and warns the

user when these variables are leaked.

Prior work on taint analysis has already identiﬁed

both conceptual and technical limitations (Cavallaro

et al., 2007, 2008; Schwartz et al., 2010), that can be

exploited to avoid detection. Dynamic anti-taint tech-

niques have been classiﬁed by Cavallaro et al. (2008).

In this paper, we investigate the level of protec-

tion that dynamic taint tracking delivers to user’s sen-

sitive data in the Android environment. We identify

the evasive attacks on taint tracking that a malicious

code can perform to create taint-free variables from

tainted objects. To the best of our knowledge, this

is the ﬁrst paper that systematically evaluates the ap-

plicability of dynamic anti-taint tracking techniques

in the mobile device environment. Our focus here is

on dynamic taint analysis and that the use of static

analysis, which is sometimes suggested as a comple-

mentary technique in these contexts (e.g., Graa et al.,

2012), is out of the scope of this paper.

Our contributions are as follows. We evaluate the

effectiveness of generic anti-taint tracking meth-

ods within the Android OS architecture (on versions

2.3 to 4.1.1 of the patched OS), by implementing a

series of attacks in a proof-of-concept application,

ScrubDroid. Speciﬁcally, we evaluate the effective-

ness against the following classes of attacks: control

dependence, which exploits conditional constructs to

breach the taint propagation mechanism; subversion

of benign code, in which the attacker uses the exist-

ing code trusted by the host, abusing its functionality

to remove taint marks; and side channel, that exploits

the use of media that are not considered as capable of

carrying information (e.g., non-monitored memory) .

We evaluate experimentally the success rates for all

presented attacks. Finally, we characterise the time

to complete the attacks for two types of leaked data:

mobile device’s International Mobile Station Equip-

ment Identity (IMEI) number and a 5 s audio record-

ing from the mobile device’s microphone. We con-

clude that dynamic anti-taint tracking techniques

are not sufﬁcient to provide adequate levels of pro-

tection against software that is designed to evade taint

tracking.

The organisation of the rest of this paper is as fol-

lows: in Section 2, we review the background and re-

lated work. In Section 3 we introduce our attacker

model and, in the following Section 4, detail our spe-

ciﬁc anti-taint attacks which can be successfully ap-

plied to circumvent taint tracking with TaintDroid.

We provide our experimental evaluation of the at-

tacks, including the success rate and time to complete

in Section 5. In Section 6 we discuss our ﬁndings and

conclude this paper in Section 7.

2 BACKGROUND

2.1 Taint Tracking

Taint analysis was originally proposed as a method to

track the lifetime of data in a program (Chow et al.,

2004). It is an information ﬂow analysis technique

which works by keeping track of variables contain-

ing data with some property by tagging them with

taint marks. The taint tracking system follows all the

marked variables and their derivatives until the end

of their life-cycle. Dynamic taint analysis (Newsome

and Song, 2005) is an extension of the technique to

perform this data tracking in real-time, as the pro-

gram is executed. Taint tracking mechanisms have

been implemented in a number of programming lan-

guages (e.g. Thomas and Hunt, 2001; 201, 2012), as

a way to support the developer’s task of writing valid

code.

More recently, the use of the technique has seen

a renewed interest for malware analysis and detec-

tion. Ho et al. (2006) proposed to track input from

the network to untrusted code running locally, to en-

sure it does not get executed (e.g., commands from

a command and control system). The Panorama sys-

tem (Yin et al., 2007), ﬂags potentially malicious code

by identifying how it uses sensitive data it captures.

Similar concepts are applied to prevent Android ap-

plications from accessing private data and silently

leaking it to unwanted third-parties, either in real-time

on the device with TaintDroid (Enck et al., 2012), or

even earlier on in the App markets, with AppInspec-

tor (Gilbert et al., 2011).

A noteworthy property of this second class of ap-

proaches is that they have fundamentally different as-

sumptions in regards to trust in the various elements

involved in the system. While in the initial proposals,

taint analysis was a support tool for the developer, in

the context of malware analysis it is actually a tool

to use against the (malware) developer; conversely,

input data, previously untrusted, is now the item to

protect.

2.2 TaintDroid

TaintDroid (Enck et al., 2012) is an implementation

of dynamic taint analysis for the Android platform.

SECRYPT2013-InternationalConferenceonSecurityandCryptography

462

It is implemented as an extension to the Dalvik vir-

tual machine, and can oversee all activity which runs

above it.

TaintDroid uses the concepts of taint sources,

from which sensitive information (e.g., IMEI, text

messages, contacts, GPS data or picture from the

mobile device’s camera) is obtained, and taint sinks,

which are interfaces to the outside world (e.g., us-

ing data networks or sending SMSs) where tainted in-

formation is usually not expected to be sent. When

tainted data reaches a taint sink, TaintDroid issues a

warning to the user. A noteworthy point is that only

system Java Native Interface (JNI) calls to known sys-

tem libraries are allowed, excluding all third-party

ones.

As TaintDroid uses dynamic taint tracking to pro-

tect sensitive user information from untrusted code, it

shares the limitations of dynamic taint analysis (Cav-

allaro et al., 2007, 2008; Schwartz et al., 2010).

Enck et al. (2012) acknowledge that TaintDroid is

vulnerable to control dependence attacks as well as

some side-channel attacks. Nonetheless, user data-

protection solutions like AppFence (Hornyack et al.,

2011) and MOSES (Russello et al., 2012) have been

built based on TaintDroid, with the added functional-

ity of blocking of data leaks, rather than just issuing

warnings. Both the generic anti-taint tracking meth-

ods and the speciﬁc attacks we present in Section 4

will also apply to these systems and can be used to

bypass the security they provide.

3 ATTACK MODEL

Our attack model is summarised in Figure 1. The at-

tacker is a developer, who produces an application to

be executed on a third-party system. The goal of the

application is to extract sensitive information from

this system and send it to a collection system they

control. We assume the application is willingly in-

stalled by the user (step 1), and do not consider po-

tential infection vectors. However, we also assume

this user is wary of such applications, and runs them

under a dynamic taint tracking system to ensure none

of the private data is transferred to the network.

Rather than subverting the taint sources (step 2)

or sinks (step 4), our attacker focuses on the taint-

propagation chain (step 3). The attacker’s objective

is therefore to exploit the limitations we identify in

the next section to remove the mark of a tainted vari-

able X

Tainted

, transforming it intoY

Untainted

and silently

leaking it to the network.

Next, we present the algorithms of the attacks that

we have implemented in our PoC application, dis-

Private Data

Attacks against

Taint Analysis

Malicious App



 !

Taint Tracking System

Attacker

Network Access

# !

Networked

Database Server

Database Se

rver

Figure 1: Our attack model against dynamic taint analysis

used for detection of malware leaking sensitive information.

cussed in Section 5. While some attacks exploit com-

ponents which are explicitly not protected by Taint-

Droid, others rely on the intrinsic (generic) limitations

of using dynamic taint tracking for malware analysis.

4 ANTI-TAINT-ANALYSIS

TECHNIQUES

In this section we introduce the generic classes of at-

tacks against taint-based data leak protection. In the

following, we assume that X

Tainted

is a single byte,

however, the attacks presented are applicable to any

type of data.

4.1 Control Dependence

Basic taint propagation is usually limited to direct as-

signments. Assignments such as Y ← f (X

Tainted

) will

effectively propagate the taint to Y. As acknowledged

by many (Newsome and Song, 2005; Enck et al.,

2012), this can be defeated with a trivial, if convo-

luted, construct using the tainted variable X

Tainted

in a

conditional and assigning a known-untainted value to

Untainted

4.1.1 Simple Encoding Attack

Array indexing attacks, where X

Tainted

is used to index

an array of untainted variables to assign to Y

Untainted

can be successfully avoided by propagating the taint

of both the array and the index to the assigned vari-

able. However,a taint-free version of the index can be

obtained using control-dependent assignment. This is

shown in Algorithm 1 where a value matching X

Tainted

is chosen from an untainted array (e.g., the table of

ASCII characters) when it corresponds to X

Tainted

and is assigned to Y

Untainted

. Since there is no direct

assignment nor propagation of data from X

Tainted

Untainted

, variable Y

Untainted

is never tainted.

OntheEffectivenessofDynamicTaintAnalysisforProtectingagainstPrivateInformationLeaksonAndroid-based

Devices

463

Algorithm 1: Simple Encoding Attack.

for each symbol ∈ AsciiTable do

if symbol = X

Tainted

then

Untainted

← symbol

end if

end for

4.1.2 Count-to-X Attack

Instead of traversing an array in search for the value

related to X

Tainted

, the count-to-X attack recreates the

value one incrementation at a time, until Y

Untainted

matches X

Tainted

4.1.3 Deliberate Exception Attack

Another way to alter the control ﬂow depending on

the value of a tainted variable is by deliberately intro-

ducing execution paths which will reliably terminate

with an exception. The exception handler can then be

used to unconditionally set taint-free variables to val-

ues related to the known value of X

Tainted

leading to

that exception. It can, for example, keep count of how

many times it has been called as the representation of

Tainted

4.2 Subversion of Benign Code

Rather than writing code to manipulate tainted data

directly, benign code, that is, code trusted by the host,

can be subverted into manipulating and leaking sen-

sitive data. Either data structures or their contents

can be modiﬁed, so that the information intended

for transfer to a legitimate peer is instead leaked to

the attacking third-party. In this class of attacks we

leverage unprotected system code to temporarily store

Tainted

, and extract it as Y

Untainted

4.2.1 System Command Attack

It is possible to leverage system commands to scrub

the mark off the variables. The goal here is to subvert

a system utility to print the value of X

Tainted

some-

where in its output stream for capture, taint-free, in

Untainted

The

echo

system command is the most straight-

forward, but many other utilities can be used for the

same purpose, as long as their output contains the

value of their input (or command line arguments).

Any shell command that simply produces an error

message containing the input is vulnerable. We have

analysed the Android Linux binaries present in the

/system/bin/

directory of Android Jelly Bean (ver-

sion 4.1.1) and found more than 40 executables to be

vulnerable for this kind of attack. None of these com-

mands requires the Android device to be rooted nor

have super-user permission to execute.

4.2.2 System–File Hybrid Attack

The previous attack can be further extended by sep-

arating the write and read steps needed to obtain a

taint-free variable. A ﬁle can be created in some stor-

age area, with the tainted information as its content,

and later be read. If either the read or write step does

not properly propagate taint markings, the resulting

variable is taint-free.

As described by Enck et al. (2012), ﬁle tainting

is implemented in a way similar to variable tainting.

Whenever a tainted variable is written to a ﬁle, that

ﬁle is also marked as tainted. Any subsequent reading

of data from that ﬁle into a new variable will mark

that variable as tainted. Using a system command

attack (e.g.,

cat /path/X_tainted

) to read the ﬁle

back into the malicious application allows to break

the taint-propagation chain and produce Y

Untainted

4.3 Side Channels

Side channel attacks are a generic class covering the

use of any medium that can be abused to represent in-

formation, even if it is not their prime purpose. Such

medium is often overlooked by taint-checking mech-

anisms, and not effectively protected. These attacks

might be the hardest to protect against as they cover

the entire system.

4.3.1 Timing Attack

Timing attacks rely on the speciﬁc side channel cre-

ated by the time it takes to perform some task. They

can be performed from within a program actively try-

ing to leak tainted data by using delay loops with a

variable duration depending on the value of a tainted

variable. They are based on the availability of a sys-

tem clock readable without tainting. The difference in

time readings before and after a waiting period, which

duration is based on the value of a tainted variable, is

not itself tainted, and can be assigned to our taint-free

output variable.

Dependingon the system, a millisecond resolution

may be sufﬁcient for accurate results. In our PoC, we

observed period inaccuracies of around 3–10ms, re-

sulting inY

Untainted

= X

Tainted

+ ε where ε ∈ [0, 10] ms.

Using a second resolution solved the problem (but ob-

viously made data collection longer). Another option

was to repeat the attack until Y

Untainted

= X

Tainted

be-

fore continuing; while this solution worked reliably,

SECRYPT2013-InternationalConferenceonSecurityandCryptography

464

its structure made the attack closer to a control de-

pendence one.

4.3.2 File Length Attack

While a ﬁle could be marked due to its contents, its

metadata can be used as an intermediary to evade taint

tracking. In Algorithm 2, random data is written, one

byte at the time, to a ﬁle until its size equals the value

of X

Tainted

. The size can then conveniently be read

without resulting in a marked output variable.

Algorithm 2: File Length Attack.

F ← CreateNewFileHandle()

z ← 0

while z < X

Tainted

WriteOneByte(F)

z ← z+ 1

end while

Untainted

← ReadFileLength(F)

Each symbol in X

Tainted

is set to be represented by

the length of an arbitrary ﬁle. Its total length is then

obtained from the system, and results in a taint-free

variable containing the desired element, from which

the full Y

Untainted

can be obtained.

If the system provides a clipboard for applications

to store and exchange temporary data, a very similar

technique can be used: the Clipboard Length At-

tack.

4.3.3 Bitmap Cache Attack

Systems with graphical output usually rely on a cache

of the currently displayed screen. This makes it pos-

sible to render the value of X

Tainted

on the screen, then

access the bitmap cache, and literally read the value

from there, for example using OCR techniques.

In our PoC, we used the standard Android API for

widget manipulation in order to output the text in a

graphical widget, then retrieve the cached image of

its rendering. OCR was then performed using off-the-

shelf tools. This was done by sending the bitmap data

to a cloud service providing OCR over HTTP ser-

vice. It should however be possible to write a sim-

ple bitmap parser using the Android Java API without

risk of keeping the taint marking as it is already re-

moved when the bitmap is obtained from the cache.

A more subtle technique involving interface wid-

gets and bitmap rendering consists in only changing

one pixel of the image to represent the current value

to untaint, then rereading it into a fresh, taint-free,

Untainted

. This is shown in Algorithm 3, which modi-

ﬁes the arbitrarily chosen pixel at coordinates 10× 10.

Algorithm 3: Bitmap Pixel Attack.

B ← CreateNewBitmap()

// set the pixel at coordinate (10, 10) with X

Tainted

SetPixel([10, 10], X

Tainted

→ B)

Untainted

← GetPixel(B, [10, 10])

4.3.4 Text Scaling Attack

This side-channel attack represents a combination of

the last two types: using the properties, rather than

the contents, of graphical elements. The method pre-

sented in Algorithm 4 consists in setting an arbitrary

property of a graphical widget, here the scaling, then

retrieving it through the standard API. Note that the

content of the widget is never changed during this at-

tack.

Algorithm 4: Text Scaling Attack.

T ← TextViewWidget()

T ← SetTextScalingValue(X

Tainted

)

Untainted

← GetTextScalingValue(T)

4.3.5 Direct Buffer Attack

Pointer indirection attacks target the low level mem-

ory access features of the system. In this particular

attack, shown in Algorithm 5, we ﬁrst create a mem-

ory buffer. We then write a tainted variable to that

buffer at a speciﬁc, known, address. Later the content

address is read back using another direct memory ac-

cess. This is sufﬁcient to obtain a taint-free version of

the data.

Algorithm 5: Direct Buffer Attack.

D ← NewDirectAccessBuf fer()

// write X

Tainted

at location 0×XX of buffer D

DirectMemoryWrite(X

Tainted

→ D)

// read from memory location 0×XX of buffer D

Untainted

← DirectMemoryRead(D,

)

In ScrubDroid, this attack works due to an im-

plementation limitation of TaintDroid that has been

mentioned by Enck et al. (2012). We include this at-

tack in-line with the classiﬁcation of Cavallaro et al.

(2008) to demonstrate how easy it is to perform this

type of indirection attacks by manipulating pointers.

In our implementation, we have used Android’s Java

New I/O interface (Google Inc., 2012) to achieve di-

rect memory access. In a more general context, this

attack however remains hard to deﬂect, save for keep-

ing a taint mark for each byte of memory, which we

consider impractical.

We also believe a new class of anti-taint tracking

OntheEffectivenessofDynamicTaintAnalysisforProtectingagainstPrivateInformationLeaksonAndroid-based

Devices

465

methods is to be watched out for, where code execu-

tion is delegated to another component of the system.

With GPUs becoming more powerful at all-purpose

computation, malware could be envisioned that del-

egates removal of taint marks to the graphical unit,

rather than performing this task directly on the CPU.

5 EVALUATION

We have instrumented ScrubDroid, our proof-of-

concept implementation of the attacks presented in

Section 4,

in order to evaluate various aspects of the

attacks that target TaintDroid.

5.1 Methodology

For the evaluation of a speciﬁc attack, the attacker

attempts to obtain tainted data, then performs a se-

ries of untainting steps speciﬁc to the the attack be-

fore ﬁnally sending it over the network to a collec-

tion server. We evaluate two aspects of the attacks:

whether they are successful (including the potential

for false positives and negatives), and the time it takes

for an attacker to leak a certain amount of data. We

consider an attack successful if the data has reached

the server without triggering an alert.

Our experimental framework is as follows. For

each attack, we ﬁrst query non-sensitive (untainted)

information. We then query for speciﬁc sensitive in-

formation, which should be tainted and generate a

warning upon reaching a taint sink; this allows us to

identify false negatives, where our attacks succeed.

The script ﬁnally asks the system for a second non-

sensitive piece of information, through the same at-

tack; if it is tainted due to the previous, sensitive, data

which was passed through the particular method, this

is a false positive. Finally, we evaluate how practical

it is for the attacker to conduct the various proposed

attacks by measuring the time it takes to obtain the

leaked variables.

In the experiments, for sensitive data we use the

mobile device’s IMEI number or a 5s audio recording

acquired a from the device’s internal microphone.

5.2 Experimental Results

We report, in Table 1(a), the results of our experi-

ments evaluating success rates of representative at-

tacks from Section 4 when the attacker is attempt-

ing to obtain IMEI. As a reference, we ﬁrst tested

The code for this application is available at http://

nicta.info/scrubdroid

two naive approaches, which do not try to remove

taint marks: sending the variable directly from a taint

source to a taint sink (Tainted Variable), and writing

it to a ﬁle prior to reading it into the taint sink (File

R/W); we consider two cases for the latter where we

either overwrite the contents of the ﬁle with subse-

quent calls, or append new data (tainted or otherwise).

We can verify that TaintDroid correctly identiﬁes

the naive approaches, but fails to ﬂag any of our spe-

ciﬁc attacks. We note however that the effective-

ness of the Direct Buffer attack differs in experiments

with the two versions of TaintDroid, the 2012-10-

06 release for Android 4.1.1r6, and a later revision,

17d49f89 in Git. The earlier version is vulnerable

to the attack, while the later Git revision properly

ﬂags the Direct Buffer attack, however at the cost of a

false positive on the subsequent non-sensitivevariable

passed in the same way. This behaviour is similar to

the naive File R/W technique where data is appended

to a ﬁle rather than overwritten: once some element of

the system has been identiﬁed as potentially tainted,

all variables transiting through it get tainted too, re-

gardless of their sensitivity. All other attacks behaved

similarly with both versions.

For timing measurements, we report results for

both IMEI, a 15-byte identiﬁer for GSM devices and

a captured 5 s of audio from the internal microphone,

with an average size of 11 kB (a variable bitrate codec

is used). Table 1(b), shows the results for selected at-

tacks (some attacks have a prohibitively long time for

the 11kB of the audio sample and were consequently

not run). All measurements have been run multiple

times to ensure the standard error was less than 5% of

the mean (resulting in 50–200 runs).

The Simple Encoding attack is clearly the most

efﬁcient way to obtain large amounts of private data

(with a speed of 13.82 kBps for audio) while the Di-

rect Buffer technique would have been the fastest

attack for smaller variables (with a fairly constant

3.72kBps).

6 POTENTIAL COUNTER

MEASURES AND DISCUSSION

Clause et al. (2007); Kang et al. (2011) have pro-

posed techniques to ﬁght control dependence at-

tacks by over-marking all the variables involved in

conditional statements. This, while reducing the num-

ber of false negatives, increases the number of false

positives, where variables that convey no informa-

tion about tainted data are marked. Implicit control

dependence attacks (or implicit ﬂow attacks, as re-

ferred to in Clause et al., 2007; Kang et al., 2011) are

SECRYPT2013-InternationalConferenceonSecurityandCryptography

466

Table 1: Experimental results: (a) Success rates and potential for errors. Checks indicate TaintDroid warnings, while “FP”

and “FN” indentify false positives or negatives. (b) Time to leak information of different sizes using various techniques.

(a) Success rates

Technique Y

Untainted

Tainted

′

Untainted

Tainted Variable – X –

File R/W (ovrwr.) – X –

File R/W (app.) – X X (FP)

Simple Encoding – – (FN) –

Count-to-X – – (FN) –

Exception-Error – – (FN) –

Shell Command – – (FN) –

File-Shell Hybrid – – (FN) –

Timekeeper – – (FN) –

File Length – – (FN) –

Clipboard Length – – (FN) –

Bitmap Cache – – (FN) –

Bitmap Pixel – – (FN) –

Text Scaling – – (FN) –

Direct Buf. (Rel.) – – (FN) –

Direct Buf. (Git) – X X (FP)

(b) Timing measurements

Technique

IMEI 5s audio

(15B) (11.00 kB, σ = 50.8 B)

avg. [ms] σ avg. [ms] σ

Tainted Variable 3.48 4.07 364.97 67.31

File R/W 47.62 19.56 386.01 49.85

Simple Encoding 9.55 4.55 795.72 49.12

Count-to-X 10.14 5.41 8278.64 84.20

Exception-Error 53.22 22.09 —

Shell Command 72.22 12.69 —

File-Shell Hybrid 78.10 25.80 —

Timekeeper 1037.66 82.60 —

File Length 72.37 21.78 —

Clipboard Length 84.89 18.61 —

Bitmap Cache 312.27 24.45 —

Bitmap Pixel 35.95 12.35 2899.80 172.56

Text Scaling 12.92 5.91 3022.58 84.12

Direct Buffer 4.00 3.67 2988.70 87.69

more difﬁcult to detect than explicit attacks, as the

untainted variable is not actively manipulated in the

control path it is relevant to. These can be mitigated

by techniques similar to Perl’s

is_tainted()

func-

tion, which marks all enclosed variables (201, 2012).

This, however, requires that the developer explicitly

marks the parts of their code potentially susceptible

to such attacks, and is also prone to false positives.

Without such developer cooperation, and to the best

of our knowledge, there is no mitigation technique for

taint evasion using implicit ﬂows. It should also be

noted that most of the presented control dependence

attacks rely on replacing direct assignment with com-

parisons between the tainted and untainted variables.

Propagating taint on comparison might therefore be

an interesting improvement to consider. Finally, al-

though the higher false positive rate may impact the

accuracy of TaintDroid, which only issues warnings,

related systems that actively block data leaks (such as

AppFence Hornyack et al., 2011 or MOSES Russello

et al., 2012), would see an unacceptable reduction of

functionality.

Protection against benign code-subversion at-

tacks is also prone to false positives, however, imple-

menting this protection may not even be a viable op-

tion. Attacks involving subversion of system utilities

would be effectively blocked by preventing the appli-

cations from using them; once again, the consequence

for many applications would be that they would not

be able to function as designed. Another option, in

the case of TaintDroid, would be to instrument not

only the Dalvik VM, but the entire system for taint-

tracking, so low level utilities are also watched. This,

however, would require a large development effort

with a set of additional challenges yet to be explored

(e.g., patching the system libraries and/or the kernel

itself). Additionally, as noted in Section 4.3.5, effec-

tively preventing pointer indirection attacks would re-

quire being able to mark each memory address, which

is likely impractical.

The side channel attacks can be mitigated by

techniques similar to those used against control de-

pendence attacks, i.e., by tainting a larger scope of

variables, however with similar consequences of in-

creasing the number of false positives. The evolu-

tion of TaintDroid’s code shows us a nice example

of this problem: the Direct Buffer attack was initially

successful, but later additions to the TaintDroid code

rendered it ineffective. Yet, the same additions also

increased the rate of false positives when using Direct

Buffers.

We note that most of the presented attacks (save

for the speciﬁc details of the side channel attacks) are

more generally applicable to dynamic taint tracking

systems at large, rather than only to Android based

systems. On a more generic note, and as already al-

luded to by Kang et al. (2011), a number of issues are

inherent to using taint analysis against the developer

and can therefore not be easily side-stepped. There-

fore, dynamic taint analysis is likely not to be effec-

tive in this context when used alone, as a single breach

in the security is where the malware developer, aware

of such protection, is most likely to attack.

7 CONCLUSIONS

We haveargued that dynamic taint tracking is unlikely

OntheEffectivenessofDynamicTaintAnalysisforProtectingagainstPrivateInformationLeaksonAndroid-based

Devices

467

to be effective in detecting privacy leaks in malicious

applications written with the expectation of such close

scrutiny in the context of Android architecture. In-

deed, the malware developer can use easy program-

matic constructs in the code, enabling the removal of

taint marks without losing the information.

We have provided the algorithms for a number

of different attacks, and evaluated their performance

on the Android platform with the TaintDroid patch.

Though only a few lines of code each, they were

shown to be sufﬁcient to completely bypass Taint-

Droid, and allow silent leaking of sensitive informa-

tion. While some of the attacks were targeting self-

reported limitations of TaintDroid, which can be cor-

rected by new versions, others have highlighted an

essential problem of using taint analysis against the

developer of the code under study.

REFERENCES

(2011). Understanding Carrier IQ technology. White paper,

Carrier IQ.

(2012). perlsec - Perl security.

Cavallaro, L., Saxena, P., and Sekar, R. (2007). Anti-taint-

analysis: Practical evasion techniques against infor-

mation ﬂow based malware defense. Technical report,

Stony Brook University.

Cavallaro, L., Saxena, P., and Sekar, R. (2008). On the lim-

its of information ﬂow techniques for malware analy-

sis and containment detection of intrusions and mal-

ware, and vulnerability assessment. In DIMVA 2008,

chapter 8.

Chow, J., Pfaff, B., Garﬁnkel, T., Christopher, K., and

Rosenblum, M. (2004). Understanding data lifetime

via whole system simulation. In Security 2004.

Clause, J., Li, W., and Orso, A. (2007). Dytan: a generic

dynamic taint analysis framework. In ISTA 2007.

Egele, M., Kruegel, C., Kirda, E., and Vigna, G. (2011).

PiOS: Detecting privacy leaks in iOS applications. In

NDSS 2011.

Enck, W., Gilbert, P., Chun, B.-G., Cox, L. P., Jung, J., Mc-

Daniel, P., and Sheth, A. N. (2012). TaintDroid: An

information-ﬂow tracking system for realtime privacy

monitoring on smartphones. In OSDI 2010.

Felt, A. P., Chin, E., Hanna, S., Song, D., and Wagner, D.

(2011). Android permissions demystiﬁed. In CCS

2011.

Felt, A. P., Ha, E., Egelman, S., Haney, A., Chin, E., and

Wagner, D. (2012). Android permissions: User atten-

tion, comprehension, and behavior. In SOUPS 2012.

Gilbert, P., Chun, B. G., Cox, L. P., and Jung, J. (2011).

Vision: Automated security validation of mobile apps

at app markets. In MCS 2011.

Google Inc. (2012). Android Java New I/O interface. An-

droid 4.2 r1.

Graa, M., Cuppens-Boulahia, N., Cuppens, F., and Cav-

alli, A. (2012). Detecting control ﬂow in smarphones:

Combining static and dynamic analyses. In CCS 2012.

Grace, M. C., Zhou, W., Jiang, X., and Sadeghi, A.-R.

(2012). Unsafe exposure analysis of mobile in-app

advertisements. In WiSec 2012.

Ho, A., Fetterman, M., Clark, C., Warﬁeld, A., and Hand, S.

(2006). Practical taint-based protection using demand

emulation. In EuroSys 2006.

Hornyack, P., Han, S., Jung, J., Schechter, S., and Wether-

all, D. (2011). “These aren’t the droids you’re looking

for:” retroﬁtting Android to protect data from imperi-

ous applications. In CCS 2011.

Kang, M. G., McCamant, S., Poosankam, P., and Ong, D.

(2011). DTA++: Dynamic taint analysis with targeted

control-ﬂow propagation. In NDSS 2011.

Newsome, J. and Song, D. (2005). Dynamic taint analysis

for automatic detection, analysis, and signature gen-

eration of exploits on commodity software. In NDSS

2005.

Russello, G., Conti, M., Crispo, B., and Fernandes, E.

(2012). MOSES: Supporting operation modes on

smartphones. In SACMAT 2012.

Schwartz, E. J., Avgerinos, T., and Brumley, D. (2010). All

you ever wanted to know about dynamic taint analysis

and forward symbolic execution (but might have been

afraid to ask). In SP 2010.

Thomas, D. and Hunt, A. (2001). Locking Ruby in the Safe,

chapter 20.

Yin, H., Song, D., Egele, M., Kruegel, C., and Kirda, E.

(2007). Panorama: Capturing system-wide informa-

tion ﬂow for malware detection and analysis. In CCS

2007.

SECRYPT2013-InternationalConferenceonSecurityandCryptography

468