Identiﬁcation of Android Malware Families with Model Checking

Pasquale Battista, Francesco Mercaldo, Vittoria Nardone,

Antonella Santone and Corrado Aaron Visaggio

Department of Engineering, University of Sannio, Benevento, Italy

Keywords:

Malware, Android, Security, Formal Methods, Process Algebras.

Abstract:

Android malware is increasing more and more in complexity. Current signature based antimalware mecha-

nisms are not able to detect zero-day attacks, also trivial code transformations may evade detection. Malware

writers usually add functionality to existing malware or merge different pieces of malware code: this is the

reason why Android malware is grouped into families, i.e., every family has in common the malicious be-

havior. In this paper we present a model checking based approach in detecting Android malware families by

means of analysing and verifying the Java Bytecode that is produced when the source code is compiled. A

preliminary investigation has been also conducted to assess the validity of the proposed approach.

1 INTRODUCTION

Malware, as well as, any other software evolves. Ev-

idence exists that the majority of newly detected mal-

ware are tweaked variants of well-known malware

(Bailey et al., 2009; Hu et al., 2009; Jang et al., 2011).

As a matter of fact, attackers use to modify ex-

isting malware, by adding new behaviors or merging

together parts of different existing malware’s codes.

Existing malware can be embedded in apparently be-

nign programs (usually popular apps) with repackag-

ing (Zhou and Jiang, 2012): malware authors locate

and download popular apps, disassemble them, en-

close malicious payloads, re-assemble and then sub-

mit the new apps to ofﬁcial and/or alternative Android

markets. This scenario leads to group malware in

families, where a family deﬁnes a set of behaviors

common to all its members. Identifying the family

a malware belongs to is of primary importance as it

helps to discover new malware families (Khoo and

Lio, 2011; Ma et al., 2006), create models of prove-

nance and lineage (Dumitras and Neamtiu, 2011), and

generate phylogeny models (Karim et al., 2005). Rec-

ognizing a malware family is at the basis of a variety

of security tasks, from malware characterization to

threat detection and cyber-attack prevention. In mal-

ware triage (Bailey et al., 2009; Hu et al., 2009; Jang

et al., 2011), lineage can be used by malware analysts

to understand trends over time and make informed de-

cisions about the dissection strategies to dissect the

malware samples. This is particularly important since

the order in which the variants of a malware are cap-

tured does not necessarily mirror its evolution. In

software security, lineage can help to ﬁnd vulnerabili-

ties in software when the source code is not available.

For example, if we know that a vulnerability is present

in an earlier version of an application, then it may also

reside in applications that are derived from it.

Although literature provides several proposals to

detect Android malware (Canfora et al., 2013; Arp

et al., 2014), the proposed techniques are not able to

isolate the payload responsible for malicious action,

and this impedes the recognisance of the family.

Moreover, in mobile malware landscape, malware

is becoming aggressive and hundreds of families are

spread at a very fast pace (Zhou and Jiang, 2012):

simple forms of polymorphic attacks (i.e., malware

that mutates at each infection) targeting Android plat-

form have already been seen

. An example of poly-

morphic behaviour is represented by Opfake family.

The authors demonstrated that by using simple code

transformations (Canfora et al., 2015) to existing mal-

ware that is well recognized by malware detectors

turns it in a version that is anymore recognized by the

most malware detectors.

DroidKungFu is a widespread malware family. Its

payload is able to install a backdoor that allows at-

tackers to access the smartphone when they want and

use the device as they please. Since DroidKungFu

contains root exploits, this family represents one of

http://www.symantec.com/connect/blogs/server-side-

polymorphic-android-applications

542

Battista, P., Mercaldo, F., Nardone, V., Santone, A. and Visaggio, C.

Identiﬁcation of Android Malware Families with Model Checking.

DOI: 10.5220/0005809205420547

In Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP 2016), pages 542-547

ISBN: 978-989-758-167-0

the most serious threats to mobile users

Starting from these considerations, it urges to

study new techniques which are able to effectively

recognize the family a malware belongs to.

In this paper we investigate whether model check-

ing could detect payloads properly and resist against

common obfuscation used by attackers to generate

malware variants belonging to same family.

Thus, we pose the following research questions:

• RQ1: is our method able to correctly identify the

malware family?

• RQ2: is our method able to correctly identify mor-

phed versions of known malware?

The paper proceeds as follows: comparisons with

related work are made in Section 2. Section 3 is a

review of the basic concepts of formal methods, while

Section 4 describes our methodology. In Section 5

the experimental results we obtained are reported and

discussed; and, ﬁnally, conclusions are drawn in the

last section.

2 RELATED WORK

In this section, coherently with the research questions

we stated in the introduction, we review related litera-

ture about malware detection with particular emphasis

on studies using formal methods. As our method per-

forms a static analysis, we discuss related works that

do not require to run applications, i.e. static ones.

Authors in (Kinder et al., 2005) introduce the

speciﬁcation language CTPL (Computation Tree

Predicate Logic) which extends the well-known logic

CTL, and describes an efﬁcient model checking algo-

rithm. They conﬁrm the malicious behavior of thir-

teen Windows malware variants using as dataset a set

of worms dating from the years 2002-2004.

Song et al. (Song and Touili, 2001) present an ap-

proach to model Microsoft Windows XP binary pro-

grams as a PushDown System (PDS). They evalu-

ate 200 malware variants (generated by NGVCK and

VCL32 engines) and 8 benign programs.

The tool PoMMaDe (Song and Touili, 2013) is

able to detect 600 real malware, 200 malware gen-

erated by two malware generators (NGVCK and

VCL32), and proves the reliability of benign pro-

grams: a Microsoft Windows binary program is mod-

eled as a PDS which allows to track the stack of the

program.

Song et al. (Song and Touili, 2014) model mobile

applications using a PDS in order to discovery private

https://www.csc.ncsu.edu/faculty/jiang/DroidKungFu3/

data leaking. They identify information leak working

at Smali code level.

Jacob and colleagues (Jacob et al., 2010) pro-

vide a basis for a malware model, founded on the

Join-Calculus: the process-based model supports the

fundamental notion of self-replication but also inter-

actions, concurrency and non-termination to cover

evolved malware. They consider the system call se-

quences to build the model.

As emerges from this discussion and at the best

knowledge of the authors, the payload identiﬁcation

in Android environment proposed in this paper was

never used in any of the works on mobile malware

detection in literature.

3 PRELIMINARIES ON FORMAL

METHODS

In this section we introduce the basic concepts of for-

mal methods. For applying formal methods, we need:

1. A Precise Notation for Deﬁning Systems: Spec-

iﬁcation is the process of describing a system. We

assume that the system behaviour is represented as

an automaton. It basically consists of a set of nodes

together with a set of labelled edges between these

nodes. A node represents a system state, while a la-

belled edge represents a transition from one system

state to the next. That is, if the automaton contains an

edge s

−→s

, then the system can evolve from state s

into state s

by the execution of action a. One state is

selected to be the root state (initial state). However,

for the purpose of mathematical reasoning it is often

convenient to represent the automaton algebraically in

the form of processes. For this aim, we use Milner’s

Calculus of Communicating Systems (CCS) (Milner,

1989), one one of the most well known process alge-

bras. CCS contains basic operators to build ﬁnite pro-

cesses, communication operators to express concur-

rency, and some notion of recursion to capture inﬁnite

behaviour. The syntax of processes is the following:

p ::= nil | α.p | p + p | p|p | p\L | p[ f ] | x

where α ranges over a ﬁnite set of actions

A = {τ,a,a,b,b,...}. Input actions are labeled with

“non-barred” names, i.e., a, while output actions are

“barred”, i.e., a. The action τ ∈ A is called internal

action. The set L ranges over sets of visible actions

(A −{τ}), f ranges over functions from actions to ac-

tions, while x ranges over a set of constant names:

each constant x is deﬁned by a constant deﬁnition

def

= p.

We give the semantics for CCS by induction over

the structure of processes.

Identiﬁcation of Android Malware Families with Model Checking

543

• The process nil can perform no actions.

• The process α.p can perform the action α and

thereby become the process p.

• The process p + q can behave either as p or as q.

• The operator | expresses parallel composition: if

the process p can perform α and become p

, then

p|q can perform α and become p

|q, and similarly

for q. Furthermore, if p can perform a visible ac-

tion l and become p

, and q can perform l and

become q

, then p|q can perform τ and become

• The operator \ expresses the restriction of actions.

If p can perform α and become p

, then p\L can

perform α to become p

\L only if α,α 6∈ L.

• The operator [ f ] expresses the relabeling of ac-

tions. If p can perform α and become p

, then

p[ f ] can perform f (α) and become p

[ f ].

• Each relabeling function f has the property that

f (τ) = τ.

• A constant x behaves as p if x

def

= p.

The operational semantics of a process p is a la-

belled transition system, i.e., an automaton whose

states correspond to processes (the initial state cor-

responds to p) and whose transitions are labelled by

actions in A.

2. A Precise Notation for Deﬁning Properties: This

need can be solved using a temporal logic. Tempo-

ral logics present constructs allowing to state in a for-

mal way that, for instance, all scenarios will respect

some property at every step, or that some particular

event will eventually happen, and so on. A model

checker then accepts two inputs, a system described,

for example, in process-algebraic notations and a tem-

poral formula, and returns “true” if the system sat-

isﬁes the formula and “false” otherwise. In this pa-

per we use the logic selective mu-calculus (Barbuti

et al., 1999; Santone and Vaglini, ). It was deﬁned

with the goal of reducing the number of states of the

transition systems in such a way that the reduction is

driven by the formulae to be checked, and in partic-

ular by the syntactic structure of the formulae. The

selective mu-calculus is a variant of the mu-calculus

(Stirling, 1989), and differs from it in the deﬁnition of

the modal operators. The syntax of the selective mu-

calculus is the following, where K and R range over

sets of actions, while Z ranges over a set of variables:

φ ::= tt | ff | Z | φ ∨ φ | φ ∧ φ |

[K]

φ | hKi

φ | νZ.φ | µZ.φ

The satisfaction of a formula φ by a state s of a tran-

sition system is deﬁned as follows:

• each state satisﬁes tt and no state satisﬁes ff;

• a state satisﬁes φ

∨φ

(φ

∧φ

) if it satisﬁes φ

(and) φ

• [K]

φ and hKi

φ are the selective modal opera-

tors. [K]

φ is satisﬁed by a state which, for every

performance of a sequence of actions not belong-

ing to R ∪ K, followed by an action in K, evolves

in a state obeying φ. hKi

φ is satisﬁed by a state

which can evolve to a state obeying φ by perform-

ing a sequence of actions not belonging to R ∪ K

followed by an action in K.

One of the most popular environments for veri-

fying concurrent systems is the Concurrency Work-

bench of New Century (CWB-NC) (Cleaveland and

Sims, 1996), which supports several different speciﬁ-

cation languages, among which CCS. In the CWB-

NC the veriﬁcation of temporal logic formulae is

based on model checking (Clarke et al., 2001).

4 THE METHODOLOGY

In this section we present our methodology for the

detection of Android malware families using model

checking. It is based on two main steps:

Step 1: Java Bytecode-to-CCS Transform

Operator

The ﬁrst step generates a CCS speciﬁcation from

the Java Bytecode of the .class ﬁles derived by the

analysed apps. This is obtained by deﬁning a Java

Bytecode-to-CCS transform operator T . The func-

tion T directly applies to the Java Bytecode and trans-

lates it into CCS process speciﬁcations. The function

T is deﬁned for each instruction of the Java Bytecode.

In the following, a Java Bytecode program P is

a sequence c of instructions, numbered starting from

address 0; ∀i ∈ {0,...,]c}, and c[i] is the instruction at

address i, where ]c denotes the length of c. All Java

Bytecode instructions have been translated in CCS;

below we will show only a few, just to give the reader

the ﬂavor of the approach followed.

Instruction: c[i] = goto j

T (i) = x

def

= gotoj.x

The instruction c[i] = goto j is translated into a CCS

process x

that performs the action gotoj and then

jumps to the instruction j, corresponding to the CCS

process x

Instruction: c[i] = tstore x

T (i) = x

def

= store.x

i+1

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

544

Each tstore x instruction is translated, regardless of

the type t and of the name of the variable x, as store

followed by the constant process x

i+1

representing the

CCS translation of the successive instruction.

Step 2: Expressing Android Malware Families

into Temporal Logic

The second step aims at discovering android malware

families, expressed in temporal logic. The CCS pro-

cesses obtained in the ﬁrst step are used to prove prop-

erties: using model checking we determine the detec-

tion of malware families. Codes described as CCS

processes are ﬁrst mapped to labelled transition sys-

tems and the CWB-NC is used. Different properties

have been deﬁned characterizing the behaviour of the

families.

Table 1 elicits the malicious behaviours for the

analysed families and the resulting translation into

logic rules. Logic rules model the malicious behav-

ior in order to ﬁnd it in the model.

The distinctive features of this methodology are:

(i) the use of formal methods; (ii) the detection on

Java Bytecode and not on the source code; (iii) the

detection of malicious payloads; (iv) the use of static

analysis; (v) the capture of malicious payloads at a

ﬁner granularity.

In practice, from the Java Bytecode we derive

CCS processes, which are successively used for

checking properties expressing the major character-

istics of a malware family. Moreover, our methodol-

ogy exploits the Bytecode representation of the anal-

ysed apps. Performing Android malware families

detection on the Bytecode and not directly on the

source code has several advantages: (i) independence

of the source programming language; (ii) detection of

malware families without decompilation even when

source code is lacking; (iii) ease of parsing a lower-

level code; (iv) independence from obfuscation.

5 RESULTS AND DISCUSSION

The malware samples used in the evaluation were col-

lected from Drebin project (Arp et al., 2014; Spre-

itzenbarth et al., 2013). Each malware sample is la-

belled according to the malware family: each fam-

ily contains samples which have in common the same

payload.

In the following preliminary study we consider the

DroidKungFu and the Opfake families, 100 samples

for each family. Furthermore, we develop a frame-

work able to inject several obfuscation levels in An-

droid applications: (i) changing package name; (ii)

identiﬁer renaming; (iii) data encoding; (iv) call indi-

rections; (v) code reordering; (vi) junk code insertion.

The reader can refer to (Canfora et al., 2015) for fur-

ther details. We produce the morphed version of the

200 applications: the full dataset is composed by 400

different applications. To highlight the effectiveness

of the proposed solution, we submitted the dataset to

the top 5 ranked mobile antimalware from AVTEST

an independent Security Institute for IT.

Table 2 shows the results obtained with Droid-

KungFu and Opfake families and with morphed ver-

sion (DroidKungFuMorph and OpfakeMorph).

We consider only samples identiﬁed in the right

family (column ident in Table 2). We also report the

samples detected as malicious but not identiﬁed in the

right family and the samples not recognized as mal-

ware (column unident. in Table 2). According to the

research questions, the problem of identifying mali-

cious payload should be a further research direction in

malware analysis. Due to the novelty of the problem,

antimalware are not still specialized in family identi-

ﬁcation. For these reasons some antimalware are un-

skilled to detect families. To better understand this

lack see Table 2, in particular the unident. column.

Another problem is that current antimalware are

not able to detect malware when the signature mu-

tates: their performance decrease dramatically with

morphed samples. In order to try to circumvent the

above problems we introduce our methodology and

we discuss preliminary results.

5.1 Empirical Evaluation Procedure

To estimate the performance detection of our method-

ology we compute the metrics of precision and recall,

F-measure (Fm) and Accuracy (Acc), deﬁned as fol-

lows:

PR =

T P

T P + FP

; RC =

T P

T P + FN

;

Fm =

2PR RC

PR + RC

; Acc =

T P + T N

T P + FN + FP + T N

where T P is the number of malware that are correctly

identiﬁed in the right family (True Positives), T N is

the number of malware correctly identiﬁed as not be-

longing to the family (True Negatives), FP is the

number of malware that are incorrectly identiﬁed in

the right family (False Positives), and FN is the num-

ber of malware that were not identiﬁed as belonging

to the right family (False Negatives).

https://www.av-test.org/en/antivirus/mobile-devices/

Identiﬁcation of Android Malware Families with Model Checking

545

Table 1: Families Description and Corresponding Logic Rules.

DroidKungFu Rule (selective mu-calculus formulae)

device rooting

IMEI

OS type

device ID

network type

C&C server

ϕ = ϕ

∨ ϕ

where:

=hpushphonei

hinvokegetSystemServicei

hcheckcastandroidtelephonyTelephonyManageri

hinvokegetDeviceIdi

=hpushIMEIi

hloadi

hinvokeiniti

hinvokeaddi

=hpushchmodi

hinvokeiniti

hstorei

hloadi

Opfake Rule (selective mu-calculus formulae)

SMS sending

SMS monitoring

download ﬁle

phonebook

ψ = ψ

∨ ψ

where:

=hloadi

hinvokesendTextMessagei

=hpushi

hanewarrayi

hinvokegetMethodi

=hpushsendTextMessagei

hloadi

hinvokegetMethodi

Table 2: Antimalware Evaluation for DroidKungFu, Opfake DroidKungFuMorph and Opfake Morph Families.

AntiMalware DroidKungFu Opfake DroidKungFuMorph OpfakeMorph

ident. unident. ident. unident. ident. unident. ident. unident.

AhnLab 2 98 66 34 0 100 44 56

Alibaba 0 100 0 100 0 100 0 100

Antiy 93 7 96 4 38 62 49 51

Avast 89 11 0 100 24 76 0 100

AVG 4 96 0 100 0 100 0 100

Our Method 87 13 73 27 89 11 73 27

Table 3: Preliminary Performance Evaluation.

TP FP FN TN PR RC Fm Acc

DroidKungFu 87 6 13 94 0.93 0.87 0.90 0.91

Opfake 73 8 27 92 0.90 0.73 0.80 0.83

DroidKungFuMorph 89 1 11 99 0.98 0.89 0.93 0.94

OpfakeMorph 73 8 27 92 0.90 0.73 0.80 0.83

5.2 Preliminary Evaluation

We have implemented a prototype tool and we have

conducted experiments for a proof of concept of our

methodology. Table 3 shows the results obtained us-

ing our prototype tool: we obtain an accuracy ranging

from 0.83 to 0.94.

RQ1 response: Results in Table 2 show that our

method is promising to identify malware payload. We

obtain, when comparing not morphed malware, per-

formance quite in line with top 5 mobile antimal-

ware. Instead, the gap between our approach and the

signature-based detection is broader in the morphed

sample evaluation.

RQ2 response: We outperform the top 5 current

signature-based approach in detecting morphed sam-

ples as shown in Table 2. Instead, when evaluating

not-morphed samples, Antiy achieves better results.

It should be underlined that the method we pro-

pose is robust: Table 3 shows that Accuracy and F-

Measure values are not affected from code obfusca-

tion. It is worth noting that Accuracy and F-Measure

increase in detecting DroidKungFu morphed samples

than not-morphed ones, while they are the same in

evaluating Opfake and OpfakeMorphed samples: this

is the reason why our method is transparent respect

to obfuscation, differently from the antimalware that

dramatically decrease when evaluating morphed sam-

ples.

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

546

6 CONCLUDING REMARKS AND

FUTURE WORK

Since previous works in mobile malware detection fo-

cus on the research in discriminating a malware appli-

cation from a trusted one, in this paper we propose an

approach to localize the malicious behaviour at a ﬁner

grain, i.e., at payload level.

We use model checking in order to test our model

against two of most diffused malware family in An-

droid environment: the DroidKungFu and the Opfake

families. We test in addition the robustness of our

solution generating morphed malware and testing it

using the model. Results seem to be promising: we

identify malicious payloads with a very high accuracy

value and with a reasonable time. This implies that

our methodology is efﬁcient and scalable.

As future work we are going to extend our prelimi-

nary evaluation to other widespread families. Further-

more, we plan to track the phylogenesis of malware to

characterize the payload family tree and to foresee the

possible payload evolution.

ACKNOWLEDGEMENTS

The Authors thank Domenico Martino for helping in

the implementation of the prototype tool used to test

the methodology.

REFERENCES

Arp, D., Spreitzenbarth, M., Huebner, M., Gascon, H., and

Rieck, K. (2014). Drebin: Efﬁcient and explainable

detection of android malware in your pocket. In Pro-

ceedings of 21th Annual Network and Distributed Sys-

tem Security Symposium (NDSS). IEEE.

Bailey, U., Comparetti, P., Hlauschek, C., Kruegel, C., and

Kirda, E. (2009). Scalable, behavior-based malware

clustering. In Network and Distributed System Secu-

rity Symposium. IEEE.

Barbuti, R., Francesco, N. D., Santone, A., and Vaglini,

G. (1999). Selective mu-calculus and formula-based

equivalence of transition systems. Elsevier.

Canfora, G., Di Sorbo, A., Mercaldo, F., and Visaggio,

C. (2015). Obfuscation techniques against signature-

based detection: a case study. In Proceedings of Work-

shop on Mobile System Technologies. IEEE.

Canfora, G., Mercaldo, F., and Visaggio, C. A. (2013). A

classiﬁer of malicious android applications. In Pro-

ceedings of the 2nd International Workshop on Secu-

rity of Mobile Applications, in conjunction with the In-

ternational Conference on Availability, Reliability and

Security. IEEE.

Clarke, E. M., Grumberg, O., and Peled, D. (2001). Model

checking. MIT Press.

Cleaveland, R. and Sims, S. (1996). The ncsu concurrency

workbench. In Alur, R. and Henzinger, T. A., editors,

CAV, volume 1102 of Lecture Notes in Computer Sci-

ence. Springer.

Dumitras, T. and Neamtiu, I. (2011). Experimental chal-

lenges in cyber security: A story of provenance and

lineage for malware. ACM.

Hu, X., Chiueh, T., Shin, K., Kruegel, C., and Kirda, E.

(2009). Large-scale malware indexing using function

call graphs. In ACM Conference on Computer and

Communications Security. ACM.

Jacob, G., Filiol, E., and Debar, H. (2010). Formalization of

viruses and malware through process algebras. In In-

ternational Conference on Availability, Reliability and

Security (ARES 2010). IEEE.

Jang, J., Brumley, D., and Venkataraman, S. (2011). Bit-

shred: feature hashing malware for scalable triage and

semantic analysis. In ACM Conference on Computer

and Communications Security. ACM.

Karim, M. E., Walenstein, A., Lakhotia, A., and Parida, L.

(2005). Malware phylogeny generation using permu-

tations of code. Springer.

Khoo, W. and Lio, P. (2011). Unity in diversity:

Phylogenetic-inspired techniques for reverse engi-

neering and detection of malware families. In SysSec

Workshop. Springer.

Kinder, J., Katzenbeisser, S., Schallhart, C., and Veith, H.

(2005). Detecting malicious code by model checking.

Springer.

Ma, J., Dunagan, J., Wang, H. J., Savage, S., and Voelker,

G. M. (2006). Finding diversity in remote code in-

jection exploits. In Proceedings of the 6th ACM SIG-

COMM conference on Internet measurement. ACM.

Milner, R. (1989). Communication and concurrency. PHI

Series in computer science. Prentice Hall.

Santone, A. and Vaglini, G. Abstract reduction in directed

model checking CCS processes. Springer.

Song, F. and Touili, T. (2001). Efﬁcient malware detection

using model-checking. Springer.

Song, F. and Touili, T. (2013). Pommade: Pushdown

model-checking for malware detection. In Proceed-

ings of the 2013 9th Joint Meeting on Foundations of

Software Engineering. ACM.

Song, F. and Touili, T. (2014). Model-checking for android

malware detection. Springer.

Spreitzenbarth, M., Echtler, F., Schreck, T., Freling, F. C.,

and Hoffmann, J. (2013). Mobilesandbox: Looking

deeper into android applications. In 28th International

ACM Symposium on Applied Computing (SAC). ACM.

Stirling, C. (1989). An introduction to modal and temporal

logics for ccs. In Yonezawa, A. and Ito, T., editors,

Concurrency: Theory, Language, And Architecture,

LNCS, pages 2–20. Springer.

Zhou, Y. and Jiang, X. (2012). Dissecting android mal-

ware: Characterization and evolution. In Proceed-

ings of 33rd IEEE Symposium on Security and Privacy

(Oakland 2012). IEEE.

Identiﬁcation of Android Malware Families with Model Checking

547