Analyzing a Concurrent Self-Modifying Program: Application to

Malware Detection*

Walid Messahel

and Tayssir Touili

LIPN, CNRS and University Sorbonne Paris Nord, France

IRIF, CNRS and University Paris Cit

e, France

Keywords:

Model Checking, Self-Modifying Code, Pushdown Systems, Malware Detection.

Abstract:

We tackle the analysis problem of multi-threaded parallel programs that contain self modifying code, i.e.,

code that have the ability to reconstruct itself during the execution time. This kind of code is usually used

to hide malicious portions of codes so that they cannot be detected by anti-viruses. In (Messahel and Touili,

2024), we introduced a new model called Self Modifying Dynamic Pushdown Network (SM-DPN) to model

such programs. A SM-DPN is a network of Self-Modifying Pushdown Systems, i.e., Pushdown Systems

that can modify their instructions on the ﬂy during execution. We proposed an algorithm to perform the

backward reachability analysis of SM-DPNs. However, in (Messahel and Touili, 2024), no concrete example

was provided. In this paper, we go one step further. We consider a case study and show concretely how this

approach and this model can be applied to represent and analyse an example of a multi-threaded self modifying

code infected with a malware.

1 INTRODUCTION

Self-modifying code is a programming philosophy

that allows computer programs to change their behav-

ior during execution time, without any external inter-

vention. This technique is used for different purposes,

some software engineers use it to protect their prod-

ucts from being reverse engineered (code obfusca-

tion), others, use it to evade detection by anti-malware

systems.

On the other hand concurrent programming is an-

other technique of software engineering that allows to

perform multiple tasks simultaneously to leverage up

the use of the hardware resources seeking for more

efﬁciency and performance improvement. Analyzing

concurrent programs can be challenging due to the

complex interactions and the inter-dependencies be-

tween concurrent threads and processes.

In this work, we consider the analysis problem of

programs that present these two sources of difﬁcul-

ties: (1) concurrency and thread creation, and (2) self-

modifying code. Analyzing this kind of code is chal-

lenging due to the complex features that it involves.

Multiple techniques were introduced to analyse this

kind of programs such as:

1. Reverse engineering: This technique involves

∗

This work was partially funded by the ERGA-

NEO grant MALWARE and the french ANR grant Defmal

“ANR-22-PECY-0007”.

analyzing the binary source code and manually

understand and anticipate its behavior by using

disassembling and decompiling tools, this ap-

proach can deliver certain results but it is hard and

time consuming.

2. Emulation techniques which simulate the pro-

gram’s execution in a virtual environment using

emulators. This can be used to observe the code’s

behaviour and detect malfunctioning.

However, these techniques have serious limita-

tions. Indeed, reverse engineering is not automatic

and necessitates human interaction, which can be very

tedious. As for emulation techniques, they can anal-

yse the program only in a limited time interval. To

sidestep these difﬁculties, we introduced in (Messahel

and Touili, 2024), a completely automatic and static

approach to analyse concurrent, self-modifying code:

we introduced a new model called Self Modifying

Dynamic Pushdown Network (SM-DPN) to model

such programs. A SM-DPN is a network of Self-

Modifying Pushdown Systems, i.e., Pushdown Sys-

tems that can modify their instructions on the ﬂy dur-

ing execution. We proposed in (Messahel and Touili,

2024) an algorithm to perform the backward reacha-

bility analysis of SM-DPNs. Our algorithm is based

on (1) representing inﬁnite sets of conﬁgurations of

SMDPN using ﬁnite state automata, and (2) applying

a kind of saturation procedure on these automata in

176

Messahel, W. and Touili, T.

Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection.

DOI: 10.5220/0013103900003899

In Proceedings of the 11th International Conference on Information Systems Security and Privacy (ICISSP 2025) - Volume 1, pages 176-182

ISBN: 978-989-758-735-1; ISSN: 2184-4356

order to apply in a backward way the different transi-

tions of the SMDPN (these transitions correspond to

the different instrcutions of the program).

However, the work of (Messahel and Touili, 2024)

is too theoretic, and no running example was given to

explain how this SMDPN model and this automata-

based algorithm can be concretely applied to the anal-

ysis of real concurrent self-modifying code.

In this paper, we go one step further and consider a

case study. We show concretely how the approach and

the SMDPN model of (Messahel and Touili, 2024)

can be applied to represent and analyse an example

of a multi-threaded self modifying code infected with

a malware.

2 RELATED WORK

Analyzing binary code has always been an interest-

ing ﬁeld of study by a variety of computer scientists

especially for security purposes either to disclose vul-

nerabilities or to detect hidden malwares. A part of

the community has used static techniques to analyze

binary code, where they usually pre-process the bi-

nary code before conducting analysis (Schwartz et al.,

2018; Chen et al., 2017; Zhang et al., 2018; Arzt et al.,

2014; Biondo et al., 2018; Wu et al., 2019; Wang

et al., 2017; Chen et al., 2017).

Analyzing self modifying code can be challenging

due to the changing nature of the code. Some works

use dynamic analysis approaches like (Dawei et al.,

2018; Ugarte-Pedrero et al., 2015; Guizani et al.,

2009; Bruschi et al., 2006; Anckaert et al., 2007;

Blazy et al., 2016; Touili and Ye, 2019). However,

none of these techniques can handle in a completely

automatic way concurrent self-modifying code.

On the other hand, the ability to analyze concur-

rent and parallel programs is essential for understand-

ing and improving the performance of these concur-

rent systems. Several works were proposed such as

(Nethercote et al., 2007; Maisuradze et al., 2010; Liu

et al., 2022; Alglave et al., 2010). However, none of

these techniques consider concurrent self-modifying

code.

The only work that we are aware of and that

can handle self-modifying and concurrent programs

is (Messahel and Touili, 2024). In this paper, we go

one step further and show how the approach of (Mes-

sahel and Touili, 2024) can be applied in a concrete

manner for the analysis of a self-modifying concur-

rent program infected with a malware.

3 MOTIVATING EXAMPLE

Executable instructions are stored in memory as byte

code. Thus, changing the byte code will change the

instruction itself. There are several kinds of self-

modifying code. In this work, we consider self-

modifying code caused by self-modifying instruc-

tions, which are often mov instructions that can ac-

cess and modify the byte code stored in memory lo-

cations.

Consider the assembly code fragment shown

in Listing 1. The program contains several self-

modifying instructions that change its execution ﬂow.

It also contains a thread creation instruction intro-

duced by the self-modifying instructions. You can

see that the mov instruction was able to modify the

instructions of the program successfully via its abil-

ity to read and write the memory. Let us explain step

by step how this code contains thread creation and is

self-modifying:

• Instruction mov [0xb80201],0x2 will replace

the content in the address 0xb80201 with 0x2.

Thus, the instruction Mov eax,74e2h at address

0xb80200 is replaced by mov eax,0x2.

• Instruction mov [0xb80206],0xcd will replace the

content in the address 0xb80206 with 0xcd, and

instruction mov [0xb80207],0x80 will replace the

content in the address 0xb80207 with 0x80. Thus,

these two instructions will replace the instruction

Mov eax,eax at address 0xb80206 with the in-

struction Int 0x80. This instruction will call the

kernal with the parameter 0x2 in the eax register.

This is a process creation function that will create

a new child process that begins its execution right

after the instruction int 0x80.

1 Addres s Bytecode Assembly

3 0 xb 802 00 b8e 2 7 40000 Mov eax , 7 4 e2h

4 0 xb 802 06 89 c 0 Mov eax , eax

5 0 xb 802 08 83 f 8 0 0 Cmp eax , 0 x0

6 0 xb 802 0b 7420 J z c h i l d

7 0 xb 802 0d e82 1 0 00000 C a l l f u nc 1

8 0 xb 802 12 83 f 8 0 2 Cmp eax , 0 x2

9 0 xb 802 15 e82a000000 J z E x i t

10 0 xb80 2 1a c6050 1 0 0 02b80 2 Mov [ 0 xb8020 1 ] , 0 x2

11 0 xb 802 22 c 6 0 5f7 9 1 0 408 c d Mov [ 0 xb80206 ] , 0 xcd

12 0 xb 802 29 c605 0 7 0002b 8 8 0 Mov [ 0 xb80207 ] , 0 x80

13 0 xb 802 30 e 9 e c 6 d f d a f Jmp 0 xb80200

Listing 1: Assembly code that contains self modifying

instructions.

Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection

177

4 THE FORMAL MODEL:

SELF-MODIFYING DYNAMIC

PUSHDOWN NETWORK

We recall in this section the Self-Modifying Dynamic

Pushdown Network model deﬁnition introduced in

(Messahel and Touili, 2024).

A SM-PDS (Touili and Ye, 2017) is a push-down

system that can modify its own set of rules during the

execution. A Self modifying dynamic pushdown net-

work (SM-DPN) consists of a network of SM-PDS

(self modifying pushdown systems) that can model a

network of pushdown processes running in parallel,

where each of these pushdown systems can change its

current set of rules and create new processes during

its execution.

Formally, a SM-DPN (Messahel and Touili, 2024)

is a tuple ℜ = (P, Γ, ∆, ∆

), where P is a ﬁnite set

of control points, Γ is a ﬁnite set of stack symbols,

∆ ⊆ (P × Γ) × (P × Γ

∗

) ∪ (P × Γ) × (P × Γ

∗

) × ((P ×

∆∪∆

) × Γ

∗

), ∆

⊆ P × (∆ ∪ ∆

) × (∆ ∪ ∆

) × P is a

ﬁnite set of modifying transitions rules. A DPN is a

SM-DPN with ∆

Each process in the network has its current set of

transition rules θ called the phase, such as θ ⊆ ∆ ∪∆

rules in ∆

can change a process phase. There are

three different types of transition rules used by the

SM-DPN:

• ((p, γ), (p

, w

)) ∈ ∆ where p, p

∈ P, γ ∈ Γ, w

∈

∗

. This rule can also be written as pγ → p

∈

∆. This rule expresses that if a process of the net-

work is in control point p with γ as its top element

of the stack then it can move to control point p

pop γ and push w

• ((p, γ), (p

, w

), ((p

, θ), w

)) ∈ ∆ where

p, p

, p

∈ P, γ ∈ Γ, w

, w

∈ Γ

∗

, θ ⊆ ∆ ∪ ∆

This rule can also be written as pγ →

 (p

, θ)w

∈ ∆. This rule expresses

that if a process of the network is in control point

p with γ as its topmost stack element, then it can

move to control point p

, pop γ, push w

and

create a new process in the network having p

as its initial control point, w

as its initial stack

content and θ as its initial current set of rules

(phase).

• (p, r

, r

, p

) ∈ ∆

where p, p

∈ P, r

, r

∈ ∆ ∪∆

This rule can also be written as p

)

−−−→ p

∈ ∆

This rule expresses that if a process of the network

is in control point p and r

is in its current set of

rules, then it can move to control point p

and up-

date its current set of transition rules by replacing

the rule r

with the rule r

A local conﬁguration of a process of the network

can be represented by (p, θ)w, where p ∈ P is the con-

trol point of the process, θ ⊆ ∆∪∆

is its current set of

rules (phase), w ∈ Γ

∗

is its stack content. As in (Mes-

sahel and Touili, 2024), to simplify the presentation,

we will sometimes write p

instead of (p, θ).

A global SM-DPN conﬁguration is a word of the

form

. . . p

where p

, p

, . . . , p

∈ P, w

, w

, . . . , w

∈ Γ

∗

and

, θ

, . . . , θ

⊆ ∆∪∆

. This word expresses that there

are n running processes in the network and for every i

such as 0 ≤ i ≤ n, the process i is in control point p

with w

as its stack content and have θ

as its current

set of rules.

Let Con f

ℜ

be the set of all global conﬁgurations

of the SM-DPN ℜ. We deﬁne the transition relation

⇒

ℜ

to be the smallest relation between two conﬁgu-

rations in Con f

ℜ

×Con f

ℜ

as follows:

• Let c = up

wv, c

′

= up

′θ

′

wv with u, v ∈

Con f

ℜ

, w ∈ Γ

∗

, if r = p

)

−−−→ p

′

∈ ∆

∩ θ,

∈ θ and θ

′

= (θ \ {r

}) ∪ {r

} then c is a

predecessor of c

′

(also written as c ⇒

ℜ

′

). The

rule r moves the process from the control point

p to the control point p

′

and changes the current

phase (current set of transition rules) by removing

and replacing it with r

without altering the

content of the stack.

• Let c = up

γwv, c

′

= up

′θ

′

wv with γ ∈ Γ, w, w

′

∈

∗

, u, v ∈ Con f

ℜ

, and r = pγ → p

′

∈ ∆∩θ, then

c ⇒

ℜ

′

. The rule r moves the SM-DPN pro-

cess from the control point p to the control point

′

, pops γ from the stack and pushes w

′

into the

stack. This rule maintains the current phase (θ)

untouched.

• Let c = up

γwv, c

′

= up

′θ

′

wv with γ ∈

Γ, w ∈ Γ

∗

, u, v ∈ Con f

ℜ

, if r = pγ → p

′



∈ ∆ ∩ θ, then c ⇒

ℜ

′

. Here the rule r will

move the SM-DPN process from the control point

p to the control point p

′

, pops γ from the stack,

pushes w

′

into the stack and creates a new process

on the control point p

, with w

as stack content,

as the initial phase and maintains the current

phase (θ) untouched.

We deﬁne ⇒

∗

ℜ

as the transitive reﬂexive closure

of ⇒

ℜ

. If a conﬁguration c

′

is reachable from c

in i

steps by applying ⇒

ℜ

i times, we write c

⇒

ℜ

′

We denote the set of immediate predecessors

(resp. successors) of a conﬁguration c as Pre

ℜ

∈ Con f

ℜ

: c

⇒

ℜ

c} (resp. Post

ℜ

∈

Con f

ℜ

: c ⇒

ℜ

}). Let Pre

∗

ℜ

(resp. Post

∗

ℜ

) denote

the reﬂexive transitive closure of Pre

ℜ

(resp. Post

ℜ

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

178

These notations can be generalized to sets of conﬁgu-

rations in the obvious ways. We omit the subscript ℜ

if clear from the context.

5 MODELING SELF MODIFYING

CONCURRENT CODE WITH

SM-DPN

To perform binary code analysis, we need a proper in-

termediate representation that can be easily converted

to SM-DPN. For this purpose, we use CFGs (Control

Flow Graphs) automatically extracted from the binary

code. Then, extracting a SM-DPN from a binary pro-

gram is segmented to three main steps:

1. Generating the Assembly Code: Firstly, we dis-

assemble the binary code with a proper tool that

reverses engineer the executed binary code to as-

sembly instructions. For example, 83 f8 00 will

be converted to Cmp eax,0x0.

2. Converting the Extracted Assembly Code to a

CFG: Secondly, the previous step will provide us

with a list of assembly instructions alongside their

memory addresses, from here we can easily ex-

tract an abstracted version of the code in the form

of a control ﬂow graph.

3. Parsing the CFG to SM-DPN rules: Lastly, we

will convert the extracted CFG to a SM-DPN that

models dynamic process creation and self modi-

fying instructions. We suppose we are given an

oracle that computes the values (or their approxi-

mations) of the different registers at different con-

trol points of the program. Let Γ be the set of all

memory adresses and register values of the pro-

gram. There are four possible cases depending on

the CFG instructions:

• If the CFG instruction is of the form

pushβ

−−−→ n

it will be converted to SM-DPN rules of the

form

γ → n

βγ

for every γ in Γ.

• If the CFG instruction is of the form

popβ

−−−→ n

it will be converted to SM-DPN rules of the

form

β → n

• If the CFG instruction is of the form

call f

−−−→ n

it will be converted to SM-DPN rules of the

form

γ → n

for every γ in Γ, where n

is the entry point of

the procedure f .

• If the CFG instruction is of the form

call CreateProcess

−−−−−−−−−−→ n

where CreateProcess is a function that creates

a new process, then it will be converted to SM-

DPN rules of the form

γ → n

γ  p

for every γ in Γ, where p

is the entry point

of the created process and w

is its initial stack

content.

• If the CFG instruction is of the form

mov m,l

−−−−→ n

where m is an executable code address, then

this is a self modifying code instruction. This

instruction will be converted to SM-DPN rules

of the form

−−→ n

where r

is the rule that corresponds to the in-

struction being modiﬁed by n

mov m,l

−−−−→ n

, and

is the rule corresponding to the new instruc-

tion that replaces the ﬁrst one.

6 SM-DPN BACKWARD

REACHABILITY ANALYSIS

It has been shown in (Messahel and Touili, 2024) that

if a regular set of global conﬁgurations can be rep-

resented by a special kind of ﬁnite automata, then it

is possible to effectively compute a ﬁnite automaton

that represents the set of all backward reachable con-

ﬁgurations. We recall in this section the results of

(Messahel and Touili, 2024) concerning the backward

reachability analysis of SM-DPNs.

Let ℜ = (P, Γ, ∆, ∆

) be a SM-DPN. To ﬁnitely

represent a regular inﬁnite set of SM-DPN conﬁg-

urations, we use a special kind of automata: a ℜ-

automaton (Messahel and Touili, 2024) is a tuple

A = (S, Ω, T, s

, F) with the following conditions :

1. Ω = (P × 2

∆∪∆

) ∪ Γ, is the automaton alphabet.

Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection

179

2. S is a ﬁnite set containing the automaton states

partitioned into two subsets S

and S

s.t S =

∪ S

, S

∩ S

0 and for every s ∈ S

and every

such that p ∈ P, θ ⊆ ∆ ∪ ∆

, there is a unique

state called s

∈ S

3. There is a relation T

′

⊆ S

× Γ × (S

\{s

: s ∈

, p ∈ P, θ ⊆ ∆ ∪ ∆

}) ∪ S

× {ε} × S

such that

T = T

′

∪ {(s, p

, s

) : s ∈ S, p ∈ P, θ ⊆ ∆ ∪ ∆

4. s

∈ S

is the initial state.

5. F ⊆ S is the set of ﬁnal states .

Note that condition (3) implies the following

properties:

• For each p ∈ P, θ ⊆ ∆ ∪ ∆

, s ∈ S

, s is the only

predecessor of s

• States s in S

do not have Γ-transitions.

• Only ε-moves from states in S

lead to states in S

• States s in S

do not have p-successors, for p ∈ P

For γ ∈ Γ ∪ {ε} and s, s

′

∈ S, if (s, γ, s

′

) ∈ T we

write s

−−→

′

. This notation can be extended in the

obvious manner to sequences of symbols as follows :

∀s ∈ S, s

−−→

s and ∀s, s

′

∈ S, ∀γ ∈ Γ∪{ε}, ∀w ∈

∗

, s

γw

−−→

′

iff ∃s” ∈ S such that s

−−→

s”

−−→

′

We will remove the subscript T if it is understood in

the context.

Intuitively, the conditions above make sure

that every path in the ℜ-automaton is the con-

catenation of paths of the form s

−−→ s

−−→

−−→ s

−−→ q

. . . s

−−→ s

−−→ q

such that s

, s

. . . s

∈ S

, s

. . . s

∈

, q

. . . q

∈ S

, p

. . . p

∈ P, θ

, θ

. . . θ

⊆

∆ ∪ ∆

, w

. . . w

∈ Γ

∗

A conﬁguration p

. . . p

is accepted

by an automaton A if there exists a path of the form

−−→ s

−−→ q

−−→ s

−−→ q

. . . s

−−→

−−→ q

such that q

∈ F. We denote by L(A) the

set of conﬁgurations accepted by A.

It was shown in (Messahel and Touili, 2024) that :

Theorem 6.1. Let ℜ = (P, Γ, ∆, ∆

) be a SM-DPN and

A = (S, Ω, T, s

, F) be an ℜ-automaton. We can build

a ﬁnite automaton A

pre

∗

= (S, Ω, T

′

, s

, F) such that

L(A

pre

∗

) = pre

∗

(L(A)).

7 CASE OF STUDY: A REAL

EXAMPLE

In order to observe the workﬂow of a binary analysis

using SM-DPN modeling and reachability analysis,

let us consider an assembly code extracted from a bi-

nary ﬁle infected by a concurrent self modifying vari-

ation of the BadRabbit ransomware, which is a notori-

ous malware that performs a drive-by attack to install

itself through fake adobe ﬂash installer or updates,

encrypts all data and uses EternalRomance exploit

to spread within the corporate network (Perekalin,

2017). The binary is obtained from MalwareCollec-

tion github repo (Enderman, 2022). We explain step

by step the process of reverse engineering the binary

”BadRabbit.exe”, then modeling it with a SM-DPN,

to ﬁnally apply the Pre

∗

algorithm of (Messahel and

Touili, 2024) to perform a reachability analysis where

the malicious part starts from the address n 7.

7.1 Reverse Engineering

We use radare2 (Studer et al., 2023) to decompile the

binary and retrieve its assembly code showed in List-

ing 2, where n 0, n 1, ..., n i are memory addresses.

1 Add res s Assembly

3 n 0 Pu sh 0 x10

4 n 1 Mov ebp , 0 x 1 f f

5 n 2 Mov [ n 1 2 ] , 0 x f f d 6 ; c h a n g e

i n s t r u c t i o n s t o r e d i n n 1 2

6 n 3 Nop

7 n 4 Pu sh e a x

8 n 5 Mov [ v ar 1 2 5 0 h ] , 0

9 n 6 jmp n 2

10 n 7 LEA EDX, [ 0 x f f f f f 9 e 4 ]

11 n 8 PUSH EDX

12 n 9 MOV [ l o c a l 1 2 b 0 ] , 0 x44

13 n 1 0 CALL C re a te P ro c es s W ; c r e a t e new

p r o c e s s

Listing 2: BadRabbit snippet ASM code.

7.2 Translating the Binary Code to an

SM-DPN

To model the assembly code of the BadRabbit mal-

ware with a SM-DPN, we loop over the instructions

and convert each instruction to a SM-DPN rule as de-

scribed below. In order to convert some instructions,

we need to resolve its registers values. To this end,

we set up three levels of precision:

1. using Radare2 (Studer et al., 2023) built-in. This

is the fastest method but it can only resolve the IP

2. using Angr (Shoshitaishvili et al., 2016) Sym-

bolic execution. This performs a symbolic exe-

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

180

cution without actually running the program and

can track register values in most of the cases, but

it is costly in terms of time and memory consump-

tion because it is based on exploring all possible

logical branches.

3. by running the program in an isolated mode with a

debugger. This is the best method in terms of time

and memory consumption, but it is tricky if the

program depends on a variety of external inputs.

Following the rules that translate an assembly

code to a SM-DPN described in section 5, we trans-

late each instruction to SM-DPN rules as follows,

where p

, p

, . . . , p

represent the SM-DPN control

points according to the addresses n 0, n 1, ..., n i, re-

spectively, such that p

is the program control point

at the address n 0. Let Γ be the set of all memory

adresses and register values of the program. The in-

structions of the program are translated to SM-DPN

rules as follows:

• push 0x10 : since this is a push instruction, we

move from control point p

to control point p

and

we push 0x10 to the stack. Thus, this instruction

can be modelled by these SM-DPN rules: p

γ →

0x10γ for every γ ∈ Γ.

• mov ebp, 0x1ff : a non self modifying mov in-

struction does not change the content of the stack

and moves the SM-DPN from control point p

control point p

resulting in rules p

γ → p

γ,

for every γ ∈ Γ.

• mov [n 12], 0xffd6 : a self modifying mov in-

struction does not change the content of the stack,

moves the SM-DPN from control point p

to con-

trol point p

and replaces an instruction repre-

sented by a rule, say r

′

, by another instruction

represented by a rule, say r”, resulting in the SM-

DPN rule: p

′

,r”)

−−−→ p

• Nop : this instruction moves the SM-DPN from

control point p

to control point p

resulting in

rules p

γ → p

γ, for every γ ∈ Γ.

• push eax : a push instruction moves from con-

trol point p

to control point p

and pushes the

value of eax to the stack. Thus, this instruc-

tion is translated to the following SM-DPN rules:

γ → p

eaxγ, for every γ ∈ Γ.

• mov [var 1250h], 0 : a non self modifying mov in-

struction does not change the content of the stack

and moves the SM-DPN from control point p

control point p

, resulting in the SM-DPN rules

γ → p

γ, for every γ ∈ Γ.

• jmp n 2 : The jmp instruction does not change

the stack content and moves the SM-DPN from

control point p

to control point p

resulting in

the SM-DPN rules: p

γ → p

γ, for every γ ∈ Γ.

• push edx : a push instruction moves the SM-

DPN from control point p

to control point p

and pushes edx to the stack. Thus, this instruc-

tion can be translated to the following SM-DPN

rules: p

γ → p

edxγ, for every γ ∈ Γ.

• mov [local 12b0],0x44 : a non self modifying

mov instruction does not change the stack content

and moves the SM-DPN from the control point

to control point p

resulting in the SM-DPN

rules: p

γ → p

γ, for every γ ∈ Γ.

• call CreateProcessW : a thread creation instruc-

tion moves the SM-DPN from control point p

control point p

, while launching a new thread.

Suppose, the function CreateProcessW creates a

new process having s

as entry control point and

as stack content, then, this instruction is trans-

lated to the following SM-DPN rules: p

γ →

γ  s

, for every γ ∈ Γ.

7.3 The BadRabbit Reachability

Analysis

Once we get the above SM-DPN model, we are ready

to apply the backward reachability results of Theo-

rem 6.1 to analyse the above program. These tech-

niques, that were implemented in a tool to analyze

SM-DPNs, have found that the program entry point

(n 0) is in the pre

∗

of the malicious entry point (n 7),

which means that the program is infected.

8 CONCLUSION

In this paper, we tackle the analysis problem of multi-

threaded parallel programs that contain self modify-

ing code, i.e., code that can modify itself during the

execution time. Malware use heavily this kind of self-

modifying code in order to get obfuscated so that they

cannot be detected by anti-viruses. To model such

programs, we use a new model called Self Modify-

ing Dynamic Pushdown Network (SM-DPN). A SM-

DPN is a network of Self-Modifying Pushdown Sys-

tems, i.e., Pushdown Systems that can modify their

instructions on the ﬂy during execution. To anal-

yse self modifying concurrent programs, we perform

reachability analysis of SM-DPNs. We successfully

apply our approach to represent and analyse a multi-

threaded self modifying code infected with a mal-

ware.

Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection

181

REFERENCES

Alglave, J., Kashyap, M., and Tofte, M. (2010). Compo-

sitional reasoning for shared-variable concurrent pro-

grams. ACM Transactions on Programming Lan-

guages and Systems (TOPLAS), 32(5):1–53.

Anckaert, B., Madou, M., and De Bosschere, K. (2007). A

model for self-modifying code. In Camenisch, J. L.,

Collberg, C. S., Johnson, N. F., and Sallee, P., editors,

Information Hiding, pages 232–248, Berlin, Heidel-

berg. Springer Berlin Heidelberg.

Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A.,

Klein, J., Le Traon, Y., Octeau, D., and McDaniel, P.

(2014). Flowdroid: Precise context, ﬂow, ﬁeld, object-

sensitive and lifecycle-aware taint analysis for android

apps. Acm Sigplan Notices, 49(6):259–269.

Biondo, A., Conti, M., and Lain, D. (2018). Back to the

epilogue: Evading control ﬂow guard via unaligned

targets. In Ndss.

Blazy, S., Laporte, V., and Pichardie, D. (2016). Veriﬁed ab-

stract interpretation techniques for disassembling low-

level self-modifying code. Journal of Automated Rea-

soning, 56:283–308.

Bruschi, D., Martignoni, L., and Monga, M. (2006). Detect-

ing self-mutating malware using control-ﬂow graph

matching. In Detection of Intrusions and Malware

& Vulnerability Assessment: Third International Con-

ference, DIMVA 2006, Berlin, Germany, July 13-14,

2006. Proceedings 3, pages 129–143. Springer.

Chen, Y., Zhang, D., Wang, R., Qiao, R., Azab, A. M., Lu,

L., Vijayakumar, H., and Shen, W. (2017). Norax:

Enabling execute-only memory for cots binaries on

aarch64. In 2017 IEEE Symposium on Security and

Privacy (SP), pages 304–319. IEEE.

Dawei, S., Delong, L., and Zhibin, Y. (2018). Dynamic self-

modifying code detection based on backward analysis.

In Proceedings of the 2018 10th International Confer-

ence on Computer and Automation Engineering, IC-

CAE 2018, page 199–204, New York, NY, USA. As-

sociation for Computing Machinery.

Enderman (2022). Malwarecollection. https://github.com/

xcp3r/MalwareCollection.

Guizani, W., Marion, J.-Y., and Reynaud-Plantey, D.

(2009). Server-side dynamic code analysis. In 2009

4th International Conference on Malicious and Un-

wanted Software (MALWARE), pages 55–62.

Liu, Y., Xu, Z., Fan, M., Hao, Y., Chen, K., Chen, H., Cai,

Y., Yang, Z., and Liu, T. (2022). Concspectre: Be

aware of forthcoming malware hidden in concurrent

programs. IEEE Transactions on Reliability, 71:1–10.

Maisuradze, G., Petrenko, A. S., Bala, A., and Lie, D.

(2010). Threadsanitizer: ﬁnding data races in native

code. Proceedings of the ACM SIGPLAN Conference

on Programming Language Design and Implementa-

tion, pages 89–100.

Messahel, W. and Touili, T. (2024). Reachability analysis

of concurrent self-modifying code. In 28th Interna-

tional Conference on Engineering of Complex Com-

puter Systems (ICECCS).

Nethercote, N., Seward, J., and Seward, J. (2007). Valgrind:

A framework for heavyweight dynamic binary instru-

mentation. In Proceedings of the 2007 International

Symposium on Dynamic Languages, pages 89–100.

Perekalin, A. (2017). Bad rabbit: A new ransomware epi-

demic is on the rise. https://www.kaspersky.com/blog/

bad-rabbit-ransomware/19887/.

Schwartz, E. J., Cohen, C. F., Duggan, M., Gennari, J.,

Havrilla, J. S., and Hines, C. (2018). Using logic pro-

gramming to recover c++ classes and methods from

compiled executables. In Proceedings of the 2018

ACM SIGSAC Conference on Computer and Commu-

nications Security, pages 426–441.

Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N.,

Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser,

C., Kruegel, C., and Vigna, G. (2016). SoK: (State

of) The Art of War: Offensive Techniques in Binary

Analysis. In IEEE Symposium on Security and Pri-

vacy.

Studer, A., Abd El-MAwgood, A. M., and Akshay Krish-

nan, R. (2023). The ofﬁcial radare2 book. https:

//book.rada.re/credits/credits.html.

Touili, T. and Ye, X. (2017). Reachability analysis of

self modifying code. In 22nd International Confer-

ence on Engineering of Complex Computer Systems

(ICECCS).

Touili, T. and Ye, X. (2019). Ltl model checking of self

modifying code. In 2019 24th International Confer-

ence on Engineering of Complex Computer Systems

(ICECCS).

Ugarte-Pedrero, X., Balzarotti, D., Santos, I., and Bringas,

P. G. (2015). Sok: Deep packer inspection: A longitu-

dinal study of the complexity of run-time packers. In

2015 IEEE Symposium on Security and Privacy, pages

659–673.

Wang, R., Shoshitaishvili, Y., Bianchi, A., Machiry, A.,

Grosen, J., Grosen, P., Kruegel, C., and Vigna, G.

(2017). Ramblr: Making reassembly great again. In

NDSS.

Wu, W., Chen, Y., Xing, X., and Zou, W. (2019). Kepler:

Facilitating control-ﬂow hijacking primitive evalua-

tion for linux kernel vulnerabilities. In USENIX Se-

curity Symposium, pages 1187–1204.

Zhang, X., Zhang, Y., Mo, Q., Xia, H., Yang, Z., Yang, M.,

Wang, X., Lu, L., and Duan, H. (2018). An empiri-

cal study of web resource manipulation in real-world

mobile applications. In 27th {USENIX} Security Sym-

posium ({USENIX} Security 18), pages 1183–1198.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

182