Analyzing a Concurrent Self-Modifying Program: Application to
Malware Detection*
Walid Messahel
1
and Tayssir Touili
2
1
LIPN, CNRS and University Sorbonne Paris Nord, France
2
IRIF, CNRS and University Paris Cit
´
e, France
Keywords:
Model Checking, Self-Modifying Code, Pushdown Systems, Malware Detection.
Abstract:
We tackle the analysis problem of multi-threaded parallel programs that contain self modifying code, i.e.,
code that have the ability to reconstruct itself during the execution time. This kind of code is usually used
to hide malicious portions of codes so that they cannot be detected by anti-viruses. In (Messahel and Touili,
2024), we introduced a new model called Self Modifying Dynamic Pushdown Network (SM-DPN) to model
such programs. A SM-DPN is a network of Self-Modifying Pushdown Systems, i.e., Pushdown Systems
that can modify their instructions on the fly during execution. We proposed an algorithm to perform the
backward reachability analysis of SM-DPNs. However, in (Messahel and Touili, 2024), no concrete example
was provided. In this paper, we go one step further. We consider a case study and show concretely how this
approach and this model can be applied to represent and analyse an example of a multi-threaded self modifying
code infected with a malware.
1 INTRODUCTION
Self-modifying code is a programming philosophy
that allows computer programs to change their behav-
ior during execution time, without any external inter-
vention. This technique is used for different purposes,
some software engineers use it to protect their prod-
ucts from being reverse engineered (code obfusca-
tion), others, use it to evade detection by anti-malware
systems.
On the other hand concurrent programming is an-
other technique of software engineering that allows to
perform multiple tasks simultaneously to leverage up
the use of the hardware resources seeking for more
efficiency and performance improvement. Analyzing
concurrent programs can be challenging due to the
complex interactions and the inter-dependencies be-
tween concurrent threads and processes.
In this work, we consider the analysis problem of
programs that present these two sources of difficul-
ties: (1) concurrency and thread creation, and (2) self-
modifying code. Analyzing this kind of code is chal-
lenging due to the complex features that it involves.
Multiple techniques were introduced to analyse this
kind of programs such as:
1. Reverse engineering: This technique involves
This work was partially funded by the ERGA-
NEO grant MALWARE and the french ANR grant Defmal
ANR-22-PECY-0007”.
analyzing the binary source code and manually
understand and anticipate its behavior by using
disassembling and decompiling tools, this ap-
proach can deliver certain results but it is hard and
time consuming.
2. Emulation techniques which simulate the pro-
gram’s execution in a virtual environment using
emulators. This can be used to observe the code’s
behaviour and detect malfunctioning.
However, these techniques have serious limita-
tions. Indeed, reverse engineering is not automatic
and necessitates human interaction, which can be very
tedious. As for emulation techniques, they can anal-
yse the program only in a limited time interval. To
sidestep these difficulties, we introduced in (Messahel
and Touili, 2024), a completely automatic and static
approach to analyse concurrent, self-modifying code:
we introduced a new model called Self Modifying
Dynamic Pushdown Network (SM-DPN) to model
such programs. A SM-DPN is a network of Self-
Modifying Pushdown Systems, i.e., Pushdown Sys-
tems that can modify their instructions on the fly dur-
ing execution. We proposed in (Messahel and Touili,
2024) an algorithm to perform the backward reacha-
bility analysis of SM-DPNs. Our algorithm is based
on (1) representing infinite sets of configurations of
SMDPN using finite state automata, and (2) applying
a kind of saturation procedure on these automata in
176
Messahel, W. and Touili, T.
Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection.
DOI: 10.5220/0013103900003899
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 11th International Conference on Information Systems Security and Privacy (ICISSP 2025) - Volume 1, pages 176-182
ISBN: 978-989-758-735-1; ISSN: 2184-4356
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
order to apply in a backward way the different transi-
tions of the SMDPN (these transitions correspond to
the different instrcutions of the program).
However, the work of (Messahel and Touili, 2024)
is too theoretic, and no running example was given to
explain how this SMDPN model and this automata-
based algorithm can be concretely applied to the anal-
ysis of real concurrent self-modifying code.
In this paper, we go one step further and consider a
case study. We show concretely how the approach and
the SMDPN model of (Messahel and Touili, 2024)
can be applied to represent and analyse an example
of a multi-threaded self modifying code infected with
a malware.
2 RELATED WORK
Analyzing binary code has always been an interest-
ing field of study by a variety of computer scientists
especially for security purposes either to disclose vul-
nerabilities or to detect hidden malwares. A part of
the community has used static techniques to analyze
binary code, where they usually pre-process the bi-
nary code before conducting analysis (Schwartz et al.,
2018; Chen et al., 2017; Zhang et al., 2018; Arzt et al.,
2014; Biondo et al., 2018; Wu et al., 2019; Wang
et al., 2017; Chen et al., 2017).
Analyzing self modifying code can be challenging
due to the changing nature of the code. Some works
use dynamic analysis approaches like (Dawei et al.,
2018; Ugarte-Pedrero et al., 2015; Guizani et al.,
2009; Bruschi et al., 2006; Anckaert et al., 2007;
Blazy et al., 2016; Touili and Ye, 2019). However,
none of these techniques can handle in a completely
automatic way concurrent self-modifying code.
On the other hand, the ability to analyze concur-
rent and parallel programs is essential for understand-
ing and improving the performance of these concur-
rent systems. Several works were proposed such as
(Nethercote et al., 2007; Maisuradze et al., 2010; Liu
et al., 2022; Alglave et al., 2010). However, none of
these techniques consider concurrent self-modifying
code.
The only work that we are aware of and that
can handle self-modifying and concurrent programs
is (Messahel and Touili, 2024). In this paper, we go
one step further and show how the approach of (Mes-
sahel and Touili, 2024) can be applied in a concrete
manner for the analysis of a self-modifying concur-
rent program infected with a malware.
3 MOTIVATING EXAMPLE
Executable instructions are stored in memory as byte
code. Thus, changing the byte code will change the
instruction itself. There are several kinds of self-
modifying code. In this work, we consider self-
modifying code caused by self-modifying instruc-
tions, which are often mov instructions that can ac-
cess and modify the byte code stored in memory lo-
cations.
Consider the assembly code fragment shown
in Listing 1. The program contains several self-
modifying instructions that change its execution flow.
It also contains a thread creation instruction intro-
duced by the self-modifying instructions. You can
see that the mov instruction was able to modify the
instructions of the program successfully via its abil-
ity to read and write the memory. Let us explain step
by step how this code contains thread creation and is
self-modifying:
Instruction mov [0xb80201],0x2 will replace
the content in the address 0xb80201 with 0x2.
Thus, the instruction Mov eax,74e2h at address
0xb80200 is replaced by mov eax,0x2.
Instruction mov [0xb80206],0xcd will replace the
content in the address 0xb80206 with 0xcd, and
instruction mov [0xb80207],0x80 will replace the
content in the address 0xb80207 with 0x80. Thus,
these two instructions will replace the instruction
Mov eax,eax at address 0xb80206 with the in-
struction Int 0x80. This instruction will call the
kernal with the parameter 0x2 in the eax register.
This is a process creation function that will create
a new child process that begins its execution right
after the instruction int 0x80.
1 Addres s Bytecode Assembly
2
3 0 xb 802 00 b8e 2 7 40000 Mov eax , 7 4 e2h
4 0 xb 802 06 89 c 0 Mov eax , eax
5 0 xb 802 08 83 f 8 0 0 Cmp eax , 0 x0
6 0 xb 802 0b 7420 J z c h i l d
7 0 xb 802 0d e82 1 0 00000 C a l l f u nc 1
8 0 xb 802 12 83 f 8 0 2 Cmp eax , 0 x2
9 0 xb 802 15 e82a000000 J z E x i t
10 0 xb80 2 1a c6050 1 0 0 02b80 2 Mov [ 0 xb8020 1 ] , 0 x2
11 0 xb 802 22 c 6 0 5f7 9 1 0 408 c d Mov [ 0 xb80206 ] , 0 xcd
12 0 xb 802 29 c605 0 7 0002b 8 8 0 Mov [ 0 xb80207 ] , 0 x80
13 0 xb 802 30 e 9 e c 6 d f d a f Jmp 0 xb80200
Listing 1: Assembly code that contains self modifying
instructions.
Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection
177
4 THE FORMAL MODEL:
SELF-MODIFYING DYNAMIC
PUSHDOWN NETWORK
We recall in this section the Self-Modifying Dynamic
Pushdown Network model definition introduced in
(Messahel and Touili, 2024).
A SM-PDS (Touili and Ye, 2017) is a push-down
system that can modify its own set of rules during the
execution. A Self modifying dynamic pushdown net-
work (SM-DPN) consists of a network of SM-PDS
(self modifying pushdown systems) that can model a
network of pushdown processes running in parallel,
where each of these pushdown systems can change its
current set of rules and create new processes during
its execution.
Formally, a SM-DPN (Messahel and Touili, 2024)
is a tuple = (P, Γ, ,
c
), where P is a finite set
of control points, Γ is a finite set of stack symbols,
(P × Γ) × (P × Γ
) (P × Γ) × (P × Γ
) × ((P ×
2
c
) × Γ
),
c
P × (
c
) × (
c
) × P is a
finite set of modifying transitions rules. A DPN is a
SM-DPN with
c
=
/
0.
Each process in the network has its current set of
transition rules θ called the phase, such as θ
c
,
rules in
c
can change a process phase. There are
three different types of transition rules used by the
SM-DPN:
((p, γ), (p
0
, w
0
)) where p, p
0
P, γ Γ, w
0
Γ
. This rule can also be written as pγ p
0
w
0
. This rule expresses that if a process of the net-
work is in control point p with γ as its top element
of the stack then it can move to control point p
0
,
pop γ and push w
0
.
((p, γ), (p
1
, w
1
), ((p
0
, θ), w
0
)) where
p, p
0
, p
1
P, γ Γ, w
0
, w
1
Γ
, θ
c
.
This rule can also be written as pγ
p
1
w
1
(p
0
, θ)w
0
. This rule expresses
that if a process of the network is in control point
p with γ as its topmost stack element, then it can
move to control point p
1
, pop γ, push w
1
and
create a new process in the network having p
0
as its initial control point, w
0
as its initial stack
content and θ as its initial current set of rules
(phase).
(p, r
1
, r
2
, p
0
)
c
where p, p
0
P, r
1
, r
2
c
.
This rule can also be written as p
(r
1
,r
2
)
p
0
c
.
This rule expresses that if a process of the network
is in control point p and r
1
is in its current set of
rules, then it can move to control point p
0
and up-
date its current set of transition rules by replacing
the rule r
1
with the rule r
2
.
A local configuration of a process of the network
can be represented by (p, θ)w, where p P is the con-
trol point of the process, θ
c
is its current set of
rules (phase), w Γ
is its stack content. As in (Mes-
sahel and Touili, 2024), to simplify the presentation,
we will sometimes write p
θ
instead of (p, θ).
A global SM-DPN configuration is a word of the
form
p
θ
0
0
w
0
p
θ
1
1
w
1
. . . p
θ
n
n
w
n
where p
0
, p
1
, . . . , p
n
P, w
0
, w
1
, . . . , w
n
Γ
and
θ
0
, θ
1
, . . . , θ
n
c
. This word expresses that there
are n running processes in the network and for every i
such as 0 i n, the process i is in control point p
i
,
with w
i
as its stack content and have θ
i
as its current
set of rules.
Let Con f
be the set of all global configurations
of the SM-DPN . We define the transition relation
to be the smallest relation between two configu-
rations in Con f
×Con f
as follows:
Let c = up
θ
wv, c
= up
θ
wv with u, v
Con f
, w Γ
, if r = p
(r
1
,r
2
)
p
c
θ,
r
1
θ and θ
= (θ \ {r
1
}) {r
2
} then c is a
predecessor of c
(also written as c
c
). The
rule r moves the process from the control point
p to the control point p
and changes the current
phase (current set of transition rules) by removing
r
1
and replacing it with r
2
without altering the
content of the stack.
Let c = up
θ
γwv, c
= up
θ
w
wv with γ Γ, w, w
Γ
, u, v Con f
, and r = pγ p
w
θ, then
c
c
. The rule r moves the SM-DPN pro-
cess from the control point p to the control point
p
, pops γ from the stack and pushes w
into the
stack. This rule maintains the current phase (θ)
untouched.
Let c = up
θ
γwv, c
= up
θ
1
1
w
1
p
θ
w
wv with γ
Γ, w Γ
, u, v Con f
, if r = pγ p
w
p
θ
1
1
w
1
θ, then c
c
. Here the rule r will
move the SM-DPN process from the control point
p to the control point p
, pops γ from the stack,
pushes w
into the stack and creates a new process
on the control point p
1
, with w
1
as stack content,
θ
1
as the initial phase and maintains the current
phase (θ) untouched.
We define
as the transitive reflexive closure
of
. If a configuration c
is reachable from c
0
in i
steps by applying
i times, we write c
0
i
c
.
We denote the set of immediate predecessors
(resp. successors) of a configuration c as Pre
(c) =
{c
1
Con f
: c
1
c} (resp. Post
(c) = {c
1
Con f
: c
c
1
}). Let Pre
(resp. Post
) denote
the reflexive transitive closure of Pre
(resp. Post
).
ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy
178
These notations can be generalized to sets of configu-
rations in the obvious ways. We omit the subscript
if clear from the context.
5 MODELING SELF MODIFYING
CONCURRENT CODE WITH
SM-DPN
To perform binary code analysis, we need a proper in-
termediate representation that can be easily converted
to SM-DPN. For this purpose, we use CFGs (Control
Flow Graphs) automatically extracted from the binary
code. Then, extracting a SM-DPN from a binary pro-
gram is segmented to three main steps:
1. Generating the Assembly Code: Firstly, we dis-
assemble the binary code with a proper tool that
reverses engineer the executed binary code to as-
sembly instructions. For example, 83 f8 00 will
be converted to Cmp eax,0x0.
2. Converting the Extracted Assembly Code to a
CFG: Secondly, the previous step will provide us
with a list of assembly instructions alongside their
memory addresses, from here we can easily ex-
tract an abstracted version of the code in the form
of a control flow graph.
3. Parsing the CFG to SM-DPN rules: Lastly, we
will convert the extracted CFG to a SM-DPN that
models dynamic process creation and self modi-
fying instructions. We suppose we are given an
oracle that computes the values (or their approxi-
mations) of the different registers at different con-
trol points of the program. Let Γ be the set of all
memory adresses and register values of the pro-
gram. There are four possible cases depending on
the CFG instructions:
If the CFG instruction is of the form
n
1
pushβ
n
2
it will be converted to SM-DPN rules of the
form
n
1
γ n
2
βγ
for every γ in Γ.
If the CFG instruction is of the form
n
1
popβ
n
2
it will be converted to SM-DPN rules of the
form
n
1
β n
2
ε
If the CFG instruction is of the form
n
1
call f
n
2
it will be converted to SM-DPN rules of the
form
n
1
γ n
f
n
2
γ
for every γ in Γ, where n
f
is the entry point of
the procedure f .
If the CFG instruction is of the form
n
0
call CreateProcess
n
1
where CreateProcess is a function that creates
a new process, then it will be converted to SM-
DPN rules of the form
n
0
γ n
1
γ p
0
w
0
for every γ in Γ, where p
0
is the entry point
of the created process and w
0
is its initial stack
content.
If the CFG instruction is of the form
n
1
mov m,l
n
2
where m is an executable code address, then
this is a self modifying code instruction. This
instruction will be converted to SM-DPN rules
of the form
n
1
r
1
,r
2
n
2
where r
1
is the rule that corresponds to the in-
struction being modified by n
1
mov m,l
n
2
, and
r
2
is the rule corresponding to the new instruc-
tion that replaces the first one.
6 SM-DPN BACKWARD
REACHABILITY ANALYSIS
It has been shown in (Messahel and Touili, 2024) that
if a regular set of global configurations can be rep-
resented by a special kind of finite automata, then it
is possible to effectively compute a finite automaton
that represents the set of all backward reachable con-
figurations. We recall in this section the results of
(Messahel and Touili, 2024) concerning the backward
reachability analysis of SM-DPNs.
Let = (P, Γ, ,
c
) be a SM-DPN. To finitely
represent a regular infinite set of SM-DPN config-
urations, we use a special kind of automata: a -
automaton (Messahel and Touili, 2024) is a tuple
A = (S, , T, s
0
, F) with the following conditions :
1. = (P × 2
c
) Γ, is the automaton alphabet.
Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection
179
2. S is a finite set containing the automaton states
partitioned into two subsets S
c
and S
s
s.t S =
S
c
S
s
, S
c
S
s
=
/
0 and for every s S
c
and every
p
θ
such that p P, θ
c
, there is a unique
state called s
p
θ
S
s
.
3. There is a relation T
S
s
× Γ × (S
s
\{s
p
θ
: s
S
c
, p P, θ
c
}) S
s
× {ε} × S
c
such that
T = T
{(s, p
θ
, s
p
θ
) : s S, p P, θ
c
}.
4. s
0
S
c
is the initial state.
5. F S is the set of final states .
Note that condition (3) implies the following
properties:
For each p P, θ
c
, s S
c
, s is the only
predecessor of s
p
θ
.
States s in S
c
do not have Γ-transitions.
Only ε-moves from states in S
s
lead to states in S
c
.
States s in S
s
do not have p-successors, for p P
For γ Γ {ε} and s, s
S, if (s, γ, s
) T we
write s
γ
T
s
. This notation can be extended in the
obvious manner to sequences of symbols as follows :
s S, s
ε
T
s and s, s
S, γ Γ{ε}, w
Γ
, s
γw
T
s
iff s S such that s
γ
T
s
w
T
s
.
We will remove the subscript T if it is understood in
the context.
Intuitively, the conditions above make sure
that every path in the -automaton is the con-
catenation of paths of the form s
0
p
θ
0
0
s
0
p
θ
0
0
w
0
q
0
ε
s
1
p
θ
1
1
s
1
p
θ
1
1
w
1
q
1
. . . s
n
p
θ
n
n
s
n
p
θ
n
n
w
n
q
n
such that s
0
, s
1
. . . s
n
S
c
, s
0
p
θ
0
0
, s
1
p
θ
1
1
. . . s
n
p
θ
n
n
S
s
, q
0
, q
1
. . . q
n
S
s
, p
0
, p
1
. . . p
n
P, θ
0
, θ
1
. . . θ
n
c
, w
0
, w
1
. . . w
n
Γ
.
A configuration p
θ
0
0
w
0
p
θ
1
1
w
1
. . . p
θ
n
n
w
n
is accepted
by an automaton A if there exists a path of the form
s
0
p
θ
0
0
s
0
p
θ
0
0
w
0
q
0
ε
s
1
p
θ
1
1
s
1
p
θ
1
1
w
1
q
1
. . . s
n
p
θ
n
n
s
n
p
θ
n
n
w
n
q
f
such that q
f
F. We denote by L(A) the
set of configurations accepted by A.
It was shown in (Messahel and Touili, 2024) that :
Theorem 6.1. Let = (P, Γ, ,
c
) be a SM-DPN and
A = (S, , T, s
0
, F) be an -automaton. We can build
a finite automaton A
pre
= (S, , T
, s
0
, F) such that
L(A
pre
) = pre
(L(A)).
7 CASE OF STUDY: A REAL
EXAMPLE
In order to observe the workflow of a binary analysis
using SM-DPN modeling and reachability analysis,
let us consider an assembly code extracted from a bi-
nary file infected by a concurrent self modifying vari-
ation of the BadRabbit ransomware, which is a notori-
ous malware that performs a drive-by attack to install
itself through fake adobe flash installer or updates,
encrypts all data and uses EternalRomance exploit
to spread within the corporate network (Perekalin,
2017). The binary is obtained from MalwareCollec-
tion github repo (Enderman, 2022). We explain step
by step the process of reverse engineering the binary
”BadRabbit.exe”, then modeling it with a SM-DPN,
to finally apply the Pre
algorithm of (Messahel and
Touili, 2024) to perform a reachability analysis where
the malicious part starts from the address n 7.
7.1 Reverse Engineering
We use radare2 (Studer et al., 2023) to decompile the
binary and retrieve its assembly code showed in List-
ing 2, where n 0, n 1, ..., n i are memory addresses.
1 Add res s Assembly
2
3 n 0 Pu sh 0 x10
4 n 1 Mov ebp , 0 x 1 f f
5 n 2 Mov [ n 1 2 ] , 0 x f f d 6 ; c h a n g e
i n s t r u c t i o n s t o r e d i n n 1 2
6 n 3 Nop
7 n 4 Pu sh e a x
8 n 5 Mov [ v ar 1 2 5 0 h ] , 0
9 n 6 jmp n 2
10 n 7 LEA EDX, [ 0 x f f f f f 9 e 4 ]
11 n 8 PUSH EDX
12 n 9 MOV [ l o c a l 1 2 b 0 ] , 0 x44
13 n 1 0 CALL C re a te P ro c es s W ; c r e a t e new
p r o c e s s
Listing 2: BadRabbit snippet ASM code.
7.2 Translating the Binary Code to an
SM-DPN
To model the assembly code of the BadRabbit mal-
ware with a SM-DPN, we loop over the instructions
and convert each instruction to a SM-DPN rule as de-
scribed below. In order to convert some instructions,
we need to resolve its registers values. To this end,
we set up three levels of precision:
1. using Radare2 (Studer et al., 2023) built-in. This
is the fastest method but it can only resolve the IP
register.
2. using Angr (Shoshitaishvili et al., 2016) Sym-
bolic execution. This performs a symbolic exe-
ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy
180
cution without actually running the program and
can track register values in most of the cases, but
it is costly in terms of time and memory consump-
tion because it is based on exploring all possible
logical branches.
3. by running the program in an isolated mode with a
debugger. This is the best method in terms of time
and memory consumption, but it is tricky if the
program depends on a variety of external inputs.
Following the rules that translate an assembly
code to a SM-DPN described in section 5, we trans-
late each instruction to SM-DPN rules as follows,
where p
0
, p
1
, . . . , p
i
represent the SM-DPN control
points according to the addresses n 0, n 1, ..., n i, re-
spectively, such that p
0
is the program control point
at the address n 0. Let Γ be the set of all memory
adresses and register values of the program. The in-
structions of the program are translated to SM-DPN
rules as follows:
push 0x10 : since this is a push instruction, we
move from control point p
0
to control point p
1
and
we push 0x10 to the stack. Thus, this instruction
can be modelled by these SM-DPN rules: p
0
γ
p
1
0x10γ for every γ Γ.
mov ebp, 0x1ff : a non self modifying mov in-
struction does not change the content of the stack
and moves the SM-DPN from control point p
1
to
control point p
2
resulting in rules p
1
γ p
2
γ,
for every γ Γ.
mov [n 12], 0xffd6 : a self modifying mov in-
struction does not change the content of the stack,
moves the SM-DPN from control point p
2
to con-
trol point p
3
and replaces an instruction repre-
sented by a rule, say r
, by another instruction
represented by a rule, say r”, resulting in the SM-
DPN rule: p
2
(r
,r)
p
3
.
Nop : this instruction moves the SM-DPN from
control point p
3
to control point p
4
resulting in
rules p
3
γ p
4
γ, for every γ Γ.
push eax : a push instruction moves from con-
trol point p
4
to control point p
5
and pushes the
value of eax to the stack. Thus, this instruc-
tion is translated to the following SM-DPN rules:
p
4
γ p
5
eaxγ, for every γ Γ.
mov [var 1250h], 0 : a non self modifying mov in-
struction does not change the content of the stack
and moves the SM-DPN from control point p
5
to
control point p
6
, resulting in the SM-DPN rules
p
5
γ p
6
γ, for every γ Γ.
jmp n 2 : The jmp instruction does not change
the stack content and moves the SM-DPN from
control point p
6
to control point p
2
resulting in
the SM-DPN rules: p
6
γ p
2
γ, for every γ Γ.
push edx : a push instruction moves the SM-
DPN from control point p
8
to control point p
9
and pushes edx to the stack. Thus, this instruc-
tion can be translated to the following SM-DPN
rules: p
8
γ p
9
edxγ, for every γ Γ.
mov [local 12b0],0x44 : a non self modifying
mov instruction does not change the stack content
and moves the SM-DPN from the control point
p
9
to control point p
10
resulting in the SM-DPN
rules: p
9
γ p
10
γ, for every γ Γ.
call CreateProcessW : a thread creation instruc-
tion moves the SM-DPN from control point p
10
to
control point p
11
, while launching a new thread.
Suppose, the function CreateProcessW creates a
new process having s
0
as entry control point and
w
0
as stack content, then, this instruction is trans-
lated to the following SM-DPN rules: p
10
γ
p
11
γ s
0
w
0
, for every γ Γ.
7.3 The BadRabbit Reachability
Analysis
Once we get the above SM-DPN model, we are ready
to apply the backward reachability results of Theo-
rem 6.1 to analyse the above program. These tech-
niques, that were implemented in a tool to analyze
SM-DPNs, have found that the program entry point
(n 0) is in the pre
of the malicious entry point (n 7),
which means that the program is infected.
8 CONCLUSION
In this paper, we tackle the analysis problem of multi-
threaded parallel programs that contain self modify-
ing code, i.e., code that can modify itself during the
execution time. Malware use heavily this kind of self-
modifying code in order to get obfuscated so that they
cannot be detected by anti-viruses. To model such
programs, we use a new model called Self Modify-
ing Dynamic Pushdown Network (SM-DPN). A SM-
DPN is a network of Self-Modifying Pushdown Sys-
tems, i.e., Pushdown Systems that can modify their
instructions on the fly during execution. To anal-
yse self modifying concurrent programs, we perform
reachability analysis of SM-DPNs. We successfully
apply our approach to represent and analyse a multi-
threaded self modifying code infected with a mal-
ware.
Analyzing a Concurrent Self-Modifying Program: Application to Malware Detection
181
REFERENCES
Alglave, J., Kashyap, M., and Tofte, M. (2010). Compo-
sitional reasoning for shared-variable concurrent pro-
grams. ACM Transactions on Programming Lan-
guages and Systems (TOPLAS), 32(5):1–53.
Anckaert, B., Madou, M., and De Bosschere, K. (2007). A
model for self-modifying code. In Camenisch, J. L.,
Collberg, C. S., Johnson, N. F., and Sallee, P., editors,
Information Hiding, pages 232–248, Berlin, Heidel-
berg. Springer Berlin Heidelberg.
Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A.,
Klein, J., Le Traon, Y., Octeau, D., and McDaniel, P.
(2014). Flowdroid: Precise context, flow, field, object-
sensitive and lifecycle-aware taint analysis for android
apps. Acm Sigplan Notices, 49(6):259–269.
Biondo, A., Conti, M., and Lain, D. (2018). Back to the
epilogue: Evading control flow guard via unaligned
targets. In Ndss.
Blazy, S., Laporte, V., and Pichardie, D. (2016). Verified ab-
stract interpretation techniques for disassembling low-
level self-modifying code. Journal of Automated Rea-
soning, 56:283–308.
Bruschi, D., Martignoni, L., and Monga, M. (2006). Detect-
ing self-mutating malware using control-flow graph
matching. In Detection of Intrusions and Malware
& Vulnerability Assessment: Third International Con-
ference, DIMVA 2006, Berlin, Germany, July 13-14,
2006. Proceedings 3, pages 129–143. Springer.
Chen, Y., Zhang, D., Wang, R., Qiao, R., Azab, A. M., Lu,
L., Vijayakumar, H., and Shen, W. (2017). Norax:
Enabling execute-only memory for cots binaries on
aarch64. In 2017 IEEE Symposium on Security and
Privacy (SP), pages 304–319. IEEE.
Dawei, S., Delong, L., and Zhibin, Y. (2018). Dynamic self-
modifying code detection based on backward analysis.
In Proceedings of the 2018 10th International Confer-
ence on Computer and Automation Engineering, IC-
CAE 2018, page 199–204, New York, NY, USA. As-
sociation for Computing Machinery.
Enderman (2022). Malwarecollection. https://github.com/
xcp3r/MalwareCollection.
Guizani, W., Marion, J.-Y., and Reynaud-Plantey, D.
(2009). Server-side dynamic code analysis. In 2009
4th International Conference on Malicious and Un-
wanted Software (MALWARE), pages 55–62.
Liu, Y., Xu, Z., Fan, M., Hao, Y., Chen, K., Chen, H., Cai,
Y., Yang, Z., and Liu, T. (2022). Concspectre: Be
aware of forthcoming malware hidden in concurrent
programs. IEEE Transactions on Reliability, 71:1–10.
Maisuradze, G., Petrenko, A. S., Bala, A., and Lie, D.
(2010). Threadsanitizer: finding data races in native
code. Proceedings of the ACM SIGPLAN Conference
on Programming Language Design and Implementa-
tion, pages 89–100.
Messahel, W. and Touili, T. (2024). Reachability analysis
of concurrent self-modifying code. In 28th Interna-
tional Conference on Engineering of Complex Com-
puter Systems (ICECCS).
Nethercote, N., Seward, J., and Seward, J. (2007). Valgrind:
A framework for heavyweight dynamic binary instru-
mentation. In Proceedings of the 2007 International
Symposium on Dynamic Languages, pages 89–100.
Perekalin, A. (2017). Bad rabbit: A new ransomware epi-
demic is on the rise. https://www.kaspersky.com/blog/
bad-rabbit-ransomware/19887/.
Schwartz, E. J., Cohen, C. F., Duggan, M., Gennari, J.,
Havrilla, J. S., and Hines, C. (2018). Using logic pro-
gramming to recover c++ classes and methods from
compiled executables. In Proceedings of the 2018
ACM SIGSAC Conference on Computer and Commu-
nications Security, pages 426–441.
Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N.,
Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser,
C., Kruegel, C., and Vigna, G. (2016). SoK: (State
of) The Art of War: Offensive Techniques in Binary
Analysis. In IEEE Symposium on Security and Pri-
vacy.
Studer, A., Abd El-MAwgood, A. M., and Akshay Krish-
nan, R. (2023). The official radare2 book. https:
//book.rada.re/credits/credits.html.
Touili, T. and Ye, X. (2017). Reachability analysis of
self modifying code. In 22nd International Confer-
ence on Engineering of Complex Computer Systems
(ICECCS).
Touili, T. and Ye, X. (2019). Ltl model checking of self
modifying code. In 2019 24th International Confer-
ence on Engineering of Complex Computer Systems
(ICECCS).
Ugarte-Pedrero, X., Balzarotti, D., Santos, I., and Bringas,
P. G. (2015). Sok: Deep packer inspection: A longitu-
dinal study of the complexity of run-time packers. In
2015 IEEE Symposium on Security and Privacy, pages
659–673.
Wang, R., Shoshitaishvili, Y., Bianchi, A., Machiry, A.,
Grosen, J., Grosen, P., Kruegel, C., and Vigna, G.
(2017). Ramblr: Making reassembly great again. In
NDSS.
Wu, W., Chen, Y., Xing, X., and Zou, W. (2019). Kepler:
Facilitating control-flow hijacking primitive evalua-
tion for linux kernel vulnerabilities. In USENIX Se-
curity Symposium, pages 1187–1204.
Zhang, X., Zhang, Y., Mo, Q., Xia, H., Yang, Z., Yang, M.,
Wang, X., Lu, L., and Duan, H. (2018). An empiri-
cal study of web resource manipulation in real-world
mobile applications. In 27th {USENIX} Security Sym-
posium ({USENIX} Security 18), pages 1183–1198.
ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy
182