Precise Detection of Security Checks in Program Binaries

Koyel Pramanick and Prasad A. Kulkarni

Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas, U.S.A.

Keywords:

Program Slicing, Security Checks, Program Binary.

Abstract:

Security checks are added to protect vulnerable code constructs, including certain indirect jumps and mem-

ory references, from external attacks. Detecting the presence of security checks that guard vulnerable code

constructs provides an important means to evaluate the security properties of given binary software. Previous

research has attempted to ﬁnd such security checks guarding potential vulnerable codes in software binaries.

Unfortunately, these techniques do not attempt to separate the original program code from the security check

code, leading to many false positives. The security check patterns detected by such techniques are also inaccu-

rate as they may be interspersed with program instructions. In this work, we develop a novel program slicing

based technique to partition the original program code from any non-program instructions, including the added

security checks. We deﬁne program code as instructions in the binary software that are needed to compute

the original and expected program outputs. Our technique can more accurately identify the embedded security

checks in program binaries with fewer false positives. Our technique can also ﬁnd more precise security check

code patterns in the given binary. Overall, our work can enable tools and humans to more effectively perform

independent security evaluations of binary software.

1 INTRODUCTION

Software products, commercial or open-source, are

typically distributed without any acknowledgment

or comment about their safety and security prop-

erties. This unfortunate condition persists even as

the number of reported software vulnerabilities have

been increasing in number and severity for many

years (Database, 2021) and software vulnerabilities

have been found to cause many disastrous real-world

attacks (Cybersecurity and Agency, 2021; Wired,

2017). Software is also often distributed in its binary

format, which makes it even harder to independently

study its security properties. We believe that an ability

to independently conduct a thorough security assess-

ment of binary code is important for the proliferation,

deployment, and use of software that can ensure the

safety and privacy of user systems and data.

While currently there is no known approach or

framework to measure software security, researchers

have developed client-side techniques to detect weak-

nesses and vulnerabilities in binary code (Qasem

et al., 2021; Brooks, 2018). Vulnerabilities are pro-

gramming bugs that can be exploited to compro-

mise user software and systems. Vulnerability de-

tection can be done using static analysis based tech-

niques (Eschweiler et al., 2016; Gao et al., 2008),

symbolic execution (Cadar et al., 2008; Cha et al.,

2012) or by dynamic techniques, like fuzzing (Ucci

et al., 2019; Abijah Roseline and Geetha, 2021). Un-

fortunately, none of the existing vulnerability detec-

tion techniques can ensure the detection and elimina-

tion of all program vulnerabilities for binary code.

Researchers have also developed mechanisms to

extract intrinsic development-time properties of the

software from the distributed binary code. Such tech-

niques could be used to evaluate software security

with the hypothesis that well-written software may

be more resistant to attacks. Speciﬁcally, software

that is written using safe high-level languages, us-

ing secure coding standards and software engineering

principles (Howard and Lipner, 2006), and hardened

with appropriate build-time compiler ﬂags (OpenSSF,

2024) may be more protected against external attacks.

For instance, a machine learning based technique was

devised to determine the high-level source program-

ming language used for coding any given binary soft-

ware (Adhikari and Kulkarni, 2022). Such infor-

mation is a useful metric to assess a software’s se-

curity properties as memory errors in low-level lan-

guages, like C/C++, are known to cause numerous

memory corruption errors (NIST, 2022), software vul-

Pramanick, K. and Kulkarni, P. A.

Precise Detection of Security Checks in Program Binaries.

DOI: 10.5220/0013366000003899

In Proceedings of the 11th International Conference on Information Systems Security and Privacy (ICISSP 2025) - Volume 2, pages 397-408

ISBN: 978-989-758-735-1; ISSN: 2184-4356

397

nerabilities (Szekeres et al., 2013) and software at-

tacks (Wheeler, 2014; CVE, 2019).

Likewise, researchers recently proposed a new

technique to detect the presence of run-time security

checks in software binaries (Pramanick and Kulka-

rni, 2022). Such checks to guard vulnerable code

constructs may be inserted by developers during cod-

ing or by tools like compilers during code genera-

tion. Run-time security checks are especially impor-

tant as they can ensure software security even when

any existing vulnerabilities are exploited, and there-

fore present an attractive approach to evaluate intrin-

sic software security. Thus, accurately detecting the

checks guarding vulnerable code constructs can pro-

vide a crucial indicator of software security.

Unfortunately, we found certain limitations in

the implementation of this earlier technique that can

cause many false positives in certain situations. A

high false positive rate in this technique makes it hard

to ascertain if the targeted vulnerable code fragments

in a binary are adequately protected by run-time secu-

rity checks, and may affect the usefulness of the tech-

nique. The current technique can be easily tricked by

compiler optimizations and obfuscation techniques by

contaminating the security code fragments by inter-

mixing the uniform security check code with random

other program instructions. It is hard in such cases for

this technique to correctly identify the actual security

check code present in the program binary.

In this work, we propose a program slicing based

approach to resolve this limitation. Program slic-

ing is a common compiler technique that computes

the subset of program instructions needed to affect

the values at some point of interest, speciﬁed by the

user (Weiser, 1981). This program subset is called

the program slice. In this work, we use slicing to

partition the binary code into two sets, program and

non-program instructions. The program slice only in-

cludes instructions that are necessary to compute the

original program state, as written by the software de-

veloper. The added security checks will be part of the

non-program instructions.

To identify the program instructions in every func-

tion, we suppose that program instructions are those

that compute state that escapes from the function.

We further suppose that state or values generated in

a function can escape or leave the function through,

(a) the return value, (b) arguments leaving via call

instructions in the function, (c) writes to non-stack

(global and heap) memory, and (d) writes to argu-

ments passed by reference. We develop an approach

that performs slicing over this set of instructions to

identify all other program instructions that compute

and facilitate state to escape from the function.

The integration of this program slicing based ex-

tension into the original technique enables it to sig-

niﬁcantly reduce false positives, and prune and more

accurately identify the actual security check code that

may be added to the program binary. Thus, we make

the following contributions in this work.

• We illustrate the problems in the earlier tech-

nique used to identify and detect runtime security

checks in program binaries.

• We develop a novel program slicing based ap-

proach to partition the binary code into program

and non-program instructions, and use this parti-

tion to more precisely detect the inserted security

checks, and

• We implement our technique in a state-of-the-art

binary analyzer, and conduct a thorough evalua-

tion of its properties and performance.

In the remainder of the paper, we present rele-

vant details regarding the original technique to de-

tect security checks in program binaries in Section 2.

We present our novel slicing-based algorithm to par-

tition the program in program and non-program code

in Section 3. We present our experimental framework

in Section 4. We explain our results in Section 5. Fi-

nally, we present our conclusions in Section 6.

2 BACKGROUND

In this work we build on the security-check detec-

tion framework proposed by (Pramanick and Kulka-

rni, 2022). This earlier framework provides a novel

theory and robust implementation to detect security

checks in unknown program binaries. However, this

framework still suffers from a high false positive rate

in some cases, and is unable to detect the precise se-

curity check code inserted into the given binary. In

this section we explain the insights used in this ear-

lier work and describe their basic technique to pro-

vide a foundation to understanding the extension that

we propose and implement in this work.

2.1 Insights Used

The prior work posits that security checks in binary

code are positioned near the speciﬁc code constructs

they aim to safeguard, which aids in their detection.

It further observes that these security checks typically

follow a consistent pattern: code that inspects a spe-

ciﬁc aspect of the program is followed by a pass/fail

decision. A pass decision allows the program to con-

tinue executing the protected construct, while a fail

decision invokes an exception routine.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

398

The framework validates the extracted snippets by

conﬁrming that they perform operations on the mem-

ory address or code construct they aim to protect.

To generalize the detection across diverse implemen-

tations, it employs a longest common subsequence

algorithm to identify recurring patterns among vali-

dated security check instances. The authors ﬁnd that

binaries with compiler-inserted security checks often

exhibit a small and consistent set of patterns. How-

ever, the recognition of these patterns is hindered by

noise introduced by program instructions surrounding

the security checks, limiting the framework’s ability

to extract precise patterns.

2.2 Methodology

The prior work proposed a method to detect compiler-

inserted security checks in binary code, address-

ing the limitations of signature-based approaches.

Signature-based methods rely on manually identi-

fying speciﬁc instruction patterns, which is labor-

intensive and fails to generalize across compilers, pro-

gramming languages, and types of security mecha-

nisms. This section summarizes their methodology.

Figure 1 provides an overview of the methodol-

ogy, illustrating its key steps. The framework be-

gins with the static analysis of binary code using

Ghidra (National Security Agency, 2019). This anal-

ysis identiﬁes "interesting code snippets" associated

with potential security checks, such as stack canary

instructions for mitigating stack overﬂows, indirect

branches for Control-Flow Integrity (CFI) validation,

and memory references ﬂagged by AddressSanitizer

for detecting memory errors. These snippets are ex-

tracted as candidates for further analysis.

Next, the framework validates the snippets to en-

sure they correspond to security checks. This valida-

tion is guided by the hypothesis that security check

instructions operate on or verify values derived from

vulnerable memory addresses. By examining the re-

lationship between these instructions and the relevant

memory addresses, only those snippets meeting the

validation criteria are retained.

The validated snippets are then normalized to ad-

dress minor structural variations while preserving es-

sential contextual information. These normalized

snippets are analyzed to detect recurring instruction

patterns indicative of security checks. By grouping

these patterns into equivalence classes, the framework

identiﬁes common sequences across diverse imple-

mentations, even when compiler optimizations intro-

duce variations.

3 ELIMINATE PROGRAM

INSTRUCTIONS USING slicing

While the earlier methodology explained in the pre-

ceding section provides a robust approach, it is lim-

ited by the presence of noise from program instruc-

tions surrounding the security checks, which hinders

the recognition of precise patterns. To address this

limitation, we introduce a slicing step as an extension

to the framework. This slicing step systematically re-

moves instructions that contribute to the program’s

primary functionality, isolating the security check in-

structions. By reducing noise, the slicing approach

enhances the framework’s ability to identify and an-

alyze security check patterns. Details of the slicing

process, its implementation, and its impact are ex-

plained in this section.

Program slicing is a widely used technique in

software analysis that isolates portions of code rel-

evant to a speciﬁc computation or aspect of inter-

est. It is particularly useful in debugging, testing,

and program comprehension. Techniques such as

static slicing, which analyzes the program without ex-

ecuting it, and dynamic slicing, which considers spe-

ciﬁc program executions, are well-established in the

ﬁeld (De Lucia, 2001). Extensions to traditional slic-

ing methods, such as symbolic slicing for enhanced

efﬁciency (Zhang, 2019) or handling constructs like

unconditional jumps (Galindo et al., 2022), have ex-

panded its applicability. These methods typically rely

on data dependency and control ﬂow analysis to ex-

tract meaningful slices.

In this work we employ a comprehensive slicing

step to improve the detection of security checks in bi-

nary code. Our technique focuses on isolating secu-

rity check instructions from the surrounding program

logic in disassembled binary code. By systematically

removing instructions related to the program’s pri-

mary functionality, our slicing approach reduces noise

in the analysis, enabling more precise validation and

pattern recognition of security checks.

Even when a security check is present, the code

snippet extracted by the earlier technique may contain

additional program instructions. We deﬁne “program

instructions” as those essential for generating the ex-

pected program output or results. The inclusion of

program instructions in extracted snippets introduces

noise, making it difﬁcult to identify common patterns

for the security checks across multiple snippets in the

binary. Our goal is to remove program instructions

from the code snippet. Ideally, this process will leave

us with a snippet that contains only the security check

instructions when the check is present, and an empty

snippet when no security check is inserted.

Precise Detection of Security Checks in Program Binaries

399

Figure 1: Strategy for identifying security checks inserted by the compiler in binary code.

Figure 2: Code Snippet to illustrate slicing.

To identify program instructions in every function,

we suppose that these instructions compute state that

escapes from the function. We further suppose that

state or values generated in a function can escape or

leave the function through, (a) the return value, (b) ar-

guments leaving via call instructions in the function,

and (c) writes to non-stack (global and heap) memory

and to arguments passed by reference. Then, we de-

velop an approach that performs slicing over this set

of instructions to identify all other program instruc-

tions that compute and facilitate state to escape from

the function. A program slice represents a semanti-

cally meaningful subset of computations within a pro-

gram. As a decomposition technique, slicing isolates

the speciﬁc computation of interest by removing un-

related program components, thereby improving the

relevance and focus of the remaining code.

We explain our slicing based technique to identify

program instructions in this section. We extend the

Ghidra reverse engineering framework (National Se-

curity Agency ghidra, 2019) to implement our algo-

rithm. Algorithm 1 describes the steps of our slicing

algorithm. We use the code example in Figure 2 to

illustrate the steps of our slicing algorithm.

3.1 Slicing Algorithm: Step 1

In this initial step, we employ Ghidra’s slicing API to

perform slicing based on three categories of instruc-

tions, as identiﬁed earlier. Thus, we slice on the ar-

guments of the return instruction, call instruction,

and in store instructions that write to the memory not

on the current function’s stack. We develop additional

heuristics to address certain challenging conditions.

Firstly, even security checks use function calls in ex-

ceptional situations, such as when the check fails to

display the error message and exit the program. Slic-

ing based on these function calls may incorrectly clas-

sify security check instructions as program instruc-

tions. Therefore, we employ Ghidra’s API to detect

such non-returning functions and avoid slicing on any

‘calls’ to non-returning functions.

Secondly, only

store instructions that write to memory beyond the

current function’s stack space can lead to computa-

tion escaping the current function. We identify such

store instructions by assuming that all memory refer-

ences to the current stack are made using offsets from

the stack/frame pointer. The remaining store instruc-

tions serve as starting points for slicing. While these

heuristics are effective, they may introduce inaccura-

cies in categorizing sets of program and non-program

instructions. For the example program in Figure 2, the

assembly lines shown in red indicate the instructions

detected as program instructions by this step.

3.2 Slicing Optimization 1: Step 2

Ghidra’s slicing algorithm is implemented in their

decompiler framework and works on a machine-

independent representation called P-Codes. The P-

Code representation of a binary in Ghidra is higher-

level than the disassembly representation, and elimi-

nates binary-level instructions that perform such tasks

as managing the calling conventions during function

entry/exit and calls. We consider such instructions

as part of the program instructions. In this step we

eliminate instructions that are typically used by the

calling convention to transfer function arguments into

registers. Additionally, we remove instructions re-

lated to stack management, such as the PUSH and

POP instructions at the start of the function, and the

function epilogue just before the RETURN statement,

Although most exception functions employed by se-

curity checks are non-returning, it’s worth noting that cer-

tain exception functions utilized by the Address Sanitizer

check don’t exit directly. Instead, they invoke other func-

tions that, in turn, contain the exit statement. These indirect

non-returning functions pose a challenge for automatic de-

tection by Ghidra. While ongoing efforts are directed to-

ward reﬁning our automatic detection approach for such

functions, it’s important to mention that, for the purposes

of this study, we manually classiﬁed them as non-returning.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

400

which are responsible for allocating and releasing

stack space. For the example program in Figure 2, the

italized assembly lines shown in pink indicate the ad-

ditional instructions detected as program instructions

by this step in the algorithm.

3.3 Slicing Optimization 2: Step 3

Ghidra’s slicing algorithm, even after our Step-2,

leaves a substantial number of assembly instructions

in the program as unclassiﬁed. In this step, we im-

plement a simple iterative data-ﬂow algorithm to ﬁnd

such additional program instructions that have a de-

pendency on the instructions already in the current

program slice. Thus, within the slice list, we track

the source registers of each instruction and append to

the list the most recent instructions within the block

or its source block that update these registers. This

approach enables us to ﬁnd a larger portion of the

program’s instructions that were missed by the orig-

inal slicing algorithm. We also handle some branch

instructions in this step. If a conditional or uncondi-

tional jump statement leads to a target basic block(s)

where some instructions in each of the successor basic

blocks are part of the slice, then we add the compare-

branch instructions to the program slice. If a block

ending with a conditional branch or jump has a suc-

cessor block where no instructions are part of the pro-

gram slice, then those compare-branch instructions

are not added to the slice.

For the example program in Figure 2, the assem-

bly lines shown in blue indicate the additional instruc-

tions detected as program instructions by this step

in the algorithm. At the end of this step, we elimi-

nate the program instructions from each code snippet.

This step produces a smaller and more reﬁned code

snippet. Even after completing all the algorithmic

steps, certain program instructions persist due to inac-

curacies in the slicing process performed by Ghidra.

These remnants are essentially treated as false posi-

tives. The instructions that remain may also suggest

the existence of another security check added either

by the compiler or the developer.

Following the slicing process, the snippets are nor-

malized to address minor variations in their struc-

ture while preserving essential contextual informa-

tion. These reﬁned and normalized snippets are then

analyzed to identify recurring instruction patterns in-

dicative of security checks. By grouping these pat-

terns into equivalence classes, the framework cap-

tures dominant sequences, even when compiler op-

timizations introduce variations in instruction place-

ment. This ﬁnal step conﬁrms the presence of security

checks in the binary code.

Input: function,Function

instructionList ←

getHighFunctionInstructions(Function) ;

Let ST ORE −→ Store instruction not on stack;

Let back

lice −→ Set of backward slices from

GHIDRA;

Let step1 −→ Set of program instructions part of the

slice after GHIDRA’s slicing;

Let step2 −→ Set of program instructions part of the

slice after adding instructions part of the calling

convention;

Let step3 −→ Set of program instructions part of the

slice after adding instructions following the control

ﬂow and data ﬂow within the program;

foreach instruction ∈ instructionList do

mnemonic ← getMnemonicString(instruction) ;

if mnemonic is ”CALL”/”RETURN”/STORE

then

argList ← getArguments(instruction) ;

foreach arg ∈ argList do

bws ← getBackwardSlice(arg) ;

if bws is not NU LL then

backslice ← bws ;

end

step1 ← (instructionList − back

lice) ;

foreach instruction ∈ step1 do

mnemonic ← getMnemonicString(instruction) ;

if mnemonic is ”CALL” then

step1 ← instructions updating registers used

to set up function arguments ;

end

if mnemonic is ”RETURN” then

step1 ← instructions updating the RSP and

RAX registers ;

end

step2 ← (step1) ;

foreach instruction ∈ instructionList do

mnemonic ← getMnemonicString(instruction) ;

if mnemonic startswith ”J” then

srcblk ← getSourceBlock(instruction) ;

predblk ← getPredecessorBlock(srcblk) ;

if getInstructionsIn(predblk) in step2 then

step2 ← instruction ;

end

foreach instruction ∈ instructionList do

oper ← getOperandsAt(instruction) ;

srcblk ← getSourceBlock(instruction) ;

predblk ← getPredecessorBlock(srcblk) ;

if oper is set in predblk then

step2 ← instruction "set"ing oper ;

end

else if oper is set in srcblk then

step2 ← instruction "set"ing oper ;

end

step3 ← (step2) ;

return step3 ;

Algorithm 1: Elimination of program instructions by slic-

ing.

Precise Detection of Security Checks in Program Binaries

401

4 EXPERIMENTAL

FRAMEWORK

We designed a controlled experimental setup to eval-

uate and study the impact of our slicing-based algo-

rithm for more accurately detecting security checks

in software binaries. This setup employs two bench-

mark conﬁgurations:

(a) Set-A. All benchmarks in this conﬁguration

are compiled with one security check explicitly and

intentionally enabled.

(b) Set-B. Benchmarks in this conﬁguration are

compiled with all security check ﬂags disabled.

Programs in the Set-A conﬁguration are further

classiﬁed according to the security check that is en-

abled into three classes: Set-A-Stackguard, Set-A-

CFI, and Set-A-AddressSanitizer, where only the

compiler ﬂags to enable the Stackguard (Kuznetsov

et al., 2014), control-ﬂow-integrity (Tice et al., 2014)

and address-sanitizer (Serebryany et al., 2012) secu-

rity checks are turned ON, respectively. Note that

even when we disable the security ﬂags in the Set-B

conﬁguration, the compiler may still add some impor-

tant checks by default and the binary can still contain

security check instructions added by the programmer.

We design experiments to conﬁrm our hypotheses re-

garding compiler-inserted security checks, and study

if our approach can correctly and consistently identify

the security checks in Set-A, while not producing any

false positives for Set-B benchmarks.

Table 1 enumerates the conﬁgurations used for

Clang/LLVM tool-chain (Sarda and Pandey, 2015)

and the GCC compiler to generate our Set-A bina-

ries with different security checks. The ﬁrst column

in the table lists the three checks that we employ for

the evaluation of our approach in this work. The last

column displays the ﬂags used to enable or disable

the respective security checks. All benchmarks were

compiled with optimizations (-O2) enabled.

We utilize twelve C/C++ programs sourced from

the SPEC cpu2006 benchmark suite for our experi-

ments (Henning, 2006). The binaries are produced

through the compilation of C/C++ programs using

designated Clang/LLVM and GCC compiler ﬂags tai-

lored for the x86-64-Linux platform. We believe that

the selected benchmarks, compilers and conﬁguration

options provide a sufﬁcient platform to validate and

analyze our hypotheses. Moreover, we expect that the

insights and observations from this study will apply

broadly to other programs, and compiler and hard-

ware conﬁgurations. We developed scripts in Python

to signiﬁcantly extend Ghidra’s (National Security

Agency ghidra, 2019) slicing functionality and to col-

lect information from the binaries.

Table 1: The conﬁguration used for enabling/disabling the

respective security check in CLANG and GCC.

Sec. Ck. Compiler Check? Flags

Stackguard CLANG ON -fstack-protector-all

OFF -fno-stack-protector

GCC ON -fstack-protector-all

OFF -fno-stack-protector

Adsan CLANG ON -fsanitize=address -fno-omit-

frame-pointer

OFF -fno-omit-frame-pointer

GCC ON -fsanitize=address -fno-omit-

frame-pointer

OFF -fno-omit-frame-pointer

CFI CLANG ON -ﬂto -fsanitize=cﬁ fvisibil-

ity=default

OFF -ﬂto

GCC ON -fcf-protection=full -fno-

sanitize=all

OFF fno-sanitize=all

5 RESULTS AND OBSERVATIONS

In this section, we present the results of our exper-

iments and explain the beneﬁts of our slicing-based

extension to more accurately detect security checks

in program binaries. First, we present results demon-

strating improvements from our technique to the num-

ber and quality of snippets that are generated for each

benchmark conﬁguration. We then show the bene-

ﬁts of our approach to detect any common instruction

patterns across the multiple code snippets for each

benchmark-conﬁguration and compiler.

5.1 Effect on Generated Code Snippets

In this section we present results about the number

of collected code snippets for each benchmark con-

ﬁguration described in Section 4. We also present

and discuss the effectiveness of our methods to prune

the number of false snippets and instructions from the

snippets. We only present the results obtained for bi-

naries compiled by the Clang compiler here. Results

with the GCC compiler reveal identical trends and are

left out due to space considerations.

Stackguard. Tables 2 and 3 present the results for the

Stackguard security check with the Clang compiler,

for two benchmark conﬁgurations with the Stack-

guard check turned ON and OFF, respectively. The

ﬁrst column lists the benchmark name, followed by

the number of code snippets (or indicators) in the next

column, representing the vulnerable code sites. The

following four pairs of columns present the number

of valid code snippets and the average number of in-

structions per snippet after validation and following

the application of the slicing techniques described in

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

402

Table 2: Results with the Set-A-Stackguard (Check ON) conﬁguration (with CLANG).

Benchmarks No. of Total No. of validations

indicators No Slicing STEP 1 Step 2 Step 3

Valid Avg. Valid Avg. Valid Avg. Valid Avg.

Snips. ins. Snips. ins. Snips. ins. Snips. ins.

Bzip 61 61 8 61 4 61 4 61 4

Gcc 3377 3293 12 3221 4 3217 4 3217 4

Gobmk 2523 2496 9 2496 4 2496 4 2496 4

H264ref 538 528 10 527 4 527 4 524 4

Hmmer 494 483 7 483 4 483 4 483 4

Lbm 26 17 9 17 5 17 5 17 5

Libquantum 107 96 7 96 4 96 4 96 4

Mcf 35 26 5 26 3 26 3 26 3

Namd 100 91 8 91 4 91 4 91 4

Omnetpp 2000 1988 8 1988 4 1988 4 1988 4

Povray 1591 1573 11 1569 4 1569 4 1567 4

Sjeng 148 139 9 139 5 139 5 139 4

AVG 916.7 899 8.5 892.833 4.1 892.5 4.09 892.083 4

Table 3: Results for the Set-B conﬁguration (with CLANG), assessed for Stackguard.

Benchmarks No. of Total No. of validations

indicators No Slicing STEP 1 Step 2 Step 3

Valid Avg. Valid Avg. Valid Avg. Valid Avg.

Snips. ins. Snips. ins. Snips. ins. Snips. ins.

Bzip 64 12 5 11 4 11 4 11 4

Gcc 3151 71 13 37 4 37 4 23 4

Gobmk 2286 44 13 24 5 23 5 7 5

H264ref 514 28 24 23 6 23 6 13 6

Hmmer 454 14 15 8 6 8 6 5 5

Lbm 21 0 0 0 0 0 0 0 0

Libquantum 101 4 14 2 4 3 4 0 0

Mcf 35 0 0 0 0 0 0 0 0

Namd 100 4 67 4 14 4 14 4 12

Omnetpp 1444 21 16 12 4 12 4 5 4

Povray 1424 43 37 24 6 23 6 10 9

Sjeng 154 3 28 3 12 3 12 3 10

AVG 812.3 20.3 19.3 12.333 5.4 12.25 5.4 6.75 4.92

Sections 3.1, 3.2, and 3.3, respectively.

We can see that the number of indicators is similar

in both conﬁgurations, which is expected as there are

a similar number of return instructions in the corre-

sponding benchmarks in each case. However, we ﬁnd

that while most code snippets pass the validation algo-

rithm for the Set-A-Stackguard conﬁguration, a very

small portion of the snippets do so for the Set-B con-

ﬁguration. This result indicates the importance of the

validation step in the original algorithm to prune the

spurious code snippets (that do not contain the secu-

rity check in Set-B), while retaining the snippets with

potential security check instructions in Set-A.

In table 2, there is little change in the number

of validated code snippets at various slicing stages,

showing a smaller impact of my slicing-based exten-

sion for this scenario. However, there is a substan-

tial 53% reduction in the average number of instruc-

tions, as the slicing algorithm removes many non-

security check-related program instructions from the

code snippets.

By contrast, Table 3 shows a signiﬁcant 67% re-

duction in validated code segments and a 75% reduc-

tion in the average number of instructions after ap-

plying all stages of the slicing algorithm. The slicing

algorithm effectively removes most instructions, lead-

ing to a notable reduction in both these metrics. The

remaining instructions and fragments are potentially

false positives from our slicing algorithm. These false

positives may consist of program instructions mis-

takenly identiﬁed as non-program instructions due to

the conservative nature of our slicing algorithm im-

plementation used to detect program instructions. In

some instances, they may include security check in-

structions inserted by the compiler at speciﬁc vulner-

able sites, even when we disable the security checks.

The false positive instructions may also include other

unrelated non-program instructions in the binary. We

Precise Detection of Security Checks in Program Binaries

403

Table 4: Results for Set-A-CFI (Check ON) conﬁguration (with CLANG).

Benchmarks No. of Total No. of validations

indicators No Slicing STEP 1 Step 2 Step 3

Valid Avg. Valid Avg. Valid Avg. Valid Avg.

Snips. ins. Snips. ins. Snips. ins. Snips. ins.

Bzip 24 20 7 20 4 20 4 20 4

Gcc 343 154 12 77 6 77 6 77 6

Gobmk 29 22 7 22 5 22 5 22 5

H264ref 349 346 14 346 4 346 4 346 4

Hmmer 14 11 6 10 3 10 3 10 3

Lbm 1 0 0 0 0 0 0 0 0

Libquantum 1 0 0 0 0 0 0 0 0

Mcf 1 0 0 0 0 0 0 0 0

Namd 5 2 7 2 3 2 3 2 3

Omnetpp 69 21 6 21 4 21 4 21 4

Povray 68 48 11 48 6 48 6 48 6

Sjeng 4 1 12 1 10 1 10 1 8

AVG 75.7 52.1 6.8 45.6 3.8 45.6 3.8 45.6 3.6

Table 5: Results for Set-B conﬁguration (with CLANG), assessed for CFI.

Benchmarks No. of Total No. of validations

indicators No Slicing STEP 1 Step 2 Step 3

Valid Avg. Valid Avg. Valid Avg. Valid Avg.

Snips. ins. Snips. ins. Snips. ins. Snips. ins.

Bzip 23 0 0 0 0 0 0 0 0

Gcc 383 4 9 3 3 3 3 2 4

Gobmk 14 0 0 0 0 0 0 0 0

H264ref 3 0 0 0 0 0 0 0 0

Hmmer 4 0 0 0 0 0 0 0 0

Lbm 1 0 0 0 0 0 0 0 0

Libquantum 1 0 0 0 0 0 0 0 0

Mcf 1 0 0 0 0 0 0 0 0

Namd 1 0 0 0 0 0 0 0 0

Omnetpp 149 0 0 0 0 0 0 0 0

Povray 27 0 0 0 0 0 0 0 0

Sjeng 3 0 0 0 0 0 0 0 0

AVG 50.75 4 9 3 3 3 3 2 4

plan to conduct a more thorough analysis and resolu-

tion of these false positive instructions in future work.

CFI. Table 4 and 5 present our ﬁndings when the CFI

check in the CLANG compiler is enabled and dis-

abled, respectively. It’s worth noting that CFI doesn’t

ﬁnd opportunities to insert any security check instru-

mentation for several benchmarks that do not contain

any indirect calls, particularly the smaller ones, like

Lbm, Mcf, and Libquantum.

Surprisingly, we ﬁnd that the number of indicators

(indirect branches/calls, in this case) and correspond-

ing number of code snippets vary for a few bench-

marks when compiled for the SET-A-CFI and SET-B-

CFI conﬁgurations, such as Gobmk and H264ref. We

have not yet analyzed the reasons for this disparity in

code generated by the compiler in these two cases.

Similar to the SET-A-Stackguard case, we ﬁnd

that in the SET-A-CFI case, most code snippets are

validated, and potentially contain a security check.

We again ﬁnd that the validation algorithm effectively

eliminates code snippets where the security check

isn’t present (SET-B conﬁguration). After validation,

most of the rows in Table 5 are blank.

We observe that our slicing algorithm effectively

removes many program instructions within the secu-

rity check snippets. This beneﬁt is evident from the

substantial 47% reduction in the average number of

instructions per snippet, as shown in Table 4, when

comparing the results before slicing to those after the

ﬁnal slicing step is applied.

Our slicing algorithm also shows notable improve-

ments in the SET-B-CFI cases, as highlighted in Ta-

ble 5. The reduction in the average number of instruc-

tions is even greater in the SET-B-CFI cases, reaching

55%. This is expected, as the SET-B conﬁguration

may lack true security checks, allowing our algorithm

to signiﬁcantly reduce false positives in the original

algorithm and improve the accuracy of the results.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

404

Table 6: Results for Set-A-AddressSanitizer (Check ON) (with CLANG).

Benchmarks No. of Total No. of validations

indicators No Slicing STEP 1 Step 2 Step 3

Valid Avg. Valid Avg. Valid Avg. Valid Avg.

Snips. ins. Snips. ins. Snips. ins. Snips. ins.

Bzip 7814 3993 9 2925 5 2926 5 2806 4

Gcc 163550 59844 9 57322 5 57257 5 56992 5

Gobmk 57630 25308 10 16092 5 16091 5 15999 5

H264ref 65890 40114 9 29762 5 29762 5 29608 5

Hmmer 33256 12322 9 11245 5 11242 5 11225 5

Lbm 915 36 9 32 5 32 5 30 5

Libquantum 2421 613 9 548 5 548 5 503 5

Mcf 1449 80 10 74 5 74 5 68 5

Namd 25125 5451 10 5372 5 5372 5 5321 5

Omnetpp 26097 6912 10 5019 5 5020 5 5007 5

Povray 96071 22688 10 20413 5 20408 5 20365 5

Sjeng 10274 4351 9 4156 5 4154 5 4122 5

AVG 40874.3 15142.7 9.3 12746.67 5 12740.5 5 12670.5 5

Table 7: Results for Set-B (with CLANG), assessed for Address Sanitizer.

Benchmarks No. of Total No. of validations

indicators No Slicing STEP 1 Step 2 Step 3

Valid Avg. Valid Avg. Valid Avg. Valid Avg.

Snips. ins. Snips. ins. Snips. ins. Snips. ins.

Bzip 4541 167 19 92 4 92 4 24 7

Gcc 79292 3104 21 1469 4 1462 4 803 4

Gobmk 21137 436 14 282 5 282 5 171 4

H264ref 33467 708 32 434 8 427 7 281 8

Hmmer 14117 369 17 262 5 258 5 120 5

Lbm 390 7 69 7 4 7 4 4 2

Libquantum 747 29 23 21 6 21 5 17 5

Mcf 637 16 18 9 4 8 5 5 3

Namd 10278 163 12 135 4 135 4 128 3

Omnetpp 13458 318 29 173 4 165 4 76 3

Povray 41271 644 20 426 3 405 3 137 4

Sjeng 4974 108 17 65 4 62 4 45 3

AVG 18692.4 57205.8 24.3 281.25 4.58 277 4.5 150.92 4.3

Address Sanitizer. Tables 6 and 7 present the results

obtained with the SET-A-AddressSanitizer (check

ON) conﬁguration and SET-B conﬁguration (check

OFF) for the Clang compiler. For this work, we wrote

a simple Ghidra-based binary analysis script to iden-

tify the indicators for the Address Sanitizer check.

Our improved script builds upon the original work

by incorporating type analysis to differentiate vector

memory accesses from scalar accesses. Unlike the

original approach, which treats all memory derefer-

ences as potential areas for protection, our script iden-

tiﬁes vector memory accesses separately, providing a

more targeted and effective approach. Interestingly,

we ﬁnd that the improved validation technique and

new slicing algorithms are highly effective at elimi-

nating the numerous spurious instances.

In Table 6, we observe a 16% reduction in the

number of code snippets after the ﬁnal stage of slicing

compared to the scenario with no slicing. This reduc-

tion highlights the effectiveness of the slicing process

in eliminating spurious code snippets that were mis-

takenly identiﬁed as security check instructions. The

initial slicing step (STEP 1) is particularly adept at

identifying such code fragments, resulting in a rela-

tively smaller decrease in the subsequent steps (STEP

2 and STEP 3). Furthermore, there is a signiﬁcant

46% decrease in the average number of instructions

within the remaining code snippets, showing effec-

tiveness at eliminating program instructions from the

extracted snippets.

In Table 7, a remarkable 70% reduction is ob-

served in the number of veriﬁed code constructs af-

ter the ﬁnal stage of slicing, compared to the scenario

with no slicing. Notably, when the security check is

turned off, a signiﬁcant decrease in the number of val-

idations is evident during the ﬁrst and third stages of

slicing, amounting to a 46% reduction, which under-

scores the accuracy of our algorithm in each step. Ad-

Precise Detection of Security Checks in Program Binaries

405

ditionally, there is a substantial 82% decrease in the

average number of instructions. This reduction can be

attributed to the slicing algorithm’s capability to iso-

late security check instructions by ﬁltering out most

program-related instructions.

5.2 Common Instruction Patterns

The ﬁnal step in the original algorithm is to de-

tect common instruction patterns among the extracted

and validated snippets for each conﬁguration. The

Longest Common Subsequence algorithm is used for

the pattern matching. The pattern matching algorithm

counts the number of hits to each unique instruction

pattern (or, equivalence class) for the code snippets

for each benchmark. The instruction patterns are then

sorted by their number of hits. The sorted fraction of

hits, also called the match ratio is plotted in the ﬁg-

ures in this section.

With a precise implementation of the proposed

technique, the hypothesis predicts the SET-A conﬁg-

urations (security check ON) to deliver just one or a

few high-frequency patterns corresponding to the ac-

tual compiler-inserted security check(s). In contrast,

the hypothesis predicts SET-B conﬁgurations to not

reveal any high-frequency instruction pattern since

the check is turned off, and the validated snippets

likely only contain false positive program instructions

left behind by our conservative slicing algorithm.

We present our observations from this pattern

recognition step in this section. Again, to conserve

space, we only present results obtained from bina-

ries compiled using the Clang compiler. Results with

the GCC-compiled binaries are not included, but are

comparable to those presented here.

5.2.1 Security Check Pattern for Stackguard

Figures 3(a) and 3(b) display the sorted cumula-

tive match ratios for each benchmark for the Stack-

guard security check with CLANG for the SET-A-

Stackguard and SET-B conﬁgurations, respectively.

As expected, the algorithm ﬁnds that a single domi-

nant instruction pattern accounts for a large majority

of hits in the SET-A-Stackguard conﬁguration (Figure

3(a)). In fact, just two instruction patterns (that are

small variants of each other) are present in all the ex-

tracted and cleaned instruction patterns in each bench-

mark. Furthermore, the same high-frequency instruc-

tion patterns are found across all the benchmarks.

In contrast, we do not ﬁnd one dominant instruc-

tion pattern among the code snippets for most bench-

marks in the SET-B conﬁguration for Stackguard

(Figure 3(b)) Interestingly, our algorithm identiﬁes

a high-frequency pattern in the SET-B conﬁguration

(a) Stack Protection ON (SET-A)

(b) Stack Protection OFF (SET-B)

Figure 3: Pattern recognition results for Stackguard.

with Stackguard (check OFF) for bzip. Upon man-

ual inspection, we discovered that, despite turning off

the security check, the compiler introduces the stack-

guard check in a few locations of the program binary.

It is encouraging to note that the algorithm success-

fully detects the presence of this pattern in the binary.

We also analyzed the high fraction hit count for

some patterns in the other benchmarks in Figure 3(b).

We attribute this seeming anomaly to the small num-

ber of validated code snippets that remained after the

elimination steps and small patterns.

5.2.2 Security Check Pattern for CFI

Figure 4 presents the pattern recognition results for

just the SET-A-CFI conﬁguration with the Clang

compiler for only seven out of our twelve total bench-

marks. The remaining ﬁve benchmarks (lbm, libquan-

tum, mcf, namd, and sjeng) yield very few code snip-

pets, typically less than ﬁve, and almost none of them

pass the validation process, as seen from Table 4.

Upon manual inspection, we found that the compiler

did not apply the CFI check for some of these cases

even with the ﬂag turned ON. Again we ﬁnd that

our pattern recognition algorithm is able to ﬁnd the

high-frequency patterns for all benchmarks. It is also

encouraging to ﬁnd identical high-frequency patterns

being detected across all benchmarks. We do not plot

the graph for the SET-B conﬁguration with CFI since

our techniques to prune spurious snippets eliminate

most of the instances, as seen from Table 5.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

406

Figure 4: Pattern recognition results for SET-A-CFI.

(a) AddressSanitizer ON (SET-A)

(b) AddressSanitizer OFF (SET-B)

Figure 5: Pattern recognition results for AddressSanitizer.

5.2.3 Security Check Pattern for AddSan

Figure 5(a) illustrates the sorted cumulative match ra-

tios over sliced and validated code snippets for the

SET-A-AddressSanitizer (security check turned ON)

benchmark conﬁguration. Again we ﬁnd that our

technique is very effective at ﬁnding the common in-

struction patterns, corresponding to the address san-

itizer security check, for the SET-A conﬁguration.

For the Set-B conﬁguration with Address Sanitizer

(Figure 5(b)), our algorithm does not ﬁnd any high-

frequency patterns for any benchmarks, which sup-

ports the likelihood that these are spurious instances

detected due to conservative ﬁltering algorithms for

the security check turned OFF case.

5.2.4 Common Patterns for Set-A Conﬁgs

Table 8 lists the most common instruction patterns

seen across all the benchmarks in the SET-A con-

Table 8: Security Check patterns.

Security Check Compiler Pattern Observed

STACKGUARD CLANG MOV RXX,qword ptr FS:[0x0]

CMP RXX,qword ptr [RSP]

J 0x0

CALL 0x0

CFI CLANG CMP RXX,RXX

J 0x0

UD2

Address Sanitizer CLANG SHR RXX,0x0

MOV RXX,byte ptr [RXX + 0x0]

TEST RXX,RXX

J 0x0

AND RXX,0x0

ADD RXX,0x0

J 0x0

ﬁgurations with the Stackguard, CFI and Address-

sanitizer security checks. We found a high-frequency

instruction pattern (or equivalence class) consistently

present across all benchmarks in the SET-A conﬁgu-

rations. It is both encouraging and notable that, on

manual inspection of the binaries in SET-A, we ﬁnd

that these instruction patterns actually correspond to

the security check instructions inserted by the com-

piler for each respective check. Furthermore, with the

SET-B conﬁguration when the check was disabled,

no consistent dominating instruction pattern was de-

tected across all benchmarks.

It is important to realize that ﬁltering the pro-

gram instructions from the validated snippets that is

achieved by our slicing-based extension is signiﬁ-

cantly consequential to such precision in the pattern

matching results. Thus, for the automated run-time

security checks investigated in this study, our slicing

based extension enables this algorithm to effectively

discern when a security check is enabled or disabled

in a given program binary.

6 CONCLUSION

Our primary objective in this work is to propose and

evaluate the beneﬁt of a novel program-slicing based

extension to an earlier approach devised to detect the

presence of run-time security checks in arbitrary soft-

ware binaries. Our slicing based extension identiﬁes

and separates the binary-level instructions into pro-

gram and non-program categories. We employ and

integrate our algorithm to remove program instruc-

tions from potential security-check snippets extracted

by this earlier approach.

We explain and evaluate our technique for SPEC

benchmarks compiled with two compilers for three

different security checks. We found that our slicing-

based approach is highly consequential in improving

Precise Detection of Security Checks in Program Binaries

407

the ability and accuracy of this earlier technique to

determine the presence of security checks in program

binaries. We anticipate that our work will greatly en-

hance automated and independent security analysis

of binary code, particularly for end-users who do not

have access to the source code.

REFERENCES

Abijah Roseline, S. and Geetha, S. (2021). A comprehen-

sive survey of tools and techniques mitigating com-

puter and mobile malware attacks. Computers & Elec-

trical Engineering, 92:107143.

Adhikari, A. and Kulkarni, P. A. (2022). Using the strings

metadata to detect the source language of the binary.

In Daimi, K. and Al Sadoon, A., editors, Proceedings

of the ICR’22 International Conference on Innova-

tions in Computing Research, pages 190–200, Cham.

Springer International Publishing.

Brooks, T. N. (2018). Survey of automated vulnerability

detection and exploit generation techniques in cyber

reasoning systems. In Science and Information Con-

ference, pages 1083–1102. Springer.

Cadar, C., Dunbar, D., Engler, D. R., et al. (2008). Klee:

unassisted and automatic generation of high-coverage

tests for complex systems programs. In OSDI, vol-

ume 8, pages 209–224.

Cha, S. K., Avgerinos, T., Rebert, A., and Brumley, D.

(2012). Unleashing mayhem on binary code. In Pro-

ceedings of the 2012 IEEE Symposium on Security

and Privacy, SP ’12, page 380–394, USA. IEEE Com-

puter Society.

CVE (2019). A buffer overﬂow vulnerability in whatsapp

voip stack.

Cybersecurity, U. and Agency, I. S. (2021). Top routinely

exploited vulnerabilities.

Database, N. N. V. (2021). Cvss severity distribution over

time.

De Lucia, A. (2001). Program slicing: methods and ap-

plications. In Proceedings First IEEE International

Workshop on Source Code Analysis and Manipula-

tion, pages 142–149.

Eschweiler, S., Yakdan, K., and Gerhards-Padilla, E.

(2016). discovre: Efﬁcient cross-architecture identi-

ﬁcation of bugs in binary code. In NDSS, volume 52,

pages 58–79.

Galindo, C., Pérez, S., and Silva, J. (2022). Program slic-

ing techniques with support for unconditional jumps.

In Riesco, A. and Zhang, M., editors, Formal Meth-

ods and Software Engineering, pages 123–139, Cham.

Springer International Publishing.

Gao, D., Reiter, M. K., and Song, D. (2008). Binhunt: Au-

tomatically ﬁnding semantic differences in binary pro-

grams. In International Conference on Information

and Communications Security, pages 238–255.

Henning, J. L. (2006). Spec cpu2006 benchmark descrip-

tions. SIGARCH Comput. Archit. News, 34(4):1–17.

Howard, M. and Lipner, S. (2006). The Security Develop-

ment Lifecycle. Microsoft Press, USA.

Kuznetsov, V., Szekeres, L., Payer, M., Candea, G., Sekar,

R., and Song, D. (2014). Code-Pointer integrity. In

11th USENIX Symposium on Operating Systems De-

sign and Implementation (OSDI 14), pages 147–163,

Broomﬁeld, CO. USENIX Association.

National Security Agency ghidra, N. (2019). Ghidra. https:

//www.nsa.gov/resources/everyone/ghidra/.

NIST (2022). National Vulnerability

Database. https://nvd.nist.gov/general/

visualizations/vulnerability-visualizations/

cvss-severity-distribution-over-time.

OpenSSF (2024). Open source security foundation

(openssf) best practices working group: Compiler op-

tions hardening guide for c and c++.

Pramanick, K. and Kulkarni, P. A. (2022). Detect compiler

inserted run-time security checks in binary software.

In Su, C., Gritzalis, D., and Piuri, V., editors, Infor-

mation Security Practice and Experience, pages 268–

286, Cham. Springer International Publishing.

Qasem, A., Shirani, P., Debbabi, M., Wang, L., Lebel, B.,

and Agba, B. L. (2021). Automatic vulnerability de-

tection in embedded devices and ﬁrmware: Survey

and layered taxonomies. ACM Comput. Surv., 54(2).

Sarda, S. and Pandey, M. (2015). LLVM Essentials. Packt

Publishing.

Serebryany, K., Bruening, D., Potapenko, A., and Vyukov,

D. (2012). Addresssanitizer: A fast address sanity

checker. In USENIX ATC 2012.

Szekeres, L., Payer, M., Wei, T., and Song, D. (2013). Sok:

Eternal war in memory. In Proceedings of the 2013

IEEE Symposium on Security and Privacy, SP ’13,

page 48–62.

Tice, C., Roeder, T., Collingbourne, P., Checkoway, S., Er-

lingsson, Ú., Lozano, L., and Pike, G. (2014). En-

forcing Forward-Edge Control-Flow integrity in GCC

& LLVM. In 23rd USENIX Security Symposium

(USENIX Security 14), pages 941–955, San Diego,

CA. USENIX Association.

Ucci, D., Aniello, L., and Baldoni, R. (2019). Survey of ma-

chine learning techniques for malware analysis. Com-

puters & Security, 81:123–147.

Weiser, M. (1981). Program slicing. In Proceedings of the

5th International Conference on Software Engineer-

ing, ICSE ’81, page 439–449. IEEE Press.

Wheeler, D. A. (2014). Preventing heartbleed. IEEE Com-

puter, 47(8):80–83.

Wired (2017). The reaper iot botnet has already infected a

million networks.

Zhang, Y. (2019). Sympas: Symbolic program slicing.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

408