A Framework for Incident Response in Industrial Control Systems

Roman Schlegel, Ana Hristova and Sebastian Obermeier

ABB Switzerland Ltd., Corporate Research, Baden-D

attwil, Switzerland

Keywords:

Industrial Control System Security, Forensics, Incident Response.

Abstract:

Industrial control systems are used to control and supervise plants and critical infrastructures. They are crucial

for operation of many industries and even society at large. However, despite efforts to secure such systems,

there are frequent reports of incidents that lead to problems because of human error (e.g., installing unautho-

rized software on a mission-critical machine) or even cyber attacks. While such incidents should be prevented

in the ﬁrst place, it is not feasible to achieve 100% security; therefore, operators should be prepared to deal

with incidents promptly and efﬁciently if they occur. In this paper, we present a general methodology and

framework for investigating incidents in industrial control systems. The methodology is supported by a tool

to automate an investigation, especially to efﬁciently determine the state of ﬁles on a device after an incident.

This enables faster recovery from incidents by being able to identify suspicious ﬁles and focus on the ﬁles that

have been modiﬁed compared to the initially installed ﬁles, or a previously taken baseline. An evaluation con-

ﬁrms the applicability of the methodology for an embedded industrial controller and for an industrial control

system.

1 INTRODUCTION

Industrial automation and control systems (IACS) are

used to monitor and control the behavior of physical

processes, for example in chemical plants, electric-

ity generation, and distribution or water management.

The ﬁrst networked IACS were running within iso-

lated networks and did not include any speciﬁc cy-

ber security mechanisms. However, nowadays more

and more SCADA systems are communicating using

public IP networks (Rao Kalapatapu, 2004), which in-

troduces security threats that the systems are not pre-

pared for. As a result, vendors, regulators, and as-

set owners have started to address this problem by

means of security mechanisms, processes, standards,

and regulation (Brandle and Naedele, 2008).

In order to increase the overall cyber security

level, it is important to detect potential cyber se-

curity incidents at an early stage. Traditional digi-

tal forensics is aimed at an ofﬂine analysis to ﬁnd

court-proof evidence of criminal activities. How-

ever, IACS follow a different prioritization regard-

ing the relevance of security objectives, cf. (Dzung

et al., 2005; Naedele, 2007). For instance, availabil-

ity, authenticity, and integrity are paramount for an

IACS, while conﬁdentiality is usually less important.

There are also additional requirements for live foren-

sics (Ahmed et al., 2012).

1.1 Problem Statement

Cyber security incidents can often go unnoticed for

a signiﬁcant period of time, and when they are dis-

covered it is difﬁcult to evaluate the extent and the

severity of an incident. Even though a security tool

might generate an event or alarm, the information re-

garding the event is often not speciﬁc enough or is

unable to provide any concrete information on the

consequences of the event. As an example, an an-

tivirus product might generate an event indicating that

a virus has been detected on a machine. It will nor-

mally also give the name of the virus (e.g., “Ex-

ploit Exp/JAVA.Niabil.gen”), and in which ﬁle the

virus was found. However, it cannot typically tell the

operator whether there are any other consequences,

such as which ﬁles the virus modiﬁed on the sys-

tem. However, if this happens on a machine with

a SCADA application, the operator needs to know

whether any other ﬁles have been affected by the

virus, as this could have an impact on the control

system and the process. However, even if a virus

has been detected and was removed from the system,

there is some uncertainty whether all affected ﬁles

have been found and removed. As industrial control

systems are more deterministic than regular desktop

computers and undergo fewer changes in software,

this allows for new approaches to detecting incidents

178

Schlegel R., Hristova A. and Obermeier S..

A Framework for Incident Response in Industrial Control Systems.

DOI: 10.5220/0005510001780185

In Proceedings of the 12th International Conference on Security and Cryptography (SECRYPT-2015), pages 178-185

ISBN: 978-989-758-117-5

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

or anomalies (Hadeli et al., 2009).

We consider three different scenarios that a foren-

sic investigation tool for IACS should support to aid

in incident response. These are:

System Inventory: determine the list of software

packages installed on a device.

Baselining: compare a machine to a snapshot of itself

in an earlier state, highlighting the changes between

the snapshots.

Installation Veriﬁcation: verify the installation of

a software package, identifying genuine ﬁles of the

software package.

All three scenarios can be used in an incident re-

sponse to rapidly get an overview of a machine or

device and this enables to quickly focus on the root

cause of an incident by removing ﬁles from the in-

vestigation that are veriﬁed to be genuine or part of a

baseline.

1.2 Contributions

In this paper, we present an approach that leverages

the deterministic character of industrial machines by

ﬁngerprinting all ﬁles on a system and comparing

them to a reference set. The main challenge is iden-

tifying changes that occur during the normal opera-

tion of a system, i.e., minimizing false positives, for

example in log ﬁles, database ﬁles and conﬁguration

ﬁles. Our approach aims at making the ﬁngerprint-

ing as reliable as possible using different techniques

such as regular hash comparison and fuzzy hash com-

parison, in order to minimize the work remaining to

investigate a system after an incident or during rou-

tine checks. The contributions of the paper are the

following:

• it identiﬁes the methodological differences be-

tween analyzing industrial control systems and

traditional ofﬁce IT systems, and develops a spe-

ciﬁc forensic methodology targeted to industrial

control systems;

• it presents a framework architecture comprising

an analysis tool that facilitates the forensic anal-

ysis of industrial control systems in the different

scenarios;

• it evaluates the methodology and the framework

on a real embedded controller and on a real indus-

trial control system.

In Section 2, we present related work while in Sec-

tion 3 we introduce a methodology for a forensic anal-

ysis of industrial control system devices by highlight-

ing the differences to traditional forensic analysis of

typical machines used in IT systems. Section 4 intro-

duces the framework developed to support a forensic

analysis of industrial control systems, followed by the

evaluation of the concept in Section 5. We discuss fu-

ture work in Section 6 and conclude in Section 7.

2 RELATED WORK

Mechanisms and approaches for incident response

have been extensively studied, especially over the past

decade. As the security of IACS came into the spot-

light with the appearance of Stuxnet (Langner, 2011),

the interest in detecting and responding to incidents

in the IACS domain increased as well. However, the

scope of the literature that exists at present is very

speciﬁc to a certain category of embedded devices

and does not address a general incident response and

forensics methodology taking into account the oper-

ational challenges and considerations for IACS. Of

the few papers that address forensics for embedded

systems, the scope is typically limited to cell phones,

navigation and personal entertainment devices or per-

forming forensic analysis at a network level.

A uniﬁed forensics methodology is described

in (Shaw and Atkins, 2010), in the context of em-

bedded devices such as data recorders in cars, mo-

bile phones, navigation devices, etc. The authors di-

vide the methodology into three phases, a preparation

phase, hardware phase and software phase and they

also present a comparative analysis of several related

methodologies against the general forensic analysis

methodology as described in (US DoJ, 2007). How-

ever, they do not cover the particularities of embed-

ded devices in IACS nor give pointers of how they

could be analyzed. Furthermore, the use of Snort

IDS and use of honeypots for aiding the process of

network forensics in a SCADA system is described

in (Valli, 2009). An architecture that supports the

forensic analysis of SCADA systems and networks is

described in (Kilpatrick et al., 2008), where forensic

agents are deployed at strategic locations to forward

relevant portions of network packets to a central loca-

tion for storage and analysis.

An interesting approach for forensic analysis us-

ing whitelisting is described in (Chawathe, 2009).

The paper describes the suitability of signature-based

methods in forensic analysis using MD5, SHA1 etc.,

to classify and prioritize ﬁles. The paper also rec-

ognizes the need for detecting near matches for effec-

tive analysis and describes potential methods to detect

near or approximate matches. Moreover, an overview

of the use of hashing in digital forensics is presented

in (Roussev, 2009), introducing ﬁngerprinting based

on random polynomials and similarity hashing based

on fuzzy hashes. The authors also describe optimiza-

AFrameworkforIncidentResponseinIndustrialControlSystems

179

tions such as Bloom ﬁlters for hashing large amounts

of data and highlight how different forms of hashing

(traditional and fuzzy) can be a valuable tool when

doing forensic analysis, however, without considering

how to obtain data or manage it.

Our methodology and framework share a similar

purpose as some of the outlined works above. How-

ever, the goal of our framework is neither to address

incident response and forensics at a network level, nor

to perform optimizations in the hashing algorithms,

nor to cover only general purpose computers or a vast

range of general purpose embedded devices. Instead

we give insight into a uniﬁed solution for performing

incident response and in-house preliminary forensic

analysis at a system or product level for IACS, both,

in real time and/or ofﬂine. To the best of our knowl-

edge our framework is the ﬁrst to directly address this

challenge for IACS.

3 METHODOLOGY

In this section, we will give an overview of tradi-

tional forensics methodology and introduce a general

methodology for forensic analysis of IACS.

3.1 Traditional Computer Forensics

Methodology

The traditional forensics methodology is typically di-

vided into three phases. These are ﬁrst the acquisition

of evidence, followed by the analysis of the acquired

evidence and ﬁnally the synthesis of the ﬁndings into

a report. The acquisition phase consists of turning

a device off and copying data from the non-volatile

memory (e.g., hard disks) for analysis, as the analysis

is never done on the original media. Furthermore, the

copying has to be done without modifying the original

data, and it should be a veriﬁable, identical bit-by-bit

copy. If volatile data is also needed (e.g., the con-

tents of the RAM), the extraction of that data should

be done before powering off a device. The analysis

phase depends very much on the context of the in-

vestigation and can include listing of all ﬁles, recov-

ering log ﬁles or deleted ﬁles, looking for hidden or

encrypted ﬁles, eliminating ﬁles that are known to be

benign, etc. Finally, a report summarizing the ﬁnd-

ings of the analysis and describing the steps taken to

analyze the data is prepared. The same process is

applicable to embedded devices with two additional

considerations:

1. Most embedded devices do not contain hard disks,

but use ﬂash memory instead. In this case a hard-

ware device (a so-called write blocker) inserted

between the ﬂash card and the host computer can

be used to prevent inadvertently modifying the

contents of the ﬂash card.

2. Not all information retrievable from a computer

system can be mapped to adequate data artifacts

in an embedded device. The OS of an embedded

device has typically been stripped down and con-

tains less functionality than regular operating sys-

tems.

3.2 General Forensics Methodology for

IACS

In the IACS domain, powering down a server or an

embedded device might impact the control of a pro-

cess, as continuous operation is typically required.

Therefore, the traditional forensic approach should be

applied with care, and should only be used when it

does not jeopardize the process that is controlled or

monitored by the system.

The data that can be extracted from embedded

devices also directly depends on the device ecosys-

tem, i.e., the tools that are already available for

management, maintenance and conﬁguration of a

device. There is no single universal approach that

can be used to extract relevant data artifacts from all

different types of embedded devices. To be able to

extract the most information from different embed-

ded devices, it is necessary to analyze each type and

create type-speciﬁc guidelines for the extraction of

relevant data artifacts that can be used for incident

response and forensic analysis.

Evidence Acquisition: There are different ap-

proaches for extracting data from an embedded de-

vice that can help in an incident response and/or live

forensic investigation. For ofﬂine data acquisition,

imaging tools can be used to obtain a disk image. If

the device cannot be shutdown to image disks, dif-

ferent approaches need to be considered, such as a

dedicated incident response and forensics agent run-

ning on an embedded device, where the agent can be

used to acquire data from the device when required

at runtime. The agent can be made to retrieve certain

ﬁles or hashes of all ﬁles, it can check ﬁles for exis-

tence of alternate data streams (ADS), etc. Another

approach would be to use the engineering and main-

tenance tools used by operators and technical experts

for conﬁguring a device, debugging and troubleshoot-

ing an installation, as well as performing diagnostics.

It is also possible to run data collecting servers on

embedded devices. For example, an FTPS (secure

FTP) server can be installed on an embedded device

to facilitate copying ﬁles between the device and a

SECRYPT2015-InternationalConferenceonSecurityandCryptography

180

PC connected through the network. If a web server is

running on the device, the device can be accessed via

HTTPS (secure HTTP) to download ﬁles and retrieve

event information. If access to the embedded device

is allowed over SSH, then this can be another method

to acquire data from the device, e.g., through shell

commands. In addition, Breeuwsma (Breeuwsma,

2006) describes a method that can be used to extract

data from an embedded device using a JTAG port, a

port that is normally used for testing integrated circuit

boards. Using this method the chance that data is al-

tered through the extraction process is minimized as

the memory chip can be addressed and read directly.

However, in production systems JTAG ports are typi-

cally neither enabled nor necessarily accessible.

Data Artefacts: During the acquisition phase the in-

vestigator should decide on the data artefacts that are

relevant for further forensics analysis and incident re-

sponse. However, when retrieving data from an em-

bedded device, the order of volatility should be taken

into account as some data has a very short lifespan.

More volatile data, i.e., with a shorter lifespan, should

be extracted ﬁrst, to prevent data being lost before

the extraction can be completed. Non-volatile data

such as log ﬁles, conﬁguration ﬁles, dump ﬁles, tem-

porary ﬁles, authentication information, alternate data

streams can be read after an incident, as they typically

persist when a device is shutdown or reset. Volatile

data like operating system time, logged-on users, net-

work status, network information, network connec-

tions, process information, process-to-port mappings,

process memory, service/daemon/driver information,

open ﬁles, mapped drivers and shares will be lost

immediately when power is interrupted and should

therefore be extracted ﬁrst.

Analysis: The exact analysis done in the analysis

phase depends on the context and the goal of the in-

vestigation. If a forensic investigation is for example

trying to ﬁnd evidence of an information leak, an in-

vestigator would analyze, e.g., local mail data, among

other things. In our methodology, where the goal is to

verify the system integrity after an incident, the analy-

sis is performed based on a comparison of ﬁle hashes

with a database of known ﬁles. This database contains

hash information of known ﬁles and other meta-data

such as matching product information, version num-

ber, manufacturer etc. Depending on the context of

the analysis, the comparison can be done against the

entire database or over a subset of ﬁles belonging to

speciﬁc products. The analysis is done in two steps:

Regular Hash Comparison. A regular hash of the

ﬁle such as SHA1, SHA-256, etc. is compared

to the list of hashes in the database. If the hash

is found, the ﬁle has been identiﬁed and the in-

formation associated in the database is listed. If

the hash of the ﬁle is not found in the database, a

second comparison step is performed using fuzzy

hashes.

Fuzzy Hash Comparison. If the comparison of a

regular hash does not yield any results for a ﬁeld,

a fuzzy hash comparison can be made. Fuzzy

hashes (Kornblum, 2006) match inputs that have

homologies, i.e., it can give information about the

similarity of a ﬁle to another ﬁle. If a regular hash

(which requires a ﬁle to be exactly the same) does

not match, the fuzzy hash can reveal ﬁles that are

at least similar to the ﬁle in question.

This two-step approach ensures that as many ﬁles as

possible can be identiﬁed, or if a ﬁle cannot be iden-

tiﬁed, that it can at least be detected to be similar to a

known ﬁle. Using fuzzy hashes requires the investiga-

tor to deﬁne a threshold above which a ﬁle is classiﬁed

as similar. For text ﬁles (e.g., log ﬁles) a lower thresh-

old can be chosen, e.g., 75 (out of 0-100), as there

changes are expected, while other ﬁle types (e.g., ex-

ecutables) should require more similarity, e.g., 90.

Report: The last step of a forensic investigation is to

create a report with the ﬁndings.

4 FRAMEWORK

Figure 1 illustrates the general architecture of the in-

cident response framework we developed to analyze

incidents in an IACS infrastructure.

The framework is able to accept hashes from var-

ious sources, which are fed into the Industrial Foren-

sics Analysis Tool (IFAT):

Direct Hashing. The IFAT can hash all ﬁles in a

given directory, or from a mounted disk image of

a device.

Hash Export. The IFAT can be run from a USB stick

on a device to hash all ﬁles on the device and ex-

port the results into a data ﬁle. This data ﬁle can

then be imported and analyzed with a central in-

stallation of the IFAT.

Remote/Online Hashing. The framework can also

make use of an extended version of GRR Rapid

Response (Cohen et al., 2011; Moser and Co-

hen, 2013), a forensic framework developed by

Google, to hash ﬁles on a device remotely and on-

line. Through an agent, a device can be instructed

to hash certain directories and send back the list

of hashes.

AFrameworkforIncidentResponseinIndustrialControlSystems

181

Figure 1: Architecture of the Incident Response Framework.

Once the list of hashes has been acquired, the

framework makes use of a comprehensive hash

database to identify and determine the provenance

of individual ﬁles and generates a report. This hash

database contains hash information of general all-

purpose software, as well as hashes from software

speciﬁc to the IACS domain and it can also contain

baseline versions of complete devices. During the

analysis the investigator can then determine whether

the comparison should be made against the complete

database, only against certain software products or

against a speciﬁc baseline.

4.1 Industrial Forensics Analysis Tool

(IFAT)

Besides regular hash functions like SHA1, MD5, etc.

the IFAT also uses fuzzy hashes (Kornblum, 2006), to

ﬁnd similarity between ﬁles and to indicate whether

a particular ﬁle is similar to a known ﬁle, or even

to an earlier snapshot of the ﬁle itself (for example

in a baselining scenario). The IFAT can also de-

tect and hash the contents of Alternate Data Streams

(ADS) (Marlin, 2013) attached to regular ﬁles.

In order to identify a particular ﬁle, the IFAT con-

nects to a database that contains hashes of many dif-

ferent software products. The database is a combina-

tion of publicly available ﬁle hashes, such as the NIST

NSRL database (National Institute of Standards and

Technology (NIST), 2009), but also contains hashes

from proprietary IACS software. For each ﬁle, there

are the following three possible analysis outcomes:

Exact Match: The ﬁle exactly matches a ﬁle in the

database, indicated by an exact match of their re-

spective hashes.

Partial Match: A ﬁle does not exactly match any ﬁle

in the database, but a fuzzy hash indicates similar-

ity to a ﬁle in the database (e.g., a similarity score

of 0.9 on a scale from 0 to 1). Whether a partial

match has been found also depends on the simi-

larity threshold chosen by the investigator.

No Match: The ﬁle did not exactly match any ﬁle in

the database, nor had a similarity score lower than

the threshold chosen by the investigator.

Once an analysis has been completed and all ﬁle

hashes from a target device have been compared to the

hashes in the database, the results can be examined

using a virtual ﬁle system view.

The IFAT also provides additional functionality

that can be used to manage the database of hashes,

such as updating the NIST NSRL database (which is

itself being updated regularly), adding hashes of new

software products or device baselines and generally

managing the hashes in the database.

There is a negligible, if non-zero, probability that two

different ﬁles have the same hash. However, if hashes such

as SHA1 or SHA-256 or stronger are used, this probability

is so small as to be irrelevant for all practical purposes.

SECRYPT2015-InternationalConferenceonSecurityandCryptography

182

5 EVALUATION

The use of the Industrial Forensics Analysis Tool

(IFAT) and the framework has been evaluated in the

following scenarios:

• Regular, fresh install of Windows XP (system in-

ventory scenario). Windows XP has been taken as

an example operating system as it is still widely

used on workstations in IACS.

• Regular, fresh install of Windows XP baselined

and then infected with a virus (baselining sce-

nario).

• Embedded industrial automation controller

(baselining scenario).

• Industrial control system software package (in-

stallation veriﬁcation scenario).

Windows XP Installation. In this experiment we

created a fresh, regular install of Windows XP, hashed

all ﬁles on the drive and compared it with the hash

database. The comparison was able to identify a ma-

jority of ﬁles (approximately 78%), as shown in Ta-

ble 1, although fuzzy hashing only improved this by

identifying a further 23 ﬁles (1.6% of remaining un-

known ﬁles after SHA1 hashing).

Table 1: File match rate for regular Windows XP installa-

tion using NIST NSRL database.

Windows XP # Files

Total # of Files 6735

SHA1 5282 78.4% (of all ﬁles)

Fuzzy Hashing 23 1.6% (of ﬁles un-

known after SHA1

hashing)

SHA1 & Fuzzy 5305 78.8% (of all ﬁles)

Unknown Files 1430 21.2% (of all ﬁles)

This shows that the publicly available hashes as

part of the NIST NSRL database, if used without

adding additional hashes to the database, are not suf-

ﬁcient for eliminating a multitude of ﬁles. Therefore,

in the next scenario, we baselined the new system to

achieve better accuracy.

Windows XP Installation with Virus. In this sce-

nario, we used a plain Windows XP installation and

baselined it, i.e., we hashed all ﬁles and inserted all

hashes as a baseline into the IFAT hash database. We

then infected the system with a virus that is hidden in

a script ﬁle for a popular IRC chat client. The virus it-

self consists of 14 ﬁles, including the chat client itself

(which is started through autorun in a ”quiet mode”

conﬁguration).

Table 2 shows the results of examining the base-

lined Windows XP machine after the virus infection.

Compared to the baseline taken before the virus in-

stallation, the tool can identify 98.6% of all ﬁles

through their SHA1 hashes. Of the remaining 92 ﬁles,

fuzzy hashing can remove a further 28% (26 ﬁles).

The remaining 66 ﬁles cannot be determined and

would yield the ﬁles that would have to be checked

in more detail. However, of those 66 ﬁles, more

ﬁles could be removed by using additional classiﬁ-

cation. For example, 17 of these 66 ﬁles are Win-

dows prefetch (PRF) ﬁles, which could be veriﬁed

to be genuine (through a semantic analysis, verifying

that the ﬁles conform to the PRF format), reducing

the number of unknown ﬁles to 49. The remaining

49 ﬁles could be reduced further, as there are for ex-

ample registry hives among these ﬁles, or Windows

desktop.ini ﬁles that could be veriﬁed to be genuine,

etc. In this case the virus actually consists of 14 ﬁles,

meaning that there are 52 ﬁles that changed with nor-

mal usage of Windows XP, but many of them can be

removed through further classiﬁcation as explained

above.

This shows that baselining can very effectively re-

duce the number of ﬁles that need to be examined fur-

ther after an incident, reducing the total number of

ﬁles by 99% after the classiﬁcation with SHA1 and

fuzzy hashing in this experiment.

Table 2: File match rate for baselined Windows XP instal-

lation infected by a virus.

WinXP (Virus) # of Files

Total # of Files 6785

SHA1 6693 98.6% (of all ﬁles)

Fuzzy Hashing 26 28% (of ﬁles un-

known after SHA1

hashing)

SHA1 & Fuzzy 6719 99.0% (of all ﬁles)

Unknown Files 66 1% (of all ﬁles)

Baselined Industrial Controller. One scenario of

our framework is to compare a controller of an indus-

trial control system with a baseline that was stored

earlier. This is similar to the use-case of baselining

regular machines, but for the industrial controllers,

this can be done regularly (e.g., once every week) and

automatically.

We performed some limited tests using cloned

drive contents of actual controllers. We manually

added two new, unrelated ﬁles and changed parts of

an existing ﬁle, and could verify that these changes to

the controller were detected correctly and accurately

by the IFAT.

Industrial Automation and Control System Soft-

ware. To test the scenario of efﬁciently verifying the

installation of a software package, we also tested the

AFrameworkforIncidentResponseinIndustrialControlSystems

183

IFAT on a large Industrial Automation and Control

System (IACS) software product. We ﬁrst imported

the hashes of the ﬁles on the installation media of

the software package into the database (extracting in-

staller data as far as feasible). We then used the IFAT

to hash a real, existing installation of the software, and

compared the installation directory of the software on

the machine with the data contained in the database

that was derived from the installation media.

The results (see Table 3) show that by adding the

installation media of a software package, the tool is

able to classify more than 90% of the ﬁles of the in-

stallation as belonging to that product. In this case,

adding the installation media to the database was done

manually, but if the process can be automated, includ-

ing recursive unpackaging of installer data, then the

match rate could be improved even further.

Table 3: File match rate for an Industrial Automation and

Control System software installed on a machine.

IACS Software

# of Files

Total # of Files 8995

SHA1 8045 89.4% (of all ﬁles)

Fuzzy Hashing 99 10.4% (of ﬁles un-

known after SHA1

hashing)

SHA1 & Fuzzy 8144 90.5% (of all ﬁles)

Unknown Files 851 9.5% (of all ﬁles)

Other Performance Measures. We tested the IFAT

together with a database containing information about

approximately 110 million ﬁles. This includes both

the contents of the NIST NSRL database and cus-

tom hashes of software that we inserted. Also, the

NIST database only contains SHA1 hashes, while

for the ﬁles that we added we also included fuzzy

hashes. The database size was approximately 24GB

(including both data and indexes), and the (virtual-

ized) server had 5.5GB of RAM available and was as-

signed 3 Intel Xeon CPUs at 1.86GHz. With these

speciﬁcations the server was able to perform about

360 queries per second for SHA1 hashes (using only

one CPU). One direction for improving the perfor-

mance would be by adding more RAM to the server,

as in our tests the performance bottleneck was the

hard disk IO. Performing fuzzy hash queries was sig-

niﬁcantly slower by two orders of magnitudes, be-

cause comparing fuzzy hashes requires computing an

edit distance between a queried fuzzy hash and all

fuzzy hashes stored in the database, and this operation

could not be improved through the use of indexes.

Conclusion. The evaluation of the IFAT prototype

has shown that it can handle the scenarios that we out-

lined earlier, such as system inventory of a machine to

baselining and verifying installations of speciﬁc soft-

ware packages.

Limitation. One limitation is that there exists kernel-

level malware that can falsify the analysis of ﬁles by

returning “clean” data or by hiding malicious ﬁles

when listing directories. This is a problem for ev-

ery live analysis tool (e.g., using the IFAT directly on

a machine, or hashing ﬁles through the GRR agent),

but can be solved for example through ofﬂine analy-

sis (i.e., analyzing an image extracted from the data

storage, e.g., hard disk, ﬂash drive, etc.). Another so-

lution to this problem would be to run the machine in

question in a virtualized environment, and have the

hypervisor directly access the underlying data stor-

age, without going through the kernel of the virtual-

ized machine itself. Another limitation is that even

with a recognition rate of 90%, the number of ﬁles

still unknown after an analysis can still number in the

thousands, which would then still need to be further

analyzed. This further illustrates that the quality of

the hash database is crucial.

6 FUTURE WORK

Currently, the hash database used in our approach

contains hashes of legitimate IACS software and gen-

eral purpose software. One direction for future work

would be to study the effectiveness of a heuristic anal-

ysis of unmatched ﬁles for more accurate results. This

could potentially be of help when dealing with a large

number of unmatched ﬁles and could be used for pre-

liminary prioritization. Another direction is to extend

the Incident Response Framework to be able to per-

form online analysis of other operating systems, par-

ticularly those used in the IACS domain such as real-

time operating systems. Finally, with the increase

in discovered vulnerabilities and the growth of digi-

tal threats in the IACS domain, incident response will

become even more important in the future, requiring

ever more elaborate methods to produce precise re-

sults.

7 CONCLUSION

We have presented a comprehensive methodology for

forensic analysis of IACS that outlines the steps nec-

essary to be performed and gives an overview of pos-

sible ways for extracting ﬁle system data from embed-

ded devices. However, our analysis has shown that

because of the diverse nature of such devices, each

device type would need to be studied individually and

SECRYPT2015-InternationalConferenceonSecurityandCryptography

184

the best methods available for data extraction deter-

mined for each device. This is because embedded de-

vices differ in their capabilities, their architecture, the

supported operating systems, available interfaces and

organization of the ﬁle system, which requires differ-

ent extraction methods for each particular type of de-

vice.

Furthermore, we have presented a framework that

allows for an initial analysis of the non-volatile stor-

age of IACS systems and devices as well as of

general-purpose computers. This analysis is foreseen

to be done in response to an incident, as an in-house

preliminary forensic analysis or as part of a periodic

routine analysis. The framework supports a variety of

operating systems and has been shown to be suitable

for examining entire ﬁle systems, speciﬁc directories

or single ﬁles. Altogether, the framework covers well

the use-cases outlined in the introduction of this pa-

per.

In addition, we have also performed an evalua-

tion, demonstrating the performance of the frame-

work in different scenarios. The recognition rate

of matched ﬁles, as expected, is directly correlated

with the comprehensiveness and completeness of the

hash database. A more complete database that in-

cludes hashes of as many software products possi-

ble will result in more accurate results. However, for

readily available databases such as the NIST NSRL

database, there are potentially still a large amount of

“unknown” ﬁles that need to be further investigated

after running our analysis tool. The evaluation also

showed that a fuzzy hash comparison can improve the

recognition rate, although not substantially for every

scenario. The performance of the hash comparison

also directly depends on the performance of the server

where the database is stored and the resources allo-

cated to the database, and we have shown that reason-

able performance can be achieved using moderately

powerful hardware.

REFERENCES

Ahmed, I., Obermeier, S., Naedele, M., and Richard, G. G.

(2012). Scada systems: Challenges for forensic inves-

tigators. Computer, 45(12):44–51.

Brandle, M. and Naedele, M. (2008). Security for process

control systems: An overview. IEEE Security & Pri-

vacy, 6(6):24–29.

Breeuwsma, I. M. (2006). Forensic imaging of embedded

systems using jtag (boundary-scan). Digital Investi-

gation, 3(1):32 – 42.

Chawathe, S. (2009). Effective whitelisting for ﬁlesys-

tem forensics. In Intelligence and Security Informat-

ics, 2009. ISI ’09. IEEE International Conference on,

pages 131–136.

Cohen, M., Bilby, D., and Caronni, G. (2011). Distributed

forensics and incident response in the enterprise. Dig-

ital Investigation, 8, Supplement(0):101 – 110. The

Proceedings of the 11th Annual Digital Forensic Re-

search Workshop (DRFWS ’11).

Dzung, D., Naedele, M., von Hoff, T., and Crevatin, M.

(2005). Security for industrial communication sys-

tems. Proceedings of the IEEE, 93(6):1152–1177.

Hadeli, H., Schierholz, R., Braendle, M., and Tuduce, C.

(2009). Leveraging determinism in industrial con-

trol systems for advanced anomaly detection and re-

liable security conﬁguration. In Proceedings of the

14th IEEE International Conference on Emerging

Technologies & Factory Automation, ETFA’09, pages

1189–1196, Piscataway, NJ, USA. IEEE Press.

Kilpatrick, T., Gonzalez, J., Chandia, R., Papa, M., and

Shenoi, S. (2008). Forensic analysis of scada systems

and networks. Int. J. Secur. Netw., 3(2):95–102.

Kornblum, J. (2006). Identifying almost identical ﬁles using

context triggered piecewise hashing. Digital Investi-

gation, 3, Supplement(0):91 – 97. The Proceedings of

the 6th Annual Digital Forensic Research Workshop

(DFRWS ’06).

Langner, R. (2011). Stuxnet: Dissecting a cyberwarfare

weapon. IEEE Security & Privacy, 9(3):49–51.

Marlin, J. (2013). Alternate Data Streams in NTFS. Online:

http://blogs.technet.com/b/askcore/archive/2013/03/2

4/alternate-data-streams-in-ntfs.aspx.

Moser, A. and Cohen, M. I. (2013). Hunting in the enter-

prise: Forensic triage and incident response. Digital

Investigation, 10(2):89 – 98. Triage in Digital Foren-

sics.

Naedele, M. (2007). Addressing IT security for critical con-

trol systems. In HICSS, page 115.

National Institute of Standards and Technology (NIST)

(2009). National Software Reference Library.

Rao Kalapatapu (2004). SCADA Protocols and Communi-

cation Trends. ISA EXPO.

Roussev, V. (2009). Hashing and data ﬁngerprinting in dig-

ital forensics. Security Privacy, IEEE, 7(2):49–55.

Shaw, R. and Atkins, A. (2010). Uniﬁed forensic method-

ology for the analysis of embedded systems. Pro-

ceedings of 4th International Conference on Advanced

Computing & Communication Technologies.

US DoJ (2007). Digital Forensic Analysis Methodol-

ogy. Online:http://www.justice.gov/criminal/ cyber-

crime/docs/forensics chart.pdf. Cybercrime Lab in

the Computer Crime and Intellectual Section.

Valli, C. (2009). SCADA Forensics with Snort IDS. In

Proceedings of WORLDCOMP, Security and Manage-

ment, pages 618–621, Las Vegas.

AFrameworkforIncidentResponseinIndustrialControlSystems

185