Bridging Knowledge Gaps in Security Analytics
Fabian B
¨
ohm
a
, Manfred Vielberth
b
and G
¨
unther Pernul
Chair of Information Systems, University of Regensburg, Germany
Keywords:
Security Analytics, Domain Knowledge, Visual Analytics, Security Awareness.
Abstract:
In a cyber-physical world, the number of links between corporate assets is growing and infrastructures are
becoming more complex. This and related developments significantly enlarge the attack surface of organiza-
tions. Additionally, more and more attacks do not exploit technical vulnerabilities directly but gain a foothold
through phishing or social engineering. Since traditional security systems prove to be no longer sufficient to
detect incidents effectively, humans and their specialized knowledge are becoming a critical security factor.
Therefore, it is vital to maintain an overview of the cybersecurity knowledge spread across the entire com-
pany. However, there is no uniform understanding of knowledge in the field of security analytics. We aim to
close this gap by formalizing knowledge and defining a conceptual knowledge model in the context of security
analytics. This allows existing research to be better classified and shows that individual areas offer much po-
tential for future research. In particular, the collaboration between domain experts but also between machines
and employees could enable the exploitation of previously unused but crucial knowledge. For example, this
knowledge is of great value for defining security rules in current security analytics systems. We introduce a
proof of concept implementation using visual programming to showcase how even security novices can easily
contribute their knowledge to security analytics.
1 INTRODUCTION
Humans are often considered the weakest link in
cybersecurity (Schneier, 2015). However, they are
also an invaluable asset as their domain knowledge
is essential for any modern Security Analytics
1
(SA)
method (Ben-Asher and Gonzalez, 2015; Zimmer-
mann and Renaud, 2019). We argue that these meth-
ods could benefit significantly from the knowledge of
non-security domains but mainly rely solely on secu-
rity experts’ knowledge to decide whether indicators
are malicious incidents or benign activities.
This issue becomes apparent when looking at
the Internet-of-Things (IoT) and ubiquitous Cyber-
Physical Systems (CPSs). They lead to an increasing
connectedness of organizations’ internal and external
assets. Besides an already skyrocketing number of cy-
ber attacks, this significantly increases the attack sur-
face for cyber-physical attacks. These explicitly ex-
ploit the linkage between cyber systems and physical
systems within CPSs to do physical harm to machines
a
https://orcid.org/0000-0002-0023-6051
b
https://orcid.org/0000-0002-1119-4715
1
As no widely accepted definition of Security Analytics
exists, we interpret this term as a set of methods to identify
attacks and threats through data analysis proactively.
or even humans (Loukas, 2015). Detecting and miti-
gating these attacks poses a challenge to existing se-
curity measures. They need to monitor assets directly
connected to cyberspace like firewalls and CPSs (full-
blown manufacturing lanes). A general problem with
CPSs is that current security means, e.g., Security In-
formation and Event Management Systems (SIEMs)
embedded within Security Operations Centers, lack
both knowledge and abilities to protect them effec-
tively (Dietz et al., 2020; Eckhart and Ekelhart, 2018).
Although security experts can make well-
informed decisions about incidents in cyberspace,
they lack the knowledge about the physical domain
to decide whether, e.g., a turbine is acting as ex-
pected (Schneier, 2018). However, engineers and the
respective staff have this knowledge to distinguish
normal turbine behavior from anomalous actions but
lack the knowledge to contribute to SA (Eckhart and
Ekelhart, 2018).
This mismatch reduces organizations’ ability to
implement cohesive SA methods that can reliably de-
tect both cyber and cyber-physical attacks with the
respective indicators. Therefore, the domain knowl-
edge of engineers and the alike need to be integrated
into security operations to allow effective incident de-
tection even for physical assets (Chen et al., 2011).
98
Böhm, F., Vielberth, M. and Pernul, G.
Bridging Knowledge Gaps in Security Analytics.
DOI: 10.5220/0010225400980108
In Proceedings of the 7th International Conference on Information Systems Security and Privacy (ICISSP 2021), pages 98-108
ISBN: 978-989-758-491-6
Copyright
c
2021 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
In this work, we make a two-fold contribution to
tackle this open issue. We first define the different
types of knowledge within cybersecurity and formal-
ize a model of knowledge-based SA. This is a new
contribution as no security-specific notions of knowl-
edge have been defined yet and it enables future re-
search to identify and address more open challenges
in this field. Our second contribution specifically ad-
dresses missing externalization of domain knowledge
as one issue within knowledge-based SA by intro-
ducing a research prototype allowing experts to make
their knowledge available.
Provideresearchprototypeasafirst
solutionapproach
Defineknowledge
types
Defineconversion
processes
Approach
knowledgegap
Defineknowledge-
basedSA
Formalize
knowledgeinSA
Section
2
Section
3
Section
4
Defineknowledge
types
Defineconversion
processes
IntegrateintoSA
process
Identifyknowledge
gaps
Figure 1: Schematic of this work’s structure.
The remainder of this work is structured accord-
ing to Figure 1. Before we can bridge knowledge
gaps within SA efficiently, we need to formally de-
scribe relevant notions of knowledge and conversion
processes for Security Analytics in Section 2. This
allows us and any future work to have a well-defined,
precise vocabulary. We then integrate this vocabu-
lary into the Incident Detection Process to form a co-
hesive picture of what we call knowledge-based SA
within Section 3. The resulting model reveals sev-
eral knowledge gaps in current SA approaches that
are not yet appropriately addressed. Thus, we present
a research prototype in Section 4 showcasing a possi-
ble approach of the integration of security novice’s
domain knowledge into an exemplary SA solution,
i.e., a signature-based incident detection component.
The prototype highlights that the implementation of
knowledge-based SA requires innovative approaches
but can drastically improve cybersecurity. Finally,
Section 5 concludes our work and points out possible
directions for future work.
2 KNOWLEDGE WITHIN
SECURITY ANALYTICS
In this section, we provide an in-depth view of knowl-
edge within modern SA. Therefore, we establish a
formal understanding of different types of knowl-
edge and knowledge conversions within this section.
Sallos et al. (Sallos et al., 2019) stress the impor-
tance of knowledge in cybersecurity on a high-level,
management-oriented view. We aim to add a more
formal view focusing on the implications of integrat-
ing knowledge into security measures.
2.1 Knowledge Types
Within the scientific literature, there are numerous
competing definitions of knowledge and different no-
tions of knowledge. At an abstract level, we fol-
low Davenport’s definition, describing knowledge
as a mix of experience, intuition, values, contex-
tual information, and expert insight (Davenport and
Prusak, 2000). In the context of the data-information-
knowledge-wisdom (DIKW) hierarchy, knowledge
describes the application of data and information to
provide answers to “How”-questions (Ackoff, 1989).
However, the concept of knowledge is not bound
to humans. Early research shows that within organi-
zations, knowledge gets transcribed into documents
or files (Nonaka and Takeuchi, 1995). Following this
path, it becomes apparent that ICT can hold a spe-
cific type of knowledge different from human knowl-
edge, especially when it comes to any type of auto-
mated data analysis (Fayyad et al., 1996; Sacha et al.,
2014). Therefore, it is an accepted definition within
ICT research to differentiate between two main types
of knowledge: explicit knowledge and tacit knowl-
edge. In the following, we will elaborate on those
notions and put them into the context of SA.
2.1.1 Explicit Knowledge
Explicit Knowledge K
e
can be defined as machine
knowledge that can be read, processed, and stored
by machines (Nonaka and Takeuchi, 1995). We
differentiate two different forms of explicit knowl-
edge within Security Analytics from the main inci-
dent detection methods. For anomaly-based detection
mechanisms, machine-readable knowledge is ma-
chine learning models and alike. However, signature-
based security analytics holds explicit knowledge in
the form of signatures and rules to detect indicators
of compromise. Another SA-specific example for K
e
is Cyber Threat Intelligence (CTI), as it describes at-
tributed incidents in a structured way that can be ex-
changed and utilized by machines.
2.1.2 Implicit Knowledge
In general, Tacit Knowledge can only be held by hu-
mans and is very specific to the individual (Polanyi,
2009). Although “tacit knowledge” is the more
widespread term, we use implicit knowledge K
i
throughout this work as this more clearly indicates the
difference between machine and human knowledge.
Bridging Knowledge Gaps in Security Analytics
99
Implicit knowledge is gained by linking new insights
and prior knowledge, which can be further subdivided
into the notions of domain and operational knowl-
edge (Chen et al., 2009). Domain knowledge de-
scribes what users know about a specific context (i.e.,
domain) like security or engineering (Ben-Asher and
Gonzalez, 2015; Eckhart and Ekelhart, 2018). Oper-
ational knowledge, in contrast, is the ability of a hu-
man to interact with a specific system. However, in
the context of SA, we consider this differentiation too
vague. To describe the problem at hand concisely, a
fine-granular and contextualized view on K
i
is neces-
sary.
One straight-forward contextualization of K
i
in
SA is the one of operational knowledge K
i
o
. Employ-
ees with operational knowledge can operate an orga-
nization’s security systems. This might range from
having the experience to create signatures for a SIEM
system or tuning models for anomaly- and behavior-
based SA mechanisms.
In Equation 1 we define the domain knowledge
relevant for SA as a combination of the two sub-types
K
i
d(sec)
and K
i
d(nonSec)
. The first one encapsulates any
security-related domain knowledge that is mostly held
by security domain experts. Elements of security-
related domain knowledge are safety and security as-
pects from a cybersecurity perspective, like firewall
rules, suspicious network connections, or unautho-
rized digital access to information. We perceive this
type of domain knowledge to be already integrated
to a good degree in SA means (Wagner et al., 2017).
However, K
i
d(nonSec)
as non-security domain knowl-
edge, like manufacturing or engineering knowledge,
is not yet sufficiently integrated into SA, although be-
ing relevant to detect incidents on cyber-physical sys-
tems (Dietz et al., 2020). As examples of non-security
domain knowledge serve security aspects from an en-
gineering view like expected RPMs of a power turbine
or temperature in a blast furnace.
In the domain of SA, we consider an addi-
tional, new type of implicit knowledge: situational
knowledge K
i
s
. This type encompasses the concept
of security awareness of an organization’s employ-
ees (Jaeger, 2018; Vasileiou and Furnell, 2019). Any
of them can perceive unusual events or suspicious
behavior in the real-world. This ranges from a re-
ceived email, which probably is a phishing attempt,
to a private USB drive plugged into a company com-
puter. Having situational security knowledge (e.g., af-
ter security awareness trainings (Ponsard and Grand-
claudon, 2020)), the employee is able to derive that
the email or the USB drive might pose a threat to
the company. No specific security domain knowledge
(K
i
d
(nonSec)) about the company’s SA is necessary
for these types of conclusions.
All of these different implicit knowledge types
are relevant for SA to detect both cyber and cyber-
physical incidents. This allows us to aggregate our
view on implicit knowledge within SA as a combina-
tion of the three previously introduced sub-types of K
i
(Eq. 2).
K
i
d
= K
i
d(sec)
K
i
d(nonSec)
(1)
K
i
= K
i
d
K
i
s
K
i
o
(2)
K
i
s
, as well as K
i
d
, need to be made available for
cohesive security operations. The operational knowl-
edge K
i
o
in the context of SA poses an entry barrier
for this integration. Having the necessary K
i
o
would
allow any employee, for example, to define a signa-
ture for a SIEM system. A user’s knowledge in this
context can be defined as instances of the different
types of K
i
as shown in Equation 3. Based on their
different knowledge sets we differentiate two types
of security-related users: security experts S
e
(Eq. 4)
and security novices S
n
(Eq. 5). Differences between
these employee groups are the different forms of do-
main knowledge and the lack of operational knowl-
edge for security novices. This sheds light on a di-
chotomy within SA as security experts have the nec-
essary operational knowledge K
i
o
but lack, e.g., en-
gineering domain knowledge K
i
d(nonSec)
, which is rel-
evant for SA to identify cyber-physical attacks. S
n
might have this engineering knowledge but lack K
i
o
analogously.
k
i
d(nonSec)
, k
i
d(sec)
K
i
d
, k
s
K
i
s
, k
i
o
K
i
o
(3)
S
e
= {k
i
d(sec)
, k
i
s
, k
i
o
} (4)
S
n
= {k
i
d(nonSec)
, k
i
s
} (5)
2.2 Knowledge Conversion
The transfer between explicit to implicit knowl-
edge and vice versa is formalized by Nonaka and
Takeuchi, who introduced four knowledge conversion
processes (Nonaka and Takeuchi, 1995). Several re-
search domains have been picking those up to formal-
ize the interaction between computers and humans.
Naturally, the domain of visualization and human-
computer interaction are lending heavily from these
concepts (Wang et al., 2009; Federico et al., 2017;
Wagner et al., 2017). However, we also think that the
area of SA can benefit from them as it is relying on
automated detection processes (explicit knowledge),
but also the expertise of different human experts (im-
plicit knowledge) is necessary for effective security
operations. Therefore, we define the following four
knowledge conversion processes for SA:
ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy
100
1. Internalization (int). This process describes ex-
plicit knowledge being made available for humans
so that they can perceive it using their opera-
tional knowledge and transform it into security-
related domain knowledge (Eq. 6). The amount
and effectiveness of the knowledge conversion are
heavily dependent on the user’s k
i
o
. This depen-
dence on k
i
o
is indicated in the equation as the
operational knowledge is annotated as the cata-
lyst of the knowledge conversion. This notation
is adapted from formal descriptions of chemical
reactions. An example of effective internalization
of K
e
in the context of SA is any form of its visual
display.
2. Externalization (ext). Through externalization,
both K
i
d
and K
i
s
are formalized to be read, stored,
and processed by computers (Eq. 7). Again, this
process is enabled by and dependent on K
i
o
. Var-
ious ways help to realize this conversion in SA.
Examples are the direct tuning of model param-
eters (K
e
) by experts and the definition of rules
for signature-based analyses. Another example is
the direct access for experts to CTI and a way to
actively edit the accessed information. External-
ization is also important as the loss of K
i
trough.
For example, security analysts’ retirement poses
a risk to organizations as occurred incidents show
(Thalmann and Ilvonen, 2020).
3. Combination (comb). A knowledge conversion
process representing the combination of two or
more explicit knowledge bases (Eq. 8). The ex-
change of CTI information between different ac-
tors and its utilization in their respective SA oper-
ations can be characterized as a combination pro-
cess.
4. Collaboration (coll). Humans are working to-
gether and therefore combining their K
i
(Eq. 9).
Collaboration is a hard process to formally cap-
ture and define. However, in the context of SA, we
see this knowledge conversion as a more techno-
logically supported process. Employees, for ex-
ample, work together on the correlation of indica-
tors to figure out whether the indicator marks an
ongoing attack or not. Therefore, they might use
an analysis tool designed specifically for this pur-
pose, supporting the collaboration. During their
collaborative analysis, they are learning from each
other and convert knowledge in the sense of coll.
int : (K
e
K
i
o
7→ K
i
d(sec)
) (6)
ext : (K
i
d
K
i
s
K
i
o
7→ K
e
) (7)
comb : K
e
7→ K
e
(8)
coll : K
i
7→ K
i
(9)
3 KNOWLEDGE-BASED
SECURITY ANALYTICS
After having established a formal understanding of
knowledge types and conversions relevant for SA in
the previous section, we set out to integrate these no-
tions of knowledge into SAs core process. The detec-
tion of incidents, i.e., attacks on any asset of a com-
pany, is one of the main functionalities of SA (Mah-
mood and Afzal, 2013). A cohesive approach to this
process requires comprehensive data collection and
the integration of any available knowledge. In this
section, we elaborate on our model of knowledge-
based SA and the crucial role of knowledge in this
context. Finally, we identify process steps needing
more attention from research.
3.1 Knowledge Model
Menges and Pernul define a well-thought incident de-
tection process (Menges and Pernul, 2018). We inte-
grate the knowledge types and conversions from the
previous sections to show the relationships between
incident detection and knowledge of both humans and
machines in Figure 2. The gray boxes in the figure
mark the components of the original incident detec-
tion process.
The first phase of the incident detection process,
Data Collection, deals with collecting raw Data that
is generated by Situations. A situation is anything
“that happens” within a company, either in the phys-
ical world or in cyberspace. This data is normalized
(and sometimes standardized) and then forms Observ-
ables (Menges and Pernul, 2018). Note that observ-
ables only cover normalized facts of what happened
within the collected raw data but are not attributed yet,
so they do not answer why something happened or
who is responsible. Observables serve as input for the
second phase of the process, which is the actual Inci-
dent Detection. Within this phase, the observables are
analysed to detect Indicators, often related to as In-
dicators of Compromise (IoCs). These point out pos-
sibly malicious or at least suspicious activities within
the observables. However, IoCs can also just stem
from unusual but benign behavior recorded within the
Bridging Knowledge Gaps in Security Analytics
101
DataCollection
K
i
K
e
IncidentDetection
Data Observables Indicators IncidentsSituation
ext(K
i
s
)
K
i
s
K
i
o
K
i
d
coll
Signatures Models ...
int(K
e
)
ext(K
i
d
)
comb
Figure 2: Model of knowledge-based SA based on the Incident Detection Process (Menges and Pernul, 2018).
observables. Therefore, indicators need to be corre-
lated to identify actual Incidents.
Deriving and detecting indicators within the vast
amount of observables is a considerably challenging
task, which relies heavily on automated analysis pro-
cesses because of the sheer number of available ob-
servables (Mahmood and Afzal, 2013). These pro-
cesses use explicit knowledge K
e
as signatures and
rules for signature-based detection but also in the
form of models for behavior-based methods. There-
fore, K
e
plays a significant role in incident detection.
Nevertheless, there are major drawbacks with only in-
tegrating K
e
into automated detection. First of all,
K
e
in the form of Signatures only can detect already
known indicators. Additionally, behavior-based Mod-
els tend to produce many false positives. Human do-
main experts can help to overcome both shortcom-
ings. On the one hand, they can explore available ob-
servables to identify new indicators, and on the other
hand, they leverage their domain knowledge to decide
whether an indicator imposes a malicious action or
not. Therefore, the integration of K
i
d
in this step is
beneficial and a prerequisite for effective SA.
The next process step, identifying actual incidents
within Indicators, needs even more integration of K
i
.
This is mainly because K
e
in this step can only iden-
tify previously known attacks where attack patterns
are already defined. When new attacks occur, K
e
can
barely do more than detect indicators. It cannot iden-
tify the incident in its full scope or complexity. K
i
d
is needed for this task as experts can differentiate be-
tween malicious and benign indicators and correlate
them.
Besides this incorporation of K
e
and K
i
into
the Incident Detection phase we also indicate sev-
eral knowledge conversion processes within Figure 2.
Those are the conversions we identify being rele-
vant and necessary for holistic, effective SA covering
both cyber and cyber-physical incidents. We intro-
duced the indicated processes int(K
e
) (Eq. 6), comb
(Eq. 8), and coll (Eq. 9) introduced in the previous
Section 2.2. Thus, we will focus the two remaining
processes ext(K
i
s
) and ext(K
i
d
) hereinafter. They are
instances of ext (Eq. 7) but have not been defined in
more detail yet.
The externalization of situational knowledge
ext(K
i
s
) essentially allows employees to transform
events they have observed or experienced into observ-
ables. Thus, the semantic information encapsulated in
those can be used in the following stages of the inci-
dent detection process. Leveraging ext(K
i
s
) supple-
ments any traditional data collection because many of
the aspects of targeted attacks remain unseen within
automatically collected data, for example, social en-
gineering or physical access attempts. Therefore, this
knowledge conversion turns any employee (both S
e
and S
n
) into a valuable source when they, for exam-
ple, report a phone call trying to find out their ac-
cess credentials. This process does not rely on do-
main knowledge but on a more general knowledge,
which can be achieved through situational awareness.
ext(K
i
s
) is particularly relevant as it is the only way to
cover the full scope of possible attack vectors, which
could not be achieved by S
e
alone.
The process ext(K
i
d
) essentially comprises both
ext(K
i
d(sec)
) and ext(K
i
d(nonSec)
). It describes the in-
teraction of humans with actual analysis methods
backed by K
e
. The goal of this is to make K
i
d
avail-
able for automated analysis methods to improve in-
cident detection. In the age of cyber-physical sys-
tems, domain knowledge, especially ext(K
i
d(nonSec)
),
is widely scattered across organizations while at the
same time, all of this knowledge is necessary for co-
hesive SA. Therefore, ext(K
i
d
) is crucial as it allows
ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy
102
«Component»
Backend
«Component»
Frontend
(Angular)
«Component»
PatternDebugger
«Component»
VisualPatternBuilder
(GoogleBlockly)
«Component»
API
(GraphQL)
Detection
Pattern
API
Event
Stream
API
«Component»
MessageBroker
(ApacheKafka)
«Component»
PatternDatabase
(MongoDB)
«Component»
PatternMatcher
(Esper)
«Component»
EventSource
Event
Stream
Event
Data
Detection
Patterns
Detection
Patterns
Event/Alert
Data
Figure 3: Component diagram of the architecture for visual collaborative pattern definition.
human knowledge to be transferred into new SIEM
correlation rules, attack signatures, or improved be-
havioral models for an organization’s assets.
3.2 Knowledge Gaps
Building on the formal definitions and the model from
the previous sections, we specify what we call SAs
dichotomy. As Equations 4 and 5 show, the main rea-
son for the dichotomy is the security novice’s lack of
operational knowledge. This leads to three knowl-
edge gaps limiting current SA operations within or-
ganizations and are not yet sufficiently addressed by
research. In this section, we provide a more detailed
view of these pressing issues to be solved in order to
allow the integration of relevant knowledge from all
employees into cohesive SA operations:
1. ext(K
i
s
): The first problem needing more atten-
tion from academia is the externalization of K
i
s
.
A key difficulty with this lies in creating means
to externalize the knowledge, i.e., how to enable
security novices S
n
to turn their situational knowl-
edge K
i
s
into Observables as indicated in Figure 2.
Initial research approaches addressing this prob-
lem consider employees as a kind of security sen-
sor, which allows them to provide observations
and perceptions from the real world as input for
SA (Vielberth et al., 2019; Heartfield and Loukas,
2018). Although the first concepts to establish
this knowledge conversion exist, this problem is
far from being completely solved.
2. ext(K
i
d(nonSec)
): The second problem stems from
the lack of K
i
o
of security novices S
n
like engi-
neers for externalizing their K
i
d
. Therefore, ways
to lower this entry barrier have to be researched.
For example, engineers could contribute heavily
in defining rules for a SIEM system as part of an
organization’s SA. The knowledge of engineers in
this context is relevant as they know better how,
e.g., an electric turbine should not behave. This
is needed for detecting indicators of attacks tar-
geting the turbine itself or its cyber-components,
i.e., the complete cyber-physical system. How-
ever, engineers are unlikely to be able to define
SIEM correlation rules in a possibly cumbersome
syntax. Therefore, ext(K
i
d(nonSec)
) must be simpli-
fied for S
n
. This process lacks attention from re-
search as there are no concepts available yet how
to achieve this in SA.
3. coll: Collaboration, especially between S
e
and S
n
,
is crucial to bring together the scattered domain
knowledge into a centralized knowledge base for
SA. Any solution for the previous two issues in
combination with means for internalization int
can support collaboration. Although collaboration
is a central part of modern organizations, it has
not yet been focused on in SA research in terms
of bringing together security experts and novices.
4 RESEARCH PROTOTYPE
As mentioned in Section 3.2 there is no existing work
that directly addresses the processes ext(K
i
d(nonSec)
)
and coll. In the following, we, therefore, present
a research prototype of signature-based incident de-
tection, which supports those two knowledge conver-
sions and is conceptualized along with the model of
knowledge-based SA (see Figure 2). To detect indica-
tors and incidents, we apply complex event process-
ing based on a pattern hierarchy. This hierarchical
approach allows observing indicators based on both
observables and other indicators. Additional patterns
can be used to identify actual incidents by correlating
IoCs. The patterns, i.e., signatures required for this
purpose, are interpreted as K
e
in the context of SA and
are accessible for humans using the prototype. For
the sake of clarity, we use the general term of event
whenever it is not necessary to distinguish between
observable, indicator, or incident.
Overall, our prototype aims at minimizing the re-
quired K
i
o
for ext(K
i
d(nonSec)
) by providing centralized,
Bridging Knowledge Gaps in Security Analytics
103
Figure 4: Screenshot of the front end’s landing page with a selected statement.
visual access to the K
e
. Additionally, it supports coll
among humans. Its basic idea is to simplify the cre-
ation and editing of patterns as it detaches them from
a complex, text-based syntax. This central concept
is enabled by visual programming, which has proven
its ability to lower entry barriers of complex systems,
especially in the context of education (Chao, 2016;
S
´
aez-L
´
opez et al., 2016). This makes complex, text-
based syntax easier to understand. Thus, the proto-
type aims to meet the following requirements:
Reduce the Needed K
i
o
for ext(K
i
d
). Although
K
i
o
is necessary for both int and ext, it serves no
other, security-specific purpose. Therefore, the re-
quired K
i
o
should be reduced, especially for S
n
.
They are not familiar with security solutions such
as SIEM systems and to harvest their knowledge
for security purposes, it is necessary to keep the
entry barriers (K
i
o
) as low as possible. Concerning
the notation of K
i
o
being the catalyst of knowledge
conversions, we aim to reduce the catalyst needed
to initiate the conversion.
Enable coll between S
e
and S
n
. Security experts
have the knowledge about possible attack vectors,
however, the knowledge needed to adapt the at-
tack vectors to a specific context often lies within
the domain of non-security experts. Thus, collab-
oration is necessary for identifying as many attack
vectors as possible.
4.1 System Architecture
As shown in Figure 3, the prototype’s architecture is
divided into front end and back end. The back end is
responsible for recognizing indicators and incidents
based on predefined patterns and the front end pro-
vides a user interface allowing to create, edit, and de-
bug these patterns. Please note that the architecture
is completely based on open source technologies and
the source code is available for everyone on GitHub
2
.
4.1.1 Back End
Since the back end is indispensable for the appli-
cation’s functionality, but the actual contribution of
the paper lies in the front end, we will only give a
quick and superficial view of it. In the back end,
indicators and incidents within an event stream are
detected with the help of complex event processing
based on Esper
3
. Events are made available by sev-
eral Data Sources and provided by the Event Broker
based on Apache Kafka
4
as an event stream. The
patterns formulated using the Esper Event Processing
Lanaguage
5
(EPL) are created in the front end and
persisted in the Pattern Storage (MongoDB
6
). Be-
tween the front end and the back end, an API Provider
2
https://github.com/Knowledge-based-Security-Analytics
3
http://www.espertech.com/esper/
4
https://kafka.apache.org/
5
https://esper.espertech.com/release-5.2.0/
6
https://www.mongodb.com/
ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy
104
is implemented as a modern GraphQL
7
API provid-
ing access to the event stream and allowing to create,
update, and delete event patterns.
Figure 5: Screenshot of EPL statement built with Blockly.
4.1.2 Front End
Our front end essentially consists of three main views
embedded in an Angular
8
-based User Interface (UI).
The landing page is divided into two main com-
ponents highlighted with red boxes (A) and (B) in
Figure 3. On the left, component (A) provides an
overview of the defined patterns, including meta-data
as the name, time of the last modification, and the
deployment mode. The deployment mode indicates
whether a pattern is still being developed or has al-
ready been deployed into operations. Additionally,
the pattern overview allows to start the editing of
a pattern or to delete it. Clicking on the “Pencil”-
action (i.e. edit a pattern) brings up the Visual Pat-
tern Builder as described in Section 4.2 with the cur-
rent definition of the pattern. The Visual Pattern
Builder can also be started to define a new pattern
via the “New Statement”-button in the navigation bar.
Please note that the overview also has a tab indicat-
ing Schemes. These define the event types that can
be used to describe a pattern, but as they are very
straight-forward, we will focus on patterns.
The second component (B) of the landing page
contains additional information about a pattern. Be-
sides its ID and EPL statement as used in the Pattern
Matcher this component displays a Live Event Chart
for a quick overview of the pattern’s activities within
the last ten minutes. The bar chart feature on the
bottom of the Live Event Chart shows the full time
window and the number of events registered on the
pattern. The upper part of the event chart displays
an interactively selectable time frame within the last
ten minutes and both the events generated by the Pat-
tern Matcher after having identified a match on spe-
cific sets of source events and the source events them-
7
https://graphql.org/
8
https://angular.io/
selves. Circles of the same color correspond to the
same event type.
4.2 Visual Pattern Builder
This component is used to create new statements or
edit existing ones. The visual code editor Google
Blockly
9
is used for this purpose. Google Blockly
is primarily used in the education context to teach,
for example, the basic principles of programming. It
allows defining specific logical building blocks that
can be snapped together and parametrized by the user.
Internally, these blocks are compiled into working
source code. Blockly has proven its feasibility in low-
ering entry barriers for novice users. Therefore, it is
a good candidate for abstracting the rather complex
syntax and logical flow of the EPL used within the
Pattern Matcher.
In our prototype, we implemented blocks based
on Google Blockly that allow Esper EPL statements’
construction. The main parts of these statements
are event patterns (blue blocks), conditions (green
blocks), and actions (yellow blocks). Figure 5 shows
a simple statement built with Blockly. The displayed
statement instructs the Pattern Matcher if it sees two
instances of “Log Event” followed by each other and
containing the same “srcIp” and “targetIp” attribute,
it should produce an Alert Event” with the respec-
tive attributes. An example of a resulting Esper EPL
statement is displayed in the grey box in component
(B) in Figure 4.
Please refer to our implementation for the band-
width of different EPL structures that we already im-
plemented in Blockly. It includes, besides others,
logical combinations of event sequences (including
“and”, “or”, “not”), counted event sequences and log-
ical conditions. Although we are not yet capable of
representing all possible EPL structures, the concept
and implementation are very promising, and we plan
to come closer to full coverage of the Esper EPL
within future iterations of our work.
4.3 Pattern Debugger
Clicking on the small arrow on the bottom right side
of component (B) brings up the Pattern Debugger for
the respective pattern. It allows testing the created
patterns by providing a detailed view of the events’
data and their relationships. As stated before, one or
multiple observables can lead to indicators and one
or multiple indicators can lead to incidents. This hi-
erarchical order is also represented within the pattern
9
https://developers.google.com/blockly
Bridging Knowledge Gaps in Security Analytics
105
Figure 6: Screenshot of the Pattern Debugger.
debugger. The pattern to debug either detects an indi-
cator or an incident. Subsequently, this view displays
three columns, as shown in Figure 6. The first column
shows all observables, which are the source for any
pattern related to the selected one. The second col-
umn shows all indicators, which lead to the debugged
pattern and the third column shows the output (indi-
cators or incidents) of the pattern itself.
The respective observables, indicators, and inci-
dents are shown in a JSON tree structure. Initially,
each JSON tree is shown collapsed, where a user can
only see the name and a small amount of informa-
tion. If a more detailed investigation of certain events
is necessary, the tree can be expanded, revealing the
next layer of its structure. This enables a view, includ-
ing all details, if a user wants to investigate them.
A critical aim of the pattern debugger is to visu-
alize the hierarchical order between observables, in-
dicators, and incidents. To achieve this, for example,
each incident is highlighted when the user is hovering
over it with the cursor. Additionally, each underly-
ing indicator and preceding observables are also high-
lighted to show the dependencies between them. This
provides an overview of the hierarchical structure of
the underlying correlation engine.
4.4 Discussion of the prototype
As delimited before, the main requirements for the
research prototype are to reduce the needed K
i
o
and
enable the coll between S
e
and S
n
. In the following,
we discuss the degree to which it achieves to meet
these requirements.
The most evident step, the reduction of the needed
K
i
o
for ext(K
i
d
) is taken through the implementation
of visual programming approaches within the Visual
Pattern Builder. Thereby, users can create and ana-
lyze detection rules without being familiar with most
SIEM systems’ complex syntax. Thus, it is easier for
them to contribute their k
i
d
and externalize it into k
e
(Eq. 7). Especially S
n
are enabled to contribute their
k
i
d(nonSec)
to the definition of detection rules.
In addition to the rule creation, the complexity of
rule debugging and testing was simplified. To avoid
overburdening the user, only a very abstract view of
the recognized rules is displayed with a Live Event
Chart. If a user wants to have a more detailed view,
the Pattern Debugger can be used, which itself ini-
tially only displays the event names. If an even more
detailed view is desired, these can be expanded fur-
ther. Therefore, the prototype also simplifies int.
The points mentioned above already partly imply
the fulfillment of these requirements since a reduction
of the required knowledge enables S
n
to participate in
creating rules. This eases the exchange of knowledge
in terms of collaboration, as defined in Equation 9 .
Thus, for example, S
n
can adapt rules of S
e
with their
domain knowledge or vice versa. The prototype’s ar-
chitectural design also supports this knowledge con-
version as patterns are stored centralized and made
available through visual interfaces. Thus, the combi-
nation of int and ext are combined to enable collabo-
ration as any user can access other users’ externalized
knowledge.
Although our prototype points in a promising di-
rection for knowledge-based SA, its current state must
be improved and thoroughly evaluated. We are aware
of a few improvements to be made regarding a fur-
ther simplification of the rule creation. Although the
highly complex Esper EPL is more accessible with
the current version, we aim to make rule creation
even more straightforward. Additionally, the rule de-
bugging functionality needs to be improved to give a
more cohesive insight into the workings of the Pat-
tern Matcher. Additional visual representation could
be applied here. Also, the effects of reducing the nec-
essary K
i
o
have to be empirically measured to assess
the prototype’s full potential. An empirical user study
furthermore needs to examine the effects of collabo-
ration on detection rates and performance of the re-
spective SA methods.
5 CONCLUSION
This paper presents a formal representation of knowl-
edge in the field of Security Analytics. Building on
a formalization, we establish a model of knowledge-
based SA based on the Incident Detection Process.
Therefore, the paper provides a sound basis for fu-
ture research in the field of knowledge-based Secu-
rity Analytics, as it brings the previously non-uniform
and mostly verbal descriptions to a formalized and
consistent level. Several segments of this model are
identified to need more attention from academic re-
search. To provide a first possible approach enabling
externalization of domain knowledge and collabora-
tion between security experts and security novices,
we implement a research prototype system that uses
ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy
106
visual programming to reduce the amount of opera-
tional knowledge needed and make it easier to create
incident signatures.
Although we present a prototype addressing some
of the open challenges in knowledge-based SA, there
is room for future research. First, it is necessary to
further technologically support the collaboration be-
tween security novices and security experts. This pa-
per presents the first approach to this, but the approach
has to be improved together with potential users and
finally evaluated throughout a user study. A second
research direction is to integrate situational knowl-
edge into SA better. Here, initial approaches already
exist in the area of human-as-a-security-sensor, but
these should be further developed.
REFERENCES
Ackoff, R. L. (1989). From data to wisdom. Journal of
Applied System Analysis, (16):3–9.
Ben-Asher, N. and Gonzalez, C. (2015). Effects of cyber
security knowledge on attack detection. Computers in
Human Behavior, 48:51–61.
Chao, P.-Y. (2016). Exploring students’ computational
practice, design and performance of problem-solving
through a visual programming environment. Comput-
ers & Education, 95:202–215.
Chen, M., Ebert, D., Hagen, H., Laramee, R. S., van Liere,
R., Ma, K.-L., Ribarsky, W., Scheuermann, G., and
Silver, D. (2009). Data, information, and knowledge
in visualization. IEEE Computer Graphics and Appli-
cations, 1(29):12–19.
Chen, T. M., Sanchez-Aarnoutse, J. C., and Buford, J.
(2011). Petri net modeling of cyber-physical attacks
on smart grid. IEEE Transactions on Smart Grid,
2(4):741–749.
Davenport, T. H. and Prusak, L. (2000). Working knowl-
edge: How organizations manage what they know.
Harvard Business School Press, Boston, Mass.
Dietz, M., Vielberth, M., and Pernul, G. (2020). Integrating
digital twin security simulations in the security opera-
tions center. In Proceedings of the 15th International
Conference on Availability, Reliability and Security
(ARES), pages 1–9, New York, NY, USA. ACM.
Eckhart, M. and Ekelhart, A. (2018). Towards security-
aware virtual environments for digital twins. In Pro-
ceedings of the 4th ACM Workshop on Cyber-Physical
System Security - CPSS ’18, pages 61–72. ACM Press.
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P.
(1996). From data mining to knowledge discovery in
databases. AI Magazine, 17(3):37.
Federico, P., Wagner, M., Rind, A., Amor-Amor
´
os, A.,
Miksch, S., and Aigner, W. (2017). The role of ex-
plicit knowledge: A conceptual model of knowledge-
assisted visual analytics. In Proceedings of the IEEE
Conference on Visual Analytics Science and Technol-
ogy (VAST).
Heartfield, R. and Loukas, G. (2018). Detecting semantic
social engineering attacks with the weakest link: Im-
plementation and empirical evaluation of a human-as-
a-security-sensor framework. Computers & Security,
76:101–127.
Jaeger, L. (2018). Information security awareness: Liter-
ature review and integrative framework. In Proceed-
ings of the 51st Hawaii International Conference on
System Sciences. Hawaii International Conference on
System Sciences.
Loukas, G. (2015). Cyber-Physical Attacks. Butterworth-
Heinemann.
Mahmood, T. and Afzal, U. (2013). Security analytics: Big
data analytics for cybersecurity: A review of trends,
techniques and tools. In 2013 2nd National Confer-
ence on Information Assurance (NCIA), pages 129–
134. IEEE.
Menges, F. and Pernul, G. (2018). A comparative analysis
of incident reporting formats. Computers & Security,
73:87–101.
Nonaka, I. and Takeuchi, H. (1995). The Knowledge Creat-
ing Company. Oxford University Press.
Polanyi, M. (2009). The Tacit Dimension. University of
Chicago Press, Chicago.
Ponsard, C. and Grandclaudon, J. (2020). Guidelines and
tool support for building a cybersecurity awareness
program for smes. In Information Systems Secu-
rity and Privacy, volume 1221 of Communications in
Computer and Information Science, pages 335–357.
Springer, Cham.
Sacha, D., Stoffel, A., Stoffel, F., Kwon, B. C., Ellis, G., and
Keim, D. (2014). Knowledge generation model for
visual analytics. IEEE Transactions on Visualization
and Computer Graphics, 20(12):1604–1613.
S
´
aez-L
´
opez, J.-M., Rom
´
an-Gonz
´
alez, M., and V
´
azquez-
Cano, E. (2016). Visual programming languages in-
tegrated across the curriculum in elementary school.
Computers & Education, 97:129–141.
Sallos, M. P., Garcia-Perez, A., Bedford, D., and Orlando,
B. (2019). Strategy and organisational cybersecurity:
a knowledge-problem perspective. Journal of Intel-
lectual Capital, 20(4):581–597.
Schneier, B. (2015). Secrets and Lies: Digital Security in a
Networked World. John Wiley & Sons, 15. edition.
Schneier, B. (2018). Click here to kill everybody: Security
and survival in a hyper-connected world. W.W. Nor-
ton & Company, New York, 1. edition.
Thalmann, S. and Ilvonen, I. (2020). Why should we in-
vestigate knowledge risks incidents? - lessons from
four cases. In Bui, T., editor, Proceedings of the 53rd
Hawaii International Conference on System Sciences.
Vasileiou, I. and Furnell, S. (2019). Personalising secu-
rity education: Factors influencing individual aware-
ness and compliance. In Information Systems Secu-
rity and Privacy, volume 977 of Communications in
Computer and Information Science, pages 189–200.
Springer, Cham.
Vielberth, M., Menges, F., and Pernul, G. (2019). Human-
as-a-security-sensor for harvesting threat intelligence.
Cybersecurity, 2(1).
Bridging Knowledge Gaps in Security Analytics
107
Wagner, M., Rind, A., Th
¨
ur, N., and Aigner, W. (2017). A
knowledge-assisted visual malware analysis system:
Design, validation, and reflection of kamas. Comput-
ers & Security, 67:1–15.
Wang, X., Jeong, D. H., Dou, W., Lee, S.-W., Ribarsky, W.,
and Chang, R. (2009). Defining and applying knowl-
edge conversion processes to a visual analytics sys-
tem. Computers & Graphics, 33(5):616–623.
Zimmermann, V. and Renaud, K. (2019). Moving from
a “human-as-problem” to a “human-as-solution” cy-
bersecurity mindset. International Journal of Human-
Computer Studies, 131:169–187.
ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy
108