Classifying Malicious Thread Behavior in PaaS Web Services
Cemile Diler
¨
Ozdemir
1,2
, Mehmet Tahir Sandıkkaya
2
and Yusuf Yaslan
2
1
Corporate Technology, Development Center, Siemens AS, Istanbul, Turkey
2
Computer Engineering Department, Istanbul Technical University, Istanbul, Turkey
Keywords:
Cloud Security, PaaS, Malicious Behavior, Machine Learning.
Abstract:
Multitenant structure of PaaS cloud delivery model allows customers to share the platform resources in the
cloud. However, this structure requires a strong security mechanism that isolates customer applications to pre-
vent interference between different applications. In this paper, a malicious thread behavior detection frame-
work using machine learning algorithms is proposed to classify whether user requests are malicious. The
framework uses thread metrics of worker threads and N-Gram frequencies of operations as its features. Test
results are evaluated on a real-life scenario using Random Forest, Adaboost and Bagging ensemble learning
algorithms and evaluated using different accuracy metrics. It is found that the malicious request detection
accuracy of the proposed system is 87.6%.
1 INTRODUCTION
Cloud computing is a popular concept for compa-
nies to be enabled on-demand network access for out-
sourcing their resources as infrastructure, platform or
software with the minimum effort. Cisco reports that
cloud data center workloads will tripled from 2015 to
2020 (Networking, 2017). Considering this growth,
cloud computing vendors focus on security research
to adapt rapid development of this technology (Baner-
jee et al., 2013).
Most of the PaaS providers offer web application
platforms to their customers because PaaS develop-
ment is web oriented. This strategy is beneficial both
for the PaaS providers and for the PaaS customers.
Since scripting languages (Ruby and Python) and vir-
tualized platforms (Java and .NET) are commonly
used in web development in recent years, providers
build their servers on these popular technologies. In
addition, cloud customers quickly customize their
existing web applications and deliver them to the
providers to be served in the cloud. Thus deploying
many customers’ applications turns into an easy and
cheap process for PaaS providers. Thereby, the com-
mon benefit is rapid adoption to the cloud.
These advantages, however, come with a major
flaw. Different cloud customers share PaaS platform
resources (hardware, software, services, configura-
tion, etc.) and it requires isolation between customer
applications to prevent interference between differ-
ent applications. This interference can occur uncon-
sciously or maliciously. For instance a faulty applica-
tion can consume most of memory or CPU on the pro-
vided platform for many customers. Other customers
are influenced; even there is no conscious attack to
platform. In addition it is possible that maliciously
acting customers can execute code to attack other cus-
tomers or platform. Availability, confidentiality and
integrity of PaaS are threaten for these reasons (Modi
et al., 2013). PaaS providers need a strong security
mechanism to protect and isolate their customer ap-
plications and the platform.
Currently, PaaS customers are limited to web ap-
plications due to leading PaaS providers Google
1
,
Heroku
2
and Amazon
3
. The providers may limit cus-
tomer applications’ access to trivial resources such as
files or sockets via carefully set up permissions. How-
ever, memory and CPU are shared among multiple
threads in web applications as well as per-request user
behavior cannot be traced (Sandıkkaya et al., 2014).
Several systems have already been proposed for
cloud security, such as host based intrusion detection
systems (Arshad et al., 2012), network based intru-
sion detection systems (Hamad and Al-Hoby, 2012),
distributed intrusion detection systems (Sanjay Ram,
2012) and hypervisor based intrusion detection sys-
tems (Garfinkel et al., 2003). However, these men-
1
https://cloud.google.com/appengine/
2
https://www.heroku.com/
3
https://aws.amazon.com/elasticbeanstalk/
418
Özdemir, C., Sandıkkaya, M. and Yaslan, Y.
Classifying Malicious Thread Behavior in PaaS Web Services.
DOI: 10.5220/0006688204180425
In Proceedings of the 8th International Conference on Cloud Computing and Services Science (CLOSER 2018), pages 418-425
ISBN: 978-989-758-295-0
Copyright
c
2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
tioned systems were designed to run on operation sys-
tem, system virtual machine or hypervisor level. They
do not consider isolation of several different customer
applications hosted in the same process virtual ma-
chine. In addition, only application and data layers
are manageable by customer in PaaS service model.
Base layers are managed by cloud provider. Fig-
ure 1 presents PaaS service model and its layers. The
customers cannot manage the underlying cloud in-
frastructure including network, operating systems or
servers in PaaS model.
Hardware and software levels isolation mecha-
nisms have also been utilized in PaaS. Software
level mechanisms isolate threads, processes or vir-
tual machines of different tenants (Bazm et al., 2017).
Heroku uses container-based isolation which groups
operating system processes by kernel name-spaces
and resource allocations to isolate from other groups.
Docker
4
is one of the most popular open-source con-
tainer platform provider which has been adopted by
many PaaS providers. In addition, Cloud Foundry
5
isolates its tenants using user-based isolation mecha-
nism. It is a traditional and widely used technique that
each application runs as a different user on the oper-
ating system. However, sharing the same process vir-
tual machine environment by multiple tenants needs
runtime-based isolation mechanisms (Zhang et al.,
2014).
In this paper a runtime-based security framework
is proposed for multitenant PaaS providers to detect
malicious behavior of threads using machine learning.
The main contributions of this paper are summarized
as follows:
Thread behavior detection framework : Thread
behavior is detected with the proposed frame-
work and this framework could be integrated into
cloud customer web application with the mini-
mum effort. The framework is designed for PaaS
providers which have many customer web appli-
cations in the same application server. In this
deployment scenario, customers’ web applica-
tions reside on the same operating system process.
Within this process, worker threads serve web ap-
plications. Proposed framework measures, classi-
fies and detects malicious behavior using worker
threads’ resource usage metrics.
Well selected metrics: All necessary metrics are
measurable on web application level. Measure-
ments of the features are independent from op-
eration system, programming language or cloud
provider. Therefore, features can be collected
4
https://www.docker.com/
5
https://www.cloudfoundry.org/
Figure 1: PaaS deployment model: The physical com-
puter resides a hypervisor to monitor the operating systems
through system virtual machines. This level of abstraction
is mostly known as IaaS. On top of operating systems, many
process virtual machine instances could be run. Each of
these process virtual machines (E.g. JVM) may isolate an
application, but not the threads within the same application.
This level of abstraction is mostly known as PaaS. The pro-
cess virtual machine may be configured as a web application
server and the threads may belong to different web applica-
tions.
without any dependency. Also features are pri-
vacy friendly, so that they do not contain any sen-
sitive data of the cloud providers or the cloud cus-
tomers.
The main difference of the proposed framework
from intrusion detection systems is, it does not mon-
itor the network activity or any feature from the net-
work connection; but directly focuses on the running
thread. On the other hand; the framework’s main dif-
ference from a virus (malware) scanner is, it detects
malicious threads rather than files. Moreover, this de-
tection is done throughout considering access to the
critical OS resources without inspecting the whole
code sequence of the thread.
The rest of this paper is organized as follows: Sec-
Classifying Malicious Thread Behavior in PaaS Web Services
419
tion 2 gives the background of the subject. In Sec-
tion 3, the proposed security mechanism is described.
Section 4 presents how the experimental setup is orga-
nized. Section 5 describes feature set and classifica-
tion algorithms applied to the data. The results of the
experiments are presented in Section 6. Finally, sec-
tion 7 concludes the paper and presents a discussion
on the results.
2 RELATED WORK
In the literature there is a vast amount of research
on malicious behavior detection systems that use ma-
chine learning algorithms (Fan et al., 2016), (Modi
et al., 2013). Most of the existing works deal with
intrusion detection systems and they try to classify
the malicious communications to avoid the external
attacks on perimeterized computer networks of an or-
ganization. These models generally extract the fea-
ture vector from data, packets user input command
sequences, log files, low-level system information and
CPU/memory usage (Wu and Banzhaf, 2010). How-
ever, these intrusion detection systems differ from the
proposed mechanism as they always assume a trusted
perimeter to be protected. As a result, they hardly
detect malicious activity that originates from the in-
siders. Note that, in PaaS deployments, the providers
are willingly to accept the customers as tenants. Then,
malicious activities may originate from these tenants
or tenants’ users. Therefore, an intrusion detection
system cannot detect malicious user behavior after the
request is once accepted to access internal resources.
In practice, an intrusion detection system may ex-
ist independent of the proposed mechanism and con-
trolled by the PaaS provider to protect the whole PaaS
deployment against third party attackers; e.g. denial-
of-service attackers.
Application program interface (API) calls and ma-
chine instructions are mostly used features of intel-
ligent malware detection systems (Bazrafshan et al.,
2013). Pirscoveanu et al. (Pirscoveanu et al., 2015)
used sequence, frequency and count of the windows
platform system calls as main features to classify with
Random Forest algorithm. Several malwares can be
dynamically classified in parallel using this approach.
Fan et al. (Fan et al., 2016) also used API calls as fea-
tures of their Malicious Sequential Pattern Malware
Detection (MSPMD) framework. They applied mod-
ified Generalized Sequential Pattern algorithm for se-
quence mining with All-Nearest-Neighbor classifier
to Windows Portable Executable (PE) samples. Up-
pal et al. (Uppal et al., 2014) applied N-Gram algo-
rithm to extract features from API sequences. Shabtai
et al. (Shabtai et al., 2012) utilized machine instruc-
tion data and extracted features with N-Gram pattern
from opcodes. Several classification algorithms are
applied to extracted feature vector to detect unknown
malicious codes.
N-Gram algorithm is also used in web prediction
models. Su et al. (Su et al., 2000) utilized N-Gram
model to predict future request of users. This prob-
abilistic prediction model aims to make best guesses
on the users’ next actions based on previous actions.
Although they do not focus on user’s malicious be-
havior, it is foreseen that N-Gram prediction model is
adaptable to predict malicious behavior of users. In
addition, N-Gram is applied to API calls and machine
instructions data in previous security studies (Uppal
et al., 2014), (Shabtai et al., 2012). However, they can
provide only system level cloud detection and pre-
vention mechanisms. In addition, collection of the
mentioned data is a time and resource consuming pro-
cess. Moreover, one may argue that, privacy of PaaS
user is not considered during data collection. Pro-
posed mechanism observes only thread behavior and
collects processor usage and requested operation se-
quences on runtime. Finally framework classifies ma-
licious behavior of PaaS user using collected feature
vector.
Despite the need of malicious thread execution de-
tection in PaaS cloud and mentioned malware detec-
tion techniques’ favorable results; a malicious thread
behavior detection framework is proposed in PaaS
cloud using machine learning algorithms.
3 PROPOSED MECHANISM
The proposed mechanism covers a wide range of sce-
narios offered in the current PaaS ecosystem. A PaaS
system provides its computational power to its cus-
tomers through threads, and these customers can re-
side in the same process virtual machine. In that
case, there is a risk of inference between different cus-
tomers in addition to the PaaS provider resources. In
such an adversarial model, the aim of the proposed
method is not completely isolate cloud customers’ ac-
cess to the platform resources. The aim of the pro-
posed model is to determine if the cloud customer
is acting maliciously. This malicious act can occur
consciously or unconsciously. The maliciously acting
threads can be stopped or at least kept away from ac-
cessing more resources right after they are classified
as malicious by the proposed security framework.
It is assumed that, a PaaS customer’s cloud ap-
plication reside in a JVM and deployed the proposed
framework. The framework focuses on web applica-
CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science
420
tion users’ request based analysis. Fundamentally, the
mechanism distinctively analyses each worker thread
per request of each user, so even capable of classify
one-time unconscious malicious activity of a trust-
worthy user.
The framework runs in the PaaS provider side, to-
gether with PaaS customers’ cloud applications. This
framework monitors thread behavior and collected in-
formation is analyzed offline to train a classifier that
is used for real-time decision making.
Java Management Extensions (JMX) is utilized to
measure processor shares, memory and average time
consumption for a user request. In addition, access to
each PaaS resource is wrapped in a separate method
which are entitled as checkpoints. Checkpoints are
defined in enter and exit of resource consuming sys-
tem methods and are identified using aspects (Kicza-
les et al., 1997) because of its numerous advantages.
First, time consumption of each resource access can
be collected per-request without affecting privacy of
the customers. The only leaked data is the sequence of
the resource access and the time consumption in each
resource. Second, the checkpoints are programmed to
disrupt the execution of a thread if its behavior after
it is classified as malicious by the framework. This
is beneficial because the threads cannot request any
more resources if they are classified as malicious.
Proposed mechanism detects maliciously acting
threads on the cloud platform using machine learning
techniques. First, classification features are selected
to be measured by proposed framework. Instant CPU
usage and cumulative CPU usage per request are
two attributes of the feature vector and measured by
JMX. Moreover, feature vector contains three more
attributes per request. These attributes are resource
access duration, resource access type and resource
access sequence. Access type feature is mapped to
CRUD (create, read, update, delete) functions and
contains information about requested function. Se-
quence feature holds order of requested functions. It
is so informative to have sequence of these operations
to obtain frequency of the operation and transaction
between each operation.
Proposed mechanism processes feature vector us-
ing N-Gram algorithm. N-Gram represents a contigu-
ous sequence of N items from a given list of items
and predicts the next item. In natural language pro-
cessing these items can be letters or words, in speech
recognition items can be phonemes and in malware
analysis they can be system calls or machine instruc-
tion sequence. Operation sequence is one the feature
in proposed mechanism and set of operations is repre-
sented as O = {Add, Delete, Read, Update}. Table 1
represents sample tokenized operation sequences data
and their types to visualize structure of the train data.
This sequence data is represented as string and these
tokens are converted into a set of new attributes using
Weka NGramTokenizer API with the values N
min
= 1
and N
max
= 5. After this filtering process, new feature
vector has 132 new attributes according to occurrence
frequency of the transactions between each operation
from the sequence text data. Operation transactions
are visualized in Figure 3.
Classification algorithms are applied to training
data after feature vector measured and tokenized with
the N-Gram. In order to detect malicious thread, dif-
ferent classification algorithms have been evaluated.
After the comparison of test results, Random Forest
classifier is integrated into proposed framework as the
most accurate classifier.
Proposed framework uses runtime observation to
classify a request. Figure 2 shows classification pro-
cess of runtime observations. Proposed framework is
integrated into an experimental cloud web application
which is explained in the next section. This behavior
based system does not require a signature database
and framework can be integrated into a cloud applica-
tion with the minimum effort.
4 EXPERIMENTAL SETUP
The proposed mechanism is tested on a demo cloud
system that contains an event ticketing application
connected to a relational database. Conventional
paths of usage are recorded and repeated with Apache
JMeter to reproduce a set of regular requests. Real-
istic attack scenarios are considered and also added
to the query set of JMeter. Experimental ticketing
cloud application is composed of basic user opera-
tions that may be mapped to CRUD (create, read, up-
date, delete) operations on a database. These four
main operations are: add, delete, read, and update.
Depending on the payload of the request an event, a
user, a ticket may be added, read, updated or deleted
from the application. The application also includes
many meta elements such as text, graphics, audio, and
video. Training set contains add, delete, read and up-
date functions either as a regular or a malicious oper-
ation. Regular requests are defined as common user
behavior depending on cloud application’s scope and
goal. On the other hand, malicious operations are se-
lected from a wide set of possibilities. An unexpected
content may be added to the database, whole table
may be dropped, a large set of bogus data may be in-
serted, etc. The attack scenarios are produced by con-
sidering cross-site scripting, SQL-injection, database
modification and file system access scenarios.
Classifying Malicious Thread Behavior in PaaS Web Services
421
Figure 2: The overall architecture of the proposed framework for malicious thread behavior detection. The framework extracts
four features from any PaaS web service deployment in a cost-effective and privacy-friendly way; instant CPU share of a
thread in the web service, CPU share of a thread over a period of time, a thread’s access type (create, read, update, delete)
to critical resources (databases, files, sockets, etc.) and finally the duration of this access. These features are enriched with
N-Gram based on access sequence during training phase. During testing, a thread’s behavior is classified based on previous
knowledge.
The final set contains ten sets that include nearly
1% malicious requests and the total number of queries
is 100000. Each experiment is conducted with 10 000
requests of which nearly 100 of the requests are ma-
licious. Each experiment is repeated 10 times. In the
final set, there are 1000 malicious requests and 99 000
regular requests. The results are presented as the av-
erage of 10 independent experiments.
5 USED FEATURE SET AND
CLASSIFICATION
ALGORITHMS
In this paper, N-Gram feature extraction algorithm is
applied to feature set and its brief description is given
in Subsection 5.1. After the feature extraction, en-
semble learning algorithms, Random Forest, Bagging
and AdaBoost, are run on the data to evaluate their ac-
curacy in proposed framework. These classification
algorithms are described respectively in Subsections
5.2, 5.3 and 5.4.
5.1 N-Gram Features
N-Gram models sequence of n elements that can
be letters, words or phonemes. In this pa-
per, N-Gram probabilistic model predicts type of
next operation X
i
based on previous operation se-
quence X
i(n1)
, X
i(n2)
, . . . X
i1
. Likelihood of
next element in the sequence is symbolized as
P(X
i
| X
i(n1)
, X
i(n2)
, . . . X
i1
) and it bases on
(n 1) order Markov model.
Proposed framework applies Weka API NGram-
Tokenizer filter to collected operation sequence data.
An example data is shown in Table 1. N-Gram splits
this sequence data with the minimum and maximum
grams. It calculates frequencies and transitions of
each operation that illustrated in Figure 3. N-Gram
model can store more context with larger N. Storing
more context provides better prediction but it requires
more memory usage and time consumption. How-
ever, prediction gets worse with the small N while
memory usage and time consumption decrease. Even-
tually, N is given with the interval of [1, 5] that pro-
vides best efficiency both prediction accuracy, time
consumption and memory usage in proposed frame-
work. After the calculation, NGramTokenizer gener-
ates new features vector that contains probability of
each grams.
5.2 Random Forest Classifier
Random Forest is an ensemble learning method that
grows many random classification or regression trees.
Trees vote for the most popular class and the result is
the combination of these tree predictors. It runs effi-
ciently on large datasets. Moreover, Random Forest
algorithm has randomness in tree construction which
CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science
422
Table 1: Sample data is shown below for N-Gram classifica-
tion. N-Gram classifies regular or malicious threads only by
the sequence of their access types. It does not check which
resource threads access or for how long. It also does not
check the parameters of the operations. Such information is
fed into the classification model by other features.
Operation Sequence
Request
Type
Read Read Regular
Read Read Add Update Regular
Delete Delete Delete Malicious
Update Regular
Figure 3: Transitions, S
t
S
t+1
, between states. States rep-
resent resource access types and transitions can be evaluated
with this state machine.
minimizes the correlation. In addition, Random For-
est algorithm does not overfit to data (Breiman, 2001).
Rapidly increasing cloud data traffic requires mem-
ory and time efficient algorithms to run with big data.
These characteristics make random forest algorithm
applicable to cloud security. The pseudo code of the
random forest algorithm is given in Algorithm 1.
5.3 Bagging
Bagging (Mamitsuka et al., 1998), also called Boot-
strap Aggregating, is an ensemble technique that uses
classifiers trained on instances generated by randomly
drawn examples, with replacement. Therefore, each
classifier in the ensemble is obtained with a different
random sampling of the training dataset. The final de-
cision is given by majority vote over individual clas-
sifiers’ outputs.
Algorithm 1: Random Forest generates ensemble
of trees using randomly selected instances and fea-
tures.
1: procedure RANDOMFOREST
2: f f eatures
3: N number o f trees
4: H
/
0
5: for i = 1 to T do
6: n
i
bootstrap samples f rom original data
7: g
i
GROWTREE(n
i
, f )
8: H H g
i
9: procedure GROWTREE(n,f)
10: f
i
subset o f f
11: best split among f
i
12: return tree
5.4 AdaBoost
AdaBoost, short for Adaptive Boosting, is a success-
ful boosting algorithm that constructs a strong clas-
sifier H(x), as linear combination of weak classifiers
h
t
(x) shown in Equation 1. Prediction of the class
label H(x) is made by calculating the weighted av-
erage of the weak predictions h
t
(x). The weight, α
t
,
is based on the classifier’s error rate which infers the
number of misclassified instances over the training set
divided by the training set size.
Previously Adaboost algorithm was used in net-
work intrusion detection because of its low compu-
tational complexity, a high detection rate, and a low
false-alarm rate (Hu et al., 2008). This algorithm is
one of the selected classifier to measure its accuracy
in the proposed framework because of these advan-
tages.
H(x) = sign
T
t=1
α
t
h
t
(x)
!
(1)
6 EXPERIMENTAL RESULTS
The experimental results are obtained using ten folds
cross validation. Results are evaluated using given
metrics: incorrectly classified instance percentage,
precision, recall and F-measure. These measurements
are computed according to True Positive (TP), False
Positive (FP), True Negative (TN) and False Nega-
tive (FN) rates given in confusion matrix in Table 2.
Incorrectly classified percentage calculation is given
in Equation 2, respectively precision, recall and F-
Measure computations are given in Equation 3, Equa-
tion 4 and Equation 5.
Classifying Malicious Thread Behavior in PaaS Web Services
423
Table 2: Confusion matrix legend.
Predicted
Malicious Regular
Actual
Malicious TP FN
Regular FP TN
Table 3 shows different classifiers’ result without
resource access sequence data. Operation sequence
feature is not evaluated in this experiment. This fea-
ture set has just processor usage, resource usage du-
ration and base operation type metrics. Random For-
est classification results on this dataset have a large
number of incorrectly classified instances. In addition
Bagging and AdaBoost algorithms with J48 decision
tree base classifier do not obtain better accuracy than
Random Forest algorithm.
Table 4 shows classification results with N-Gram
feature extraction. Operation sequence features are
filtered with the values N
min
= 1 and N
max
= 5 and
feature extraction process in framework is shown in
Figure 2. After the extraction, new feature vector has
attributes processor usage, resource usage, base oper-
ation type and generated N-Gram features.
Table 4 has much more accurate results than Ta-
ble 3 for all classifiers, although the same classifica-
tion algorithms are run. The number of misclassified
instances drops from 888 to 162 after the operation se-
quence feature is added to framework to be collected
and to be processed with N-Gram module.
Random Forest algorithm gets the most accurate
and promising result in Table 4. Only 162 instances
are labeled incorrectly out of 100 000. In addition,
only 38 regular instances out of 99000 are classified
as malicious. It shows that proposed framework has
low false-alarm rate. 124 malicious instances out of
1000 are classified as regular request. Table 5 shows
confusion matrix of Random Forest algorithm with N-
Gram feature extraction. Precision result given in Ta-
ble 4 indicates that, a malicious predicted instance is
classified correctly with the probability of 0.95 by the
proposed framework. In addition, recall result shows
a malicious request is detected by framework with the
probability of 0.87. Since classes are unbalanced F-
Measure is evaluated as success criteria of the frame-
work to inspect if it is close to its best value at 1. As a
result, 0.91 F-Measure result indicates that proposed
framework is accurate on both malicious and regular
classes.
Misclassi f ied =
FP + FN
FP + T P + FN + T N
(2)
Precision =
T P
T P + FP
(3)
Table 3: Classifiers’ result without resource access se-
quence data.
Random
Forest
Bagging
Ada-
Boost
Misclassified % 0.888 0.9111 0.986
Precision 0.6204 0.6164 0.5133
Recall 0.299 0.242 0.291
F-Measure 0.4011 0.3457 0.3690
Table 4: Classifiers’ results enhanced with N-Gram feature
extraction.
Random
Forest
Bagging
Ada-
Boost
Misclassified % 0.162 0.192 0.2
Precision 0.9584 0.9646 0.9381
Recall 0.876 0.839 0.857
F-Measure 0.9153 0.8968 0.8954
Recall =
T P
T P + FN
(4)
FMeasure = 2 ×
Precision × Recall
Precision + Recall
(5)
7 DISCUSSION AND
CONCLUSION
This paper proposes a malicious thread behavior de-
tection framework for the use of PaaS providers.
This approach utilizes machine learning techniques
and especially beneficial in the current multitenant
PaaS ecosystem as cloud customers may share the
resources within the same OS process. The multi-
tenancy requires strong isolation between customer
applications. The proposed framework obtains such
isolation by monitoring thread behavior and purposes
to satisfy cloud customers’ need for security and iso-
lation. The proposed framework is deployed in the
application level and it can be integrated into any web
service application in the PaaS cloud with a standard
JVM requirement.
Table 5: Confusion matrix of the proposed framework. The
proposed framework runs Random Forest classifier with N-
Gram feature extraction into feature set.
Predicted
Malicious Regular
Actual
Malicious 876 124
Regular 38 98962
CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science
424
The proposed mechanism investigates several ma-
chine learning techniques and combines them. First,
N-Gram module filters the operation sequence. The
filtered sequence is combined with resource usage
metrics. Then, the proposed framework classifies the
requests as regular or malicious using the combined
measured metrics. This classification module is built
on the training data. The Random Forest classifier is
able to detect a malicious request with the probability
of 0.87 in the proposed framework.
It is obvious that better results can be obtained us-
ing proposed framework with the more precise mea-
surement and different metrics in the future. Im-
proved frameworks would be applicable to malicious
behavior detection into the PaaS clouds domain in the
near future. As a future work, proposed framework
scenarios will be extended using different cloud ap-
plications and different metrics.
REFERENCES
Arshad, J., Townend, P., and Xu, J. (2012). An abstract
model for integrated intrusion detection and severity
analysis for clouds. Cloud Computing Advancements
in Design, Implementation, and Technologies, 1.
Banerjee, C., Kundu, A., Basu, M., Deb, P., Nag, D., and
Dattagupta, R. (2013). A service based trust man-
agement classifier approach for cloud security. In Ad-
vanced Computing Technologies (ICACT), 2013 15th
International Conference on, pages 1–5. IEEE.
Bazm, M.-M., Lacoste, M., S
¨
udholt, M., and Menaud, J.-M.
(2017). Side Channels in the Cloud: Isolation Chal-
lenges, Attacks, and Countermeasures. working paper
or preprint.
Bazrafshan, Z., Hashemi, H., Fard, S. M. H., and Hamzeh,
A. (2013). A survey on heuristic malware detection
techniques. In Information and Knowledge Technol-
ogy (IKT), 2013 5th Conference on, pages 113–120.
IEEE.
Breiman, L. (2001). Random forests. Machine learning,
45(1):5–32.
Fan, Y., Ye, Y., and Chen, L. (2016). Malicious sequen-
tial pattern mining for automatic malware detection.
Expert Systems with Applications, 52:16–25.
Garfinkel, T., Rosenblum, M., et al. (2003). A virtual
machine introspection based architecture for intrusion
detection. In Ndss, volume 3, pages 191–206.
Hamad, H. and Al-Hoby, M. (2012). Managing intrusion
detection as a service in cloud networks. International
Journal of Computer Applications, 41(1).
Hu, W., Hu, W., and Maybank, S. (2008). Adaboost-
based algorithm for network intrusion detection. IEEE
Transactions on Systems, Man, and Cybernetics, Part
B (Cybernetics), 38(2):577–583.
Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C.,
Lopes, C., Loingtier, J.-M., and Irwin, J. (1997).
Aspect-oriented programming. ECOOP’97Object-
oriented programming, pages 220–242.
Mamitsuka, N. A. H. et al. (1998). Query learning strate-
gies using boosting and bagging. In Machine learn-
ing: proceedings of the fifteenth international confer-
ence (ICML98), volume 1.
Modi, C., Patel, D., Borisaniya, B., Patel, H., Patel, A., and
Rajarajan, M. (2013). A survey of intrusion detection
techniques in cloud. Journal of Network and Com-
puter Applications, 36(1):42–57.
Networking, C. V. (2017). Ciscoglobal cloud index: fore-
cast and methodology, 2015-2020. white paper.
Pirscoveanu, R. S., Hansen, S. S., Larsen, T. M., Ste-
vanovic, M., Pedersen, J. M., and Czech, A. (2015).
Analysis of malware behavior: Type classification us-
ing machine learning. In Cyber Situational Aware-
ness, Data Analytics and Assessment (CyberSA), 2015
International Conference on, pages 1–7. IEEE.
Sandıkkaya, M. T.,
¨
Odevci, B., and Ovatman, T. (2014).
Practical runtime security mechanisms for an apaas
cloud. In Globecom Workshops (GC Wkshps), 2014,
pages 53–58. IEEE.
Sanjay Ram, M. (2012). Secure cloud computing based on
mutual intrusion detection system. International Jour-
nal of Computer application, 1(2):57–67.
Shabtai, A., Moskovitch, R., Feher, C., Dolev, S., and
Elovici, Y. (2012). Detecting unknown malicious code
by applying classification techniques on opcode pat-
terns. Security Informatics, 1(1):1.
Su, Z., Yang, Q., Lu, Y., and Zhang, H. (2000). Whatnext:
A prediction system for web requests using n-gram
sequence models. In Web Information Systems Engi-
neering, 2000. Proceedings of the First International
Conference on, volume 1, pages 214–221. IEEE.
Uppal, D., Sinha, R., Mehra, V., and Jain, V. (2014). Mal-
ware detection and classification based on extraction
of api sequences. In Advances in Computing, Com-
munications and Informatics (ICACCI, 2014 Interna-
tional Conference on, pages 2337–2342. IEEE.
Wu, S. X. and Banzhaf, W. (2010). The use of computa-
tional intelligence in intrusion detection systems: A
review. Applied Soft Computing, 10(1):1–35.
Zhang, Y., Juels, A., Reiter, M. K., and Ristenpart, T.
(2014). Cross-tenant side-channel attacks in paas
clouds. In Proceedings of the 2014 ACM SIGSAC
Conference on Computer and Communications Secu-
rity, pages 990–1003. ACM.
Classifying Malicious Thread Behavior in PaaS Web Services
425