learning in classification problems has increased
thanks to three main components: (1) the increase
of commercial feeds that helped new malware to ex-
ist, (2) computing power has become cheaper, and (3)
machine learning as an independent computer science
discipline has evolved, and big companies are invest-
ing in it which help researchers by providing the tools
to innovate in the field.
Machine learning approaches and results are en-
couraging to achieve a high malware detection rate
without any human interaction. As a result, AV ven-
dors and their research and development teams began
deploying some machine learning classifiers such as
neural networks, decision trees, and logistic regres-
sion (Krizhevsky et al., 2012).
Malware analysis, as an independent discipline in
cybersecurity, has been facing the problem of mal-
ware classification or detection as a binary classifi-
cation. So, any file shall be analyzed to detect if it is
malware or not. If it is malware, it is labeled accord-
ing to its type and family based on its behavior by
using a classification mechanism. The main purpose
behind this work is introduce a different approach for
malware detection in Android by using visual char-
acteristics of malware and deep learning for pattern
recognition.
Contributions. In this work, we have:
• Combined and preprocessed a dataset containing
benign and malicous Android applications;
• Developed a machine learning model based on
CNN to detect and classify mobile applications
samples as benign or malware.
• Experimented the suggested model based on com-
parisons with other defined models.
Outline. The rest of the paper is organized as fol-
lows. In Section II, different malware types are intro-
duced. Then, in Section III, we introduce the different
malware analysis methdologies, whereas in Section
IV we discuss the related work. After that, in Sec-
tion V, we present in details our methodology. Then,
we explain the process of processing malware as an
image in Section VI. We share our results in Section
VII. And finally, we present out conclusions and fu-
ture work in Section VIII.
2 MALWARE TYPES
Malware is a compound word of two words: mali-
cious and software. Malware are software programs
that are designed and implemented to damage or ex-
ecute some malicious commands on a system which
may lead to some unwanted actions for the user such
as gathering sensitive information, disrupting normal
computer operations, gain control over the computer
system, spying on the user’s daily activities by gain-
ing access to mobile sensors, and destroying the mo-
bile system. The word malware is the general ter-
minology used to describe any malicious software.
However, they can be technically divided into the fol-
lowing categories depending on their goal.
• Adware: It is a type of malware that automatically
displays advertisements. It is used to gather data
about users’ interests and to get revenue from the
displayed advertisements.
• Spyware: It is a type of malware that tracks the
daily activities of the users without them knowing.
It is dependent on the data that it gets from mobile
sensors and other running applications. Spyware
can collect sensitive data, including keystrokes,
data harvesting, and monitoring activities.
• Virus: It is a type of malware that can copy itself
and spread on the mobile system. Viruses can be
transported on any medium, including but not lim-
ited to email attachments, social media messages,
malicious links, etc.
• Worm: It is a type of malware that can spread on
the network by exploiting operating system vul-
nerabilities. The major difference between worm
and virus is that virus depends on human action
to spread while the worm can replicate itself and
spread without any human interaction.
• Trojan: it is a type of malware that makes itself
appear as a normal file or application to trick users
into downloading and installing the trojan. A tro-
jan can give unauthorized remote access to the in-
fected mobile phone. It is usually designed in a
client-server architecture where the server is in-
stalled on the attacker’s machine, and the client is
the trojan itself. It is used to steal private informa-
tion including but not limited to logins, financial
data, cryptocurrencies wallets, etc. In addition, it
is used to enable some devices on the victim’s mo-
bile phone such as front camera, spying on users’
activities such as keystrokes and files.
• Rootkit: It is a type of malicious software de-
signed to access other mobile phones remotely
and control them without being detected.
• Backdoors: It is a computer software that allows
access to compromised mobile phones. It allows
the attacker to have an entry point to the mobile
phone without the consent of the user.
• Ransomware: It is a malicious software that re-
stricts the user from accessing his or her files by
encrypting them. The decryption happens after
A Novel Approach for Android Malware Detection and Classification using Convolutional Neural Networks
607