
API and other features (Xiang et al 2018). Mohaisen 
et  al.  extracted  the  dynamic  detection  features  of 
malware  during  its  execution,  and  used  K-Nearest 
Neighbor (KNN), Support  Vector  Machine (SVM), 
decision  tree  and  other  classification  algorithms  to 
build the final detection model (Mohaisen et al 2015). 
Xin  Wang  et  al.  proposed  a  detection  model  that 
combines Recurrent Neural Network (RNN) networks 
and autoencoders (Wang et al 2016). Mehadi Hassen 
et al. used the supervised learning algorithm to learn 
low-dimensional  features,  and  finally  used  the 
anomaly detection method to detect malware (Hassen 
et  al  2018).  S.  Pai  et  al.  used  the  expectation 
maximization  algorithm  in  the  cluster  detection 
analysis of malware (Pai et al 2017). 
Current  malicious  software  detection  primarily 
relies  on  two  main  methods:  static  analysis  and 
dynamic analysis. Most of these methods use either a 
single machine learning algorithm or a combination of 
multiple  learning  algorithms  to  construct 
classification  detection  models.  Static  analysis  is 
efficient but may have lower accuracy, while dynamic 
analysis  provides  high  accuracy  but  may  be  less 
efficient.  Therefore,  relying  solely  on  static  or 
dynamic analysis alone may not simultaneously meet 
the  dual  requirements  of  high  efficiency  and  high 
accuracy.  Hence,  this  paper  introduces  a  malicious 
software detection technique based on deep learning 
algorithms. The fusion of dynamic and static detection 
techniques ensures that the final detection process is 
both efficient and accurate. 
2  TRADITIONAL MALWARE 
DETECTION METHODS 
2.1  Static Analysis 
Static  analysis  is  the  initial  phase  of  the  malicious 
software  analysis  process,  primarily  focused  on 
examining  executable  files  without  delving  into 
specific  instructions.  Basic  static  analysis  can 
determine the presence of malicious characteristics in 
a file, offer insights into its expected functionality, or 
generate  fundamental  network  feature  identifiers. 
However,  static  analysis  has  its  limitations  when 
dealing  with  complex  malicious  software  and  may 
occasionally overlook critical malicious behaviours. 
2.1.1  Message Digest Algorithm 5 
Message Digest Algorithm 5 (MD5) is a commonly 
used  technique  for  identifying  malicious  software. 
The  MD5  method  involves  subjecting  malicious 
software to a hash function, resulting in a unique hash 
value generated for each malicious software instance. 
In the field of deep learning, feature extraction hashing 
is a commonly employed algorithm that can map data 
of  varying  sizes  into  standardized  fixed-size 
representations. 
2.1.2  PEiD Detection 
PEiD is a common way to detect wrapped files, and is 
often used  to  detect  files  generated by  a  packer or 
compiler.  Because  malware  is  often  packaged  or 
obfuscated, the malicious files it generates are more 
difficult  to  detect,  which  can  seriously  hinder  the 
analysis of malware. PEiD also has a security risk in 
its work, because its plug-ins tend to automatically run 
malicious  executables,  so  it  needs  to  create  a  safe 
environment for malicious operation and analysis. 
2.1.3  Executable File Format Analysis 
PE file format is a type of data structure, and almost 
all executable code files loaded in Windows systems 
are PE file formats. The PE file starts with the header 
and  includes  information  such  as  code,  application 
type,  library  functions,  etc.  The  information  in  the 
header of the file is valuable to malware analysts. 
2.1.4  Interactive Disassembly Expert (IDA 
Pro) 
As an advanced static  analysis method, IDA Pro is 
also the preferred disassembly tool for most malware 
analysts and vulnerability analysts (Raff et al 2017). 
Strings  are  the  starting  point  for  malware  static 
analysis,  using  their  cross-reference  feature  to  see 
exactly where and how strings are used in code, and 
disassembler  provides  a  snapshot  of  the  program 
before the first instruction is executed. 
2.2  Dynamic Analysis 
Dynamic analysis is the second phase in the process of 
analysing malicious software. It is typically employed 
when  basic  static  analysis  fails  to  yield  definitive 
results.  Dynamic  analysis  involves  monitoring  the 
behaviour of malicious software while it actively runs 
or examining system changes after the execution of 
malicious  software.  Unlike  static analysis,  dynamic 
analysis provides a deeper understanding of the actual 
functionality  and  internal  workings  of  malicious 
software. It has been proven to be an effective method 
for identifying malicious software. 
 
The Investigation of Malware Detection Model Construction Based on Deep Learning Algorithms
333