Authors:
Igor Santos
;
Yoseba K. Penya
;
Jaime Devesa
and
Pablo G. Bringas
Affiliation:
Deusto Technological Foundation, Spain
Keyword(s):
Security, Computer viruses, Data-mining, Malware detection, Machine learning.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Formal Methods
;
Health Information Systems
;
Industrial Applications of Artificial Intelligence
;
Information Systems Analysis and Specification
;
Methodologies and Technologies
;
Operational Research
;
Security
;
Sensor Networks
;
Signal Processing
;
Simulation and Modeling
;
Soft Computing
Abstract:
Malware is any malicious code that has the potential to harm any computer or network. The amount of malware is increasing faster every year and poses a serious security threat. Thus, malware detection is a critical topic in computer security.
Currently, signature-based detection is the most extended method for detecting malware.
Although this method is still used on most popular commercial computer antivirus software, it can only achieve detection once the virus has already caused damage and it is registered. Therefore, it fails to detect new malware. Applying a methodology proven successful in similar problem-domains, we propose the use of n-grams (every substring of a larger string, of a fixed lenght \textit{n}) as file signatures in order to detect unknown malware whilst keeping low false positive ratio. We show that n-grams signatures provide an effective way to detect unknown malware.