Malware detection is a crucial aspect of software security. Current malware detectors work by checking for signatures, which attempt to capture the syntactic characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes current detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter the syntactic properties of the malware byte sequence without significantly affecting their execution behavior. This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behavior of malware as well as that of the program being checked for infection, and uses abstract interpretation to ``hide'' irrelevant aspects of these behaviors. As a concrete application of our approach, we show that (1) standard signature matching detection schemes are generally sound but not complete, (2) the semantics-aware malware detector proposed byChristodorescu et al. is complete with respect to a number of common obfuscations used by malware writers and (3) the malware detection scheme proposed by Kinder et al. and based on standard model-checking techniques is sound in general and complete on some, but not all, obfuscations handled by the semantics-aware malware detector.

A Semantics-Based Approach to Malware Detection

DALLA PREDA, Mila;
2008-01-01

Abstract

Malware detection is a crucial aspect of software security. Current malware detectors work by checking for signatures, which attempt to capture the syntactic characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes current detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter the syntactic properties of the malware byte sequence without significantly affecting their execution behavior. This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behavior of malware as well as that of the program being checked for infection, and uses abstract interpretation to ``hide'' irrelevant aspects of these behaviors. As a concrete application of our approach, we show that (1) standard signature matching detection schemes are generally sound but not complete, (2) the semantics-aware malware detector proposed byChristodorescu et al. is complete with respect to a number of common obfuscations used by malware writers and (3) the malware detection scheme proposed by Kinder et al. and based on standard model-checking techniques is sound in general and complete on some, but not all, obfuscations handled by the semantics-aware malware detector.
2008
Malware Detection; Obfuscation; Trace Semantics; Abstract Inbterpretation
File in questo prodotto:
File Dimensione Formato  
DallaPreda-SemanticMalware.pdf

accesso aperto

Tipologia: Documento in Pre-print
Licenza: Accesso ristretto
Dimensione 496.1 kB
Formato Adobe PDF
496.1 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/312659
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 79
  • ???jsp.display-item.citation.isi??? 36
social impact