Malware detection is a crucial aspect of software security. Current malware detectors work by checking for signatures, which attempt to capture the syntactic characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes current detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter the syntactic properties of the malware byte sequence without significantly affecting their execution behavior. This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behavior of malware as well as that of the program being checked for infection, and uses abstract interpretation to ``hide'' irrelevant aspects of these behaviors. As a concrete application of our approach, we show that (1) standard signature matching detection schemes are generally sound but not complete, (2) the semantics-aware malware detector proposed byChristodorescu et al. is complete with respect to a number of common obfuscations used by malware writers and (3) the malware detection scheme proposed by Kinder et al. and based on standard model-checking techniques is sound in general and complete on some, but not all, obfuscations handled by the semantics-aware malware detector.
A Semantics-Based Approach to Malware Detection
DALLA PREDA, Mila;
2008-01-01
Abstract
Malware detection is a crucial aspect of software security. Current malware detectors work by checking for signatures, which attempt to capture the syntactic characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes current detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter the syntactic properties of the malware byte sequence without significantly affecting their execution behavior. This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behavior of malware as well as that of the program being checked for infection, and uses abstract interpretation to ``hide'' irrelevant aspects of these behaviors. As a concrete application of our approach, we show that (1) standard signature matching detection schemes are generally sound but not complete, (2) the semantics-aware malware detector proposed byChristodorescu et al. is complete with respect to a number of common obfuscations used by malware writers and (3) the malware detection scheme proposed by Kinder et al. and based on standard model-checking techniques is sound in general and complete on some, but not all, obfuscations handled by the semantics-aware malware detector.File | Dimensione | Formato | |
---|---|---|---|
DallaPreda-SemanticMalware.pdf
accesso aperto
Tipologia:
Documento in Pre-print
Licenza:
Accesso ristretto
Dimensione
496.1 kB
Formato
Adobe PDF
|
496.1 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.