How to confuse antimalware neural networks. Adversarial attacks and protection
#1
Bug 
Quote:
[Image: AdobeStock_108644145-1000x600.jpeg]

Introduction

Nowadays, cybersecurity companies implement a variety of methods to discover new, previously unknown malware files. Machine learning (ML) is a powerful and widely used approach for this task. At Kaspersky we have a number of complex ML models based on different file features, including models for static and dynamic detection, for processing sandbox logs and system events, etc. We implement different machine learning techniques, including deep neural networks, one of the most promising technologies that make it possible to work with large amounts of data, incorporate different types of features, and boast a high accuracy rate. But can we rely entirely on machine learning approaches in the battle with the bad guys? Or could powerful AI itself be vulnerable? Let’s do some research.

In this article we attempt to attack our product anti-malware neural network models and check existing defense methods.

Background

An adversarial attack is a method of making small modifications to the objects in such a way that the machine learning model begins to misclassify them. Neural networks (NN) are known to be vulnerable to such attacks. Research of adversarial methods historically started in the sphere of image recognition. It has been shown that minor changes in pictures, such as the addition of insignificant noise can cause remarkable changes in the predictions of the classifiers and even completely confuse ML models.

Furthermore, the insertion of small patterns into the image can also force models to change their predictions in the wrong direction[ii].


After this susceptibility to small data changes was highlighted in the image recognition of neural networks, similar techniques were demonstrated in other data domains. In particular, various types of attacks against malware detectors were proposed, and many of them were successful.

In the paper “Functionality-preserving black-box optimization of adversarial windows malware”[iii] the authors extracted data sequences from benign portable executable (PE) files and added them to malware files either at the end of the file (padding) or within newly created sections (section injection). These changes affected the scores of the targeted classifier while preserving file functionality by design. A collection of these malware files with inserted random benign file parts was formed. Using genetic algorithms (including mutations, cross-over and other types of transformations) and the malware classifier for predicting scores, the authors iteratively modified the collection of malware files, making them more and more difficult for the model to be classified correctly.

This was done via objective function optimization, which contains two conflicting terms: the classification output on the manipulated PE file, and a penalty function that evaluates the number of injected bytes into the input data. Although the proposed attack was effective, it did not use state-of-the-art ML adversarial techniques and relied on public pre-trained models. Also, the authors measured an average effectiveness of the attack against VirusTotal anti-malware engines, so we don’t know for sure how effective it is against the cybersecurity industry’s leading solutions. Moreover, since most security products still use traditional methods of detection, it’s unclear how effective the attack was against the ML component of anti-malware solutions, or against other types of detectors.

Another study, “Optimization-guided binary diversification to mislead neural networks for malware detection”[iv], proposed a method for functionality-preserving assembler operand changes in functions, and adversarial attacks based on it. The algorithm randomly selects a function and transformation type and tries to apply selected changes. The attempted transformation is applied only if the targeted NN classifier becomes more likely to misclassify the binary file. Again, this attack lacks ML methods for adversarial modification, and it has not been tested on specific anti-malware products.

Some papers proposed gradient-driven adversarial methods that use knowledge about model structure and features for malicious file modification[v]. This approach provides more opportunities for file modifications and results in better effectiveness. Although the authors conducted experiments in order to measure the impact of such attacks against specific malware detectors (including public models), they don’t work with product anti-malware classifiers.

For a more detailed overview of the various adversarial attacks on malware classifiers, see our whitepaper and “A survey on practical adversarial examples for malware classifiers“.

Our goal

Since Kaspersky anti-malware solutions, among other techniques, rely on machine learning models, we’re extremely interested in investigating how vulnerable our ML models are to adversarial attacks. Three attack scenarios can be considered:

White-box attack. In this scenario, all information about a model is available. Armed with this information, attackers try to convert malware files (detected by the model) to adversarial samples with identical functionality but misclassified as benign. In real life this attack is possible when the ML detector is a part of the client application and can be retrieved by code reversing. In particular, researchers at Skylight reported such a scenario for the Cylance antivirus product.

Gray-box attack. Complex ML models usually require a significant amount of both computational and memory resources. Therefore, the ML classifiers may be cloud-based and deployed on the security company servers. In this case, the client applications merely compute and send file features to these servers.

The cloud-based malware classifier responds with the predictions for given features. The attackers have no access to the model, but they still have knowledge about feature construction, and can get labels for any file by scanning it with the security product.

Black-box attack. In this case, feature computation and model prediction are performed on the cybersecurity company’s side. The client applications just send raw files, or the security company collects files in another way. Therefore, no information about feature processing is available. There are strict legal restrictions for sending information from the user machine. This approach also involves traffic limitation. This means the malware detection process usually can’t be performed for all user files on the go. Therefore, an attack on a black-box system is still the most difficult.

Consequently, we will focus on the first two attack scenarios and investigate their effectiveness against our product model.
...
Continue Reading
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)
[-]
Welcome
You have to register before you can post on our site.

Username/Email:


Password:





[-]
Recent Posts
UltraSearch 4.6.0.1091
UltraSearch 4.6.0....harlan4096 — 10:38
Brave 1.73.91
Release Channel 1....harlan4096 — 10:11
AdGuard Browser Extension 5.0.169 (MV3)
AdGuard Browser Ex...harlan4096 — 10:10
uBOLite_2024.11.20.858
uBOLite_2024.11.20...harlan4096 — 10:09
CrystalDiskInfo 9.5.0 [2024/11/20]
9.5.0 ​ Added D...harlan4096 — 10:08

[-]
Birthdays
Today's Birthdays
No birthdays today.
Upcoming Birthdays
avatar (56)Stefanos

[-]
Online Staff
There are no staff members currently online.

>