Posts: 15,569
Threads: 10,027
Thanks Received: 9,253 in 7,404 posts
Thanks Given: 10,105
Joined: 12 September 18
07 November 19, 11:46
Quote:
Pipelining VT Intelligence searches and sandbox report lookups via APIv3 to automatically generate indicators of compromise
TL;DR: VirusTotal APIv3 includes an endpoint to retrieve all the dynamic analysis reports for a given file. This article showcases programmatic retrieval of sandbox behaviour reports in order to produce indicators of compromise that you can use to power-up your network perimeter/endpoint defenses. We are also releasing a set of python scripts alongside this blog post to illustrate this use case.
We recently rolled out a new Windows dynamic analysis system called VirusTotal Jujubox. This new sandbox represents a major revamp of VirusTotal’s in-house behaviour analysis capabilities as well as a key addition to the multi-sandbox project, which already aggregates behaviour reports from more than 10 partners and the most popular operating systems.
Behaviour reports are often perceived as a mechanism to understand what an individual sample does when executed, a quick overview before diving into disassembly and debugging. However, when you have a massive dynamic analysis setup processing hundreds of thousands of files per day, the microscopic dissection capability is far from being the most attractive use case.
When you generate reports at scale, and more importantly, when you index them in an elasticsearch index and expose it via API, the generated data can be used for advanced hunting, especially when this data can be combined with other static, binary and in-the-wild properties.
The basic workflow would be as follows:
1. Periodically identify new malware variants pertaining to a family that you are tracking making use of the VT Intelligence search API. Use family variant commonalities (for instance a section name, the compilation timestamp or a document’s author metadata property) to retrieve a stream of malware.
2. Focus on recent matches since the previous execution (query: fs:2019-11-01+).
3. For each match, retrieve the generated behaviour reports for the pertinent file. You can also focus specifically on network communications with the contacted_ips, contacted_domains and contacted_urls relationships.
4. For each automatically extracted network observable, check popularity ranks in order to filter out noise and FPs.
5. All the newly yielded network artefacts (CnCs) can then be fed into SIEMs or transformed into IDS rules to power up network perimeter defenses.
...
Continue Reading