Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reframe README around the concept of differential analysis #663

Merged
merged 2 commits into from
Nov 26, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 46 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,32 +13,66 @@
subtle malware discovery tool
```

malcontent detects supply-chain compromises and other malicious software. It has 3 modes of operation:
malcontent discovers supply-chain compromises through the magic of context, differential analysis, and 14,000+ YARA rules.

* ✨`diff`: show the risk-weighted capability drift between two versions of a program
* ☝️ **Our bread & butter: malcontent does this better than anyone else**

```
________ ________ ________ ________
| | | | | | | |
| v1.0.0 | => | v1.0.1 | => | v1.0.2 | => | v1.0.3 |
|________| |________| |________| |________|

unchanged HIGH-RISK decreased
risk increase risk

```

malcontent has 3 modes of operation:

* ✨ `diff`: risk-weighted differential analysis between two programs
* 🕵️‍♀️ `analyze`: deep analysis of a program's capabilities
* 🔍 `scan`: find malicious content across a broad set of file formats
* 🔍 `scan`: basic scan of malicious content

malcontent is a bit paranoid and prone to false positives. It is currently focused on finding threats that impact Linux and macOS platforms, but malcontent can also detect threats that impact other platforms.
malcontent is at its best analyzing programs that run on Linux. Still, it also performs admirably for programs designed for other UNIX platforms such as macOS and, to a lesser extent, Windows.

## Features

* 14,500+ [YARA](YARA) detection rules
* Including third-party rules from companies such as Avast, Elastic, FireEye, Mandiant, Nextron, ReversingLabs, and more!
* Analyzes binaries from nearly any operating system (Linux, macOS, FreeBSD, Windows, etc.)
* Analyzes scripts (Python, shell, Javascript, Typescript, PHP, Perl, AppleScript)
* Analyzes container images
* Transparent archive support (apk, tar, zip, etc.)
* Analyzes binary files in most common formats (ELF, Mach-O, a.out, PE)
* Analyzes code from most common languages (AppleScript, C, Go, Javascript, PHP, Perl, Ruby, Shell, Typescript)
* Transparent support for archives (apk, tar, zip, etc.) & container images
* Multiple output formats (JSON, YAML, Markdown, Terminal)
* Designed to work as part of a CI/CD pipeline
* Supports air-gapped networks

## Modes

### Diff

malcontent's most powerful method for discovering malware is through differential analysis against CI/CD artifacts. When used within a build system, malcontent has two significant contextual advantages over a traditional malware scanner:

* Baseline of expected behavior (previous release)
* Semantic versioning that describes how large of a change to expect


Using the [3CX Compromise](https://www.fortinet.com/blog/threat-research/3cx-desktop-app-compromised) as an example, malcontent trivially surfaces unexpectedly high-risk changes to libffmpeg:

![diff screenshot](./images/diff.png)

Each line that begins with a "++" represents a newly added capability. Each capability has a risk score based on how unique it is to malware.

Like the diff(1) command it's based on, malcontent can diff between two binaries or directories. It can also diff two archive files or even two OCI images. Here are some helpful flags:

* `--format=markdown`: output in markdown for use in GitHub Actions
* `--min-file-risk=critical`: only show diffs for critical-level changes
* `--quantity-increases-risk=false`: disable heuristics that increase file criticality due to result frequency
* `--file-risk-change`: only show diffs for modified files when the source and destination files are of different risks
* `--file-risk-increase`: only show diffs for modified files when the destination file is of a higher risk than the source file

### Scan

Scan directories for possible malware. This is our simplest feature, but not particularly novel either. malcontent is pretty paranoid in this mode, so expect some false positives:
malcontent's most basic feature scans directories for possible malware. malcontent is pretty paranoid in this mode, so expect some false positives:

![scan screenshot](./images/scan.png)

Expand All @@ -51,7 +85,7 @@ Useful flags:

### Analyze

To analyze the capabilities of a program, use `mal analyze`. For example:
To enumerate the capabilities of a program, use `mal analyze`. For example:

![analyze screenshot](./images/analyze.png)

Expand All @@ -62,23 +96,6 @@ The analyze mode emits a list of capabilities often seen in malware, categorized
* `--format=json`: output to JSON for data parsing
* `--min-risk=high`: only show high or critical risk findings

### Diff

To detect unexpected capability changes, try `diff` mode. This allows you to find far more subtle attacks than a general scan, as you generally have both a baseline "known good" version and the context to understand what capabilities a program needs to operate.

Using the [3CX Compromise](https://www.fortinet.com/blog/threat-research/3cx-desktop-app-compromised) as an example, we're able to use malcontent to detect malicious code inserted in an otherwise harmless library:

![diff screenshot](./images/diff.png)

Each line that begins with a "++" represents a newly added capability. You can use it to diff entire directories recursively, even if they contain programs written in a variety of languages.

For use in CI/CD pipelines, you may find the following flags helpful:

* `--format=markdown`: output in markdown for use in GitHub Actions
* `--min-file-risk=critical`: only show diffs for critical-level changes
* `--quantity-increases-risk=false`: disable heuristics that increase file criticality due to result frequency
* `--file-risk-change`: only show diffs for modified files when the source and destination files are of different risks
* `--file-risk-increase`: only show diffs for modified files when the destination file is of a higher risk than the source file

## Installation

Expand Down Expand Up @@ -109,4 +126,4 @@ go install github.com/chainguard-dev/malcontent/cmd/mal@latest

## Help Wanted

malcontent is an honest-to-goodness open-source project. If you are interested in contributing, check out [DEVELOPMENT.md](DEVELOPMENT.md). Send us a pull request, and we'll help you with the rest!
malcontent is open source! If you are interested in contributing, check out [our development guide](DEVELOPMENT.md). Send us a pull request, and we'll help you with the rest!