v7.0.0
This is the v7.0.0 release of capa which was mainly worked on during the Google Summer of Code (GSoC) 2023. A huge
shoutout to our GSoC contributors @colton-gabertan and @yelhamer for their amazing work. See our blog posts for more details:
Also, a big thanks to the other contributors: @aaronatp, @Aayush-Goel-04, @bkojusner, @doomedraven, @ruppde, @larchchen, @JCoonradt, and @xusheng6.
New Features
- add Ghidra backend #1770 #1767 @colton-gabertan @mike-hunhoff
- add Ghidra UI integration #1734 @colton-gabertan @mike-hunhoff
- add dynamic analysis via CAPE sandbox reports #48 #1535 @yelhamer
- binja: add support for forwarded exports #1646 @xusheng6
- binja: add support for symtab names #1504 @xusheng6
- add com class/interface features #322 @Aayush-Goel-04
- dotnet: emit enclosing class information for nested classes #1780 #1913 @bkojusner @mike-hunhoff
Breaking Changes
- remove the
SCOPE_*
constants in favor of theScope
enum #1764 @williballenthin - protobuf: deprecate
RuleMetadata.scope
in favor ofRuleMetadata.scopes
@williballenthin - protobuf: deprecate
Metadata.analysis
in favor ofMetadata.analysis2
that is dynamic analysis aware @williballenthin - update freeze format to v3, adding support for dynamic analysis @williballenthin
- extractor: ignore DLL name for api features #1815 @mr-tz
- main: introduce wrapping routines within main for working with CLI args #1813 @williballenthin
- move functions from
capa.main
to newcapa.loader
namespace #1821 @williballenthin - proto: add
package
declaration #1960 @larchchen
New Rules (41)
- nursery/get-ntoskrnl-base-address @mr-tz
- host-interaction/network/connectivity/set-tcp-connection-state @johnk3r
- nursery/capture-process-snapshot-data @mr-tz
- collection/network/capture-packets-using-sharppcap [email protected]
- nursery/communicate-with-kernel-module-via-netlink-socket-on-linux [email protected]
- nursery/get-current-pid-on-linux [email protected]
- nursery/get-file-system-information-on-linux [email protected]
- nursery/get-password-database-entry-on-linux [email protected]
- nursery/mark-thread-detached-on-linux [email protected]
- nursery/persist-via-gnome-autostart-on-linux [email protected]
- nursery/set-thread-name-on-linux [email protected]
- load-code/dotnet/load-windows-common-language-runtime [email protected] [email protected] [email protected]
- nursery/log-keystrokes-via-input-method-manager @mr-tz
- nursery/encrypt-data-using-rc4-via-systemfunction032 [email protected]
- nursery/add-value-to-global-atom-table @mr-tz
- nursery/enumerate-processes-that-use-resource @Ana06
- host-interaction/process/inject/allocate-or-change-rwx-memory @mr-tz
- lib/allocate-or-change-rw-memory [email protected] @mr-tz
- lib/change-memory-protection @mr-tz
- anti-analysis/anti-av/patch-antimalware-scan-interface-function [email protected]
- executable/dotnet-singlefile/bundled-with-dotnet-single-file-deployment [email protected]
- internal/limitation/file/internal-dotnet-single-file-deployment-limitation [email protected]
- data-manipulation/encoding/encode-data-using-add-xor-sub-operations [email protected]
- nursery/access-camera-in-dotnet-on-android [email protected]
- nursery/capture-microphone-audio-in-dotnet-on-android [email protected]
- nursery/capture-screenshot-in-dotnet-on-android [email protected]
- nursery/check-for-incoming-call-in-dotnet-on-android [email protected]
- nursery/check-for-outgoing-call-in-dotnet-on-android [email protected]
- nursery/compiled-with-xamarin [email protected]
- nursery/get-os-version-in-dotnet-on-android [email protected]
- data-manipulation/compression/create-cabinet-on-windows [email protected] [email protected]
- data-manipulation/compression/extract-cabinet-on-windows [email protected]
- lib/create-file-decompression-interface-context-on-windows [email protected]
- nursery/enumerate-files-in-dotnet [email protected] [email protected]
- nursery/get-mac-address-in-dotnet [email protected] [email protected] [email protected]
- nursery/get-current-process-command-line [email protected]
- nursery/get-current-process-file-path [email protected]
- nursery/hook-routines-via-dlsym-rtld_next [email protected]
- nursery/linked-against-hp-socket [email protected]
- host-interaction/process/inject/process-ghostly-hollowing [email protected]
Bug Fixes
- ghidra: fix
ints_to_bytes
performance #1761 @mike-hunhoff - binja: improve function call site detection @xusheng6
- binja: use
binaryninja.load
to open files @xusheng6 - binja: bump binja version to 3.5 #1789 @xusheng6
- elf: better detect ELF OS via GCC .ident directives #1928 @williballenthin
- elf: better detect ELF OS via Android dependencies #1947 @williballenthin
- fix setuptools package discovery #1886 @gmacon @mr-tz
- remove unnecessary scripts/vivisect-py2-vs-py3.sh file #1949 @JCoonradt
capa explorer IDA Pro plugin
- various integration updates and minor bug fixes
Development
Developer Notes
With this new release, many classes and concepts have been split up into static (mostly identical to the
prior implementations) and dynamic ones. For example, the legacy FeatureExtractor class has been renamed to
StaticFeatureExtractor and the DynamicFeatureExtractor has been added.
Starting from version 7.0, we have moved the component responsible for feature extractor from main to a new
capabilities' module. Now, users wishing to utilize capa’s feature extraction abilities should use that module instead
of importing the relevant logic from the main file.
For sandbox-based feature extractors, we are using Pydantic models. Contributions of more models for other sandboxes
are very welcome!
With this release we've reorganized the logic found in main()
to localize logic and ease readability and ease changes
and integrations. The new "main routines" are expected to be used only within main functions, either capa main or
related scripts. These functions should not be invoked from library code.
Beyond copying code around, we've refined the handling of the input file/format/backend. The logic for picking the
format and backend is more consistent. We've documented that the input file is not necessarily the sample itself
(cape/freeze/etc.) inputs are not actually the sample.