Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARA Refactor and Option to Output Hexadecimal Offset Matching #391

Merged
merged 13 commits into from
Aug 20, 2023

Conversation

phutelmyer
Copy link
Contributor

@phutelmyer phutelmyer commented Aug 13, 2023

Describe the change
The previous implementation of YARA scanning in Strelka involved redundant rule and configuration loading, which affected performance. Additionally, the option to output the Hexadecimal Offset of matches provides a more in-depth view of rule matches and can assist in further analysis.

Hexadecimal Offset Matching
Users can now output the Hex offset in which a YARA match occurred. To reduce impact to the cluster, not all rules will perform this processing. Only rules with the proper meta tag will (see the offset_meta_key below for details). This is how the meta should look if you'd like to extract the match data.

rule MiddlePatternMatch
{
    meta:
        StrelkaHexDump = true

    strings:
        $my_pattern = "PatternToMatch"

    condition:
        $my_pattern
}

The output will look like this:

    "hex": [
        {
          "dump": [
            "00000060  42 42 42 42 42 0a 43 43 43 43 43 43 43 43 43 43   BBBBB.CCCCCCCCCC",
            "00000070  50 61 74 74 65 72 6e 54 6f 4d 61 74 63 68 43 43   PatternToMatchCC",
            "00000080  43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43   CCCCCCCCCCCCCCCC"
          ],
          "rule": "MiddlePatternMatch"
        }
    ],

Additional Configuration Options
With this change, the following configuration options were added to the ScanYara config in the backend.yml:

store_offset: Determines whether to store the offset of YARA matches. Defaults to False.
offset_meta_key: Key to determine which metadata should trigger offset logging. Defaults to StrelkaHexDump
offset_padding: Defines the number of bytes to include as padding around the matched string in the hex dump. Defaults to 32.

Describe testing procedures
Developed and successfully Pytest with relevant fixture.

Sample output

{
    "matches": ["MiddlePatternMatch"],
    "tags": ["tag1", "tag2"],
    "meta": [{"rule": "MiddlePatternMatch", "identifier": "author", "value": "John Doe"}],
    "hex": [
        {
          "dump": [
            "00000060  42 42 42 42 42 0a 43 43 43 43 43 43 43 43 43 43   BBBBB.CCCCCCCCCC",
            "00000070  50 61 74 74 65 72 6e 54 6f 4d 61 74 63 68 43 43   PatternToMatchCC",
            "00000080  43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43   CCCCCCCCCCCCCCCC"
          ],
          "rule": "MiddlePatternMatch"
        }
    ],
}

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of and tested my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings

@phutelmyer phutelmyer marked this pull request as ready for review August 20, 2023 14:52
@phutelmyer phutelmyer merged commit 6129dc1 into master Aug 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant