-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Specification
The rules consist of a few required sections and several optional ones.
title
id [optional]
related [optional]
- type {type-identifier}
id {rule-id}
status [optional]
description [optional]
author [optional]
references [optional]
logsource
category [optional]
product [optional]
service [optional]
definition [optional]
...
detection
{search-identifier} [optional]
{string-list} [optional]
{field: value} [optional]
...
timeframe [optional]
condition
fields [optional]
falsepositives [optional]
level [optional]
tags [optional]
...
[arbitrary custom fields]
type: //rec
required:
title:
type: //str
length:
min: 1
max: 256
logsource:
type: //rec
optional:
category: //str
product: //str
service: //str
definition: //str
detection:
type: //rec
required:
condition:
type: //any
of:
- type: //str
- type: //arr
contents: //str
length:
min: 2
optional:
timeframe: //str
rest:
type: //any
of:
- type: //arr
contents: //str
- type: //map
values:
type: //any
of:
- type: //str
- type: //arr
contents: //str
length:
min: 2
optional:
id:
type: //any
of:
- type: //str
length:
min: 1
max: 64
related:
type: //arr
contents:
type: //rec
required:
type:
type: //any
of:
- type: //str
value: derived
- type: //str
value: obsoletes
- type: //str
value: merged
- type: //str
value: renamed
id:
type: //any
of:
- type: //str
length:
min: 1
max: 64
- type: //arr
contents: //str
length:
min: 1
max: 64
status:
type: //any
of:
- type: //str
value: stable
- type: //str
value: testing
- type: //str
value: experimental
description: //str
author: //str
license: //str
references:
type: //arr
contents: //str
fields:
type: //arr
contents: //str
falsepositives:
type: //any
of:
- type: //str
- type: //arr
contents: //str
length:
min: 2
level:
type: //any
of:
- type: //str
value: low
- type: //str
value: medium
- type: //str
value: high
- type: //str
value: critical
tags:
type: //arr
contents: //str
rest: //any
Attribute: title
A brief title for the rule that should contain what the rules is supposed to detect (max. 256 characters)
Attributes: id, related
Sigma rules should be identified by a globally unique identifier in the id attribute. For this purpose random generated UUIDs (version 4) are recommended but not mandatory. An example for this is:
title: Test rule
id: 929a690e-bef0-4204-a928-ef5e620d6fcc
Rule identifiers can and should change for the following reasons:
- Major changes in the rule. E.g. a different rule logic.
- Derivation of a new rule from an existing or refinement of a rule in a way that both are kept active.
- Merge of rules.
To being able to keep track on relationships between detections, Sigma rules may also contain references to related rule identifiers in the related attribute. This allows to define common relationships between detections as follows:
related:
- id: 08fbc97d-0a2f-491c-ae21-8ffcfd3174e9
type: derived
- id: 929a690e-bef0-4204-a928-ef5e620d6fcc
type: obsoletes
Currently the following types are defined:
- derived: Rule was derived from the referred rule or rules, which may remain active.
- obsoletes: Rule obsoletes the referred rule or rules, which aren't used anymore.
- merged: Rule was merged from the referred rules. The rules may be still existing and in use.
- renamed: The rule had previously the referred identifier or identifiers but was renamed for any other reason, e.g. from a private naming scheme to UUIDs, to resolve collisions etc. It's not expected that a rule with this id exists anymore.
Attribute: status
Declares the status of the rule:
- stable: the rule is considered as stable and may be used in production systems or dashboards.
- test: an almost stable rule that possibly could require some fine tuning.
- experimental: an experimental rule that could lead to false results or be noisy, but could also identify interesting events.
- deprecated: the rule is replace or cover by another one. The link is made by the
related
field. - unsupported: the rule can not be use in its current state (special correlation log, home-made fields)
Attribute: description
A short description of the rule and the malicious activity that can be detected (max. 65,535 characters)
Attribute: license
License of the rule according the SPDX ID specification.
Attribute: author
Creator of the rule.
Attribute: reference
References to the source that the rule was derived from. These could be blog articles, technical papers, presentations or even tweets.
Attribute: logsource
This section describes the log data on which the detection is meant to be applied to. It describes the log source, the platform, the application and the type that is required in detection.
It consists of three attributes that are evaluated automatically by the converters and an arbitrary number of optional elements. We recommend using a "definition" value in cases in which further explication is necessary.
- category - examples: firewall, web, antivirus
- product - examples: windows, apache, check point fw1
- service - examples: sshd, applocker
The "category" value is used to select all log files written by a certain group of products, like firewalls or web server logs. The automatic conversion will use the keyword as a selector for multiple indices.
The "product" value is used to select all log outputs of a certain product, e.g. all Windows Eventlog types including "Security", "System", "Application" and the new log types like "AppLocker" and "Windows Defender".
Use the "service" value to select only a subset of a product's logs, like the "sshd" on Linux or the "Security" Eventlog on Windows systems.
The "definition" can be used to describe the log source, including some information on the log verbosity level or configurations that have to be applied. It is not automatically evaluated by the converters but gives useful advice to readers on how to configure the source to provide the necessary events used in the detection.
You can use the values of 'category, 'product' and 'service' to point the converters to a certain index. You could define in the configuration files that the category 'firewall' converts to ( index=fw1* OR index=asa* )
during Splunk search conversion or the product 'windows' converts to "_index":"logstash-windows*"
in ElasticSearch queries.
Instead of referring to particular services, generic log sources may be used, e.g.:
category: process_creation
product: windows
Instead of definition of multiple rules for Sysmon, Windows Security Auditing and possible product-specific rules.
Attribute: detection
A set of search-identifiers that represent searches on log data
A definition that can consist of two different data structures - lists and maps.
- All values are treated as case-insensitive strings
- You can use wildcard characters
*
and?
in strings (see also escaping section below) - Regular expressions are case-sensitive by default
- You don't have to escape characters except the string quotation marks
'
The backslash character \
is used for escaping of wildcards *
and ?
as well as the backslash character itself. Escaping of the backslash is necessary if it is followed by a wildcard depending on the desired result.
Summarized, there are the following possibilities:
- Plain backslash not followed by a wildcard can be expressed as single
\
or double backslash\\
. For simplicity reasons the single notation is recommended. - A wildcard has to be escaped to handle it as a plain character:
\*
- The backslash before a wildcard has to be escaped to handle the value as a backslash followed by a wildcard:
\\*
- Three backslashes are necessary to escape both, the backslash and the wildcard and handle them as plain values:
\\\*
- Three or four backslashes are handled as double backslash. Four a recommended for consistency reasons:
\\\\
results in the plain value\\
.
The lists contain strings that are applied to the full log message and are linked with a logical 'OR'.
Example: Matches on 'EvilService' or 'svchost.exe -n evil'
detection:
keywords:
- EVILSERVICE
- svchost.exe -n evil
Maps (or dictionaries) consist of key/value pairs, in which the key is a field in the log data and the value a string or integer value. Lists of maps are joined with a logical 'OR'. All elements of a map are joined with a logical 'AND'.
Examples:
Matches on Eventlog 'Security' and ( Event ID 517 or Event ID 1102 )
detection:
selection:
- EventLog: Security
EventID:
- 517
- 1102
condition: selection
Matches on Eventlog 'Security' and Event ID 4679 and TicketOptions 0x40810000 and TicketEncryption 0x17
detection:
selection:
- EventLog: Security
EventID: 4769
TicketOptions: '0x40810000'
TicketEncryption: '0x17'
condition: selection
There are special field values that can be used.
- An empty value is defined with
''
- A null value is defined with
null
OBSOLETE: An arbitrary value except null or empty cannot be defined with not null
anymore
The application of these values depends on the target SIEM system.
To get an expression that say not null
you have to create another selection and negate it in the condition.
Example:
detection:
selection:
EventID: 4738
filter:
PasswordLastSet: null
condition:
selection and not filter
The values contained in Sigma rules can be modified by value modifiers. Value modifiers are
appended after the field name with a pipe character |
as separator and can also be chained, e.g.
fieldname|mod1|mod2: value
. The value modifiers are applied in the given order to the value.
There are two types of value modifiers:
- Transformation modifiers transform values into different values, like the two Base64 modifiers mentioned above. Furthermore, this type of modifier is also able to change the logical operation between values. Transformation modifiers are generally backend-agnostic. Means: you can use them with any backend.
- Type modifiers change the type of a value. The value itself might also be changed by such a modifier, but the main purpose is to tell the backend that a value should be handled differently by the backend, e.g. it should be treated as regular expression when the re modifier is used. Type modifiers must be supported by the backend.
Generally, value modifiers work on single values and value lists. A value might also expand into multiple values.
-
contains
: puts*
wildcards around the values, such that the value is matched anywhere in the field. -
all
: Normally, lists of values were linked with OR in the generated query. This modifier changes this to AND. This is useful if you want to express a command line invocation with different parameters where the order may vary and removes the need for some cumbersome workarounds. -
base64
: The value is encoded with Base64. -
base64offset
: If a value might appear somewhere in a base64-encoded value the representation might change depending on the position in the overall value. There are three variants for shifts by zero to two bytes and except the first and last byte the encoded values have a static part in the middle that can be recognized. -
endswith
: The value is expected at the end of the field's content (replaces e.g. '*\cmd.exe') -
startswith
: The value is expected at the beginning of the field's content. (replaces e.g. 'adm*') -
utf16le
: transforms value to UTF16-LE encoding, e.g.cmd
>63 00 6d 00 64 00
(only used in combination with base64 modifiers) -
utf16be
: transforms value to UTF16-BE encoding, e.g.cmd
>00 63 00 6d 00 64
(only used in combination with base64 modifiers) -
wide
: alias forutf16le
modifier -
utf16
: prepends a byte order mark and encodes UTF16, e.g.cmd
>FF FE 63 00 6d 00 64 00
(only used in combination with base64 modifiers)
- re: value is handled as regular expression by backends. Currently, this is only supported by the Elasticsearch query string backend (es-qs). Further (like Splunk) are planned or have to be implemented by contributors with access to the target systems.
A relative time frame definition using the typical abbreviations for day, hour, minute, second.
Examples:
15s (15 seconds)
30m (30 minutes)
12h (12 hours)
7d (7 days)
3M (3 months)
The time frame is defined in the timeframe attribute of the detection section.
Note: The time frame is often a manual setting that has to be defined within the SIEM system and is not part of the generated query.
Attribute: condition
The condition is the most complex part of the specification and will be subject to change over time and arising requirements. In the first release it will support the following expressions.
-
Logical AND/OR
keywords1 or keywords2
-
1/all of search-identifier
Same as just 'keywords' if keywords are defined in a list. X may be:
- 1 (logical or across alternatives)
- all (logical and across alternatives)
Example:
all of keywords
means that all items of the list keywords must appear, instead of the default behaviour of any of the listed items. -
1/all of them
Logical OR (
1 of them
) or AND (all of them
) across all defined search identifiers. The search identifiers themselves are logically linked with their default behaviour for maps (AND) and lists (OR).The usage of
all of them
is discouraged, as it prevents the possibility of downstream users of a rule to generically filter unwanted matches. Seeall of {search-identifier-pattern}
in the next section as the preferred method.Example:
1 of them
means that one of the defined search identifiers must appear. -
1/all of search-identifier-pattern
Same as 1/all of them, but restricted to matching search identifiers. Matching is done with * wildcards (any number of characters) at arbitrary positions in the pattern.
Examples:
all of selection*
1 of selection* and keywords
any of selection* and not filters
-
Negation with 'not'
keywords and not filters
-
Brackets
selection1 and (keywords1 or keywords2)
-
Pipe (deprecated)
search_expression | aggregation_expression
A pipe indicates that the result of search_expression is aggregated by aggregation_expression and possibly compared with a value.
The first expression must be a search expression that is followed by an aggregation expression with a condition.
Aggregations in the condition are deprecated and will be replaced with Sigma correlations.
-
Aggregation expression (deprecated)
agg-function(agg-field) [ by group-field ] comparison-op value
agg-function may be:
- count
- min
- max
- avg
- sum
All aggregation functions except count require a field name as parameter. The count aggregation counts all matching events if no field name is given. With field name it counts the distinct values in this field.
Example:
count(UserName) by SourceWorkstation > 3
This comparison counts distinct user names grouped by SourceWorkstations.
-
Near aggregation expression
near search-id-1 [ [ and search-id-2 | and not search-id-3 ] ... ]
This expression generates (if supported by the target system and backend) a query that recognizes search_expression (primary event) if the given conditions are or are not in the temporal context of the primary event within the given time frame.
Operator Precedence (least to most binding)
- |
- or
- and
- not
- x of search-identifier
- ( expression )
If multiple conditions are given, they are logically linked with OR.
Attribute: fields
A list of log fields that could be interesting in further analysis of the event and should be displayed to the analyst.
Attribute: falsepositives
A list of known false positives that may occur.
Attribute: level
The level field contains one of five string values. It describes the criticality of a triggered rule. While low
and medium
level events have an informative character, events with high
and critical
level should lead to immediate reviews by security analysts.
-
informational
: Rule is intended for enrichment of events, e.g. by tagging them. No case or alerting should be triggered by such rules because it is expected that a huge amount of events will match these rules. -
low
: Notable event but rarely an incident. Low rated events can be relevant in high numbers or combination with others. Immediate reaction shouldn't be necessary, but a regular review is recommended. -
medium
: Relevant event that should be reviewed manually on a more frequent basis. -
high
: Relevant event that should trigger an internal alert and requires a prompt review. -
critical
: Highly relevant event that indicates an incident. Critical events should be reviewed immediately.
Attribute: tags
A Sigma rule can be categorised with tags. Tags should generally follow this syntax:
- Character set: lower-case letters, underscores and hyphens
- no spaces
- Tags are namespaced, the dot is used as separator. e.g. attack.t1234 refers to technique 1234 in the namespace attack; Namespaces may also be nested
- Keep tags short, e.g. numeric identifiers instead of long sentences
- If applicable, use predefined tags. Feel free to send pull request or issues with proposals for new tags
Placeholders can be used to select a set of elements that can be expanded during conversion. Placeholders map a an identifier to a user defined value that can be set in config files for an automatic replacement during conversion runs. Placeholders are meaningful identifiers that users can easily expand themselves.
-
%Administrators%
- Administrative user accounts -
%JumpServers%
- Server systems used as jump servers
Some SIEM systems allow using so-called "tags" or "search macros" in queries and can integrate Sigma rules with placeholders directly. Others expand the placeholders values to wildcard strings or regular expressions.
Splunk
-
AccountName: %Administrators%
convert totag=Administrators
Elastic Search
-
SourceWorkstation: %JumpServers%
convert to"SourceWorkstation": SRV110[12]
A file may contain multiple YAML documents. These can be complete Sigma rules or action documents. A YAML document is handled as action document if the action
attribute on the top level is set to:
-
global
: Defines YAML content that is merged in all following YAML rule documents in this file. Multiple global action documents are accumulated. ** Use case: define metadata and rule parts that are common across all Sigma rules of a collection. -
reset
: Reset global YAML content defined by global action documents. -
repeat
: Repeat generation of previous rule document with merged data from this YAML document. ** Use case: Small modifications of previously generated rule.
A common use case is the definition of multiple Sigma rules for similar events like Windows Security EventID 4688 and Sysmon EventID 1. Both are created for process execution events. A Sigma rule collection for this scenario could contain three documents:
- A global action document that defines common metadata and detection indicators
- A rule that defines Windows Security log source and EventID 4688
- A rule that defines Windows Sysmon log source and EventID 1
Alternative solution could be:
- A global action document that defines common metadata.
- The Security/4688 rule with all event details.
- A repeat action document that replaces the logsource and EventID from the rule defined in 2.