Skip to content

Latest commit

 

History

History
276 lines (231 loc) · 9.65 KB

0032-definition-of-ecs-compliance.md

File metadata and controls

276 lines (231 loc) · 9.65 KB

0032: Definitions of ECS Compliance

  • Stage: 2 (candidate)
  • Date: 2022-05-16

Events described as "ECS-compliant" follow the ECS guidelines and best practices. While the guidelines provide an overview, more detailed, established guidance aids both mapping to and understanding of ECS events. This proposal standardizes what is and is not expected of an ECS-compliant event.

This document's usage of the terms must, must not, should, should not, required, and may are in accordance with IETF RFC 2119.

Fields

The following sections describe the required, suggested, and optional practices. Later sections include more detailed examples of the requirements and guidelines.

Minimum requirements

An ECS-compliant event MUST:

  • populate the @timestamp field with the date/time the event originated.
  • set ecs.version to the ECS version this event conforms.
  • index all ECS fields using the data type defined in the schema. A different type from the same type family (e.g., keyword for wildcard) may substitute
  • use nested fields over dotted. ECS events should use nested objects, { "log": { "level": "debug" }}), over dotted field names, { "log.level": "debug" }.

Recommended guidelines

ECS-compliant events SHOULD:

  • map the contents of the original event to as many ECS fields as possible.
  • populate the top-level message field.
  • If a field expects an array, the value should always be an array even if the array contains one value (for example, [ 10.42.42.42 ]).
  • lowercase the value if the field's description calls for it.
  • set the event categorization fields using the allowed values.
  • populate source.* and destination.* as a pair, when possible.
  • populate source.*/destination.* if client.*/server.* are populated.
  • copy all relevant values into the related.* fields.
  • use "breakdown" fields. Breakdown fields take an original value and deconstruct it. Examples include user_agent.* or .domain, .sub_domain, .registered_domain, etc.
  • duplicate the .address field value into either .ip or .domain. Dot not populate the .ip and .domain fields directly.

Optional

ECS-compliant events MAY:

  • store the entire raw, original event in event.original. Disable indexing and doc_values on event.original to reduce store.
  • add multi-fields not defined by ECS. For example, a text multi-field with a custom analyzer.
  • remove unused ECS fields or entire field sets from an index mapping.
  • use custom fields alongside ECS fields in an event following these convention:
    • ECS field names avoids using proper nouns. Nest custom fields in a namespace using a proper noun: a tool name, project, or company (e.g., nginx, acme_corp).
    • ECS field names are always lowercase. Use capitalized key names to avoid future field conflicts.
    • Place custom fields inside a dedicated namespace and not at the top-level of an event.

Usage

An ECS-compliant event will have @timestamp, ecs.version, and at least one other field. Dots in the ECS field name represent a nested object structure. The following mapping indexes as the correct data types into Elasticsearch:

{
  "@timestamp": "2022-03-31T18:48:35.000Z",
  "ecs": {
    "version": "8.1.0"
  },
  "message": "example.com 10.0.0.2, 10.0.0.1, 127.0.0.1 - - [07/Dec/2016:11:05:07 +0100] \"GET /ocelot HTTP/1.1\" 200 571 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0\""
}
{
  "mappings" : {
    "properties" : {
      "@timestamp" : {
        "type" : "date"
      },
      "ecs" : {
        "properties" : {
          "version" : {
            "type" : "keyword",
            "ignore_above" : 1024
          }
        }
      },
      "message" : {
        "type" : "match_only_text"
      }
    }
  }
}

This compliant event builds on the required guidelines and incorporates many recommended practices.

{
  "agent": {
    "name": "test",
    "id": "a0e86cd2-d38b-4801-8d54-db5f2fb7f7e1",
    "ephemeral_id": "8568c102-6c2d-495d-800b-bc5b89cde1b6",
    "type": "filebeat",
    "version": "8.1.2"
  },
  "log": {
    "file": {
      "path": "/var/log/nginx/access.log"
    },
    "offset": 2716
  },
  "source": {
    "address": "192.168.64.1",
    "ip": "192.168.64.1"
  },
  "destination": {
    "address": "192.168.64.2",
    "ip": "192.168.64.2"
  },
  "url": {
    "path": "/",
    "original": "/"
  },
  "tags": [
    "nginx-access"
  ],
  "@timestamp": "2022-03-31T18:48:35.000Z",
  "ecs": {
    "version": "8.0.0"
  },
  "related": {
    "ip": [
      "192.168.64.1",
      "192.168.64.2",
      "fe80::9c5f:77ff:fe74:604"
    ]
  },
  "nginx": {
    "access": {
      "remote_ip_list": [
        "10.42.42.42",
      ]
    }
  },
  "host": {
    "hostname": "test",
    "os": {
      "kernel": "5.4.0-105-generic",
      "codename": "focal",
      "name": "Ubuntu",
      "type": "linux",
      "family": "debian",
      "version": "20.04.4 LTS (Focal Fossa)",
      "platform": "ubuntu"
    },
    "ip": [
      "192.168.64.2",
      "fe80::9c5f:77ff:fe74:604"
    ],
    "name": "test",
    "id": "39c062dece654ac393c9f62fc2be2b11",
    "mac": [
      "9e:5f:77:74:06:04"
    ],
    "architecture": "x86_64"
  },
  "http": {
    "request": {
      "method": "GET"
    },
    "response": {
      "status_code": 304,
      "body": {
        "bytes": 0
      }
    },
    "version": "1.1"
  },
  "event": {
    "agent_id_status": "verified",
    "ingested": "2022-03-31T18:48:38Z",
    "timezone": "-05:00",
    "created": "2022-03-31T18:48:37.472Z",
    "kind": "event",
    "category": [
      "web"
    ],
    "type": [
      "access"
    ],
    "dataset": "nginx.access",
    "outcome": "success"
  },
  "user_agent": {
    "original": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36",
    "os": {
      "name": "Mac OS X",
      "version": "10.15.7",
      "full": "Mac OS X 10.15.7"
    },
    "name": "Chrome",
    "device": {
      "name": "Mac"
    },
    "version": "99.0.4844.84"
  }
}
  1. Maps as many fields as possible based on what was available in the event. An Elasticsearch ingest pipeline also populates more fields.
  2. Fields expecting arrays, like event.category, use arrays even for a single value.
  3. Adds the event categorization fields (event.kind, event.category, event.type, and event.outcome).
  4. Populates source.* and destination.* fields as a pair.
  5. Concatenate all IP addresses into the related.ip field.
  6. Both source.address and destination.address copy the *.address to its sibling .ip field.
  7. The original user-agent value populates user_agent.original. Other fields hold the broken down values: user_agent.os.*, user_agent.name, user_agent.version, etc.
  8. A custom namespace, nginx.*, holds any custom fields. Nginx is a proper noun and will never conflict with any future ECS field names.

Source data

The ECS-compliant guidance applies to all ECS data sources.

Scope of impact

This proposal is informational and includes no changes to ECS. Beats, Elastic Agent, and APM users already gain the benefits of ECS-compliant data. Custom sources will benefit normalizing their data into ECS-compliant events.

Concerns

Arrays in ECS

Any field can contain one or more values of the same type. There is no dedicated array type in Elasticsearch. Why distinguish array vs. non-array fields in ECS?

While Elasticsearch is permissive, other software languages and configurations support array constructs. Components adopting ECS are able to expect what fields do and don't use arrays.

People

The following are the people that consulted on the contents of this RFC.

  • @ebeahan | author, sponsor

References

RFC Pull Requests