You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Elastic Common Schema (ECS) is used to provide a common data model when
ingesting data into Elasticsearch. Having a common schema allows you correlate
data from sources like logs and metrics or IT operations
analytics and security analytics.
ECS is still under development and backward compatibility is not guaranteed. Any
feedback on the general structure, missing fields, or existing fields is appreciated.
For contributions please read the Contributing Guide.
The base set contains all fields which are on the top level without a namespace.
These are fields which are common across all types of events.
Field
Description
Type
Multi Field
Example
@timestamp
Timestamp when the event was created. For log events this is expected to be when the event was generated and not when it was read. Timestamp is a required field and must exist in all events.
date
2016-05-23T08:05:34.853Z
tags
Tags is a list of keywords which are used to tag each event.
keyword
["production", "env2"]
labels
Labels is an object which contains key/value pairs. Labels can be used to add additional meta information to events. Label should not contain nested objects and all values are stored as keyword. An example usage is the docker and k8s labels.
object
{key1: value1, key2: value2}
message
For log events the message field contains the log message. In other use cases the message field can be used to concatenate together different values which are then freely searchable. Or if multiple messages exist they can be combined here into one message.
text
Hello World
Agent fields
The agent fields contains the data about the agent/client/shipper that created the event.
As an example in case of Beats for logs the agent.name is filebeat. In the case of APM it is the agent running in the app / service. The agent information does not change if data is sent through queuing system like Kafka, Redis, or processing systems like Logstash or APM Server.
Field
Description
Type
Multi Field
Example
agent.version
Agent version.
keyword
6.0.0-rc2
agent.name
Agent name. Name of the agent.
keyword
filebeat
agent.id
Unique identifier of this agent if one exists. In the case of Beats this would be beat.id.
keyword
8a4f500d
agent.ephemeral_id
Ephemeral identifier of this agent if one exists. This id compared to id normally changes across restarts.
keyword
8a4f500f
Cloud fields
All fields related to the cloud or infrastructure the events are coming from.
In case Metricbeat is running on an EC2 host and fetches data from its host, the cloud info is expected to contain the data about this machine. In the case Metricbeat runs outside the cloud on a remote machine and fetches data from a service running in the cloud it is expected to have the cloud data from the machine on which the service is running in.
Field
Description
Type
Multi Field
Example
cloud.provider
Name of the cloud provider. Example values are ec2, gce, or digitalocean.
keyword
ec2
cloud.availability_zone
Availability zone in which this host is running.
keyword
us-east-1c
cloud.region
Region in which this host is running.
keyword
us-east-1
cloud.instance.id
Instance ID of the host machine.
keyword
i-1234567890abcdef0
cloud.instance.name
Instance name of the host machine.
keyword
cloud.machine.type
Machine type of the host machine.
keyword
t2.medium
Container fields
Container fields are used for meta information about the specific container the information is coming from. This should help to correlate data based containers from any runtime.
Field
Description
Type
Multi Field
Example
container.runtime
Runtime managing this container.
keyword
docker
container.id
Unique container id.
keyword
container.image.name
Name of the image the container was built on.
keyword
container.image.tag
Container image tag.
keyword
container.name
Container name.
keyword
container.labels
Image labels.
object
Destination fields
Destination fields describe details about the destination of a packet/event.
Field
Description
Type
Multi Field
Example
destination.ip
IP address of the destination. This can be on or multiple IPv4 or IPv6 addresses.
ip
destination.hostname
Hostname of the destination.
keyword
destination.port
Port of the destination.
long
destination.mac
MAC address of the destination.
keyword
destination.domain
Destination domain.
keyword
destination.subdomain
Destination subdomain.
keyword
Device fields
Device fields are used to give additional information about the device that the information is coming from.
This could be a firewall, network device, etc.
Field
Description
Type
Multi Field
Example
device.mac
MAC address of the device
keyword
device.ip
IP address of the device.
ip
device.hostname
Hostname of the device.
keyword
device.vendor
Device vendor information.
text
device.version
Device version.
keyword
device.serial_number
Device serial number.
keyword
device.timezone.offset.sec
Timezone offset of the host in seconds. Number of seconds relative to UTC. In case the offset is -01:30 the value will be -5400.
long
-5400
device.type
The type of the device the data is coming from. There is no predefined list of device types. Some examples are endpoint, firewall, ids, ips, proxy.
keyword
firewall
Error fields
Error namespace
This can be used to represent all kinds of errors. It can be for errors that happen while fetching events or if the event itself contains an error.
Field
Description
Type
Multi Field
Example
error.id
Unique identifier for the error.
keyword
error.message
Error message.
text
error.code
Error code describing the error.
keyword
Event fields
The event fields are used for context information about the data itself.
Field
Description
Type
Multi Field
Example
event.id
Unique ID to describe the event.
keyword
8a4f500d
event.category
Event category. This can be a user defined category.
keyword
metrics
event.type
A type given to this kind of event which can be used for grouping. This is normally defined by the user.
keyword
nginx-stats-metrics
event.module
Name of the module this data is coming from. This information is coming from the modules used in Beats or Logstash.
keyword
mysql
event.dataset
Name of the dataset. The concept of a dataset (fileset / metricset) is used in Beats as a subset of modules. It contains the information which is currently stored in metricset.name and metricset.module or fileset.name.
keyword
stats
event.severity
Severity describes the severity of the event. What the different severity values mean can very different between use cases. It's up to the implementer to make sure severities are consistent across events.
long
7
event.raw
Raw text message of entire event to be used to demonstrate log integrity.
Hash (perhaps logstash fingerprint) of raw field to be able to demonstrate log integrity.
keyword
123456789012345678901234567890ABCD
event.version
The version field contains the version an event for ECS adheres to. This field should be provided as part of each event to make it possible to detect to which ECS version an event belongs. event.version is a required field and must exist in all events. It describes which ECS version the event adheres to. The current version is 0.1.0.
keyword
0.1.0
event.duration
Duration of the event in nanoseconds.
long
event.created
event.created contains the date when the event was created. This timestamp is distinct from @timestamp in that @timestamp contains the processed timestamp. For logs these two timestamps can be different as the timestamp in the log line and when the event is read for example by Filebeat are not identical. @timestamp must contain the timestamp extracted from the log line, event.created when the log line is read. The same could apply to package capturing where @timestamp contains the timestamp extracted from the network package and event.created when the event was created. In case the two timestamps are identical, @timestamp should be used.
date
event.risk_score
Risk score value of the event.
float
File fields
File attributes.
Field
Description
Type
Multi Field
Example
file.path
The path to the file.
text
file.path.raw
The path to the file. This is a non-analyzed field that is useful for aggregations.
keyword
1
file.target_path
The target path for symlinks.
text
file.target_path.raw
The path to the file. This is a non-analyzed field that is useful for aggregations.
keyword
1
file.extension
The file extension. This should allow easy filtering by file extensions.
keyword
png
file.type
The file type (file, dir, or symlink).
keyword
file.device
The device.
keyword
file.inode
The inode representing the file in the filesystem.
keyword
file.uid
The user ID (UID) or security identifier (SID) of the file owner.
keyword
file.owner
The file owner's username.
keyword
file.gid
The primary group ID (GID) of the file.
keyword
file.group
The primary group name of the file.
keyword
file.mode
The mode of the file in octal representation.
keyword
416
file.size
The file size in bytes (field is only added when type is file).
long
file.mtime
The last modified time of the file (time when content was modified).
date
file.ctime
The last change time of the file (time when metadata was changed).
date
Geoip fields
Geoip fields are for used for geo information for an ip address.
The conversion to geoip information can be done by the Elasticsearch geoip plugin.
Field
Description
Type
Multi Field
Example
geoip.continent_name
The name of the continent.
keyword
geoip.country_iso_code
Country ISO code.
keyword
geoip.location
The longitude and latitude.
geo_point
geoip.region_name
The region name.
keyword
geoip.city_name
The city name.
keyword
Host fields
All fields related to a host. A host can be a physical machine, a virtual machine, and also a Docker container.
Normally the host information is related to the machine on which the event was generated / collected but also can be used differently if needed.
Field
Description
Type
Multi Field
Example
host.timezone.offset.sec
Timezone offset of the host in seconds. Number of seconds relative to UTC. In case the offset is -01:30 the value will be -5400.
long
-5400
host.name
host.name is the hostname of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name or also a name specified by the user. It is up to the sender to decide which value to use.
keyword
host.id
Unique host id. As hostname is not always unique, this often can be configured by the user. An example here is the current usage of beat.name.
keyword
host.ip
Host ip address.
ip
host.mac
Host mac address.
keyword
host.type
This is the type of the host. For Cloud providers this can be the machine type like t2.medium. Or it vm, container for example or something user defined.
keyword
host.os.platform
Operating system platform (e.g. centos, ubuntu, windows).
keyword
darwin
host.os.name
Operating system name.
keyword
Mac OS X
host.os.family
OS family (e.g. redhat, debian, freebsd, windows).
keyword
debian
host.os.version
Operating system version.
keyword
10.12.6
host.architecture
Operating system architecture.
keyword
x86_64
Kubernetes fields
Kubernetes fields are used for meta information about k8s. This should help to correlate data coming out of k8s setups.
Field
Description
Type
Multi Field
Example
kubernetes.pod.name
Kubernetes pod name
keyword
kubernetes.namespace
Kubernetes namespace
keyword
kubernetes.labels
Kubernetes labels map
object
kubernetes.annotations
Kubernetes annotations map
object
kubernetes.container.name
Kubernetes container name. This name is unique within the pod only, it's different from underlying container name (container.name in ECS)
keyword
Log fields
Fields which are specific to log events.
Field
Description
Type
Multi Field
Example
log.level
Log level of the log event. Some examples are WARN, ERR, INFO.
keyword
ERR
log.line
Line number the log event was collected from.
long
18
log.offset
Offset of the beginning of the log event.
long
12
Network fields
All fields related to network data.
Field
Description
Type
Multi Field
Example
network.protocol
Network protocol name.
keyword
http
network.direction
Direction of the network traffic. The recommended values are: * inbound * outbound * unknown
keyword
inbound
network.forwarded_ip
forwarded_ip indicates the host IP address when the source IP address is the proxy.
ip
192.1.1.2
network.inbound.bytes
Network inbound bytes.
long
184
network.inbound.packets
Network inbound packets.
long
12
network.outbound.bytes
Network outbound bytes.
long
184
network.outbound.packets
Network outbound packets.
long
12
Organization fields
The organization namespace can be used to enrich data with information from which organization the data belongs.
This can be useful if data should stored in the same index should be sometimes filtered or organized by one or multiple organizations.
Field
Description
Type
Multi Field
Example
organization.name
Organization name.
text
organization.id
Unique identifier for the organization.
keyword
Process fields
These fields contain information about a process.
If metrics information is collected for a process and a process id / name shows up in a log message, these fields should help to correlated the two. It is expected that the process.pid will often also stay in the metric itself and only copied to the global field for correlation.
Field
Description
Type
Multi Field
Example
process.args
Process arguments. May be filtered to protect sensitive information.
keyword
['-l', 'user', '10.0.0.16']
process.name
Process name. This is sometimes also known as program name or similar.
keyword
ssh
process.pid
Process id.
long
process.ppid
Process parent id.
long
process.title
Process title. The proctitle, often the same as process name.
keyword
Service fields
The service fields describe the service for / from which the data was collected.
If logs or metrics are collected from Redis, service.name would be redis. This allows to find and correlate logs for a specific service and even version with service.version.
Field
Description
Type
Multi Field
Example
service.id
Unique identifier of the running service. This id should uniquely identify this service. This makes it possible to correlate logs and metrics for one specific service. For example in case of issues with one redis instance, it's possible to filter on the id to see metrics and logs for this single instance.
keyword
d37e5ebfe0ae6c4972dbe9f0174a1637bb8247f6
service.name
Name of the service data is collected from. The name can be used to group logs and metrics together from one service and correlate them.
keyword
elasticsearch
service.type
Service type.
keyword
service.state
Current state of the service.
keyword
service.version
Version of the service the data was collected from. This allows to look at a data set only for a specific version of a service.
keyword
3.2.4
service.ephemeral_id
Ephemeral identifier of this service if one exists. This id compared to id normally changes across restarts.
keyword
8a4f500f
Source fields
Source fields describe details about the source of where the event is coming from.
Field
Description
Type
Multi Field
Example
source.ip
IP address of the source. This can be on or multiple IPv4 or IPv6 addresses.
ip
source.hostname
Hostname of the source.
keyword
source.port
Port of the source.
long
source.mac
MAC address of the source.
keyword
source.domain
Source domain.
keyword
source.subdomain
Source subdomain.
keyword
URL fields
A complete URL, with scheme, host, and path.
The URL object can be reused in other prefixes like host.url.* for example. It is important that whenever URL is used that the same structure is used.
url.href is a multi field which means the data is stored as keyword url.href and test url.href.analyzed. The advantage of this is that for running a query against only a part of the url still works without having to split up the URL in all its part on ingest time.
href contains the full url. The field is stored as keyword. href is an analyzed field so the parsed information can be accessed through href.analyzed in queries.
keyword
https://elastic.co:443/search?q=elasticsearch#top
url.href.analyzed
text
1
url.protocol
The protocol of the request, e.g. "https:".
keyword
url.hostname
The hostname of the request, e.g. "example.com". For correlation the this field can be copied into the host.name field.
keyword
url.port
The port of the request, e.g. 443.
keyword
url.pathname
The path of the request, e.g. "/search".
text
url.pathname.raw
The url path. This is a non-analyzed field that is useful for aggregations.
keyword
1
url.search
The search describes the query string of the request, e.g. "q=elasticsearch".
text
url.search.raw
The url search part. This is a non-analyzed field that is useful for aggregations.
keyword
1
url.hash
The hash of the request URL, e.g. "top".
keyword
url.username
The username of the request.
keyword
url.password
The password of the request.
keyword
url.extension
The url extension field contains the extension of the file associated with the url. A simple example is http://localhost/logo.png where the extension would be png. There can also be more complex cases like http://localhost/content?asset=logo.png&token=XYZ where the extension could also be png but depends on the implementation. The extension field should be left out if the extension is not defined.
keyword
png
User fields
The user fields are used to describe user information as part of the event.
All fields in user can have one or multiple entries. If a user has more then one id, an array with the ids must be provided.
Field
Description
Type
Multi Field
Example
user.id
One or multiple unique identifiers of the user.
keyword
user.name
Name of the user. As the field is a keyword, the field will not be tokenized.
keyword
user.email
User email address.
keyword
user.hash
Unique user hash to correlate information for a user in anonymized form. This is useful in case user.id or user.name cannot be used because it contains confidential information.
keyword
User agent fields
The user_agent fields are normally coming from a browser request.
These are common to show up in web service logs coming from the parsed user agent string.
Field
Description
Type
Multi Field
Example
user_agent.raw
Unparsed version of the user_agent.
text
user_agent.device
The name of the physical device.
keyword
user_agent.version
Version of the physical device.
keyword
user_agent.major
The major version of the user agent.
long
user_agent.minor
The minor version of the user agent.
long
user_agent.patch
The patch version of the user agent.
keyword
user_agent.name
The name of the user agent.
keyword
Chrome
user_agent.os.name
The name of the operating system.
keyword
user_agent.os.version
Version of the operating system.
keyword
user_agent.os.major
The major version of the operating system.
long
user_agent.os.minor
The minor version of the operating system.
long
user_agent.os.name
The name of the operating system.
keyword
Use cases
Below are some examples that demonstrate how ECS fields can be applied to
specific use cases.
The following rules apply if an event wants to adhere to ECS
The document MUST have the @timestamp field.
The data type defined for an ECS field MUST be used.
It SHOULD have the field event.version to define which version of ECS it uses.
To make the most out of ECS as many fields as possible should be mapped to ECS.
Rules
ECS follows the following writing and naming rules for the fields. The goal of
these rules is to make the fields easy to remember and have a guide when new
fields are added.
Often events will contain additional fields besides ECS. These can follow the
the same naming and writing rules but don't have to.
Writing
All fields must be lower case
No special characters except _
Words are combined through underscore
Naming
Use present tense unless field describes historical information.
Use singular and plural names properly to reflect the field content. For example, use requests_per_sec rather than request_per_sec.
Organise the prefixes from general to specific to allow grouping fields into objects with a prefix like host.*.
Avoid stuttering of words. If part of the field name is already in the prefix, do not repeat it. Example: host.host_ip should be host.ip.
Fields must be prefixed except for the base fields. For example all host fields are prefixed with host.. See dot notation in FAQ for more details.
Do not use abbreviations (few exceptions like ip exist)
About ECS
Scope
The Elastic Common Schema defines a common set of document fields (and their respective field names) to be used in event messages stored in Elasticsearch as part of any logging or metrics use case of the Elastic Stack, including IT operations analytics and security analytics.
Goals
The ECS has the following goals:
Correlate data between metrics, logs and APM
Correlate data coming from the same machines / hosts
Correlate data coming from the same service
Priority on which fields are added is based on these goals.
Benefits
The benefits to a user adopting these fields and names in their clusters are:
Ability to simply correlate data from different data sources
Improved ability to remember commonly used field names (since there is only a single set, not a set per data source)
Improved ability to deduce unremembered field names (since the field naming follows a small number of rules with few exceptions)
Ability to re-use analysis content (searches, visualizations, dashboards, alerts, reports, and ML jobs) across multiple data sources
Ability to use any future Elastic-provided analysis content in their environment without modifications
FAQ
Why is ECS using a dot nation instead of an underline notation?
There are two common formats on how keys are formatted when ingesting data into Elasticsearch:
This means internally in Elasticsearch user is represented as an object datatype. In the case of the underline notation both are just string datatypes.
NOTE: ECS does not used nested datatypes which is an array of objects.
Advantages of dot notation
The advantage of the dot notation is that on the Elasticsearch side each prefix is an object. Each object can have parameters on how fields inside the object should be treated, for example if they should be index or mappings should be extended. In the context of ECS this allows for example to disable dynamic property creation for certain prefixes.
On the ingest side of Elasticsearch it makes it simpler to for example drop complete objects with the remove processor instead of selecting each key inside it. It does not require prior knowledge which keys will end up in the object.
On the event producing side like in Beats it simplifies the creation of the events as on the code side each object can be treated as an object (or struct in Golang as an example) which makes constructing and modifying each part of the final event easier.
Disadvantage of dot notation
In Elasticsearch each key can only have one type. So if user is an object it's not possible to have in the same index user as type keyword like {"user": "nicolas ruflin"}. This can be an issue in certain datasets.
For the ECS data itself this is not an issue as all fields are predefined.
What if I already use the underline notation?
It's not a problem to mix the underline notation with the ECS do notation. They can coexist in the same document as long as there are not conflicts.
I have conflicting fields with ECS?
Assuming you already have a field user but ECS uses user as an object, you can use the rename processor on ingest time to rename your field to either the matching ECS field or rename it to user.value instead if your field does not match ECS.