Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] 0017 Remove log.original stage 1 #1314

Merged
merged 6 commits into from
Apr 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 19 additions & 26 deletions rfcs/text/0017-remove-log-original.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,28 @@
# 0017: Remove log.original

- Stage: **0 (strawperson)** <!-- Update to reflect target stage. See https://elastic.github.io/ecs/stages.html -->
- Stage: **1 (draft)** <!-- Update to reflect target stage. See https://elastic.github.io/ecs/stages.html -->
- Date: **2021-03-11** <!-- The ECS team sets this date at merge time. This is the date of the latest stage advancement. -->

This RFC supersedes issue [#841](https://github.com/elastic/ecs/issues/841) which implies breaking changes therefore the RFC Process is indicated.

The request is to consolidate `log.original` and `event.original` by removing `log.original`, since these are almost equivalent in nature. (One) justification for preserving `event.original` is that not all events are logs. Once `log.original` is removed, `event.original` will be the sole field intended to capture the original untouched event.

The removal of `log.original` will be considered a breaking change since the field is being removed from the schema. Possible migration/mitigations for users impacted may include:

- requirement to alias `log.original` for current users of this field
- TBD if there exist current users of fields with distinct content/meaning in a common index mapping

## Fields

| Field Set | Field(s) |
| --------- | -------- |
| [`log`](0000/log.yml) | `log.original` |
| [`event`](0000/event.yml) | `event.original` |
| [`log`](0017/log.yml) | `log.original` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including field definition files when adding or changing fields has proven useful in a couple of ways:

  • The fields can be used either with the --include flag to test out the field changes
  • Sometimes, the proposed field definitions get a little long, especially for brand new fieldsets. Separating the field definitions from the markdown keeps everything a bit tidier.

The way that custom files are merged with the current ECS fields, even if you remove a current ECS field it'll still be present in all the outputs. So it's not a problem to include these field definitions here, but unfortunately, we can't use them with the --include flag and get the expected result.

I do think we could consider having a consistent way to mark a field as deprecated in the ECS docs until we're ready to entirely remove the field. Looking at the Elastic docs repo, there is an asciidoc admonition available for deprecated that we might be able to leverage somehow: https://github.com/elastic/docs#old-section.

We have something similar for the beta attribute: https://github.com/elastic/ecs/blob/master/scripts/templates/field_details.j2#L13-L18

| [`event`](0017/event.yml) | `event.original` |

<!--
Stage 1: Describe at a high level how this change affects fields. Include new or updated yml field definitions for all of the essential fields in this draft. While not exhaustive, the fields documented here should be comprehensive enough to deeply evaluate the technical considerations of this change. The goal here is to validate the technical details for all essential fields and to provide a basis for adding experimental field definitions to the schema. Use GitHub code blocks with yml syntax formatting.
-->
- The internal description of the field `log.original` in [`log`](0017/log.yml) should be amended by addition of a notice of deprecation and subsequently removal if/when Deprecation progresses to Removal

<!--
Memo: see https://github.com/elastic/ecs/pull/1298#discussion_r591870463
-->
- The internal description of the field `event.original` in [`event`](0017/event.yml) should be updated to reflect the revised scope
ebeahan marked this conversation as resolved.
Show resolved Hide resolved

- The [`Beats default fields inclusion list`](../../scripts/generators/beats_default_fields_allowlist.yml) list should be updated by removing `log.original` if/when Deprecation progresses to Removal

- The extended description of `log.original` in the [`Log Fields documentation`](../../docs/field-details.asciidoc#field-log-original) should be amended by addition of a notice of deprecation and subsequently removal if/when Deprecation progresses to Removal

- The extended description of `event.original` in the [`Event Fields documentation`](../../docs/field-details.asciidoc#field-event-original) should be amended to clarify the absorption of `log.original`

<!--
Stage 2: Add or update all remaining field definitions. The list should now be exhaustive. The goal here is to validate the technical details of all remaining fields and to provide a basis for releasing these field definitions as beta in the schema. Use GitHub code blocks with yml syntax formatting.
Expand Down Expand Up @@ -53,13 +50,12 @@ Stage 3: Add more real world example source documents so we have at least 2 tota

## Scope of impact

<!--
Stage 2: Identifies scope of impact of changes. Are breaking changes required? Should deprecation strategies be adopted? Will significant refactoring be involved? Break the impact down into:
* Ingestion mechanisms (e.g. beats/logstash)
* Usage mechanisms (e.g. Kibana applications, detections)
* ECS project (e.g. docs, tooling)
The goal here is to research and understand the impact of these changes on users in the community and development teams across Elastic. 2-5 sentences each.
-->
Beats modules and Agent integration packages would be required to migrate if this change is adopted as proposed.

The removal of `log.original` will be considered a breaking change since the field is being removed from the schema. Possible migration/mitigations for users impacted may include:

- requirement to alias `log.original` for current users of this field
- TBD if there exist current users of fields with distinct content/meaning in a common index mapping

## Concerns

Expand All @@ -80,6 +76,7 @@ Stage 3: Document resolutions for all existing concerns. Any new concerns should
The following are the people that consulted on the contents of this RFC.

* @djptek | author
* @ebeahan | sponsor

<!--
Who will be or has been consulted on the contents of this RFC? Identify authorship and sponsorship, and optionally identify the nature of involvement of others. Link to GitHub aliases where possible. This list will likely change or grow stage after stage.
Expand All @@ -104,8 +101,4 @@ e.g.:
<!-- An RFC should link to the PRs for each of it stage advancements. -->

* Stage 0: https://github.com/elastic/ecs/pull/1298

<!--
* Stage 1: https://github.com/elastic/ecs/pull/NNN
...
-->
* Stage 1: https://github.com/elastic/ecs/pull/1314
13 changes: 13 additions & 0 deletions rfcs/text/0017/event.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
- name: event
- name: original
description: >
Raw text message of entire event. Used to demonstrate log integrity
or where the full log message (before splitting it up in multiple
parts) may be required, e.g. for reindex.

This field is not indexed and doc_values are disabled. It cannot be
searched, but it can be retrieved from `_source`. If users wish to
override this and index this field, please see `Field data types`
in the `Elasticsearch Reference`.

19 changes: 19 additions & 0 deletions rfcs/text/0017/log.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
- name: log
- name: original
short: Deprecated Original log message with light interpretation only (encoding, newlines).
description: >
Deprecated for Removal in next major version release. This field is superceeded by
`event.original`

This is the original log message and contains the full log message
before splitting it up in multiple parts.

In contrast to the `message` field which can contain an extracted part
of the log message, this field contains the original, full log message.
It can have already some modifications applied like encoding or new
lines removed to clean up the log message.

This field is not indexed and doc_values are disabled so it can't be
queried but the value can be retrieved from `_source`.