Change deprecation logs data stream name #68737

pugnascotia · 2021-02-09T11:41:44Z

Add .default to the end of the deprecation logs data stream name,
making it .logs-deprecation-elasticsearch.default after advice from
the ECS folks.

Also broaden the corresponding component template pattern to
.logs-deprecation-*, so that any stack project can use it.

cc @ruflin FYI

Add `.default to the end of the deprecation logs data stream name, making it `.logs-deprecation-elasticsearch.default` after advice from the ECS folks. Also broaden the corresponding component template pattern to `.logs-deprecation-*`, so that any stack project can use it.

elasticmachine · 2021-02-09T11:41:47Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

pugnascotia · 2021-02-09T13:27:26Z

@elasticmachine update branch

ruflin · 2021-02-15T09:27:05Z

...s/transport-netty4/src/javaRestTest/java/org/elasticsearch/rest/Netty4HeadBodyIsEmptyIT.java

@@ -116,7 +116,7 @@ public void testTemplateExists() throws IOException {
                request.setOptions(expectWarnings(
                    "legacy template [template] has index patterns [*] matching patterns from existing composable templates " +
                    "[.deprecation-indexing-template,.slm-history,.triggered_watches,.watch-history-14,.watches,ilm-history,logs," +
-                    "metrics,synthetics] with patterns (.deprecation-indexing-template => [.logs-deprecation-elasticsearch]," +
+                    "metrics,synthetics] with patterns (.deprecation-indexing-template => [.logs-deprecation-elasticsearch.default]," +


Looking at the naming scheme (https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme) we are using here elasticsearch.default as the namespace which I'm not sure if this is on purpose? Can you point me to the discussion with the ECS folks to better understand the structure here? Is it expected that all stack components ship data to a single data stream or each service will have its own data stream with its own data structure?

To be honest, we're arrived where we are currently in a rather organic (haphazard) fashion. We haven't yet had an explicit discussion around whether to use a single data stream for all components, or a stream per component.

Since we're on the subject, however, I think it would make sense to have a single data stream, since we're trying to move to a place where we can give a user a complete view of the deprecation state of their Elastic Stack deployment. I don't know what benefit we'd get from multiple data streams, so long as it's possible to differentiate where each deprecation message originated, for aggregation and filtering purposes. What do you think? If we go down that road, would we end up with data stream like e.g. .logs-deprecation.elastic-default?

I'll ignore for now the type and namespace part and only focus on the dataset bit which above is deprecation.elastic as I think it is the one that matters here.

I could see a few scenarios for naming. In general the logic I have in mind it goes from generic left to more specific on the right. Example nginx.acess is nginx access logs.

deprecation.*. This sounds like a general place for deprecation logs, not only the Elastic Stack.

elastic.deprecation.*. This sounds like deprecation logs specific to the Elastic Stack.

kibana.deprecation: Kibana has several logs, these are the deprecation logs for Kibana.

I don't think we build a "general place" for deprecation logs, so I would throw out deprecation.*. elastic.deprecation would allow us to ship all to a single data stream. Will all deprecation logs have the same format and same retention? Or will we clean the deprecation logs for Kibana but not Elasticsearch. I remember a discussion around this that someone wants to "clean" all alerts. If we do, this would indicate we land on elastic.deprecation.kibana or option 3.

To decide between Option 2 and 3, the question for me is if this is temporary / 1 off thing or if deprecation logs should always be sent there not only for a migration effort. Today we collect Kibana logs into kibana.* and Elasticsearch logs into elasticsearch.*. So adding the deprecation logs here as elasticsearch.deprecation feels very natural, but it is now a public index.

Part of the point of this work is to make upgrading the Stack an easier process for users, so I feel that we need to take a Stack-centric view wherever possible.

I'm afraid I get a bit lost with the discussion of namespaces, types, datasets, etc etc. Would you mind restating what values I'd need to put and where?

From my perspective, the namespace is just default. I think where we need to have the discussion is on dataset and I hope a few chime in with their opinions and pros / cons.

@pugnascotia What is your take on

To decide between Option 2 and 3, the question for me is if this is temporary / 1 off thing or if deprecation logs should always be sent there not only for a migration effort. Today we collect Kibana logs into kibana.* and Elasticsearch logs into elasticsearch.*. So adding the deprecation logs here as elasticsearch.deprecation feels very natural, but it is now a public index.

data_stream.dataset

If I'm understanding you correctly, elasticsearch.deprecation would be a good option (and what this PR does). So each Stack component would write a document where the data_stream.dataset would be <component>.deprecation?

That said, I like the hierarchical structure of elastic.deprecation.*. Would that option translate to elastic.deprecation.elasticsearch in the documents written by ES, or is that the literal dataset value?

There is a small important difference from what this PR does and it is .logs vs logs and I'm not sure yet which path we should follow.

Today the Elasticsearch Filebeat module already ships deprecation logs: https://github.com/elastic/beats/tree/master/filebeat/module/elasticsearch/deprecation This will go to logs-elasticsearch.deprecation-default as soon as it is part of Elastic Agent.

The advantage to make it private in our case is that maybe during migration he wants to wipe all the deprecation logs from the UI but does not necessarily really mean wipe what is shipped by Filebeat? What happens if a user enables both? Will the data be duplicated? Or is this actually different data?

@pugnascotia The data_stream.dataset value in the document must always match the {dataset} part of the data stream, so I think the answer is yes to your second question.

@ruflin I feel like there's a difference between data that a user has chosen to ingest, and data that the Stack has collected for its own purposes. I'd prefer to keep the data that we rely on separate.

This PR has been hanging around for a while now and I'd like to push it over the finish line. We need to decide:

What is the data stream name for ES logs? This PR changes it to .logs-deprecation-elasticsearch.default, but it was previously .logs-deprecation-elasticsearch

What should the value of data_stream.dataset be? It's current elasticsearch.deprecation (via DeprecatedMessage.java), but we could change it to elastic.deprecation.elasticsearch

Nicolas and I synced, and decided on the following.

Given the index naming pattern {type}-{dataset}-{namespace}, we'll use the following values:

type = .logs dataset = deprecation.{product} (e.g. `elasticsearch`) namespace = default

I'll update the PR accordingly.

…ing-updates

pugnascotia · 2021-04-01T14:43:32Z

I've now updated event.dataset to match data_stream.dataset, and both have the value deprecation.elasticsearch.

It's a little unfortunate that event.dataset for other log4j appenders follow the elasticsearch.* pattern. Nicolas and I discussed elasticsearch.deprecation as an option for data_stream.dataset, but discounted it due to the similarity with the existing logs-elasticsearch.deprecation-default index that Beats can write when ingesting ES logs. In this case, event.dataset is largely an implementation detail and not of particular interest to users.

ruflin · 2021-04-06T06:26:59Z

@pugnascotia Can a user still log deprecation logs to file or will this now only go internally?

pugnascotia · 2021-04-06T08:47:30Z

Elasticsearch will continue to log deprecation warnings to file (unless a user has changed the log4j configuration).

ruflin · 2021-04-06T09:00:41Z

Great, so the existing deprecation logging will not change, neither will the content.

pugnascotia · 2021-04-06T09:11:54Z

That's correct 👍

More fixes to deprecation log indexing so that the data stream name and document contents are more ECS-compatible.

Backport of #68737. More fixes to deprecation log indexing so that the data stream name and document contents are more ECS-compatible.

pugnascotia added :Core/Infra/Logging Log management and logging utilities >refactoring v8.0.0 v7.12.0 labels Feb 9, 2021

pugnascotia requested review from ruflin and rjernst February 9, 2021 11:41

elasticmachine added the Team:Core/Infra Meta label for core/infra team label Feb 9, 2021

elasticmachine and others added 2 commits February 9, 2021 08:27

Merge branch 'master' into deprecation-indexing-updates

455639f

Formatting

d5c031f

ruflin reviewed Feb 15, 2021

View reviewed changes

williamrandolph added v7.13.0 and removed v7.12.0 labels Feb 18, 2021

pugnascotia added 2 commits March 31, 2021 13:16

Merge remote-tracking branch 'upstream/master' into deprecation-index…

1839b01

…ing-updates

Settle on .logs-deprecation.elasticsearch-default

edb632a

ruflin mentioned this pull request Apr 1, 2021

type==logs also for hidden data streams elastic/package-spec#156

Open

Sync event.dataset with data_stream.dataset

1c28ec2

pugnascotia requested a review from ruflin April 1, 2021 14:43

pugnascotia merged commit fb1921c into elastic:master Apr 13, 2021

pugnascotia deleted the deprecation-indexing-updates branch April 13, 2021 14:11

pugnascotia added a commit to pugnascotia/elasticsearch that referenced this pull request Apr 13, 2021

Change deprecation logs data stream name (elastic#68737)

65c5ad8

More fixes to deprecation log indexing so that the data stream name and document contents are more ECS-compatible.

pugnascotia mentioned this pull request Apr 13, 2021

Change deprecation logs data stream name #71642

Merged

pugnascotia added a commit that referenced this pull request Apr 14, 2021

Change deprecation logs data stream name (#71642)

c1385fb

Backport of #68737. More fixes to deprecation log indexing so that the data stream name and document contents are more ECS-compatible.

axw mentioned this pull request Apr 30, 2021

Deprecation logs for APM Server (+Agents?) elastic/apm-server#4284

Open

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

ruflin mentioned this pull request Jan 28, 2022

[Filebeat] Update handling of elasticsearch server logs elastic/beats#30018

Merged

6 tasks

This was referenced Jun 13, 2022

Inconsistent value of event.dataset in ES deprecation logs #83251

Closed

Deprecation dataset value changed to elasticsearch.deprecation #83254

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change deprecation logs data stream name #68737

Change deprecation logs data stream name #68737

pugnascotia commented Feb 9, 2021 •

edited

Loading

elasticmachine commented Feb 9, 2021

pugnascotia commented Feb 9, 2021

ruflin Feb 15, 2021

pugnascotia Feb 16, 2021

ruflin Feb 19, 2021

pugnascotia Mar 3, 2021

ruflin Mar 4, 2021

ruflin Mar 8, 2021

pugnascotia Mar 8, 2021

ruflin Mar 8, 2021

pugnascotia Mar 29, 2021

pugnascotia Mar 31, 2021

pugnascotia commented Apr 1, 2021

ruflin commented Apr 6, 2021

pugnascotia commented Apr 6, 2021

ruflin commented Apr 6, 2021

pugnascotia commented Apr 6, 2021

Change deprecation logs data stream name #68737

Change deprecation logs data stream name #68737

Conversation

pugnascotia commented Feb 9, 2021 • edited Loading

elasticmachine commented Feb 9, 2021

pugnascotia commented Feb 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pugnascotia commented Apr 1, 2021

ruflin commented Apr 6, 2021

pugnascotia commented Apr 6, 2021

ruflin commented Apr 6, 2021

pugnascotia commented Apr 6, 2021

pugnascotia commented Feb 9, 2021 •

edited

Loading