Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable deprecation log indexing by default #76292

Closed
alisonelizabeth opened this issue Aug 10, 2021 · 8 comments · Fixed by #78991
Closed

Enable deprecation log indexing by default #76292

alisonelizabeth opened this issue Aug 10, 2021 · 8 comments · Fixed by #78991
Labels
:Core/Infra/Logging Log management and logging utilities >enhancement Team:Core/Infra Meta label for core/infra team

Comments

@alisonelizabeth
Copy link
Contributor

alisonelizabeth commented Aug 10, 2021

In #61484, Elasticsearch introduced a new setting called cluster.deprecation_indexing.enabled to write deprecation logs to a dedicated data stream.

This is currently disabled by default. I'd like to propose enabling it by default.

This setting allows users to view and analyze their deprecation logs through the Logs UI or Discover via the Upgrade Assistant in Kibana. If this setting is enabled by default, it will likely be easier for users to spot deprecation issues that need to be addressed immediately rather than having to first enable it, wait a certain period of time to trigger any potential deprecations, and then analyze.

/cc @jakelandis @elasticdog @beiske

@alisonelizabeth alisonelizabeth added >enhancement needs:triage Requires assignment of a team area label labels Aug 10, 2021
@jakelandis jakelandis added the :Core/Infra/Logging Log management and logging utilities label Aug 10, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Aug 10, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@rdnm
Copy link

rdnm commented Aug 16, 2021

Thanks @alisonelizabeth - we discussed this one in the core/infra sync and we had a a few concerns with enabling this by default:

  • If we enable this by default, we increase the system resources required by default as this index will continue to grow. I believe there is an ILM policy in place where we could set it to more aggressively delete older data, but then we lose the benefit of this being on as we may lose deprecation logs we want (especially if it's a rare workflow the user follows e.g. once a month). We had specific concerns for the smaller single node clusters which are deliberately sized small.
  • Enabling it by default at this stage in the 7.x series makes sense, but would having it enabled by default in 8.1 still make sense? I think this will only make sense to be on towards the end of a major release cycle.

Would it be better for this to be enabled/disabled on demand? Could we make this easier and allow the user to turn on deprecation log indexing directly from the Upgrade Assistant so at the point they are starting to consider the upgrade they can start indexing the deprecation logs?

@pgomulka
Copy link
Contributor

Deprecation logs are throttled by x-opaque-id and key only.
For instance:

   deprecationLogger.deprecate(DeprecationCategory.SETTINGS, "ingest_useragent_ecs_settings",
                    "setting [ecs] is deprecated as ECS format is the default and only option");

when the same user (or when no x-opaque-id is set) is executing the same code multiple times it will only be emitted once. It does not depend on the time when this is executed (could be executed next day and will be throttled as well)

However, the throttling is based on RateLimitingFilter which is using LRU cache of size 128.
When ES is restarted, or the cache size is breached, then the deprecation logs could be duplicated.

return lruKeyCache.add(xOpaqueId + key) ? Result.ACCEPT : Result.DENY;

The same throttling logic is applied to both JSON logs and indexed logs.
We could workaround the problem of too big index by ILM based on max_size? We could then just delete the oldest data the same as we do for JSON logs.

appender.deprecation_rolling.policies.size.type = SizeBasedTriggeringPolicy

We could also fix the problem of empty cache after restart by prepopulating it with keys from the index.

Ideally I think we should aim to have the size of the index limited by number of deprecations we have in our code base multiplied by numer of x-opaque-id used (however here I guess some throttling should be still needed as number of users is hard to predict)

I also like the idea of having deprecation logs indexed by default, as this would allow us to collect all the deprecations triggered by a user and making him aware of all the changes that he has to fix before upgrading. Enabling this on demand, makes it possible that we would miss some of them.
I also understand the argument that in some small clusters where space is critical this might be a problem. But shouldn't those cluster also have to tweak the log rotation policies which is I guess as cumbersome as disabling the deprecation logs indexing

cc: @pugnascotia

@alisonelizabeth
Copy link
Contributor Author

Thanks @rdnm for looking into this!

Enabling it by default at this stage in the 7.x series makes sense, but would having it enabled by default in 8.1 still make sense? I think this will only make sense to be on towards the end of a major release cycle.

That is a valid question, and not something I had considered. As the upgrade flow currently exists, I would tend to agree that this would likely be more relevant now (7.15/7.16) than 8.1. I'm wondering though if the make-it-minor effort changes this at all?

Would it be better for this to be enabled/disabled on demand? Could we make this easier and allow the user to turn on deprecation log indexing directly from the Upgrade Assistant so at the point they are starting to consider the upgrade they can start indexing the deprecation logs?

That is currently how it is functioning - users can enable the setting via the Upgrade Assistant. The issue, as @pgomulka pointed out, is that it's a little difficult to guide users on how long they should wait after turning the setting on to feel comfortable that any deprecated features they are using have been discovered.

I also like the idea of having deprecation logs indexed by default, as this would allow us to collect all the deprecations triggered by a user and making him aware of all the changes that he has to fix before upgrading. Enabling this on demand, makes it possible that we would miss some of them.

++

@elasticdog
Copy link
Contributor

I believe that with the Make It Minor effort, having these logs available by default will become even more important as we won't have the normal end-of-major cycle and will be expecting that users will be able to upgrade more often. I don't have much insight into the additional resources this would require (especially for smaller clusters), but like the thinking surrounding how we can minimize the noise, still keep these messages useful, and allow them to be enabled out of the box.

@DJRickyB DJRickyB removed the needs:triage Requires assignment of a team area label label Aug 18, 2021
@pgomulka
Copy link
Contributor

we discussed this on core infra syncs on 11th and 18th August
We concluded that we can enable this by default, but we would need to coordinate with cloud team (and document this) so that for small deployments we would not use x-opaque-id as part of a key in a cache.
Excluding x-opaque-id from a key makes sure that we won't have too many duplicated deprecations.
For other deployments, we would still use x-opaque-id.

If necessary in the future, we can increase the size of a cache (currently 128) prepopulate it with the content the deprecation index.

@pgomulka
Copy link
Contributor

@rdnm do you remember if we agreed to do this in 7.x or in 8.0 only?

@rdnm
Copy link

rdnm commented Aug 24, 2021

I believe the request would be to do this in 7.x too, so we have this information ahead of the 8.0 upgrade. Would that classify as a breaking change?

pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 11, 2021
Introduces a setting
cluster.deprecation_indexing.x_opaque_id_used.enabled to disable use of
x-opaque-id in RateLimitingFilter. This will be used for deprecation
logs indexing and will not affect logging to files (it uses different
instance of RateLimitingFilter with this flag enabled by default)

Changes the indices backing a deprecation log data stream to be hidden.

Refactors DeprecationHttpIT to be more reliable

relates elastic#76292
closes elastic#77936
pgomulka added a commit that referenced this issue Oct 12, 2021
Introduces a setting cluster.deprecation_indexing.x_opaque_id_used.enabled to disable use of
x-opaque-id in RateLimitingFilter. This will be used for deprecation
logs indexing and will not affect logging to files (it uses different
instance of RateLimitingFilter with this flag enabled by default)

Changes the indices backing a deprecation log data stream to be hidden.

Refactors DeprecationHttpIT to be more reliable

relates #76292
closes #77936
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 12, 2021
)

Introduces a setting cluster.deprecation_indexing.x_opaque_id_used.enabled to disable use of
x-opaque-id in RateLimitingFilter. This will be used for deprecation
logs indexing and will not affect logging to files (it uses different
instance of RateLimitingFilter with this flag enabled by default)

Changes the indices backing a deprecation log data stream to be hidden.

Refactors DeprecationHttpIT to be more reliable

relates elastic#76292
closes elastic#77936
pgomulka added a commit that referenced this issue Oct 12, 2021
…78982)

Introduces a setting cluster.deprecation_indexing.x_opaque_id_used.enabled to disable use of
x-opaque-id in RateLimitingFilter. This will be used for deprecation
logs indexing and will not affect logging to files (it uses different
instance of RateLimitingFilter with this flag enabled by default)

Changes the indices backing a deprecation log data stream to be hidden.

Refactors DeprecationHttpIT to be more reliable

relates #76292
closes #77936

backport #78911
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 12, 2021
Changing the default for deprecation log indexing to be true.
This commit also overrides this defualt to tests where a deprecation
data stream would intefere - because it uses index template, it would
not be possible to delete with _index_template/*.
The overrides should be removed when elastic#78850 is done.

closes elastic#76292
pgomulka added a commit that referenced this issue Oct 13, 2021
Changing the default for deprecation log indexing to be true.
This commit also overrides this default to tests where a deprecation
data stream would interfere - because it uses index template, it would
not be possible to delete with _index_template/*.
The overrides should be removed when #78850 is done.

closes #76292
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 13, 2021
Changing the default for deprecation log indexing to be true.
This commit also overrides this default to tests where a deprecation
data stream would interfere - because it uses index template, it would
not be possible to delete with _index_template/*.
The overrides should be removed when elastic#78850 is done.

closes elastic#76292
pgomulka added a commit that referenced this issue Oct 18, 2021
…9035)

Changing the default for deprecation log indexing to be true.
This commit also overrides this default to tests where a deprecation
data stream would interfere - because it uses index template, it would
not be possible to delete with _index_template/*.
The overrides should be removed when #78850 is done.

closes #76292
backport #78991
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Logging Log management and logging utilities >enhancement Team:Core/Infra Meta label for core/infra team
Projects
None yet
7 participants