-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surfacing deprecations with rich context from ES warning header #120044
Conversation
2739916
to
4a5efc1
Compare
Pinging @elastic/kibana-core (Team:Core) |
@@ -65,7 +65,13 @@ export const config = { | |||
defaultValue: new Map<string, AppenderConfigType>(), | |||
}), | |||
loggers: schema.arrayOf(loggerSchema, { | |||
defaultValue: [], | |||
defaultValue: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
realised that configuring any loggers would remove this default and thereby switch the elasticsearch.deprecation
logger level to info
...
So I need to have logic that's a bit cleverer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we decide to enable this logging context by default, might be useful information to see how many logs get triggered on Cloud
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, what you are running into here is somewhat related to this enhancement issue we have for default appender configs: #92082
What would be useful here is to be able to do something similar for loggers.
Unless we decide to enable this logging context by default, might be useful information to see how many logs get triggered on Cloud
FWIW I would be in support of this, but if we went down this path I would vote that we clean up the structure of the log messages just a bit (see my other comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You chose the hard path by not just using the debug
log level 😅
Even if I think this approach to use more normal log levels such as warning
and error
, and try to disable the elasticsearch.deprecation
by default is nice, what is the main reason to not use debug
instead?
- It requires a lot less changes
- It's more consistent with the other 'disabled by default' logger we currently have, such as
elasticsearch.query
orhttp.server.response
src/core/server/core_usage_data/core_usage_data_service.test.ts
Outdated
Show resolved
Hide resolved
@@ -65,7 +65,13 @@ export const config = { | |||
defaultValue: new Map<string, AppenderConfigType>(), | |||
}), | |||
loggers: schema.arrayOf(loggerSchema, { | |||
defaultValue: [], | |||
defaultValue: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, what you are running into here is somewhat related to this enhancement issue we have for default appender configs: #92082
What would be useful here is to be able to do something similar for loggers.
Unless we decide to enable this logging context by default, might be useful information to see how many logs get triggered on Cloud
FWIW I would be in support of this, but if we went down this path I would vote that we clean up the structure of the log messages just a bit (see my other comment)
Co-authored-by: Luke Elmers <[email protected]>
…into es-log-deprecations
|
||
queryLogger.debug(queryMsg, meta); | ||
|
||
if (event.warnings && event.warnings.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that warnings
header contains only deprecations? Maybe the deprecation warning should have a structure to allow Kibana to distinguish deprecations from other warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that an intermediate proxy injects warning headers here but it won't be the end of the world if we log them. All warnings include Elasticsearch-${semver}
so we could use that to filter out noise but I think that's better done in the Elasticsearch-js client https://github.com/elastic/elastic-transport-js/blob/main/src/Transport.ts#L333
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I added a filter to Kibana, I will create an upstream issue so that we can remove this from Kibana long term
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(event.meta.request.params.headers != null && | ||
(event.meta.request.params.headers[ | ||
'x-elastic-product-origin' | ||
] as unknown as string)) === 'kibana' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what cases it doesn't contain kibana
? Why it's always user
? Can it have a deprecation originating from a logstash
request, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ES team decided that each product should have it's own origin header like {'x-elastic-product-origin': 'kibana'}
. ES does validation on this header for known Elastic products. They will include this header in the deprecation document created in the deprecations index. Then upgrade assistant will then filter out any deprecation documents with an 'x-elastic-product-origin' header an not show this to users because if it's our products then users can't fix it.
When a request comes from a user (e.g. when they use the console app) then we don't include this header so that users can see that they are using deprecated APIs. We haven't gotten as far yet... but once #120201 is done, we can ask teams to go through all their APIs, if there's an API that contains 100% user controlled content then they should mark that API call as coming from the user by not sending the x-elastic-product-origin header.
So from Kibana's perspective if we get a deprecation warning, it's either because us developers wrote a deprecated query or because users wrote it.
|
||
const deprecationMsg = `Elasticsearch deprecation: ${event.warnings}\nOrigin:${requestOrigin}\nStack trace:\n${stackTrace}\nQuery:\n${queryMsg}`; | ||
if (requestOrigin === 'kibana') { | ||
deprecationLogger.warn(deprecationMsg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can an end-user fix the problem? If not, maybe it shouldn't have the warning
level? End users are worried about warning
and errors
for a good reason - they want their deployment to be healthy. If they can't fix a problem, they will contact us. #35004 is a reminder that we should pick a log level carefully
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point, we want these logs to surface by default on cloud so we can monitor and fix any deprecations we caused... but info
would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but info would be fine
Maybe we can use debug
and configure Cloud to always produce deprecation logs?
logging.loggers:
- name: elasticsearch.deprecation
level: all
but info would be fine.
info
is logged by default on-prem version, so it would confuse end users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use debug and configure Cloud to always produce deprecation logs?
++ Hadn't considered this, but if the main reason for configuring a higher log level was to ensure they are surfaced on Cloud, then I agree this would simplify things greatly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we wouldn't want to surface deprecations caused by users on cloud by default since this could add a lot of volume. So we would need two logger contexts, elasticsearch.deprecation.kibana
and elasticsearch.deprecation.user
so that cloud could only enable debug logs for elasticsearch.deprecation.kibana
.
In terms of priorities, it's critical that we backport this to 7.16 and give teams a heads-up if they need to action anything. So I'll create an follow-up issue that can be done later to lower the level and change the configuration on cloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? 'kibana' | ||
: 'user'; | ||
|
||
// Strip the first 5 stack trace lines as these are irrelavent to finding the call site |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really show a place where a request was initiated? I suspect https://github.com/elastic/kibana/pull/120044/files#diff-b5f83498b6f6c9bb148d4ca6705a2b2a3757d0f8c177ae6a56e8236817eb4702R88 doesn't keep the async stack trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On main/8.0 yes:
Stack trace:
at Xpack.info (/Users/rudolf/dev/kibana/node_modules/@elastic/elasticsearch/src/api/api/xpack.ts:66:12)
at LicensingPlugin.fetchLicense (/Users/rudolf/dev/kibana/x-pack/plugins/licensing/server/plugin.ts:180:34)
But on 7.16 where it's more critical to trace deprecations it doesn't :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it does not always work on 8.0 either. But let's keep it for now as is. What are you going to do for 7.16
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove stack traces for 7.16 since it's never useful, but having the full query is usually enough to track down where the request was made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd lower the log level for the deprecation logs, LGTM otherwise
'--logging.appenders.deprecation.type=console', | ||
'--logging.appenders.deprecation.layout.type=json', | ||
'--logging.loggers[0].name=elasticsearch.deprecation', | ||
'--logging.loggers[0].appenders[0]=deprecation', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecation messages contain newlines and then tooling_log_text_writer prints only the first line but ignores subsequent lines with the query and stack trace. Having this in JSON isn't ideal but it's the best I can think of 🤷
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
…tic#120044) * First stab at surfacing deprecations from warning header * Log deprecations with error level but disable logger context by default * Don't filter out error logs from ProcRunner * Another try at not having messages ignored on CI * Log deprecation logs with warn not info * Tests * Let write() do it's writing * Commit pre-built @kbn/pm package * Second try to commit pre-built @kbn/pm package * Enable deprecation logger for jest_integration even though logs aren't interleaved * Apply suggestions from code review Co-authored-by: Luke Elmers <[email protected]> * deprecations logger: warn for kibana and debug for users * Refactor split query and deprecation logger out of configure_client * Unit test for tooling_log_text_writer * Fix TS * Use event.meta.request.params.headers to include Client constructor headers * Fix tests * Ignore deprecation warnings not from Elasticsearch * Log on info level * Log in JSON so that entire deprecation message is on one line * commit built kbn-pm package Co-authored-by: Luke Elmers <[email protected]> Co-authored-by: Kibana Machine <[email protected]>
💔 Backport failed
Successful backport PRs will be merged automatically after passing CI. To backport manually run: |
) (#120800) * First stab at surfacing deprecations from warning header * Log deprecations with error level but disable logger context by default * Don't filter out error logs from ProcRunner * Another try at not having messages ignored on CI * Log deprecation logs with warn not info * Tests * Let write() do it's writing * Commit pre-built @kbn/pm package * Second try to commit pre-built @kbn/pm package * Enable deprecation logger for jest_integration even though logs aren't interleaved * Apply suggestions from code review Co-authored-by: Luke Elmers <[email protected]> * deprecations logger: warn for kibana and debug for users * Refactor split query and deprecation logger out of configure_client * Unit test for tooling_log_text_writer * Fix TS * Use event.meta.request.params.headers to include Client constructor headers * Fix tests * Ignore deprecation warnings not from Elasticsearch * Log on info level * Log in JSON so that entire deprecation message is on one line * commit built kbn-pm package Co-authored-by: Luke Elmers <[email protected]> Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Rudolf Meijering <[email protected]> Co-authored-by: Luke Elmers <[email protected]>
…tic#120044) * First stab at surfacing deprecations from warning header * Log deprecations with error level but disable logger context by default * Don't filter out error logs from ProcRunner * Another try at not having messages ignored on CI * Log deprecation logs with warn not info * Tests * Let write() do it's writing * Commit pre-built @kbn/pm package * Second try to commit pre-built @kbn/pm package * Enable deprecation logger for jest_integration even though logs aren't interleaved * Apply suggestions from code review Co-authored-by: Luke Elmers <[email protected]> * deprecations logger: warn for kibana and debug for users * Refactor split query and deprecation logger out of configure_client * Unit test for tooling_log_text_writer * Fix TS * Use event.meta.request.params.headers to include Client constructor headers * Fix tests * Ignore deprecation warnings not from Elasticsearch * Log on info level * Log in JSON so that entire deprecation message is on one line * commit built kbn-pm package
#120044) (#120882) * Surfacing deprecations with rich context from ES warning header (elastic#120044) * First stab at surfacing deprecations from warning header * Log deprecations with error level but disable logger context by default * Don't filter out error logs from ProcRunner * Another try at not having messages ignored on CI * Log deprecation logs with warn not info * Tests * Let write() do it's writing * Commit pre-built @kbn/pm package * Second try to commit pre-built @kbn/pm package * Enable deprecation logger for jest_integration even though logs aren't interleaved * Apply suggestions from code review Co-authored-by: Luke Elmers <[email protected]> * deprecations logger: warn for kibana and debug for users * Refactor split query and deprecation logger out of configure_client * Unit test for tooling_log_text_writer * Fix TS * Use event.meta.request.params.headers to include Client constructor headers * Fix tests * Ignore deprecation warnings not from Elasticsearch * Log on info level * Log in JSON so that entire deprecation message is on one line * commit built kbn-pm package * Remove stack traces as these are useless on 7.x
…tic#120044) * First stab at surfacing deprecations from warning header * Log deprecations with error level but disable logger context by default * Don't filter out error logs from ProcRunner * Another try at not having messages ignored on CI * Log deprecation logs with warn not info * Tests * Let write() do it's writing * Commit pre-built @kbn/pm package * Second try to commit pre-built @kbn/pm package * Enable deprecation logger for jest_integration even though logs aren't interleaved * Apply suggestions from code review Co-authored-by: Luke Elmers <[email protected]> * deprecations logger: warn for kibana and debug for users * Refactor split query and deprecation logger out of configure_client * Unit test for tooling_log_text_writer * Fix TS * Use event.meta.request.params.headers to include Client constructor headers * Fix tests * Ignore deprecation warnings not from Elasticsearch * Log on info level * Log in JSON so that entire deprecation message is on one line * commit built kbn-pm package Co-authored-by: Luke Elmers <[email protected]> Co-authored-by: Kibana Machine <[email protected]>
Summary
Related #120043
To provide maximum context to developers about the origin of a deprecation warning this instruments the core elasticsearch client to be able to log deprecation warnings, a stack trace and the query that caused the deprecation.
This will create a log entry like:
Deprecations caused by user requests are logged on the debug level, to see these use the following config:
On CI (7.16 branch) it produces deprecation warnings inline with the test results making it easy to track the source of the deprecation:
Checklist
Delete any items that are not applicable to this PR.
Risk Matrix
Delete this section if it is not applicable to this PR.
Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.
When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:
For maintainers