release-21.2: tenant: add endpoint with instant metrics #71052
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #70750.
/cc @cockroachdb/release
Previously the tenant process was serving various metrics on
/_status/vars
. This endpoint has all the available metrics and these areupdated every 10 sec. Many of the metrics show a rate that is calculated
over the 10 sec interval. Some of the metrics are used by the cockroach
operator to monitor the CPU workload of the tenant process and use that
workload for automatic scaling. The 10 sec interval however is too long
and causes a slow scaling up. The reporting of high CPU utilization can
take up to 20 sec (to compute a delta). To resolve this, the PR adds a
new endpoint
/_status/load
that provides an instant reading of avery small subset of the normal metrics - user and system CPU time for
now. By having these be instant, the client can retrieve in quick
succession, consecutive snapshots and compute a precise CPU utulization.
It also allows the client to control the interval between the two pulls
(as opposed to having it hard coded to 10 sec).
Release note: None
Release justification: Necessary change for addressing the slow pod scale-up
for the upcoming Serverless MVP release. This functionality
is only used in multi-tenant scenarios and should have no impact on
our dedicated customers.