-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-21.1: Backport changefeed observability PRs. #68106
release-21.1: Backport changefeed observability PRs. #68106
Conversation
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
pkg/settings/setting.go
Outdated
@@ -20,7 +20,7 @@ import ( | |||
|
|||
// MaxSettings is the maximum number of settings that the system supports. | |||
// Exported for tests. | |||
const MaxSettings = 512 | |||
const MaxSettings = 288 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guessing we don't want to decrease this :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this persisted somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not persisted, but if we register more settings than the max a panic is thrown. My guess is a few PRs bumped it when we hit the old limit, so we should probably stick with the higher value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed one of the commits since the setting was increased multiple times.
8cc30ed
to
9bf7575
Compare
This adds 3 new histogram metrics to try to get some insight into whether we are seeing occasionally slow sinks or checkpoints during various nightly roachtests. While this produces some data duplication with the previous flush_nanos, flushes, emitted_messages, and emit_nanos metrics, I think that having the histogram will still be useful to more easily see if a changefeed experienced a small number of slow flushes or checkpoints. The naming of these metrics is a bit unfortunate, but since flush_nanos and emit_nanos already existed and since I didn't want to remove them, I've included 'hist' in the name of these new metrics. Release note: None
This adds a new cluster setting changfeed.slow_span_log_threshold That allows us to control the threshold for logging slow spans. This is useful for cases where the auto-calculated threshold is much higher than we would like. Release note (sql change): A new cluster setting `changefeed.slow_span_log_threshold` allows setting a cluster-wide default for slow span logging.
I occasionally find this useful to know how many observations a given histogram is based on. The prometheus output already returns this, but it is nice to have it in the SQL and JSON output as well. Release note (ops change): Histogram metrics now store the total number of observations over time.
Stop relying on ExportRequestLimit to determine the number of concurrent export requests, and introduce a decidated ScanRequestLimit setting. If the setting is specified, uses that setting; otherwise, the default value is computed as 3 * (number of nodes in the cluster), which is the old behavior, but we cap this number so that concurrency does not get out of hand if running in a very large cluster. Fixes cockroachdb#67190 Release Nodes: Provide a better configurability of scan request concurrency. Scan requests are issued by changefeeds during the backfill.
Add a metric to keep track of the number of frontier updates in the changefeed. Release Notes: None
9bf7575
to
a8bea98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metrics changes have been baking a while and are pretty low risk. The changes to the export request limit seems pretty straightforward and high value for large clusters.
Backport:
Please see individual PRs for details.
/cc @cockroachdb/release
Release Justification: A low danger, high impact observability change. The added metrics allow us to troubleshoot
changefeed performance, when running in large scale clusters.