-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: information returned by SpanStats
should be authoritative
#103957
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Comments
zachlite
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-cluster-observability
labels
May 26, 2023
zachlite
pushed a commit
to zachlite/cockroach
that referenced
this issue
Jun 6, 2023
This commit guarantees that SpanStats uses up-to-date range descriptors on all nodes. The dependency on DistSender is removed and is replaced with a dependency on ScanMetaKVs. ScanMetaKVs is used to: 1) Locate nodes that house replicas of a span to avoid cluster-wide fan-outs. 2) Find Range Descriptors that touch a span, to build a SpanStatsResponse. This commit also fixes cockroachdb#103809, where a SpanStatsResponse incorrectly returned the replica_count for a span, instead of the range_count. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Issue: cockroachdb#103957 Release note (bug fix): SpanStats is no longer subject to stale information, and should be considered authoritative.
zachlite
pushed a commit
to zachlite/cockroach
that referenced
this issue
Jun 9, 2023
This commit guarantees that SpanStats uses up-to-date range descriptors on all nodes. The dependency on DistSender is removed and is replaced with a dependency on rangedesc.Scanner. rangedesc.Scanner is used to: 1) Locate nodes that house replicas of a span to avoid cluster-wide fan-outs. 2) Find Range Descriptors that touch a span, to build a SpanStatsResponse. This commit also fixes cockroachdb#103809, where a SpanStatsResponse incorrectly returned the replica_count for a span, instead of the range_count. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Issue: cockroachdb#103957 Release note (bug fix): SpanStats is no longer subject to stale information, and should be considered authoritative.
zachlite
pushed a commit
to zachlite/cockroach
that referenced
this issue
Jun 15, 2023
This commit guarantees that SpanStats uses up-to-date range descriptors on all nodes. The dependency on DistSender is removed and is replaced with a dependency on rangedesc.Scanner. rangedesc.Scanner is used to: 1) Locate nodes that house replicas of a span to avoid cluster-wide fan-outs. 2) Find Range Descriptors that touch a span, to build a SpanStatsResponse. This commit also fixes cockroachdb#103809, where a SpanStatsResponse incorrectly returned the replica_count for a span, instead of the range_count. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Issue: cockroachdb#103957 Release note (bug fix): SpanStats is no longer subject to stale information, and should be considered authoritative.
craig bot
pushed a commit
that referenced
this issue
Jun 15, 2023
103945: jobs: add a Profiler tab to the job details page r=maryliag a=adityamaru This change adds a Profiler tab to the job details page. This change also adds a row that allows collection of a cluster-wide CPU profile for 5 seconds, of all the samples corresponding to the job's execution. Fixes: #102735 Release note (ui change): job details page now has a profiler tab for more advanced observability into a job's execution. Currently, we support collecting a cluster-wide CPU profile of the job. 104423: server: make SpanStats authoritative r=zachlite a=zachlite This commit guarantees that SpanStats uses up-to-date range descriptors on all nodes. The dependency on DistSender is removed and is replaced with a dependency on rangedesc.Scanner. rangedesc.Scanner is used to: 1) Locate nodes that house replicas of a span to avoid cluster-wide fan-outs. 2) Find Range Descriptors that touch a span, to build a SpanStatsResponse. This commit also fixes #103809, where a SpanStatsResponse incorrectly returned the replica_count for a span, instead of the range_count. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Issue: #103957 Release note (bug fix): SpanStats is no longer subject to stale information, and should be considered authoritative. 104933: roachtest: fix regressions introduced by metamorphic arm64 and fips c… r=yuzefovich,renatolabs a=srosenberg …hanges There were two regressions introduced in [1]. This PR resolves them in a quick way. Subsequent PRs will address cpu architecture validation for attaching to pre-existing clusters, as well as enable benchmarks to run on arm64. When using "local mode", don't fall back to amd64 if a test happens to be a benchmark. When attaching to a pre-existing cluster, pretend that its (cpu) architecture is the same as the one that was chosen for the given test. [1] #103710 Epic: none Fixes: #104678 Release note: None 104982: go.mod: bump Pebble to e9a280333ef7 r=jbowens a=jbowens ``` e9a28033 db: tighten missized tombstone metric count 17e0ab3c sharedcache: use a pool of workers for writes 3726f551 db: don't use internal types in ScanInternal 2fed2e71 sharedcache: refactor block math ``` Epic: none Release note: None 104986: changefeedccl: add version gate to `SHOW CHANGEFEED JOBS` r=Xiang-Gu a=jayshrivastava On 23.1 binaries, `SHOW CHANGEFEED JOBS` queries `crdb_internal.system_jobs`, which was added in 23.1. In 22.2-23.1 mixed version clusters, this query fails. This change adds a version gate such that before an upgrade to 23.1 is finalized, `SHOW CHANGEFEED JOBS` will query `system.jobs` instead. This change also adds regression testing in the `changefeed` logic test, ensuring that it runs in a mixed version environment. Release note (bug fix): `SHOW CHANGEFEED JOBS` no longer fails on 22.2, 23.1 mixed version clusters. Epic: None Co-authored-by: adityamaru <[email protected]> Co-authored-by: Zach Lite <[email protected]> Co-authored-by: Stan Rosenberg <[email protected]> Co-authored-by: Jackson Owens <[email protected]> Co-authored-by: Jayant Shrivastava <[email protected]>
blathers-crl bot
pushed a commit
that referenced
this issue
Jun 15, 2023
This commit guarantees that SpanStats uses up-to-date range descriptors on all nodes. The dependency on DistSender is removed and is replaced with a dependency on rangedesc.Scanner. rangedesc.Scanner is used to: 1) Locate nodes that house replicas of a span to avoid cluster-wide fan-outs. 2) Find Range Descriptors that touch a span, to build a SpanStatsResponse. This commit also fixes #103809, where a SpanStatsResponse incorrectly returned the replica_count for a span, instead of the range_count. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Issue: #103957 Release note (bug fix): SpanStats is no longer subject to stale information, and should be considered authoritative.
zachlite
pushed a commit
to zachlite/cockroach
that referenced
this issue
Jun 20, 2023
This commit guarantees that SpanStats uses up-to-date range descriptors on all nodes. The dependency on DistSender is removed and is replaced with a dependency on rangedesc.Scanner. rangedesc.Scanner is used to: 1) Locate nodes that house replicas of a span to avoid cluster-wide fan-outs. 2) Find Range Descriptors that touch a span, to build a SpanStatsResponse. This commit also fixes cockroachdb#103809, where a SpanStatsResponse incorrectly returned the replica_count for a span, instead of the range_count. Epic: https://cockroachlabs.atlassian.net/browse/CRDB-24928 Issue: cockroachdb#103957 Release note (bug fix): SpanStats is no longer subject to stale information, and should be considered authoritative.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
The work done to improve
server.SpanStats
in #96223 and #101378 uses instances ofkvcoord.RangeIterator
so that each node may collect range descriptors relevant to the request span(s).The problem is that
kvcoord.RangeIterator
useskvcoord.DistSender
's caching infrastructure to find these descriptors, and the state of the cache on each node may be different at any given time. Although the caches will eventually be consistent, the risk of inconsistency is not acceptable for this endpoint. This has come to light in #103128 whereserver.SpanStats
has become a dependency ofSHOW RANGES
, which must be authoritative.Some additional points from @knz:
This issue revisits CRDB-22711 and CRDB-17463
Jira issue: CRDB-28270
Epic: CRDB-24928
The text was updated successfully, but these errors were encountered: