-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
typedesc: copy composite type elements for hydration #119866
Labels
branch-master
Failures and bugs on the master branch.
C-test-failure
Broken test (automatically or manually discovered).
O-robot
Originated from a bot.
T-sql-queries
SQL Queries Team
Milestone
Comments
cockroach-teamcity
added
branch-master
Failures and bugs on the master branch.
C-test-failure
Broken test (automatically or manually discovered).
O-robot
Originated from a bot.
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
T-sql-queries
SQL Queries Team
labels
Mar 4, 2024
DrewKimball
removed
the
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
label
Mar 4, 2024
Looks like (hopefully) once last fix for the type-hydration data races - the elements of a composite type descriptor need to be copied in |
Same as #119780? |
DrewKimball
changed the title
pkg/sql/logictest/tests/fakedist-disk/fakedist-disk_test: TestLogic_composite_types failed
typedesc: copy composite type elements for hydration
Mar 4, 2024
Yes, I'll close that one in favor of this. |
craig bot
pushed a commit
that referenced
this issue
Mar 4, 2024
118943: kvcoord: add DistSender circuit breakers r=nvanbenschoten a=erikgrinaker This patch adds an initial implementation of DistSender replica circuit breakers. Their primary purpose is to prevent the DistSender getting stuck on non-functional replicas. In particular, the DistSender relies on receiving a NLHE from the replica to update its range cache and try other replicas, otherwise it will keep sending requests to the same broken replica which will continue to get stuck, giving the appearance of an unavailable range. This can happen if: - The replica stalls, e.g. with a disk stall or mutex deadlock. - Clients time out before the replica lease acquisition attempt times out, e.g. if the replica is partitioned away from the leader. If a replica has returned only errors in the past few seconds, or hasn't returned any responses at all, the circuit breaker will probe the replica by sending a `LeaseInfo` request. This must either return success or a NLHE pointing to a leaseholder. Otherwise, the circuit breaker trips, and the DistSender will skip it for future requests, optionally also cancelling in-flight requests. Currently, only replica-level circuit breakers are implemented. If a range is unavailable, the DistSender will continue to retry replicas as today. Range-level circuit breakers can be added later if needed, but are considered out of scope here. The circuit breakers are disabled by default for now. Some follow-up work is likely needed before they can be enabled by default: * Improve probe scalability. Currently, a goroutine is spawned per replica probe, which is likely too expensive at large scales. We should consider batching probes to nodes/stores, and using a bounded worker pool. * Consider follower read handling, e.g. by tracking the replica's closed timestamp and allowing requests that may still be served by it even if it's partitioned away from the leaseholder. * Improve observability, with metrics, tracing, and logging. * Comprehensive testing and benchmarking. This will be addressed separately. Resolves #105168. Resolves #104262. Resolves #81100. Resolves #80713. Epic: none Release note (general change): gateways will now detect faulty or stalled replicas and use other replicas instead, which can prevent them getting stuck in certain cases (e.g. with disk stalls). This behavior can be disabled via the cluster setting `kv.dist_sender.circuit_breaker.enabled`. 119880: typedesc: copy composite type elements in `AsTypesT` r=DrewKimball a=DrewKimball This commit adds copying for the elements of a composite type in the `TypeDescriptor.AsTypesT` method. This avoids data races during type hydration. Fixes #119866 Release note: None Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Drew Kimball <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
branch-master
Failures and bugs on the master branch.
C-test-failure
Broken test (automatically or manually discovered).
O-robot
Originated from a bot.
T-sql-queries
SQL Queries Team
pkg/sql/logictest/tests/fakedist-disk/fakedist-disk_test.TestLogic_composite_types failed with artifacts on master @ 51bbfff84c26be8a2b40e25b4bce3d59ea63dc59:
Parameters:
TAGS=bazel,gss
stress=true
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-36362
The text was updated successfully, but these errors were encountered: