-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: scaledata/distributed-semaphore/nodes=6 failed [unsafe tscache update] #60580
Comments
@nvanbenschoten, perhaps this might be of interest to you. |
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@3c223f5f5162103110a790743b687ef2bf952489:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6 See this test on roachdash |
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@64c4aef909f4382523cd9248341ca9f4448d841a:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6 See this test on roachdash |
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@bf9744bad5a416a4b06907f0f3dd42896f7342f3:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6 See this test on roachdash |
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@5cfd7e5553a3072a1490d392390dddf968844215:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6 See this test on roachdash |
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@ec011620c7cf299fdbb898db692b36454defc4a2:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6 See this test on roachdash |
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@c7e088826bc079620dfd3b5ae75d1c15cd9cd16d:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6 See this test on roachdash |
61113: ui: show replica type on the range report page r=aayushshah15 a=aayushshah15 Resolves #59677 Release justification: observability improvement Release note (ui change): the range report page on the admin ui will now also show each of the replica's types 61128: jobs: introduce jobspb.JobID r=lucy-zhang a=lucy-zhang This commit introduces a `jobspb.JobID` int64 type and uses it in most places where we were previously using an int64. Closes #61121. Release justification: Low-risk change to existing functionality. Release note: None 61129: geo/wkt: update parsing of dimensions for empty geometrycollections r=otan,rafiss a=andyyang890 Previously, the data structure used for storing geometry collections was unable to store a layout, which made it impossible to distinguish empty geometry collections of different layouts. That issue has since been fixed and this patch updates the parser accordingly. Resolves #61035. Refs: #53091 Release justification: bug fix for new functionality Release note: None 61130: kv: disable timestamp cache + current clock assertion r=nvanbenschoten a=nvanbenschoten Closes #60580. Closes #60736. Closes #60779. Closes #61060. This was added in 218a5a3. The check was more of a sanity check that we have and always have had an understand of the timestamps that can enter the timestamp cache. The fact that it's failing is a clear indication that there were issues in past releases, because a lease transfer used to only be safe if the outgoing leaseholder's clock was above the time of any read in its timestamp cache. We now ship a snapshot of the timestamp cache on lease transfers, so that invariant is less important. I'd still like to get to the bottom of this, but I'll do so on my own branch, off of master where it's causing disruption. Release justification: avoid assertion failures 61155: jobs: make sure we finish spans if canceled before starting job r=ajwerner a=ajwerner Was seeing: ``` testcluster.go:135: condition failed to evaluate within 45s: unexpectedly found active spans: 0.000ms 0.000ms === operation:job _unfinished:1 intExec:create-stats goroutine 84 [running]: runtime/debug.Stack(0xc0086b1890, 0x792e940, 0xc009ac37e0) /usr/local/go/src/runtime/debug/stack.go:24 +0xab ``` In roachprod stressrace with a big cluster. This seemed to fix it. Release justification: bug fixes and low-risk updates to new functionality. Release note: None Co-authored-by: Aayush Shah <[email protected]> Co-authored-by: Lucy Zhang <[email protected]> Co-authored-by: Andy Yang <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Andrew Werner <[email protected]>
60835: kv: bump timestamp cache to Pushee.MinTimestamp on PUSH_ABORT r=nvanbenschoten a=nvanbenschoten Fixes #60779. Fixes #60580. We were only checking that the batch header timestamp was equal to or greater than this pushee's min timestamp, so this is as far as we can bump the timestamp cache. 62832: geo: minor performance improvement for looping over edges r=otan a=andyyang890 This patch slightly improves the performance of many spatial builtins by storing the number of edges used in the loop conditions of for loops into a variable. We discovered this was taking a lot of time when profiling the point-in-polygon optimization. Release note: None 62838: kvserver: purge gc-able, unmigrated replicas during migrations r=irfansharif a=irfansharif Fixes #58378. Fixes #62267. Previously it was possible for us to have replicas in-memory, with pre-migrated state, even after a migration was finalized. This led to the kind of badness we were observing in #62267, where it appeared that a replica was not using the applied state key despite us having migrated into it (see TruncatedAndRangeAppliedState, introduced in #58088). --- To see how, consider the following set of events: - Say r42 starts off on n1, n2, and n3 - n3 flaps and so we place a replica for r42 on n4 - n3's replica, r42/3, is now GC-able, but still un-GC-ed - We run the applied state migration, first migrating all ranges into it and then purging outdated replicas - Well, we should want to purge r42/3, cause it's un-migrated and evaluating anything on it (say a lease request) is unsound because we've bumped version gates that tell the kvserver to always expect post-migration state - What happens when we try to purge r42/3? Previous to this PR if it didn't have a replica version, we'd skip over it (!) - Was it possible for r42/3 to not have a replica version? Shouldn't it have been accounted for when we migrated all ranges? No, that's precisely why the migration infrastructure purge outdated replicas. The migrate request only returns once its applied on all followers; in our example that wouldn't include r42/3 since it was no longer one - The stop-gap in #60429 made it so that we didn't GC r42/3, when we should've been doing the opposite. When iterating over a store's replicas for purging purposes, an empty replica version is fine and expected; we should interpret that as signal that we're dealing with a replica that was obviously never migrated (to even start using replica versions in the first place). Because it didn't have a valid replica version installed, we can infer that it's soon to be GC-ed (else we wouldn't have been able to finalize the applied state + replica version migration) - The conditions above made it possible for us to evaluate requests on replicas with migration state out-of-date relative to the store's version - Boom Release note: None 62839: zonepb: make subzone DiffWithZone more accurate r=ajstorm a=otan * Subzones may be defined in a different order. We did not take this into account which can cause bugs when e.g. ADD REGION adds a subzone in the end rather than in the old "expected" location in the subzones array. This has been fixed by comparing subzones using an unordered map. * The ApplyZoneConfig we previously did overwrote subzone fields on the original subzone array element, meaning that if there was a mismatch it would not be reported through validation. This is now fixed by applying the expected zone config to *zonepb.NewZoneConfig() instead. * Added logic to only check for zone config matches subzones from active subzone IDs. * Improve the error messaging when a subzone config is mismatching - namely, add index and partitioning information and differentiate between missing fields and missing / extraneous zone configs Resolves #62790 Release note (bug fix): Fixed validation bugs during ALTER TABLE ... SET LOCALITY / crdb_internal.validate_multi_region_zone_config where validation errors could occur when the database of a REGIONAL BY ROW table has a new region added. Also fix a validation bug partition zone mismatches configs were not caught. 62872: build: use -json for RandomSyntax test r=otan a=rafiss I'm hoping this will help out with an issue where the test failures seem to be missing helpful logs. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Andy Yang <[email protected]> Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>
(roachtest).scaledata/distributed-semaphore/nodes=6 failed on master@5971ecb9dd1a25c81cd6012d6be1ff922802eae5:
More
Artifacts: /scaledata/distributed-semaphore/nodes=6
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: