-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: tpccbench/nodes=3/cpu=4 failed #53282
Comments
Same as #53285 (comment). |
(roachtest).tpccbench/nodes=3/cpu=4 failed on master@ef55609797c92e46eb9efd08facc9db49558291d:
More
Artifacts: /tpccbench/nodes=3/cpu=4
See this test on roachdash |
@rohany were you in the area recently? (Only the last failure.) |
I just bors something to fix this http://github.com/cockroachdb/cockroach/pull/53450 |
related to #53285 |
53589: jobs: improve job adoption r=ajwerner a=ajwerner #### jobs: don't hold mutex during adoption, launch in parallel #### jobs: break up new stages of job lifecycle movement In the PR which adopted the sqlliveness sessions, we shoved all of the stages of adopting jobs into the same stage and we invoked that stage on each adoption interval and on each sent to the adoption channel. These stages are: * Cancel jobs * Serve pause and cancel requests * Delete claims due to dead sessions * Claim jobs * Process claimed jobs This is problematic for tests which send on the adoption channel at a high rate. One important thing to note is that all jobs which are sent on the adoption channel are already claimed. After this PR we move the first three steps above into the cancellation loop we were already running. We also increase the default interval for that loop as it was exceedingly frequent at 1s for no obvious reason. Much of the testing changes are due to this cancelation loop duration change. The tests in this package now run 3x faster (10s vs 30s). Then, upon sends on the adoption channel, we just process claimed jobs. When the adoption interval rolls around, then we attempt to both claim and process jobs. Release justification: bug fixes and low-risk updates to new functionality Release note: None 53697: kv: attach txn to error from detectIntentMissingDueToIntentResolution r=nvanbenschoten a=nvanbenschoten Fixes #53189. Fixes #53282. Fixes #53285. Fixes #53469. 3dcb6f1 improved the detection of missing intents during parallel commit attempts to distinguish between certain classes of ambiguous errors and transaction aborted errors. This was a nice improvement, as it dramatically reduced the number of situations where we returned ambiguous errors during normal operation (see #52566). However, in introducing a new location where transaction retry errors could be generated, it accidentally violated the invariant that all transaction retry errors have transaction protos attached to them. This was causing panics in TPC-C roachtests. This commit fixes this issue by properly attaching transaction protos to these new errors, along with any others returned from `detectIntentMissingDueToIntentResolution`. Release justification: bug fix 53708: builtins: implement ST_MemCollect and ST_MemUnion r=otan a=erikgrinaker These are implemented as aliases for ST_Collect and ST_Union. In PostGIS these are memory-optimized versions, but for now it should be sufficient to simply make them functional. Release justification: low risk, high benefit changes to existing functionality Release note (sql change): Implement the geometry aggregate builtins `ST_MemCollect` and `ST_MemUnion`. Closes #48984. Closes #48986. Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Erik Grinaker <[email protected]>
(roachtest).tpccbench/nodes=3/cpu=4 failed on master@7425e857e62fe4280f614f9076f310322cc78649:
More
Artifacts: /tpccbench/nodes=3/cpu=4
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: