-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testutils/testcluster: TestRestart failed #111674
Comments
Unfortunately, I was not able to reproduce this locally under stress. Since the error message and stack trace are coming from Pebble, I'm pinging @cockroachdb/storage on this one -- can you take a look, please? |
Looking at the pebble logs for that run, this is interesting, 000544 (assuming the crash is on n3,s3) is created by an excise:
Then there’s this line when we start up again (unclear what node/store this is though):
Maybe the code to check for obsolete ssts on Open isn’t accounting for virtual sstables? But its' entirely possible it's something else, as I'm stringing together the above theory based on multiple assumptions on the crashing node/store |
This is actually cockroachdb/pebble#2947 . A vendor bump of Pebble will fix this. |
110930: gcjob_test: add more logging to TestGCJobRetry r=rafiss a=andyyang890 This patch adds more logging to `TestGCJobRetry` to help debug occasional flakes. Informs: #110447 Release note: None 111510: api: increase timeout to request execution details r=maryliag a=adityamaru In large clusters requesting execution details can definitely take longer than 5 seconds. This is because it involves collecting cluster wide traces, goroutines and contacting the coordinator node of the job to dump its execution details. Since this is a debug only tool we bump the timeout to 300s to give it all the time it needs. Epic: none Release note: None 111701: sql: add tests for privileges for statements in udfs r=rharding6373 a=rharding6373 This PR adds test coverage for privileges in UDFs, e.g., SELECT and INSERT privileges. Epic: CRDB-25388 Informs: #87289 Release note: None 111704: Roachtest azure nightly r=healthy-pod,herkolategan a=smg260 These are a series of commits to enable roachtests to run on Azure in TeamCity. 1. Add the relevant teamcity invoke script 2. Update authentication to look in CLI or environment for dev and TC respectively 3. Look for a default subscription in the environment, with fallback to existing "pick first" implementation 4. Add a security rule to allow roachtest host machine to connect to a vm via kafka admin 5. Update azure default location to one with more quota and `apt-get update` before installing go for a cdc test (failed on azure) A follow up PR will enable an initial set of roachtests to run. Epic: CC-25185 Release note: none 111776: go.mod: bump Pebble to b013ca78e9dc r=RaduBerinde a=RaduBerinde b013ca78 db: keep track of virtual sstable size sum 62251e69 db: make checkpoint test even more deterministic c7c47d6b db: turn testingAlwaysWaitForCleanup into an option a05b0192 db: keep track of virtual sstable count in metrics 3c778710 db: add test for virtual sstable checkpointing cb4dab66 db: add metrics for num backing sstables and size 8317cf38 db: incrementally keep tracking of backing table size 0f80e184 Update index.html aa077af6 go.mod: specify Go 1.20 ccb9a7dc manifest: add invariant check for duplicate file backings 699fc0e8 db: only create one CreatedBackingTables entry per sstable b2da10c6 db: remove trailing whitespace from compacting log line 1d696c79 db: cleanup btree obsoletion logic Fixes: #111674 Release note: none Co-authored-by: Andy Yang <[email protected]> Co-authored-by: adityamaru <[email protected]> Co-authored-by: rharding6373 <[email protected]> Co-authored-by: Miral Gadani <[email protected]> Co-authored-by: Radu Berinde <[email protected]>
testutils/testcluster.TestRestart failed with artifacts on master @ 765bda989b6b438b1d552b7d93526ffca5a31923:
Parameters:
TAGS=bazel,gss,deadlock
,stress=true
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-32012
The text was updated successfully, but these errors were encountered: