Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: mvcc_gc failed #90020

Closed
cockroach-teamcity opened this issue Oct 15, 2022 · 8 comments · Fixed by #90603
Closed

roachtest: mvcc_gc failed #90020

cockroach-teamcity opened this issue Oct 15, 2022 · 8 comments · Fixed by #90603
Assignees
Labels
A-testing Testing tools and infrastructure branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Oct 15, 2022

roachtest.mvcc_gc failed with artifacts on master @ 7be0b20edbc336200c1510a9c6f1d76ae2f92c3a:

test artifacts and logs in: /artifacts/mvcc_gc/run_1
(mvcc_gc.go:187).assertRangesWithGCRetry: assertion still failing after 5m0s: table ranges contain range tombstones contains_estimates:0 last_update_nanos:1665817313005902506 intent_age:0 gc_bytes_age:69427857169 live_bytes:41619847 live_count:20000 key_bytes:4356887 key_count:20000 val_bytes:701959520 val_count:341420 intent_bytes:0 intent_count:0 separated_intent_count:0 range_key_count:2 range_key_bytes:66 range_val_count:2 range_val_bytes:0 sys_bytes:2912 sys_count:46 abort_span_bytes:0
(monitor.go:127).Wait: monitor failure: monitor task failed: t.Fatal() was called

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

Jira issue: CRDB-20548

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Oct 15, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Oct 15, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Oct 15, 2022
@erikgrinaker erikgrinaker added T-kv-replication and removed T-kv KV Team labels Oct 15, 2022
@blathers-crl
Copy link

blathers-crl bot commented Oct 15, 2022

cc @cockroachdb/replication

@erikgrinaker
Copy link
Contributor

@aliher1911 Can you have a look on Monday, and determine whether this is a legit failure or a test artifact?

@aliher1911
Copy link
Contributor

This is a flake of a newly added test. Test verifies GC by enqueueing replicas in retry loop while checking stats.
It looks like it is always barely gets in within timeout when running in CI while it has some headroom on local cluster or when running from dev machine.
It doesn't look like a problem in GC, so I'll remove blocker tag and address flakiness in due time.

@aliher1911 aliher1911 removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Oct 17, 2022
@cockroach-teamcity
Copy link
Member Author

roachtest.mvcc_gc failed with artifacts on master @ 762c1b86fe1c3a70338cee4a91c9e2e4c5e0fcfe:

test artifacts and logs in: /artifacts/mvcc_gc/run_1
(mvcc_gc.go:187).assertRangesWithGCRetry: assertion still failing after 5m0s: table ranges contain range tombstones contains_estimates:0 last_update_nanos:1666076934831958621 intent_age:0 gc_bytes_age:62440539948 live_bytes:41619847 live_count:20000 key_bytes:4273139 key_count:20000 val_bytes:687610696 val_count:334441 intent_bytes:0 intent_count:0 separated_intent_count:0 range_key_count:2 range_key_bytes:52 range_val_count:2 range_val_bytes:0 sys_bytes:4095 sys_count:53 abort_span_bytes:0
(monitor.go:127).Wait: monitor failure: monitor task failed: t.Fatal() was called

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.mvcc_gc failed with artifacts on master @ 16b020292dbcdf9699da531764929ebdc8e48c43:

test artifacts and logs in: /artifacts/mvcc_gc/run_1
(mvcc_gc.go:187).assertRangesWithGCRetry: assertion still failing after 5m0s: table ranges contain range tombstones contains_estimates:0 last_update_nanos:1666249233209163788 intent_age:0 gc_bytes_age:64203745236 live_bytes:41619847 live_count:20000 key_bytes:4343519 key_count:20000 val_bytes:699669136 val_count:340306 intent_bytes:0 intent_count:0 separated_intent_count:0 range_key_count:2 range_key_bytes:84 range_val_count:4 range_val_bytes:0 sys_bytes:7731 sys_count:86 abort_span_bytes:0
(monitor.go:127).Wait: monitor failure: monitor task failed: t.Fatal() was called

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

This test on roachdash | Improve this report!

@erikgrinaker
Copy link
Contributor

Mind bumping the timeout here?

@cockroach-teamcity
Copy link
Member Author

roachtest.mvcc_gc failed with artifacts on master @ 11e299bb1bec2f5666658393f14c12ebe11cc4a4:

test artifacts and logs in: /artifacts/mvcc_gc/run_1
(mvcc_gc.go:187).assertRangesWithGCRetry: assertion still failing after 5m0s: table ranges contain range tombstones contains_estimates:0 last_update_nanos:1666594801935897222 intent_age:0 gc_bytes_age:65053613185 live_bytes:41619847 live_count:20000 key_bytes:4340651 key_count:20000 val_bytes:699177752 val_count:340067 intent_bytes:0 intent_count:0 separated_intent_count:0 range_key_count:3 range_key_bytes:180 range_val_count:12 range_val_bytes:0 sys_bytes:10576 sys_count:124 abort_span_bytes:0
(monitor.go:127).Wait: monitor failure: monitor task failed: t.Fatal() was called

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.mvcc_gc failed with artifacts on master @ 1b1c8da55be48c174b7b370b305f42622546209f:

test artifacts and logs in: /artifacts/mvcc_gc/run_1
(test_impl.go:297).Fatalf: assertion still failing after 5m0s: table ranges contain range tombstones contains_estimates:0 last_update_nanos:1666767927189741590 intent_age:0 gc_bytes_age:64737673140 live_bytes:41619847 live_count:20000 key_bytes:4386695 key_count:20000 val_bytes:707066624 val_count:343904 intent_bytes:0 intent_count:0 separated_intent_count:0 range_key_count:2 range_key_bytes:102 range_val_count:6 range_val_bytes:0 sys_bytes:4962 sys_count:57 abort_span_bytes:0
(test_impl.go:291).Fatal: monitor failure: monitor task failed: t.Fatal() was called

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Oct 26, 2022
90603: roachtest: mvcc_gc increase GC waiting timeouts r=erikgrinaker a=aliher1911

Test waits for mvcc gc queue to collect old data. Replicas are enqueued for GC asynchronously and it could take long time for garbage to get collected which causes test to fail. This commit bumps retry timeout to remove false negatives.

Release note: None

Fixes: #90020

Co-authored-by: Oleg Afanasyev <[email protected]>
@craig craig bot closed this as completed in 52e1859 Oct 26, 2022
@lunevalex lunevalex added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure labels Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants