Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: TestStoreResolveMetrics is very flaky #98404

Closed
renatolabs opened this issue Mar 10, 2023 · 0 comments · Fixed by #101936
Closed

kvserver: TestStoreResolveMetrics is very flaky #98404

renatolabs opened this issue Mar 10, 2023 · 0 comments · Fixed by #101936
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. skipped-test

Comments

@renatolabs
Copy link
Contributor

renatolabs commented Mar 10, 2023

TestStoreResolveMetrics started flaking very frequently (1 every ~5 runs locally). A bisection points to #98044.

=== RUN   TestStoreResolveMetrics
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/09357cecfdbbab5926b4b055936a9b62/logTestStoreResolveMetrics3287525348
    test_log_scope.go:79: use -show-logs to present logs inline
    client_metrics_test.go:251: expected around 200 intent commits, saw 255
    client_metrics_test.go:259: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/09357cecfdbbab5926b4b055936a9b62/logTestStoreResolveMetrics3287525348
--- FAIL: TestStoreResolveMetrics (0.85s)

https://teamcity.cockroachdb.com/project.html?projectId=Cockroach_Ci_TestsGcpLinuxX8664BigVm&testNameId=-2099906381107111862&tab=testDetails

cc @nvanbenschoten

Jira issue: CRDB-25248

@renatolabs renatolabs added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team labels Mar 10, 2023
msbutler added a commit to msbutler/cockroach that referenced this issue Mar 10, 2023
craig bot pushed a commit that referenced this issue Mar 10, 2023
98405: kv: skip TestStoreResolveMetrics TestReplicaRemovalClosesProposalQuota r=999 a=msbutler

Informs #98404

Epic: none
Release note: none

Co-authored-by: Michael Butler <[email protected]>
@smg260 smg260 removed the T-kv KV Team label Mar 14, 2023
@nvanbenschoten nvanbenschoten self-assigned this Apr 20, 2023
craig bot pushed a commit that referenced this issue Apr 20, 2023
100731: c2c: clean up ReplicationFeed error handling r=lidorcarmel a=msbutler

Previously, the replicationFeed test helper had methods that would swallow errors, making it impossible to debug certain test failures. This patch cleans up the internals of this test helper and prevents error swallowing.

Fixes #100414

Release note: None

101860: util/parquet: add support for arrays r=miretskiy a=jayshrivastava

This change extends and refactors the util/parquet library to be able to read and write arrays.

Release note: None

Informs: #99028
Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071

101936: kv: deflake and unskip TestStoreResolveMetrics r=arulajmani a=nvanbenschoten

Fixes #98404.

The test had begun flaking after #98044 because we now perform more async intent resolution operations when starting a cluster. Specifically, we perform additional async intent resolution operations in service of jobs updates. These updates perform SELECT FOR UPDATE queries over the new `system.job_info` table, but then perform a 1-phase commit.

To deflake the test, we clear the intent resolution metrics after server startup.

Release note: None

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Jayant Shrivastava <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in c286cd1 Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. skipped-test
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants