Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multitenant: deflake TestConsumption #105552

Merged
merged 1 commit into from
Jul 28, 2023

Conversation

DrewKimball
Copy link
Collaborator

@DrewKimball DrewKimball commented Jun 26, 2023

This patch adds retries to the TestConsumption test for retrieving the total tenant RU consumption in order to ensure that the delta includes the bucket flush that was triggered by the test query, instead of background activity.

Fixes #94286
Fixes #106572

Release note: None

@DrewKimball DrewKimball requested review from a team and rharding6373 June 26, 2023 15:44
@DrewKimball DrewKimball requested a review from a team as a code owner June 26, 2023 15:44
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original failure was on the 22.2 branch. I think you should add backport labels for 23.1 and 23.2.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side_test.go line 910 at r1 (raw file):

		// Wait out the target period duration.
		time.Sleep(targetPeriod * 5)

Sleep can still be flaky, so it would be nice if we could find a way to not rely on it. Why not instead loop on waitForConsumption a few times if the checks on batches, requests, etc. haven't been met?

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side_test.go line 910 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Sleep can still be flaky, so it would be nice if we could find a way to not rely on it. Why not instead loop on waitForConsumption a few times if the checks on batches, requests, etc. haven't been met?

This suggestion also has the potential to be flaky, but I think the sleep is worse, since when sleeping for a long time background tasks are also more likely to cause the checks to pass.

I might be missing some subtleties about how consumption works, though.

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side_test.go line 910 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

This suggestion also has the potential to be flaky, but I think the sleep is worse, since when sleeping for a long time background tasks are also more likely to cause the checks to pass.

I might be missing some subtleties about how consumption works, though.

I changed to retry in a loop.

@DrewKimball
Copy link
Collaborator Author

@rharding6373 friendly ping :)

@DrewKimball DrewKimball added backport-23.1.x Flags PRs that need to be backported to 23.1 backport-22.2.x labels Jul 11, 2023
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side_test.go line 910 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

I changed to retry in a loop.

Instead of an explicit retry loop, why not use SucceedsSoon?

@rharding6373 rharding6373 self-requested a review July 18, 2023 22:21
This patch adds retries to the `TestConsumption` test for retrieving the
total tenant RU consumption in order to ensure that the delta includes the
bucket flush that was triggered by the test query, instead of background
activity.

Fixes cockroachdb#94286
Fixes cockroachdb#106572

Release note: None
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side_test.go line 910 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Instead of an explicit retry loop, why not use SucceedsSoon?

I added SucceedsSoon, hope you don't mind Drew. I also stressed it on gceworker with no failures:

3324 runs so far, 0 failures, over 49m30s

Let's merge this to fix 2 issues :) @DrewKimball @rharding6373

@DrewKimball
Copy link
Collaborator Author

Let's merge this to fix 2 issues :) @DrewKimball @rharding6373

SGTM, sorry I haven't gotten to this in a while.

@yuzefovich
Copy link
Member

No worries!

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 28, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 28, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 28, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
4 participants