-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: test namespace entry for multi-region enum is drained properly #60466
Conversation
As promised on #60437 |
fb5a77e
to
c903c36
Compare
pkg/sql/type_change.go
Outdated
@@ -162,6 +162,9 @@ type TypeSchemaChangerTestingKnobs struct { | |||
// RunBeforeEnumMemberPromotion runs before enum members are promoted from | |||
// readable to all permissions in the typeSchemaChanger. | |||
RunBeforeEnumMemberPromotion func() | |||
// RunAfterOnFailOrCancel runs after OnFailOrCancel completes, if | |||
// OnFailOrCancel is triggered. | |||
RunAfterAllFailOrCancel func() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean RunAfterOnFailOrCancel
?
pkg/sql/multiregion_test.go
Outdated
`) | ||
require.NoError(t, err) | ||
|
||
_, err = sqlDB.Exec(`ALTER DATABASE db DROP REGION "us-east-1" `) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra space
defer mu.Unlock() | ||
numCleanups++ | ||
if numCleanups == 2 { | ||
close(cleanupFinished) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if OnFailOrCancel
fails, does it retry, or does the entry get stuck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/sql/multiregion_test.go, line 164 at r1 (raw file):
mu.Lock() defer mu.Unlock() numCleanups++
Nit: why not make this a bool, and set it after the check?
haveWePerformedCleanup := false
...
if !haveWePerformedCleanup {
haveWePerformedCleanup = true
} else {
close(cleanupFinished)
}
c903c36
to
bf9a6a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @otan)
pkg/sql/multiregion_test.go, line 164 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Nit: why not make this a bool, and set it after the check?
haveWePerformedCleanup := false ... if !haveWePerformedCleanup { haveWePerformedCleanup = true } else { close(cleanupFinished) }
I think thats slightly confusing because when this block is hit, cleanup has already been performed, as the knob runs after cleanup has completed.
I slightly switched this logic to be explicit about the expectation here, lemme know if that improves the readability.
pkg/sql/multiregion_test.go, line 166 at r1 (raw file):
Previously, otan (Oliver Tan) wrote…
if
OnFailOrCancel
fails, does it retry, or does the entry get stuck?
This is exactly the bug I was talking about when we spoke yesterday. #60489. This isn't to say things can never be in a bad state -- they can be, if the error is non-retryable, but you can't account for everything :)
I'll send out a patch once this thing goes in, should be easy enough to test once this testing knob is a thing. I can also update this test to account for that.
pkg/sql/type_change.go, line 167 at r1 (raw file):
Previously, otan (Oliver Tan) wrote…
do you mean
RunAfterOnFailOrCancel
?
I'm not even sure how I made that typo tbh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @otan)
pkg/sql/multiregion_test.go, line 164 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
I think thats slightly confusing because when this block is hit, cleanup has already been performed, as the knob runs after cleanup has completed.
I slightly switched this logic to be explicit about the expectation here, lemme know if that improves the readability.
I like the changes. You could still go with the bool if you change name to haveWePerformedFirstRoundOfCleanup
, but your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @otan)
pkg/sql/type_change.go, line 167 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
I'm not even sure how I made that typo tbh
you need sleep
bf9a6a4
to
2347a01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @otan)
pkg/sql/multiregion_test.go, line 164 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
I like the changes. You could still go with the bool if you change name to
haveWePerformedFirstRoundOfCleanup
, but your call.
Done!
🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set. |
Testing only patch. When the primary region is removed from a multi-region database we queue up an async job to reclaim the namespace entry for the multi-region type descriptor (and its array alias). This patch adds a test to make sure that the namespace entry is drained properly even if the job runs into an error. Release note: None
2347a01
to
df14643
Compare
bors r+ |
This PR was included in a batch that was canceled, it will be automatically retried |
Build failed (retrying...): |
Build succeeded: |
Testing only patch. When the primary region is removed from a
multi-region database we queue up an async job to reclaim the namespace
entry for the multi-region type descriptor (and its array alias). This
patch adds a test to make sure that the namespace entry is drained
properly even if the job runs into an error.
This patch adds a new testing knob to the type schema changer,
RunAfterAllFailOrCancel, to synch on the
OnFailOrCancel
jobcompleting.
Release note: None