-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: retry sql stats compaction job #107306
sql: retry sql stats compaction job #107306
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Please add a test. |
c2dc039
to
c5ee7c2
Compare
If the sql stats compaction job fails, the error is returned to the job system, where an hour must elapse (by default) before compaction can try again. This commit introduces a compaction job level retry, where compaction can retry up to 10 times before the error is returned to the jobs system. Retries are idempotent. The error semantics are preserved. If all 10 attempts fail, the most recent error is returned. Resolves cockroachdb#107108 Epic: none Release note: None
c5ee7c2
to
5ad1747
Compare
a4a3a80
to
99002f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @j82w and @zachlite)
-- commits
line 28 at r3:
nit: currently this setting isn't marked as "public" (i.e. it's missing WithPublic()
suffix), yet we mention it in the release note. Release notes should contain things that are intended to be exposed to users, and in almost all cases we'd expect the setting to be public if we wanted users to know about. In short, there is a mismatch - we either don't mention the cluster setting in the release note (which I would do) or we mark the setting as public so that it's properly documented.
pkg/sql/compact_sql_stats.go
line 86 at r3 (raw file):
maxRetryAttempts := int(persistedsqlstats.CompactionNumRetryAttempts.Get(&r.st.SV)) if p.ExecCfg().SQLStatsTestingKnobs != nil && p.ExecCfg().SQLStatsTestingKnobs.OverrideCompactionRetryOptions != nil {
nit: p.ExecCfg().SQLStatsTestingKnobs
could be extracted into a local variable.
99002f7
to
79c53d5
Compare
Added |
retry logic This change adds a test for the compaction retry logic as well as a cluster setting `sql.stats.cleanup.num_retries` to configure the number of retry attempts before the compaction job fails. Release note(sql change): add a cluster setting `sql.stats.cleanup.num_retries` to configure the number of retry attempts for the sql stats compaction job before it fails
79c53d5
to
5ea7958
Compare
Closing after contention fix: #107549 |
If the sql stats compaction job fails, the error is returned to the job system, where an hour must elapse (by default) before compaction can try again.
This commit introduces a compaction job level retry, where compaction can retry up to 10 times before the error is returned to the jobs system. Retries are idempotent.
The error semantics are preserved. If all 10 attempts fail, the most recent error is returned.
Resolves #107108
Epic: none
Release note: None