Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-23.1: sql: fix sql compaction job full scan #108987

Merged
merged 1 commit into from
Aug 18, 2023
Merged

release-23.1: sql: fix sql compaction job full scan #108987

merged 1 commit into from
Aug 18, 2023

Conversation

j82w
Copy link
Contributor

@j82w j82w commented Aug 18, 2023

Backport 1/1 commits from #108947 on behalf of @j82w.

/cc @cockroachdb/release


The sql-stats-compaction is failing with TransactionRetryError. This is caused by the internal executor uses the zero-values for the settings, rather than the cluster defaults. This causes SET reorder_joins_limit = 0; which then causes the sql-stats-compaction delete statement to do a full scan. The full scan is causing the query to take a long time causing other queries to conflict with it.

Error:
TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: committed value on key

The fix to use the correct default value instead of 0 is made in #101486.

Solution is to change the query to avoid the join and thus the full scan.

Fixes: #108936

Release note (sql change): Optimized the sql-stats-compaction job delete query to avoid full scan. This helps avoid the TransactionRetryError which can cause the job to fail.


Release justification: bug fix

The sql-stats-compaction is failing with TransactionRetryError. This
is caused by the internal executor uses the zero-values for the
settings, rather than the cluster defaults. This causes `SET
reorder_joins_limit = 0;` which then causes the `sql-stats-compaction`
delete statement to do a full scan. The full scan is causing the query
to take a long time causing other queries to conflict with it.

Error:
`TransactionRetryWithProtoRefreshError: TransactionRetryError: retry
txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict:
committed value on key`

The fix to use the correct default value instead of 0 is made in
#101486.

Solution is to change the query to avoid the join and thus the full
scan.

Fixes: #108936

Release note (sql change): Optimized the sql-stats-compaction job delete
query to avoid full scan. This helps avoid the TransactionRetryError
which can cause the job to fail.
@j82w j82w added the blathers-backport This is a backport that Blathers created automatically. label Aug 18, 2023
@blathers-crl
Copy link

blathers-crl bot commented Aug 18, 2023

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@j82w j82w marked this pull request as ready for review August 18, 2023 13:07
@j82w j82w requested a review from a team August 18, 2023 13:07
@j82w j82w requested a review from a team as a code owner August 18, 2023 13:07
@j82w j82w requested review from msirek and removed request for a team August 18, 2023 13:07
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @msirek)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blathers-backport This is a backport that Blathers created automatically.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants