Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-21.2: stats: fix autostats panic with minStaleRows == 0 #89706

Merged
merged 1 commit into from
Oct 11, 2022

Conversation

msirek
Copy link
Contributor

@msirek msirek commented Oct 10, 2022

Fixes #89323

This fixes a panic in autostats code when cluster setting
sql.stats.automatic_collection.min_stale_rows is 0.

Release note (bug fix): This patch fixes errors that may
occur in autostats collection when cluster setting
sql.stats.automatic_collection.min_stale_rows is set to 0.

Release justification: high-severity bug in existing functionality;
autostats collection should never panic.

Fixes cockroachdb#89323

This fixes a panic in autostats code when cluster setting
`sql.stats.automatic_collection.min_stale_rows` is 0.

Release note (bug fix): This patch fixes errors that may occur
in autostats collection when cluster setting
sql.stats.automatic_collection.min_stale_rows is set to 0.
@msirek msirek requested review from michae2, cucaroach and a team October 10, 2022 23:57
@msirek msirek requested a review from a team as a code owner October 10, 2022 23:57
@blathers-crl
Copy link

blathers-crl bot commented Oct 10, 2022

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msirek
Copy link
Contributor Author

msirek commented Oct 10, 2022

This is backporting some code from #78110.

@cucaroach
Copy link
Contributor

Does is make sense to allow setting minStaleRows to zero? What does that mean? Recreate stats on any update/insert/delete? Wouldn't 1 be the same? Maybe it should be an error to set it to 0? Or maybe it should mean never recreate stats (ie turn off auto stats)?

Where's the PR for this to be fixed on master? Did I miss it or does this bug not exist on master?

The release justification is just a restatement of the description, it should be something like one of the 4 categories here: https://cockroachlabs.atlassian.net/wiki/spaces/ENG/pages/168656965/Guidelines+for+Release+Stability+Period

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't questioned the valid range of min_stale_rows before. It's had a min value of zero for a couple years. The description says:

Target minimum number of stale rows per table that will trigger a statistics refresh

And there's an example in docs:

Suppose the clusters settings sql.stats.automatic_collection.fraction_stale_rows and sql.stats.automatic_collection.min_stale_rows have the default values .2 and 500 as shown in the preceding table.

If a table has 100 rows and 20 became stale, a re-collection would not be triggered because, even though 20% of the rows are stale, they do not meet the 500 row minimum.

So, I think a min_stale_rows setting of zero would mean basically the thing as a setting of 1. The number of mutations must be greater than or equal to this value to trigger auto stats. In the case that rowCount starts off low, like a value of 1, a min_stale_rows setting of zero would trigger stats refresh on the next mutation (if fraction_stale_rows is less than 1). This seems like OK behavior. Once the rowCount start growing, rowCount * fraction_stale_rows would be above zero and we'd start randomly triggering autostats (instead of on every mutation).

Where's the PR for this to be fixed on master? Did I miss it or does this bug not exist on master?

This code was part of #78110, which I'm backporting to fix this issue.

I've updated the release justification to category 3.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @michae2)

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose users could get creative and set both fraction_stale_rows and min_stale_rows to zero to force autostats collection on every mutation. That could be by design, but I'm not sure. Copying @rytaft

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @michae2)

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)

@msirek msirek merged commit 3e274db into cockroachdb:release-21.2 Oct 11, 2022
@msirek msirek deleted the 89323 branch October 11, 2022 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants