-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/stats: support rowCountEq = 0 in histogram.adjustCounts #82474
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @msirek)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
func (h *histogram) adjustCounts(evalCtx *eval.Context, rowCountTotal, distinctCountTotal float64) { | ||
// Empty table cases. | ||
if rowCountTotal <= 0 || distinctCountTotal <= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing negative counts are possible with forecasting too? Maybe the commit message should mention that. And you could add test cases with negative counts if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I added a test case.
The current PR normalizes negative counts to zero before calling adjustCounts
but you're absolutely right, this was my intention with <=.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @michae2)
The predicted histograms in statistics forecasts will often have buckets with NumEq = 0, and some predicted histograms will have _all_ buckets with NumEq = 0. This wasn't possible before forecasting, because the histograms produced by `EquiDepthHistogram` never have any buckets with NumEq = 0. If `adjustCounts` is called on such a histogram, `rowCountEq` and `distinctCountEq` will be zero. `adjustCounts` should still be able to fix such a histogram to have sum(NumRange) = rowCountTotal and sum(DistinctRange) = distinctCountTotal. This patch teaches `adjustCounts` to handle these histograms. (Similarly, predicted histograms could have all buckets with NumRange = 0, but this is already possible for histograms produced by `EquiDepthHistogram`, so `adjustCounts` already handles these.) Also, add a few more comments to `adjustCounts`. Assists: cockroachdb#79872 Release note: None
TFTRs! bors r=rytaft,mgartner,msirek |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 3f2da1a to blathers/backport-release-22.1-82474: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
The predicted histograms in statistics forecasts will often have buckets
with NumEq = 0, and some predicted histograms will have all buckets
with NumEq = 0. This wasn't possible before forecasting, because the
histograms produced by
EquiDepthHistogram
never have any buckets withNumEq = 0.
If
adjustCounts
is called on such a histogram,rowCountEq
anddistinctCountEq
will be zero.adjustCounts
should still be able tofix such a histogram to have sum(NumRange) = rowCountTotal and
sum(DistinctRange) = distinctCountTotal. This patch teaches
adjustCounts
to handle these histograms.(Similarly, predicted histograms could have all buckets with
NumRange = 0, but this is already possible for histograms produced by
EquiDepthHistogram
, soadjustCounts
already handles these.)Also, add a few more comments to
adjustCounts
.Assists: #79872
Release note: None