Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/stats: support rowCountEq = 0 in histogram.adjustCounts #82474

Merged
merged 1 commit into from
Jun 7, 2022

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Jun 6, 2022

The predicted histograms in statistics forecasts will often have buckets
with NumEq = 0, and some predicted histograms will have all buckets
with NumEq = 0. This wasn't possible before forecasting, because the
histograms produced by EquiDepthHistogram never have any buckets with
NumEq = 0.

If adjustCounts is called on such a histogram, rowCountEq and
distinctCountEq will be zero. adjustCounts should still be able to
fix such a histogram to have sum(NumRange) = rowCountTotal and
sum(DistinctRange) = distinctCountTotal. This patch teaches
adjustCounts to handle these histograms.

(Similarly, predicted histograms could have all buckets with
NumRange = 0, but this is already possible for histograms produced by
EquiDepthHistogram, so adjustCounts already handles these.)

Also, add a few more comments to adjustCounts.

Assists: #79872

Release note: None

@michae2 michae2 requested review from rytaft, msirek and a team June 6, 2022 17:03
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @msirek)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

func (h *histogram) adjustCounts(evalCtx *eval.Context, rowCountTotal, distinctCountTotal float64) {
// Empty table cases.
if rowCountTotal <= 0 || distinctCountTotal <= 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing negative counts are possible with forecasting too? Maybe the commit message should mention that. And you could add test cases with negative counts if you'd like.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added a test case.

The current PR normalizes negative counts to zero before calling adjustCounts but you're absolutely right, this was my intention with <=.

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @michae2)

The predicted histograms in statistics forecasts will often have buckets
with NumEq = 0, and some predicted histograms will have _all_ buckets
with NumEq = 0. This wasn't possible before forecasting, because the
histograms produced by `EquiDepthHistogram` never have any buckets with
NumEq = 0.

If `adjustCounts` is called on such a histogram, `rowCountEq` and
`distinctCountEq` will be zero. `adjustCounts` should still be able to
fix such a histogram to have sum(NumRange) = rowCountTotal and
sum(DistinctRange) = distinctCountTotal. This patch teaches
`adjustCounts` to handle these histograms.

(Similarly, predicted histograms could have all buckets with
NumRange = 0, but this is already possible for histograms produced by
`EquiDepthHistogram`, so `adjustCounts` already handles these.)

Also, add a few more comments to `adjustCounts`.

Assists: cockroachdb#79872

Release note: None
@michae2
Copy link
Collaborator Author

michae2 commented Jun 7, 2022

TFTRs!

bors r=rytaft,mgartner,msirek

@craig
Copy link
Contributor

craig bot commented Jun 7, 2022

Build succeeded:

@craig craig bot merged commit 8344e69 into cockroachdb:master Jun 7, 2022
@blathers-crl
Copy link

blathers-crl bot commented Jun 7, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 3f2da1a to blathers/backport-release-22.1-82474: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants