Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: fix external aggregator fallback and bool agg functions reset #59055

Merged
merged 1 commit into from
Jan 19, 2021

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jan 15, 2021

Previously, reset method of the ordered aggregator would always set
the flag to reset the internal batch to true. However, that batch is
only allocated when Next is called at least once with a non-zero batch
coming from the input, which is not the case when the fallback to the
disk-backed strategy occurs in the external aggregator (there, we call
reset before we use the operator every time). This would lead to
a nil pointer exception, and it is now fixed.

Our unit tests didn't catch it because we forgot to set the forced
number of repartitions which is now also fixed.

This also revealed a bug with resetting of bool_and and bool_or
aggregates - we forgot to reset whether they have seen a non-null value
or not.

Fixes: #59043.

Release note: None (no stable release with these bugs)

Previously, `reset` method of the ordered aggregator would always set
the flag to reset the internal batch to `true`. However, that batch is
only allocated when `Next` is called at least once with a non-zero batch
coming from the input, which is not the case when the fallback to the
disk-backed strategy occurs in the external aggregator (there, we call
`reset` before we used the operator every time). This would lead to
a nil pointer exception, and it is now fixed.

Our unit tests didn't catch it because we forgot to set the forced
number of repartitions which is now also fixed.

This also revealed a bug with resetting of `bool_and` and `bool_or`
aggregates - we forgot to reset whether they have seen a non-null value
or not.

Release note: None (no stable release with these bugs)
@yuzefovich yuzefovich requested review from asubiotto and a team January 15, 2021 17:50
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, could you add a small regression test for the bool_{and,or} bug?

Reviewed 8 of 8 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@yuzefovich
Copy link
Member Author

Note the bool_* bug was found by already existing unit tests (once maxForcedRepartitions is set). Or do you mean adding a small unit test that verifies specifically the contents of the aggregates after reset? That seems like an overkill.

@asubiotto
Copy link
Contributor

But those tests test mainly correctness in the presence of disk spilling, right? What I mean is maybe a logic test or something simple that would trigger the bool_ bug specifically so we don't have to rely on the side effects of the disk spilling tests to inform us of a regression.

@yuzefovich
Copy link
Member Author

As discussed in standup, AggregateFunc.Reset method is currently only used by the ordered aggregator when it is a part of the fallback strategy in the external aggregator, so the existing coverage seems reasonable.

@yuzefovich
Copy link
Member Author

I'll probably violate some rule by not waiting for an LGTM on this PR, but I had a verbal approval from Alfonso during the standup, and this PR fixes the last known release blocker from our team, so I'll go ahead and merge as is. If anything comes up, we can fix it up later.

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 19, 2021

Build succeeded:

@craig craig bot merged commit 47eb9f3 into cockroachdb:master Jan 19, 2021
@yuzefovich yuzefovich deleted the fix-agg-fallback branch January 19, 2021 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: tpchvec/disk failed
3 participants