-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colbuilder: clean up type schema handling #107324
Conversation
e4f9b87
to
d25e123
Compare
Release note: None
786a973
to
a4da044
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this one on! It seems sound. If one of the places where the types are passed (like MakeHashJoinerSpec
) modified the slice, we'd see corruption, but I think we consider the types slice immutable anyway.
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @yuzefovich)
-- commits
line 45 at r2:
Do you think we should add a release note? Customers have observed this behavior once or twice.
This commit refactors how we're keeping track of the current type schema of the operators in `NewColOperator`. Previously, we would create a new type slice for each operator due to "type schema corruption" bugs we observed (cockroachdb#47889). We fixed that bug by being extremely conservative, and this commit applies a different more reasonable fix. In particular, it is safe to append to the current type slice we have in scope, and we only need to be careful when we're trying to create a "projection" (i.e. when we need to change the order of types or modify one type in-place). Thus, this commit switches to making a copy only in those scenarios which should happen at most once per processor spec (previously, it could happen thousands of times for elaborate render expressions). Furthermore, this commit reuses the same type slice from `InputSyncSpec` since creation of the operators occurs _after_ the spec has been communicated across the wire (or locally), so we're free to use it as we please. ``` name old time/op new time/op delta NestedAndPlanning/renders=16-24 627µs ± 1% 624µs ± 2% ~ (p=0.143 n=10+10) NestedAndPlanning/renders=256-24 3.54ms ± 0% 3.04ms ± 1% -14.14% (p=0.000 n=9+10) NestedAndPlanning/renders=4096-24 211ms ± 4% 68ms ± 1% -67.61% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NestedAndPlanning/renders=16-24 74.0kB ±20% 68.9kB ±10% ~ (p=0.053 n=10+9) NestedAndPlanning/renders=256-24 1.71MB ± 0% 0.60MB ± 0% -65.07% (p=0.000 n=8+8) NestedAndPlanning/renders=4096-24 303MB ± 0% 13MB ± 1% -95.58% (p=0.000 n=8+8) name old allocs/op new allocs/op delta NestedAndPlanning/renders=16-24 754 ±18% 733 ±18% ~ (p=0.105 n=9+9) NestedAndPlanning/renders=256-24 6.44k ± 0% 5.93k ± 0% -7.88% (p=0.000 n=8+8) NestedAndPlanning/renders=4096-24 146k ± 6% 136k ± 0% -7.02% (p=0.000 n=8+8) ``` Release note (bug fix): Previously, CockroachDB when planning expressions containing many sub-expressions (e.g. deeply-nested AND / OR structures) would use memory quadratical in the number of sub-expressions, and in the worst cases (thousands of sub-expressions) this could lead to OOMs. The bug has been present since at least 22.1 and has now been fixed.
a4da044
to
85fd4fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I too believe this should be safe since the callers have no reason to modify the passed-in slices. The type management seems to be centralized in execplan.go
, and we only have a handful of operators that reorder / recreate the output type schema completely, and in those cases we do make a copy. I think it should be safe to backport too, after some baking period on master.
TFTR!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)
Previously, DrewKimball (Drew Kimball) wrote…
Do you think we should add a release note? Customers have observed this behavior once or twice.
Done.
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 85fd4fb to blathers/backport-release-22.2-107324: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Nice improvement! |
This commit refactors how we're keeping track of the current type schema of the operators in
NewColOperator
. Previously, we would create a new type slice for each operator due to "type schema corruption" bugs we observed (#47889). We fixed that bug by being extremely conservative, and this commit applies a different more reasonable fix.In particular, it is safe to append to the current type slice we have in scope, and we only need to be careful when we're trying to create a "projection" (i.e. when we need to change the order of types or modify one type in-place). Thus, this commit switches to making a copy only in those scenarios which should happen at most once per processor spec (previously, it could happen thousands of times for elaborate render expressions).
Furthermore, this commit reuses the same type slice from
InputSyncSpec
since creation of the operators occurs after the spec has been communicated across the wire (or locally), so we're free to use it as we please.Fixes: #104996.
Release note (bug fix): Previously, CockroachDB when planning expressions containing many sub-expressions (e.g. deeply-nested AND / OR structures) would use memory quadratical in the number of sub-expressions, and in the worst cases (thousands of sub-expressions) this could lead to OOMs. The bug has been present since at least 22.1 and has now been fixed.