-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: audit all processors to make their closure bullet-proof #91969
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @yuzefovich)
pkg/sql/execinfra/processorsbase.go
line 361 at r1 (raw file):
// NOTE: if StartInternal() hasn't been called, this will be nil, so // consider using EnsureCtx() instead. Ctx context.Context
could we unexport this to enforce calling EnsureCtx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @rytaft)
pkg/sql/execinfra/processorsbase.go
line 361 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
could we unexport this to enforce calling EnsureCtx?
We could, and it would fix this nil pointer error for good, but for some reason I'm slightly hesitant about making such a change - like it "feels" wrong to me, not sure why :) Probably because it would "pollute" the code a bit when accessing the context when we know that it is non-nil - which is pretty much in all places except for "closing", and the problem in the "closing" scenario is only present when the row-by-row processors are wrapped into the vectorized flows (due to different interfaces used in both engines).
I don't have a strong opinion though, so if you think it's worth it, I'd be fine with making such a change.
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I also don't feel too strongly, but if you make this change it might feel less polluting if you change the function name from |
3360813
to
3877149
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still a long term goal to not store the context at all? Just curious, is the reason we haven't done that because of the code churn required or something else?
Reviewed 22 of 22 files at r1, 4 of 28 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)
3877149
to
165a023
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think that we want to remove the context from the processors or from the operators (there is more context in this comment). Where I think we ought to remove the context from is eval.Context
because that object doesn't have a clear lifetime and is being passed around many different layers - processors and operators don't have such issues.
Could someone take another look? I replaced all Ctx
accesses with Ctx()
calls, most changes were mechanical, the only interesting ones where in columnarizer.go
and processors_test.go
.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @rytaft)
pkg/sql/execinfra/processorsbase.go
line 361 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I also don't feel too strongly, but if you make this change it might feel less polluting if you change the function name from
EnsureCtx()
to justCtx()
, since it will be the only way to access thectx
. But I defer to you to decide if you think that makes sense.
Ok, I decided to implement this suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 22 files at r1, 28 of 28 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
TFTRs! bors r+ |
Build failed (retrying...): |
Needs a rebase. bors r- |
Canceled. |
This commit replaces all usages of `ProcessorBaseNoHelper.Ctx` field with a call to the newly-introduced `Ctx()` method which returns a background context if the processor hasn't been started. This change makes it so that all processors now respect the contract of `colexecop.Closer` interface which says that `Close` must be safe to call even if `Init` hasn't been performed (in the context of processors this means that `Columnarizer.Init` wasn't called meaning that `Processor.Start` wasn't either). Initially, I attempted to fix this in cockroachdb#91446 by putting the protection into the columnarizer, but that led to broken assumptions since we wouldn't close all closers that we expected to (in particular, the materializer that is the input to the wrapped row-by-row processor wouldn't be closed). This commit takes a different approach and should fix the issue for good without introducing any flakiness. As a result, this commit fixes a rarely hit issue when the aggregator and the zigzag joiner attempt to log when they are closed if they haven't been started (that we see occasionally from sentry). The issue is quite rare though, so no release note seems appropriate. Release note: None
165a023
to
19f3386
Compare
bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 19f3386 to blathers/backport-release-22.1-91969: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. error creating merge commit from 19f3386 to blathers/backport-release-22.2-91969: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
This commit replaces all usages of
ProcessorBaseNoHelper.Ctx
fieldwith a call to the newly-introduced
Ctx()
method which returnsa background context if the processor hasn't been started. This change
makes it so that all processors now respect the contract of
colexecop.Closer
interface which says thatClose
must be safe tocall even if
Init
hasn't been performed (in the context of processorsthis means that
Columnarizer.Init
wasn't called meaning thatProcessor.Start
wasn't either).Initially, I attempted to fix this in #91446 by putting the protection
into the columnarizer, but that led to broken assumptions since we
wouldn't close all closers that we expected to (in particular, the
materializer that is the input to the wrapped row-by-row processor
wouldn't be closed). This commit takes a different approach and should
fix the issue for good without introducing any flakiness.
As a result, this commit fixes a rarely hit issue when the aggregator
and the zigzag joiner attempt to log when they are closed if they
haven't been started (that we see occasionally from sentry). The issue
is quite rare though, so no release note seems appropriate.
Fixes: #84902.
Fixes: #91845.
Release note: None