-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexec: clean up meta components handling #62221
Conversation
b8b1c89
to
6dd24eb
Compare
6dd24eb
to
549bfc4
Compare
ab83dec
to
ac784f3
Compare
I think this is RFAL. A more detailed motivation behind this change is here, but the gist of the change is introducing the following struct
to tightly couple stats collectors and metadata sources which simplifies the tracking of which components belong to which input tree. Closers are added because why not. The only component that doesn't use this new struct is an outbox because it requires special handling of stats collection in order to emit the flow-level stats on the last outbox on a remote node. |
8eca6e2
to
c8c20c2
Compare
c8c20c2
to
237f8b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but there is a lot here I'm not familiar with. Maybe @jordanlewis can take a look as well.
Reviewed 17 of 41 files at r2, 22 of 29 files at r3, 24 of 25 files at r4, 2 of 2 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)
pkg/sql/colexec/columnarizer.go, line 48 at r3 (raw file):
// chunk into a coldata.Batch column by column. type Columnarizer struct { execinfra.ProcessorBase
Unrelated, but I find it strange that these are embedding ProcessorBase (given that they're not processors). It may pay off to think about what functionality exactly do they need and if we can extract that out into something smaller.
pkg/sql/colexec/columnarizer.go, line 159 at r3 (raw file):
c.input.Start(c.ctx) c.initStatus = colexecop.OperatorInitialized if pb, ok := c.input.(execinfra.ExecStatsForTraceHijacker); ok {
[nit] I know that pb
here is from "processor base", but it's kind of cryptic
pkg/sql/colexec/materializer.go, line 123 at r1 (raw file):
if len(d.statsCollectors) > 0 { // If statsCollectors is non-nil, then the drainHelper is responsible // for attaching the execution statistics to the span, yet we don't get
I have trouble understanding this comment. What does "we don't get the recording from the span" mean?
pkg/sql/colflow/stats.go, line 243 at r5 (raw file):
} func maybeAddStatsInvariantChecker(op *colexecargs.OpWithMetaInfo) {
[nit] could use a short comment
pkg/sql/rowexec/aggregator.go, line 157 at r3 (raw file):
// interface. func (ag *aggregatorBase) HijackExecStatsForTrace() func() *execinfrapb.ComponentStats { return ag.ProcessorBase.HijackExecStatsForTrace()
Why do we need these wrappers, don't we inherit the function from ProcessorBase anyway?
56376ae
to
2aa4dca
Compare
406d02e
to
6ddba03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! @jordanlewis could you take a look from higher level perspective?
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde and @rytaft)
pkg/sql/colexec/columnarizer.go, line 48 at r3 (raw file):
Previously, RaduBerinde wrote…
Unrelated, but I find it strange that these are embedding ProcessorBase (given that they're not processors). It may pay off to think about what functionality exactly do they need and if we can extract that out into something smaller.
Fair point, left a TODO to think about it.
pkg/sql/colexec/materializer.go, line 123 at r1 (raw file):
Previously, RaduBerinde wrote…
I have trouble understanding this comment. What does "we don't get the recording from the span" mean?
Updated the comment, let me know if it makes sense.
pkg/sql/rowexec/aggregator.go, line 157 at r3 (raw file):
Previously, RaduBerinde wrote…
Why do we need these wrappers, don't we inherit the function from ProcessorBase anyway?
Hm, when I was writing this code originally, I ran into some issues (namely, a processor (hidden behind RowSource
interface) couldn't be converted to ExecStatsForTraceHijacker
- I assumed it was because all processors embed ProcessorBase
by value, by the method implementation must have a pointer receiver so that the ProcessorBase is updated accordingly when execStats func is hijacked. As a result, I was running in incomplete EXPLAIN ANALYZE diagrams.
However, now I removed these wrappers, and things seem to be fine. Not sure what changed, but I've removed them.
6ddba03
to
5842b70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @rytaft, and @yuzefovich)
pkg/sql/rowexec/aggregator.go, line 157 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, when I was writing this code originally, I ran into some issues (namely, a processor (hidden behind
RowSource
interface) couldn't be converted toExecStatsForTraceHijacker
- I assumed it was because all processors embedProcessorBase
by value, by the method implementation must have a pointer receiver so that the ProcessorBase is updated accordingly when execStats func is hijacked. As a result, I was running in incomplete EXPLAIN ANALYZE diagrams.However, now I removed these wrappers, and things seem to be fine. Not sure what changed, but I've removed them.
If A embeds Base, *A implements any interface that *Base implements (and A does not) https://play.golang.org/p/l8THFK7DW-y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jordanlewis please give this at least a high level review.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde and @rytaft)
pkg/sql/rowexec/aggregator.go, line 157 at r3 (raw file):
Previously, RaduBerinde wrote…
If A embeds Base, *A implements any interface that *Base implements (and A does not) https://play.golang.org/p/l8THFK7DW-y
Makes sense, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 29 files at r3, 64 of 66 files at r6, 41 of 41 files at r7, 29 of 30 files at r8, 40 of 40 files at r9, 17 of 17 files at r10.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @rytaft, and @yuzefovich)
pkg/sql/colexecop/operator.go, line 347 at r8 (raw file):
} // VectorizedStatsCollector is the common interface implemented by collectors.
nit: it is not clear at least to me what "Collectors" are from this context, a better comment would help. Also perhaps some commentary on GetStats - is it the stats just for the operator, or for its children, etc?
pkg/sql/colflow/stats.go, line 247 at r10 (raw file):
// specified. See comment on statsInvariantChecker for the kind of invariants // checked. func maybeAddStatsInvariantChecker(op *colexecargs.OpWithMetaInfo) {
Nice, I like this idea
This commit is mostly a mechanical change which replaces `getStats` functions with using the stats collectors directly for the root materializer and the hash routers. The outbox still expects a callback because of the need to append flow-level stats on the last outbox. Additionally, the drain helper of the materializer has been updated so that the stats are retrieved before the metadata sources are drained. This will be an invariant enforced by the checkers in the follow-up commits. Release note: None
This commit introduces a utility struct `OpWithMetaInfo` that has a root operator as well as the metadata sources and closers present in the tree handling of which (draining and closing, respectively) hasn't been claimed yet. The struct will be the same (and will replace) `opDAGWithMetaSources` and `SynchronizerInput` by the follow-up commit. The motivation behind this change is that we want to attribute stats collectors, metadata sources, and - why not - closers to the tree of operators in which they are present. Release note: None
Previously, if we had wrapped row-execution processors in the vectorized flow, during the stats collection two `ComponentStats` objects would get emitted for each wrapped processor - one by the processor itself and another by the vectorized stats collector that is wrapping the columnarizer. Both stats objects would have the same ComponentID, so they would get Unioned. This commit cleans up the situation by making the columnarizer "hijack" the `ExecStatsForTrace` function from the wrapped processor. This required a bit of plumbing, namely, we need to: - track the "root" columnarizer separately in `NewColOperator` call so that later the vectorized stats collectors could augment the ComponentStats object created by the wrapped processor with additional information - provide an interface for the columnarizer to hijack the stats function. Release note: None
This commit plumbs the stats collectors into `OpWithMetaInfo` object which allows us to get rid of separate `opDAGWithMetaSources` and `SynchronizerInput` structs as well as more clealy define the ownership of the stats collectors. Now all components can drain their input trees are also responsible for collecting the stats, if applicable, before draining. This invariant will be enforced by the follow-up commit. Notable changes are to the synchronizers - now all of them (ordered, serial unordered, parallel unordered) will be collecting stats. This required creating a tracing span on the first call to `Next` so that we had a way to propagate stats. That a-bit-hacky logic will be removed once the `Operator` interface is updated to take in the context object in `Init`. Release note: None
This commit introduces a dummy vectorized stats collector that is also a metadata source that simply checks that `GetStats` has been called before `DrainMeta` and returns an error as a metadata if not. This stats invariants checker is now planned whenever `util.CrdbTestBuild` is true (which is when `TAGS=crdb_test` was specified). Release note: None
5842b70
to
ec2c445
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @jordanlewis, @RaduBerinde, and @rytaft)
pkg/sql/colexecop/operator.go, line 347 at r8 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
nit: it is not clear at least to me what "Collectors" are from this context, a better comment would help. Also perhaps some commentary on GetStats - is it the stats just for the operator, or for its children, etc?
Done.
pkg/sql/colflow/stats.go, line 247 at r10 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Nice, I like this idea
:)
Build succeeded: |
This commit simplifies the way we track all of the `colexecop.Closer`s we need to close. Previously, we would track them using `OpWithMetaInfo` and then many operators would be responsible for closing the components in their input tree. This commit makes it so that we close them during the flow cleanup. This is ok because we know that all concurrent goroutines will have exited by the time `Cleanup` is called since we do so only after `Flow.Wait` returns (which blocks). The decision to put closers into `OpWithMetaInfo` was made in cockroachdb#62221 with justification "why not" that I provided. Now I have the answer to why we should not do it. Unlike the stats collection and metadata draining (which - at least currently - must happen in the "owner" goroutines), for closers we have a convenient place to close them at the flow level. The contract is such that the closure must occur eventually, but it doesn't matter much in which goroutine it's done (as long as there is no race) and whether it is a bit delayed. (The only downside I see is that the tracing spans are finished with a delay in comparison to when the relevant operations are actually done, but this has already been pretty bad, and this commit makes things only slightly worse. This "delayed" finish shows up as "over-extended" span when viewing traces via jaeger.) As a result of this refactor, the assertion that all closers are closed seems redundant - we'd effectively be asserting only that a single method is called, thus the assertion is removed. This commit also allowed to remove some of the complexity around `ExplainVec` implementation (we no longer need to tie the cleanup to closing of the corresponding planNode). Release note: None
This commit simplifies the way we track all of the `colexecop.Closer`s we need to close. Previously, we would track them using `OpWithMetaInfo` and then many operators would be responsible for closing the components in their input tree. This commit makes it so that we close them during the flow cleanup. This is ok because we know that all concurrent goroutines will have exited by the time `Cleanup` is called since we do so only after `Flow.Wait` returns (which blocks). The decision to put closers into `OpWithMetaInfo` was made in cockroachdb#62221 with justification "why not" that I provided. Now I have the answer to why we should not do it. Unlike the stats collection and metadata draining (which - at least currently - must happen in the "owner" goroutines), for closers we have a convenient place to close them at the flow level. The contract is such that the closure must occur eventually, but it doesn't matter much in which goroutine it's done (as long as there is no race) and whether it is a bit delayed. (The only downside I see is that the tracing spans are finished with a delay in comparison to when the relevant operations are actually done, but this has already been pretty bad, and this commit makes things only slightly worse. This "delayed" finish shows up as "over-extended" span when viewing traces via jaeger.) As a result of this refactor, the assertion that all closers are closed seems redundant - we'd effectively be asserting only that a single method is called, thus the assertion is removed. This commit also allowed to remove some of the complexity around `ExplainVec` implementation (we no longer need to tie the cleanup to closing of the corresponding planNode). Release note: None
91971: colflow: track closers by the vectorized flow r=yuzefovich a=yuzefovich This commit simplifies the way we track all of the `colexecop.Closer`s we need to close. Previously, we would track them using `OpWithMetaInfo` and then many operators would be responsible for closing the components in their input tree. This commit makes it so that we close them during the flow cleanup. This is ok because we know that all concurrent goroutines will have exited by the time `Cleanup` is called since we do so only after `Flow.Wait` returns (which blocks). The decision to put closers into `OpWithMetaInfo` was made in #62221 with justification "why not" that I provided. Now I have the answer to why we should not do it. Unlike the stats collection and metadata draining (which - at least currently - must happen in the "owner" goroutines), for closers we have a convenient place to close them at the flow level. The contract is such that the closure must occur eventually, but it doesn't matter much in which goroutine it's done (as long as there is no race) and whether it is a bit delayed. (The only downside I see is that the tracing spans are finished with a delay in comparison to when the relevant operations are actually done, but this has already been pretty bad, and this commit makes things only slightly worse. This "delayed" finish shows up as "over-extended" span when viewing traces via jaeger.) As a result of this refactor, the assertion that all closers are closed seems redundant - we'd effectively be asserting only that a single method is called, thus the assertion is removed. This commit also allowed to remove some of the complexity around `ExplainVec` implementation (we no longer need to tie the cleanup to closing of the corresponding planNode). Epic: None Release note: None 93077: opt: fix optbuilder partial index comment r=mgartner a=mgartner Epic: None Release note: None 93175: kv: use correct sequence number when scanning for conflicting intents r=nvanbenschoten a=arulajmani A read only request scans the lock table before it can proceed with dropping latches. It can only evaluate if no conflicting intents are found. While doing so, it also determines if the MVCC scan evaluation needs to consult intent history (by using the interleaved iterator). The MVCC scan evaluation needs to consult intent history if we discover an intent by the transaction performing the read operation at a higher sequence number or a higher timestamp. The correct sequence numbers to compare here are those on the `BatchRequest`, and not on the transaction. Before this patch, we were using the sequence number on the transaction, which could lead us to wrongly conclude that the use of an intent interleaving iterator wasn't required. Specifically, if the batch of the following construction was retried on the server: ``` b.Scan(a, e) b.Put(b, "value") ``` The scan would end up (erroneously) reading "value" at key b. As part of this patch, I've also renamed `ScanConflictingIntents` to `ScanConflictingIntentsForDroppingLatchesEarly` -- the function isn't as generalized as the old name would suggest. Closes #92217 Closes #92189 Release note: None 93225: multitenant: check query rows result in admin function tests r=knz a=ecwall Informs CRDB-16746 Some admin functions return error rows in the response instead of an error itself. Update related tests to check these rows. Release note: None 93263: docs issue generation: Update TC configuration to use new parameters r=nickvigilante a=nickvigilante Part of #72910 Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Nick Vigilante <[email protected]>
colexec: plumb statsCollectors directly in most cases
This commit is mostly a mechanical change which replaces
getStats
functions with using the stats collectors directly for the root
materializer and the hash routers. The outbox still expects a callback
because of the need to append flow-level stats on the last outbox.
Additionally, the drain helper of the materializer has been updated so
that the stats are retrieved before the metadata sources are drained.
This will be an invariant enforced by the checkers in the follow-up
commits.
Release note: None
colexecargs: tightly couple root op with metadata sources and closers
This commit introduces a utility struct
OpWithMetaInfo
that has a rootoperator as well as the metadata sources and closers present in the tree
handling of which (draining and closing, respectively) hasn't been
claimed yet. The struct will be the same (and will replace)
opDAGWithMetaSources
andSynchronizerInput
by the follow-up commit.The motivation behind this change is that we want to attribute stats
collectors, metadata sources, and - why not - closers to the tree of
operators in which they are present.
Release note: None
colexec: clean up stats collection from wrapped processors
Previously, if we had wrapped row-execution processors in the vectorized
flow, during the stats collection two
ComponentStats
objects would getemitted for each wrapped processor - one by the processor itself and
another by the vectorized stats collector that is wrapping the
columnarizer. Both stats objects would have the same ComponentID, so
they would get Unioned. This commit cleans up the situation by making
the columnarizer "hijack" the
ExecStatsForTrace
function from thewrapped processor.
This required a bit of plumbing, namely, we need to:
NewColOperator
call sothat later the vectorized stats collectors could augment the
ComponentStats object created by the wrapped processor with additional
information
function.
Release note: None
colexec: finish stats collection refactor
This commit plumbs the stats collectors into
OpWithMetaInfo
objectwhich allows us to get rid of separate
opDAGWithMetaSources
andSynchronizerInput
structs as well as more clealy define the ownershipof the stats collectors. Now all components can drain their input trees
are also responsible for collecting the stats, if applicable, before
draining. This invariant will be enforced by the follow-up commit.
Notable changes are to the synchronizers - now all of them (ordered,
serial unordered, parallel unordered) will be collecting stats. This
required creating a tracing span on the first call to
Next
so that wehad a way to propagate stats. That a-bit-hacky logic will be removed
once the
Operator
interface is updated to take in the context objectin
Init
.Release note: None
colflow: enforce stats collection before drain meta
This commit introduces a dummy vectorized stats collector that is also
a metadata source that simply checks that
GetStats
has been calledbefore
DrainMeta
and returns an error as a metadata if not. This statsinvariants checker is now planned whenever
util.CrdbTestBuild
is true(which is when
TAGS=crdb_test
was specified).Release note: None