-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add nodes for each EXPLAIN ANALYZE operator #60550
Conversation
CC @awoods187 |
e82b712
to
6cdaeda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note on the implementation: I started by trying to make execution set
ComponentID.NodeID in all cases, but I got stuck in ProcessorBase
where we only have a SQLIDContainer available. I don't fully
understand the new abstraction and whether the distsql components and
flows should really use SQLIDs instead of NodeIDs.
SQLIDContainer is just the node ID and you can treat it as such (by casting). The end goal is to remove all notion of NodeIDs and just have sql instance IDs (#49596).
Reviewed 10 of 10 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
I see. I tried working on it some more but the change is getting pretty big, we need to plumb the node ID in other places (e.g. diagram data). I think I'd rather merge this as-is for now. |
This looks great and I'll take it in explain analyze for now. We can add to explain in the future as you said. |
LGTM |
6cdaeda
to
e0fa95c
Compare
Actually, I think I figured out the NodeID stuff, PTAL. |
e0fa95c
to
1a74e20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)
pkg/sql/execinfra/flow_context.go, line 120 at r2 (raw file):
// The stream must originate from the node associated with this FlowCtx. func (ctx *FlowCtx) StreamComponentID(streamID execinfrapb.StreamID) execinfrapb.ComponentID { // TODO(radu): the component stats should store SQLInstanceID instead.
nit: Could we do the inverse and create a sql instance ID from places where we have a nodeID?
pkg/sql/flowinfra/outbox_test.go, line 570 at r2 (raw file):
NodeDialer: nodedialer.New(clientRPC, staticAddressResolver(addr)), }, NodeID: base.NewSQLIDContainer(1, nil /* nodeID */),
nit: maybe make a TestingSQLIDContainer
function with no args that returns this to avoid repeating these calls
Show the cluster nodes involved in the execution of each operator. Note that this info does not show up in the non-analyze EXPLAIN. It is technically much more challenging to do that because of the indirect way we do distsql planning. Once we have the new DistSQL exec factory, we will be able to add it. To implement this, we make execution set `ComponentID.NodeID` in all cases. Unfortunately, there is not much we can do to test this currently (other than manual testing). I will investigate making the "deterministic" flag more fine-grained, so that we can hide truly non-deterministic values (like timings) separately from those that just vary with the configuration (query distribution). Example: ``` movr> EXPLAIN ANALYZE SELECT * FROM rides JOIN vehicles ON rides.vehicle_id = vehicles.id; info -------------------------------------------- planning time: 158µs execution time: 7ms distribution: full vectorized: true hash join │ cluster nodes: n1, n2, n3 │ actual row count: 500 │ equality: (vehicle_id) = (id) │ ├── scan │ cluster nodes: n1, n2, n3 │ actual row count: 500 │ KV rows read: 500 │ KV bytes read: 86 KiB │ missing stats │ table: rides@primary │ spans: FULL SCAN │ └── scan cluster nodes: n1, n2, n3 actual row count: 15 KV rows read: 15 KV bytes read: 2.3 KiB missing stats table: vehicles@primary spans: FULL SCAN ``` Release note (sql change): EXPLAIN ANALYZE now includes the nodes which were involved in the execution of each operator in the tree.
1a74e20
to
a24fa86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto)
pkg/sql/execinfra/flow_context.go, line 120 at r2 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
nit: Could we do the inverse and create a sql instance ID from places where we have a nodeID?
Sorry, I'm not sure I follow.. The TODO is saying that ComponentID should store SQLInstanceID, so that we would need to obtain or pass that instead of NodeID; this does mean that in some places, at least until all the SQLInstanceID plumbing is done, we'd need to create the sql instance ID from the node ID.
pkg/sql/flowinfra/outbox_test.go, line 570 at r2 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
nit: maybe make a
TestingSQLIDContainer
function with no args that returns this to avoid repeating these calls
Done, apparently there was already one defined :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @RaduBerinde)
pkg/sql/execinfra/flow_context.go, line 120 at r2 (raw file):
Previously, RaduBerinde wrote…
Sorry, I'm not sure I follow.. The TODO is saying that ComponentID should store SQLInstanceID, so that we would need to obtain or pass that instead of NodeID; this does mean that in some places, at least until all the SQLInstanceID plumbing is done, we'd need to create the sql instance ID from the node ID.
Sorry the thought wasn't complete. What I meant is that we could already make ComponentID
store SQLInstanceID
and instead of converting instance ids to node ids like we do here, convert node ids to instance ids in the other sites. Just a thought to make the future plumbing easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto)
pkg/sql/execinfra/flow_context.go, line 120 at r2 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Sorry the thought wasn't complete. What I meant is that we could already make
ComponentID
storeSQLInstanceID
and instead of converting instance ids to node ids like we do here, convert node ids to instance ids in the other sites. Just a thought to make the future plumbing easier.
Oh, that's exactly what the TODO said. Done (as a separate commit).
a39695d
to
607a6ef
Compare
Switch to using SQLInstanceID instead of NodeID in ComponentStats. Eventually, the same change should happen in the exec protos (e.g. StreamEndpointSpec). Release note: None
607a6ef
to
878bf45
Compare
bors r+ |
Build failed (retrying...): |
Build succeeded: |
Show the cluster nodes involved in the execution of each operator.
Note that this info does not show up in the non-analyze EXPLAIN. It is
technically much more challenging to do that because of the indirect
way we do distsql planning. Once we have the new DistSQL exec factory,
we will be able to add it.
Note on the implementation: I started by trying to make execution set
ComponentID.NodeID
in all cases, but I got stuck inProcessorBase
where we only have a
SQLIDContainer
available. I don't fullyunderstand the new abstraction and whether the distsql components and
flows should really use SQLIDs instead of NodeIDs.
Unfortunately, there is not much we can do to test this currently
(other than manual testing). I will investigate making the
"deterministic" flag more fine-grained, so that we can hide truly
non-deterministic values (like timings) separately from those that
just vary with the configuration (query distribution).
Example:
Release note (sql change): EXPLAIN ANALYZE now includes the nodes
which were involved in the execution of each operator in the tree.
Informs #59860.