Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: close main query after all postqueries are executed #38866

Merged
merged 1 commit into from
Jul 17, 2019

Conversation

yuzefovich
Copy link
Member

Previously, we were closing the main query plan nodes right after
it was executed. However, some postqueries require access to the
objects belonging to the main query tree (for example, scan buffer
nodes in postqueries will be reading from the buffer nodes in the
main tree). Now this is fixed.

Release note: None

@yuzefovich yuzefovich requested review from jordanlewis, RaduBerinde and a team July 14, 2019 20:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, and @yuzefovich)


pkg/sql/distsql_running.go, line 303 at r1 (raw file):

	}

	return func() {

The caller could in principle change planCtx after this function but this function binds it as a pointer. If we want to restrict that it should be documented. Otherwise we should pull the if outside of the func and copy a pointer to curPlan


pkg/sql/distsql_running.go, line 307 at r1 (raw file):

		// flow.Cleanup, which closes memory accounts that expect to be emptied.
		if planCtx.planner != nil && !planCtx.ignoreClose {
			planCtx.planner.curPlan.execErr = recv.resultWriter.Err()

Won't the main query and postquery all write execErr (in whatever order we happen to clean up)?

It kind of feels like this is happening in the wrong place, maybe what this if does should happen once, not once per query/postquery


pkg/sql/distsql_running.go, line 308 at r1 (raw file):

		if planCtx.planner != nil && !planCtx.ignoreClose {
			planCtx.planner.curPlan.execErr = recv.resultWriter.Err()
			planCtx.planner.curPlan.close(ctx)

Should planTop.close() close the postqueries as well?

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RaduBerinde)


pkg/sql/distsql_running.go, line 303 at r1 (raw file):

Previously, RaduBerinde wrote…

The caller could in principle change planCtx after this function but this function binds it as a pointer. If we want to restrict that it should be documented. Otherwise we should pull the if outside of the func and copy a pointer to curPlan

Hm, I didn't consider this, fixed, thanks!


pkg/sql/distsql_running.go, line 307 at r1 (raw file):

Previously, RaduBerinde wrote…

Won't the main query and postquery all write execErr (in whatever order we happen to clean up)?

It kind of feels like this is happening in the wrong place, maybe what this if does should happen once, not once per query/postquery

This if will only be true for the main query (because ignoreClose is set to true for sub- and postqueries).

(Also, I looked into where this execErr is used, and it is only in one place when retrieving a query statement by fingerprint, and it is only checked whether it is non-nil, so - at least at the moment - it wouldn't actually matter if we overrode the error. But I agree, that this could have been problematic some time in the future if we were to use execErr somewhere else, so thanks for pointing this out!)


pkg/sql/distsql_running.go, line 308 at r1 (raw file):

Previously, RaduBerinde wrote…

Should planTop.close() close the postqueries as well?

Yeah, I think so, thanks.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, and @yuzefovich)


pkg/sql/distsql_running.go, line 307 at r1 (raw file):

Previously, yuzefovich wrote…

This if will only be true for the main query (because ignoreClose is set to true for sub- and postqueries).

(Also, I looked into where this execErr is used, and it is only in one place when retrieving a query statement by fingerprint, and it is only checked whether it is non-nil, so - at least at the moment - it wouldn't actually matter if we overrode the error. But I agree, that this could have been problematic some time in the future if we were to use execErr somewhere else, so thanks for pointing this out!)

Definitely feels like this would be more clean if we just moved it after the call to PlanAndRun (and got rid of ignoreClose). The curPlan is a higher-level object than what this function is logically dealing with.

In fact, I think a defer planCtx.planner.curPlan.close() belongs somewhere earlier in execWithDistSQLEngine (or even higher up in its caller).

Let's see what @jordanlewis thinks.


pkg/sql/distsql_running.go, line 311 at r2 (raw file):

			// before flow.Cleanup, which closes memory accounts that expect to be
			// emptied.
			plannerCopy.curPlan.execErr = recv.resultWriter.Err()

This does nothing now, it just modifies the copy.. I think we should do curPlan := &planCtx.planner.curPlan

@maddyblue
Copy link
Contributor

Did you consider guaranteeing that the returned func is never nil (i.e., returning func() {})? This would make life much easier for the callers since they can just blindly execute it.

@yuzefovich yuzefovich requested a review from a team July 16, 2019 20:54
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this suggestion, thanks, done.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RaduBerinde)


pkg/sql/distsql_running.go, line 307 at r1 (raw file):

Previously, RaduBerinde wrote…

Definitely feels like this would be more clean if we just moved it after the call to PlanAndRun (and got rid of ignoreClose). The curPlan is a higher-level object than what this function is logically dealing with.

In fact, I think a defer planCtx.planner.curPlan.close() belongs somewhere earlier in execWithDistSQLEngine (or even higher up in its caller).

Let's see what @jordanlewis thinks.

I agree with you here, and I actually tried doing this briefly during revision 2, but it was a little complicated, so I cooked up this quick patch without changes to the logic. I think it should be ok to merge this as is so that it unblocks work on FKs, and I (or maybe Matt as part of his work on other things) will address it shortly.


pkg/sql/distsql_running.go, line 311 at r2 (raw file):

Previously, RaduBerinde wrote…

This does nothing now, it just modifies the copy.. I think we should do curPlan := &planCtx.planner.curPlan

Oops, thanks.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it stylistically ok to call this cleanup function right away, i.e. as do a single line dsp.Run(planCtx, noTxn, &p, recv, &evalCtxCopy, finishedSetupFn)() instead of a double liner if we don't want to delay the clean up?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @RaduBerinde)

@@ -188,7 +188,8 @@ func distChangefeedFlow(

// Copy the evalCtx, as dsp.Run() might change it.
evalCtxCopy := *evalCtx
dsp.Run(planCtx, noTxn, &p, recv, &evalCtxCopy, finishedSetupFn)
cleanup := dsp.Run(planCtx, noTxn, &p, recv, &evalCtxCopy, finishedSetupFn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dsp.Run(planCtx, noTxn, &p, recv, &evalCtxCopy, finishedSetupFn)() is even simpler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't sure if this is "stylistically acceptable" as I put it in the comment above (because it kind of hides that a function is returned and executed right away, at least to me it seems more complicated to read such code), but it does seem nicer overall. And since you pointed it out as well, let's go with calling the function right away.

Previously, we were closing the main query plan nodes right after
it was executed. However, some postqueries require access to the
objects belonging to the main query tree (for example, scan buffer
nodes in postqueries will be reading from the buffer nodes in the
main tree). Now this is fixed.

Release note: None
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @mjibson, and @RaduBerinde)

@yuzefovich
Copy link
Member Author

Thanks for the reviews!

bors r+

craig bot pushed a commit that referenced this pull request Jul 17, 2019
38866: sql: close main query after all postqueries are executed r=yuzefovich a=yuzefovich

Previously, we were closing the main query plan nodes right after
it was executed. However, some postqueries require access to the
objects belonging to the main query tree (for example, scan buffer
nodes in postqueries will be reading from the buffer nodes in the
main tree). Now this is fixed.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jul 17, 2019

Build succeeded

@craig craig bot merged commit 7e2c074 into cockroachdb:master Jul 17, 2019
@yuzefovich yuzefovich deleted the fix-pq branch July 19, 2019 03:56
Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @mjibson, and @RaduBerinde)


pkg/sql/distsql_running.go, line 307 at r1 (raw file):

Previously, yuzefovich wrote…

I agree with you here, and I actually tried doing this briefly during revision 2, but it was a little complicated, so I cooked up this quick patch without changes to the logic. I think it should be ok to merge this as is so that it unblocks work on FKs, and I (or maybe Matt as part of his work on other things) will address it shortly.

Can I kindly ask you to clean up what this thread is discussing? I'm confused about what the code around here, and I'm generally bothered by the cleanup return value from this function. I realize there were reasons why you've deferred things, but what greater pleasure is there in life than disentangling those reason?
Ideally a higher level would handle the closing of the plan nodes, and this method would handle the flow.Cleanup().
The time to separate memory accounts using planning and execution might be upon us.

As a nit, does the final returned cleanup in this method (the one that just does flow.Cleanup()) need to be deferred, or can it just be called here?

I got here because I'm adding yet more cleanup to a flow and it's fairly unclear if that needs to be deferred or not (although I have a pretty good guess). The fact that the cleanup return value is not usefully documented doesn't help, and I think it's not documented because it's hard to document because, in turn, it doesn't make a whole lotta sense.

@yuzefovich
Copy link
Member Author

Can I kindly ask you to clean up what this thread is discussing? I'm confused about what the code around here, and I'm generally bothered by the cleanup return value from this function. I realize there were reasons why you've deferred things, but what greater pleasure is there in life than disentangling those reason?
Ideally a higher level would handle the closing of the plan nodes, and this method would handle the flow.Cleanup().
The time to separate memory accounts using planning and execution might be upon us.

As a nit, does the final returned cleanup in this method (the one that just does flow.Cleanup()) need to be deferred, or can it just be called here?

I got here because I'm adding yet more cleanup to a flow and it's fairly unclear if that needs to be deferred or not (although I have a pretty good guess). The fact that the cleanup return value is not usefully documented doesn't help, and I think it's not documented because it's hard to document because, in turn, it doesn't make a whole lotta sense.

@andreimatei I'm taking a look into cleaning up this cleanup right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants