sql: support spilling to disk for bufferNode #63900

yuzefovich · 2021-04-20T01:21:10Z

This commit refactors several planNodes that need to buffer rows to
use a disk-backed row container instead of pure in-memory one. In order
to achieve that a couple of light wrappers are introduced on top of the
corresponding container and an iterator over it.

Still, one - probably important - case is not fixed properly: namely, if
a subquery is executed in AllRows or AllRowsNormalized mode, then we
first buffer the rows into the disk-backed container only to materialize
it later into a single tuple. Addressing this is left as a TODO.

Fixes: #62301.
Fixes: #62674.

Release note (sql change): CockroachDB now should be more stable when
executing queries with subqueries producing many rows (previously we
could OOM crash and now we will use the temporary disk storage).

cockroach-teamcity · 2021-04-20T01:21:16Z

This change is

yuzefovich · 2021-04-20T15:34:04Z

Kicked off 20 runs of tpcdsvec (re #62301), and they finished successfully with this commit.

rytaft

Reviewed 11 of 11 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)

pkg/sql/apply_join.go, line 124 at r1 (raw file):

			for {
				var rrow tree.Datums
				if len(a.rightTypes) != 0 {

how will we know when the rows are exhausted if there are no columns?

pkg/sql/create_stats.go, line 522 at r1 (raw file):

		planCtx := dsp.NewPlanningCtx(ctx, evalCtx, nil /* planner */, txn, true /* distribute */)
		if err := dsp.planAndRunCreateStats(
			ctx, evalCtx, planCtx, txn, r.job, NewRowResultWriter(nil /* rowContainer */),

how does this change without a row container? Shouldn't we pass a rowContainerHelper?

This commit refactors several `planNode`s that need to buffer rows to use a disk-backed row container instead of pure in-memory one. In order to achieve that a couple of light wrappers are introduced on top of the corresponding container and an iterator over it. Still, one - probably important - case is not fixed properly: namely, if a subquery is executed in AllRows or AllRowsNormalized mode, then we first buffer the rows into the disk-backed container only to materialize it later into a single tuple. Addressing this is left as a TODO. Release note (sql change): CockroachDB now should be more stable when executing queries with subqueries producing many rows (previously we could OOM crash and now we will use the temporary disk storage).

yuzefovich

TFTR!

bors r+

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)

pkg/sql/apply_join.go, line 124 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

how will we know when the rows are exhausted if there are no columns?

Nice catch! In this scenario previously we would have entered an infinite loop. I added a test and fixed the problem (this also required updating the mem row container code slightly).

pkg/sql/create_stats.go, line 522 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

how does this change without a row container? Shouldn't we pass a rowContainerHelper?

sampleAggregator can only produce metadata objects, so we don't need an actual row container here. Added a comment.

craig · 2021-04-21T23:32:51Z

Build failed (retrying...):

GitHub CI (Cockroach)

craig · 2021-04-22T01:08:59Z

Build succeeded:

GitHub CI (Cockroach)

This commit fixes a bug introduced in cockroachdb#63900 which causes execution of semi and anti apply-joins to panic. For a each row on the left side of the apply-join, rows are fetched for the right side of the join and added to an iterator. For semi and anti apply-joins, the right rows are only consumed until a match is found. These right rows were not being cleared for the next successive left row. This caused a panic when the apply-join predicate would be applied on a `nil` left row during the next call to `applyJoinNode.Next`; the next left row is only fetched if the right row iterator has been cleared. Fixes cockroachdb#65040 Release note (bug fix): A bug that caused panics while executing semi and anti apply-joins has been fixed. This bug was present since version 21.2.0-alpha.

This commit fixes a bug introduced in cockroachdb#63900 which causes execution of semi and anti apply-joins to panic. For a each row on the left side of the apply-join, rows are fetched for the right side of the join and added to an iterator. For semi and anti apply-joins, the right rows are only consumed until a match is found. These right rows were not being cleared for the next successive left row. This caused a panic when the apply-join predicate would be applied on a `nil` left row during the next call to `applyJoinNode.Next`; the next left row is only fetched if the right row iterator has been cleared. Fixes cockroachdb#65040 There is no release note because this bug is only present in 21.2.0-alpha, which has not yet been released. Release note: None

65375: sql: add support for expression-based indexes with CREATE INDEX r=mgartner a=mgartner #### sql: add experimental_enable_expression_based_indexes session setting This commit adds a session setting that will eventually enable users to create expression-based indexes. The setting will be removed when expression-based indexes are fully supported. Release note: None #### sql: add support for expression-based indexes with CREATE INDEX This commit adds basic support for creating expression-based indexes with a `CREATE INDEX` statement. An expression-based index is syntactic sugar for an index on a virtual computed column. Creating an expression-based index will automatically created a hidden virtual column with the given expression. If a virtual column with the given expression already exists, that column is used rather than creating a new one. Future work includes supporting expression-based indexes in `CREATE TABLE` and making error messages related to these indexes more user-friendly. There is no release note because expression-based indexes are not enabled by default. They require the `experimental_enable_expression_based_indexes` session setting until they are fully supported. Release note: None 65524: sql: clear right rows correctly during apply-join r=mgartner a=mgartner This commit fixes a bug introduced in #63900 which causes execution of semi and anti apply-joins to panic. For a each row on the left side of the apply-join, rows are fetched for the right side of the join and added to an iterator. For semi and anti apply-joins, the right rows are only consumed until a match is found. These right rows were not being cleared for the next successive left row. This caused a panic when the apply-join predicate would be applied on a `nil` left row during the next call to `applyJoinNode.Next`; the next left row is only fetched if the right row iterator has been cleared. Fixes #65040 There is no release note because this bug is only present in 21.2.0-alpha, which has not yet been released. Release note: None Co-authored-by: Marcus Gartner <[email protected]>

yuzefovich added the do-not-merge bors won't merge a PR with this label. label Apr 20, 2021

yuzefovich force-pushed the spill-buffer-node branch 2 times, most recently from 0369ee0 to 3e5081c Compare April 20, 2021 02:56

yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Apr 20, 2021

yuzefovich force-pushed the spill-buffer-node branch from 3e5081c to ae7992f Compare April 20, 2021 03:37

yuzefovich marked this pull request as ready for review April 20, 2021 03:38

yuzefovich requested review from rytaft and a team April 20, 2021 03:38

rytaft approved these changes Apr 21, 2021

View reviewed changes

yuzefovich force-pushed the spill-buffer-node branch from ae7992f to 0a8baa9 Compare April 21, 2021 17:59

yuzefovich commented Apr 21, 2021

View reviewed changes

yuzefovich mentioned this pull request Apr 22, 2021

sql: add 'distsql_workmem' session variable #63959

Merged

craig bot merged commit bb86f86 into cockroachdb:master Apr 22, 2021

yuzefovich deleted the spill-buffer-node branch April 22, 2021 01:15

yuzefovich mentioned this pull request Apr 30, 2021

roachtest: tpcdsvec failed #64464

Closed

mgartner mentioned this pull request May 20, 2021

sql: clear right rows correctly during apply-join #65524

Merged

jseldess mentioned this pull request Sep 8, 2021

sql: support spilling to disk for bufferNode cockroachdb/docs#11310

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: support spilling to disk for bufferNode #63900

sql: support spilling to disk for bufferNode #63900

yuzefovich commented Apr 20, 2021 •

edited

Loading

cockroach-teamcity commented Apr 20, 2021

yuzefovich commented Apr 20, 2021

rytaft left a comment

yuzefovich left a comment

craig bot commented Apr 21, 2021

craig bot commented Apr 22, 2021

sql: support spilling to disk for bufferNode #63900

sql: support spilling to disk for bufferNode #63900

Conversation

yuzefovich commented Apr 20, 2021 • edited Loading

cockroach-teamcity commented Apr 20, 2021

yuzefovich commented Apr 20, 2021

rytaft left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

craig bot commented Apr 21, 2021

craig bot commented Apr 22, 2021

yuzefovich commented Apr 20, 2021 •

edited

Loading