sql: parallelize FK and UNIQUE constraints #96123

yuzefovich · 2023-01-27T20:40:14Z

rowcontainer: support concurrent reads and some cleanup

This commit refactors the row container infrastructure in order to make
it possible to read from a single container concurrently (via separate
iterators). It required moving some of the scratch space from the
container directly into the iterators (so that the iterators don't share
anything between each other) as well as making a similar adjustment to
the pebbleMapIterator. As a result, it is now possible to safely
concurrently iterate over all types of row containers; however, the
iterators need to be created and closed under mutex protection (which
seems like an acceptable limitation). This commit also makes a minor
change to make the container a named parameter inside of the iterator
(which makes it easier to see when an iterator is touching the
container).

The ability to read from the same row container (after all rows have
been added to it) concurrently will be utilized by the follow-up commit
in order to support validation of multiple FK and UNIQUE checks in
parallel.

Additionally, this commit improves RowIterator interface a bit by
renaming Row to EncRow (since the method returns EncDatumRows) and
introducing a new Row implementation which returns tree.Datums. The
new method is now utilized in the sql.rowContainerIterator which
removes some of the unnecessary conversions between different row types.

Release note: None

sql: miscellaneous cleanup and improvement

This commit performs a bunch of minor cleanups and improvements in
preparation for the follow-up commit, in which we parallelize the
execution of post-query checks. The main rationale for changes in this
commit is making it easier to reason about concurrency-safety of
DistSQL infra.

The following changes are made:

remove unused receiver from FinalizePlan and
finalizePlanWithRowCount methods
remove an unused parameter from getPlanDistribution
remove the reference to planner from getPlanDistribution in favor
of passing a single boolean
refactor resetEvalCtx to make it concurrency-safe by extracting out
the modification of extraTxnState into resetPlanner
refactor how topLevelQueryStats are aggregated across all parts of
the stmt (i.e. across the main query, subqueries, and postqueries) to
make it easier to introduce concurrency-safety to it in a follow-up
set planFlagVectorized only on the main query setup and explicitly
propagate whether the flow is vectorized into saveFlows
fix a couple of typos in comments.

Release note: None

sql: parallelize FK and UNIQUE constraints

This commit makes it so that we parallelize the execution of multiple
"check plans" (which include FOREIGN KEY and UNIQUE constraint checks
that run as "postqueries"). The main idea is that we use the LeafTxns
(derived from the RootTxn that was used for the main mutation query) and
execute each check in a separate task. As a result, the checks should be
completed faster, especially so in multi-region environments.

This required introduction of mutex-protection for several
planning-infra-related objects:

saveFlows function that populates miscellaneous info, needed for the
stmt bundle
associateNodeWithComponents function that creates a mapping from
planNodes to DistSQL processors, needed for different EXPLAIN variants
topLevelQueryStats.add function that aggregates some execution stats
across all "queries" of a stmt.

Additionally, this commit also needed to make scanBufferNodes safe for
concurrent use. The way this works is that in the main query we have
bufferNode which has already accumulated some rows into a row
container. That row container can be read from concurrently as long as
the separate iterators are created (the creation and the closure must be
mutex-protected though).

All of these things are synchronized via a single "global" mutex. We
could have introduced a separate mutex for each of these things, but
a single one seems acceptable given that these operations should be
pretty quick and occur at different points throughout the checks'
execution.

Fixes: #95184.

Release note (performance improvement): The execution of multiple
FOREIGN KEY and UNIQUE constraint checks has been parallelized in some
cases. As a result, these checks should be completed faster, especially
so in multi-region environments where the checks require cross-region
reads.

cockroach-teamcity · 2023-01-27T20:40:25Z

This change is

msirek

On my M1 laptop, the test takes about 8.2s with the new version of the fix, and 7.9s on a commit before the fix. Something must be different between your system and mine.

Reviewed 20 of 20 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @cucaroach and @rharding6373)

yuzefovich · 2023-02-08T01:48:14Z

I added a temporary second commit here, but I think I'll extract it into a separate PR for ease of reviewing. Hold off on reviewing changes here.

yuzefovich · 2023-02-09T01:49:58Z

Alright, I cleaned up a change to the row containers to support concurrent reads and extracted it into a separate PR (#96836) for review (I'll still merge it as part of this PR eventually). With that change applied, in Mark's example I see the expected improvement locally when using demo (from 4.2s w/o parallelization to 2.4s w/ (w/ disk spilling) and from 2.5s w/o parallelization to 2.0s w/ (w/o disk spilling)).

This commit refactors the row container infrastructure in order to make it possible to read from a single container concurrently (via separate iterators). It required moving some of the scratch space from the container directly into the iterators (so that the iterators don't share anything between each other) as well as making a similar adjustment to the `pebbleMapIterator`. As a result, it is now possible to safely concurrently iterate over all types of row containers; however, the iterators need to be created and closed under mutex protection (which seems like an acceptable limitation). This commit also makes a minor change to make the container a named parameter inside of the iterator (which makes it easier to see when an iterator is touching the container). The ability to read from the same row container (after all rows have been added to it) concurrently will be utilized by the follow-up commit in order to support validation of multiple FK and UNIQUE checks in parallel. Additionally, this commit improves `RowIterator` interface a bit by renaming `Row` to `EncRow` (since the method returns `EncDatumRow`s) and introducing a new `Row` implementation which returns `tree.Datums`. The new method is now utilized in the `sql.rowContainerIterator` which removes some of the unnecessary conversions between different row types. Release note: None

This commit performs a bunch of minor cleanups and improvements in preparation for the follow-up commit, in which we parallelize the execution of post-query checks. The main rationale for changes in this commit is making it easier to reason about concurrency-safety of DistSQL infra. The following changes are made: - remove unused receiver from `FinalizePlan` and `finalizePlanWithRowCount` methods - remove an unused parameter from `getPlanDistribution` - remove the reference to `planner` from `getPlanDistribution` in favor of passing a single boolean - refactor `resetEvalCtx` to make it concurrency-safe by extracting out the modification of `extraTxnState` into `resetPlanner` - refactor how `topLevelQueryStats` are aggregated across all parts of the stmt (i.e. across the main query, subqueries, and postqueries) to make it easier to introduce concurrency-safety to it in a follow-up - set `planFlagVectorized` only on the main query setup and explicitly propagate whether the flow is vectorized into `saveFlows` - fix a couple of typos in comments. Release note: None

This commit makes it so that we parallelize the execution of multiple "check plans" (which include FOREIGN KEY and UNIQUE constraint checks that run as "postqueries"). The main idea is that we use the LeafTxns (derived from the RootTxn that was used for the main mutation query) and execute each check in a separate task. As a result, the checks should be completed faster, especially so in multi-region environments. This required introduction of mutex-protection for several planning-infra-related objects: - `saveFlows` function that populates miscellaneous info, needed for the stmt bundle - `associateNodeWithComponents` function that creates a mapping from `planNode`s to DistSQL processors, needed for different EXPLAIN variants - `topLevelQueryStats.add` function that aggregates some execution stats across all "queries" of a stmt. Additionally, this commit also needed to make `scanBufferNode`s safe for concurrent use. The way this works is that in the main query we have `bufferNode` which has already accumulated some rows into a row container. That row container can be read from concurrently as long as the separate iterators are created (the creation and the closure must be mutex-protected though). All of these things are synchronized via a single "global" mutex. We could have introduced a separate mutex for each of these things, but a single one seems acceptable given that these operations should be pretty quick and occur at different points throughout the checks' execution. Release note (performance improvement): The execution of multiple FOREIGN KEY and UNIQUE constraint checks has been parallelized in some cases. As a result, these checks should be completed faster, especially so in multi-region environments where the checks require cross-region reads.

yuzefovich · 2023-02-14T00:06:34Z

The first commit in this PR was approved in #96836 while the second and the third are effectively unchanged since I got the approval, so I'm going ahead and merging this.

Thanks for the reviews everyone!

bors r+

craig · 2023-02-14T01:30:03Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

craig · 2023-02-14T02:28:20Z

Build succeeded:

Bazel Essential CI (Cockroach)

blathers-crl · 2023-02-14T02:28:30Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from 8b4d676 to blathers/backport-release-22.2-96123: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

yuzefovich force-pushed the parallelize-checks branch 11 times, most recently from 59f839b to 9d9b0b3 Compare February 3, 2023 00:33

yuzefovich changed the title ~~WIP on parallelizing post-query checks~~ sql: parallelize FK and UNIQUE constraints Feb 3, 2023

yuzefovich force-pushed the parallelize-checks branch from 9d9b0b3 to 875497d Compare February 3, 2023 01:07

cockroachdb deleted a comment from blathers-crl bot Feb 3, 2023

yuzefovich force-pushed the parallelize-checks branch from 875497d to d643d61 Compare February 3, 2023 02:36

yuzefovich marked this pull request as ready for review February 3, 2023 02:37

yuzefovich requested review from a team as code owners February 3, 2023 02:37

yuzefovich requested review from a team February 3, 2023 02:37

yuzefovich requested a review from a team as a code owner February 3, 2023 02:37

yuzefovich requested review from benbardin, samiskin and rytaft and removed request for a team and benbardin February 3, 2023 02:37

msirek reviewed Feb 7, 2023

View reviewed changes

yuzefovich force-pushed the parallelize-checks branch from a3fb556 to 30fa6c5 Compare February 8, 2023 01:47

yuzefovich force-pushed the parallelize-checks branch from 30fa6c5 to a795c40 Compare February 9, 2023 01:41

yuzefovich requested a review from a team as a code owner February 9, 2023 01:41

yuzefovich requested review from bananabrick and removed request for a team and bananabrick February 9, 2023 01:41

yuzefovich marked this pull request as draft February 9, 2023 01:42

yuzefovich mentioned this pull request Feb 13, 2023

rowcontainer: support concurrent reads and some cleanup #96836

Closed

yuzefovich force-pushed the parallelize-checks branch from a795c40 to 02b87f5 Compare February 13, 2023 23:03

yuzefovich marked this pull request as ready for review February 13, 2023 23:56

yuzefovich force-pushed the parallelize-checks branch from 02b87f5 to 0f4b431 Compare February 14, 2023 00:05

craig bot merged commit ac53d8c into cockroachdb:master Feb 14, 2023

yuzefovich deleted the parallelize-checks branch February 14, 2023 02:55

yuzefovich mentioned this pull request Feb 14, 2023

release-22.2: sql: parallelize FK and UNIQUE constraints #97091

Closed

cockroach-teamcity mentioned this pull request Feb 14, 2023

PR #96123 - sql: parallelize FK and UNIQUE constraints cockroachdb/docs#16254

Open

erikgrinaker mentioned this pull request Feb 15, 2023

roachtest: tpcc/headroom failed during release qualification #97141

Closed

yuzefovich mentioned this pull request Mar 11, 2023

opt, sql: parallelize uniqueness checks when possible #58942

Closed

arulajmani mentioned this pull request Apr 3, 2023

sql,kv: bubble up retry errors when creating leaf transactions #98713

Merged

yuzefovich mentioned this pull request Apr 3, 2023

release-22.2: sql: parallelize FK and UNIQUE constraints #100520

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: parallelize FK and UNIQUE constraints #96123

sql: parallelize FK and UNIQUE constraints #96123

yuzefovich commented Jan 27, 2023 •

edited

Loading

cockroach-teamcity commented Jan 27, 2023

msirek left a comment

yuzefovich commented Feb 8, 2023

yuzefovich commented Feb 9, 2023

yuzefovich commented Feb 14, 2023

craig bot commented Feb 14, 2023

craig bot commented Feb 14, 2023

blathers-crl bot commented Feb 14, 2023

sql: parallelize FK and UNIQUE constraints #96123

sql: parallelize FK and UNIQUE constraints #96123

Conversation

yuzefovich commented Jan 27, 2023 • edited Loading

cockroach-teamcity commented Jan 27, 2023

msirek left a comment

Choose a reason for hiding this comment

yuzefovich commented Feb 8, 2023

yuzefovich commented Feb 9, 2023

yuzefovich commented Feb 14, 2023

craig bot commented Feb 14, 2023

craig bot commented Feb 14, 2023

blathers-crl bot commented Feb 14, 2023

yuzefovich commented Jan 27, 2023 •

edited

Loading