-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.2: sql: parallelize FK and UNIQUE constraints #97091
release-22.2: sql: parallelize FK and UNIQUE constraints #97091
Conversation
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
The backport was dirty on each of the commits, but the conflicts were minor. I changed the default value for the cluster setting to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @msirek)
This commit addresses an old TODO about figuring out why we cannot reuse the same buffer in `diskRowIterator.Row` method. The contract of that method was previously confusing - it says that the call to `Row` invalidates the row returned on the previous call; however, the important piece was missing - that the datums themselves cannot be mutated (this is what we assume elsewhere and perform only the "shallow" copies). This commit clarifies the contract of the method and explicitly explains why we need to allocate fresh byte slices (which is done via `ByteAllocator` to reduce the number of allocations). Additional context can be found in cockroachdb#43145 which added this copy in the first place. Here is a relevant quote from Alfonso: ``` I think what's going on here is that this type of contract (you may only reuse the row until the next call) is a bit unclear. `CopyRow`s of `EncDatum`s are only implemented as shallow copies, so the reference to this reuse only applies to the `EncDatumRow`, but not to the `encoded` field, which is what is rewritten in this case. The `encoded` field will not be copied, so I think the tests are probably failing due to that. This is unfortunate and there doesn't seem to be a good reason for it. To implement deep copies, we will have to implement deep copies for `Datum`s as well. ``` I think we were under the impression that we'd implement the "deep copy" in `CopyRow`, but I highly doubt we'll do so given that we're mostly investing in the columnar infrastructure nowadays, and the current way of performing shallow copying has worked well long enough. Epic: None Release note: None
This commit refactors the row container infrastructure in order to make it possible to read from a single container concurrently (via separate iterators). It required moving some of the scratch space from the container directly into the iterators (so that the iterators don't share anything between each other) as well as making a similar adjustment to the `pebbleMapIterator`. As a result, it is now possible to safely concurrently iterate over all types of row containers; however, the iterators need to be created and closed under mutex protection (which seems like an acceptable limitation). This commit also makes a minor change to make the container a named parameter inside of the iterator (which makes it easier to see when an iterator is touching the container). The ability to read from the same row container (after all rows have been added to it) concurrently will be utilized by the follow-up commit in order to support validation of multiple FK and UNIQUE checks in parallel. Additionally, this commit improves `RowIterator` interface a bit by renaming `Row` to `EncRow` (since the method returns `EncDatumRow`s) and introducing a new `Row` implementation which returns `tree.Datums`. The new method is now utilized in the `sql.rowContainerIterator` which removes some of the unnecessary conversions between different row types. Release note: None
This commit performs a bunch of minor cleanups and improvements in preparation for the follow-up commit, in which we parallelize the execution of post-query checks. The main rationale for changes in this commit is making it easier to reason about concurrency-safety of DistSQL infra. The following changes are made: - remove unused receiver from `FinalizePlan` and `finalizePlanWithRowCount` methods - remove an unused parameter from `getPlanDistribution` - remove the reference to `planner` from `getPlanDistribution` in favor of passing a single boolean - refactor `resetEvalCtx` to make it concurrency-safe by extracting out the modification of `extraTxnState` into `resetPlanner` - refactor how `topLevelQueryStats` are aggregated across all parts of the stmt (i.e. across the main query, subqueries, and postqueries) to make it easier to introduce concurrency-safety to it in a follow-up - set `planFlagVectorized` only on the main query setup and explicitly propagate whether the flow is vectorized into `saveFlows` - fix a couple of typos in comments. Release note: None
This commit makes it so that we parallelize the execution of multiple "check plans" (which include FOREIGN KEY and UNIQUE constraint checks that run as "postqueries"). The main idea is that we use the LeafTxns (derived from the RootTxn that was used for the main mutation query) and execute each check in a separate task. As a result, the checks should be completed faster, especially so in multi-region environments. This required introduction of mutex-protection for several planning-infra-related objects: - `saveFlows` function that populates miscellaneous info, needed for the stmt bundle - `associateNodeWithComponents` function that creates a mapping from `planNode`s to DistSQL processors, needed for different EXPLAIN variants - `topLevelQueryStats.add` function that aggregates some execution stats across all "queries" of a stmt. Additionally, this commit also needed to make `scanBufferNode`s safe for concurrent use. The way this works is that in the main query we have `bufferNode` which has already accumulated some rows into a row container. That row container can be read from concurrently as long as the separate iterators are created (the creation and the closure must be mutex-protected though). All of these things are synchronized via a single "global" mutex. We could have introduced a separate mutex for each of these things, but a single one seems acceptable given that these operations should be pretty quick and occur at different points throughout the checks' execution. Release note (performance improvement): The execution of multiple FOREIGN KEY and UNIQUE constraint checks can be parallelized in some cases. As a result, these checks can be completed faster, especially so in multi-region environments where the checks require cross-region reads. The feature is enabled by default on 23.1 release, and in order to enable it on 22.2 one must change the private `sql.distsql.parallelize_checks.enabled` cluster setting to `true`.
bed65b3
to
f28d8db
Compare
I just included the backport of a minor change from #92859 in order to preserve the history of commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale)
Holding off on merging this pending investigation in #97141. |
Backport 1/1 commits from #92859.
Backport 3/3 commits from #96123.
/cc @cockroachdb/release
rowcontainer: support concurrent reads and some cleanup
This commit refactors the row container infrastructure in order to make
it possible to read from a single container concurrently (via separate
iterators). It required moving some of the scratch space from the
container directly into the iterators (so that the iterators don't share
anything between each other) as well as making a similar adjustment to
the
pebbleMapIterator
. As a result, it is now possible to safelyconcurrently iterate over all types of row containers; however, the
iterators need to be created and closed under mutex protection (which
seems like an acceptable limitation). This commit also makes a minor
change to make the container a named parameter inside of the iterator
(which makes it easier to see when an iterator is touching the
container).
The ability to read from the same row container (after all rows have
been added to it) concurrently will be utilized by the follow-up commit
in order to support validation of multiple FK and UNIQUE checks in
parallel.
Additionally, this commit improves
RowIterator
interface a bit byrenaming
Row
toEncRow
(since the method returnsEncDatumRow
s) andintroducing a new
Row
implementation which returnstree.Datums
. Thenew method is now utilized in the
sql.rowContainerIterator
whichremoves some of the unnecessary conversions between different row types.
Release note: None
sql: miscellaneous cleanup and improvement
This commit performs a bunch of minor cleanups and improvements in
preparation for the follow-up commit, in which we parallelize the
execution of post-query checks. The main rationale for changes in this
commit is making it easier to reason about concurrency-safety of
DistSQL infra.
The following changes are made:
FinalizePlan
andfinalizePlanWithRowCount
methodsgetPlanDistribution
planner
fromgetPlanDistribution
in favorof passing a single boolean
resetEvalCtx
to make it concurrency-safe by extracting outthe modification of
extraTxnState
intoresetPlanner
topLevelQueryStats
are aggregated across all parts ofthe stmt (i.e. across the main query, subqueries, and postqueries) to
make it easier to introduce concurrency-safety to it in a follow-up
planFlagVectorized
only on the main query setup and explicitlypropagate whether the flow is vectorized into
saveFlows
Release note: None
sql: parallelize FK and UNIQUE constraints
This commit makes it so that we parallelize the execution of multiple
"check plans" (which include FOREIGN KEY and UNIQUE constraint checks
that run as "postqueries"). The main idea is that we use the LeafTxns
(derived from the RootTxn that was used for the main mutation query) and
execute each check in a separate task. As a result, the checks should be
completed faster, especially so in multi-region environments.
This required introduction of mutex-protection for several
planning-infra-related objects:
saveFlows
function that populates miscellaneous info, needed for thestmt bundle
associateNodeWithComponents
function that creates a mapping fromplanNode
s to DistSQL processors, needed for different EXPLAIN variantstopLevelQueryStats.add
function that aggregates some execution statsacross all "queries" of a stmt.
Additionally, this commit also needed to make
scanBufferNode
s safe forconcurrent use. The way this works is that in the main query we have
bufferNode
which has already accumulated some rows into a rowcontainer. That row container can be read from concurrently as long as
the separate iterators are created (the creation and the closure must be
mutex-protected though).
All of these things are synchronized via a single "global" mutex. We
could have introduced a separate mutex for each of these things, but
a single one seems acceptable given that these operations should be
pretty quick and occur at different points throughout the checks'
execution.
Fixes: #95184.
Release note (performance improvement): The execution of multiple
FOREIGN KEY and UNIQUE constraint checks can be parallelized in some
cases. As a result, these checks can be completed faster, especially
so in multi-region environments where the checks require cross-region
reads. The feature is enabled by default on 23.1 release, and in order
to enable it on 22.2 one must change the private
sql.distsql.parallelize_checks.enabled
cluster setting totrue
.Release justification: low-risk high-impact performance improvement disabled by default.