Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add new BACKFILLING mutation state #72281

Merged
merged 1 commit into from
Jan 26, 2022

Conversation

stevendanna
Copy link
Collaborator

@stevendanna stevendanna commented Nov 1, 2021

This adds a BACKFILLING mutation state. When an index is in this
state, it is ignored for the purposes of INSERT, UPDATE, and
DELETE. Currently, this state is unused in the actual code.

This new state will eventually be used in the new index backfiller. In
the new backfiller, the bulk-backfill using AddSSTable will occur when
the index is in this state, with concurrent updates being captured by
a temporary index. See
docs/RFCS/20211004_incremental_index_backfiller.md for more details.

Indexes in this state are returned by (TableDescriptor).AllIndexes,
(TableDescriptor).NonDropIndexes, and (TableDescriptor).Partial.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@stevendanna stevendanna marked this pull request as ready for review November 1, 2021 16:59
@stevendanna stevendanna requested review from a team, rhu713 and dt and removed request for a team November 1, 2021 16:59
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine on the face of it.

I say on the face of it because there might be code that relies on the assumption that the only possible values are PUBLIC, DELETE_ONLY and DELETE_AND_WRITE_ONLY. However, because the new BACKFILL value is not set anywhere except in that unit test, this assumption is never really challenged.

I'm all for smaller PRs but in this case it would be nice to see this commit in a multi-commit PR followed by other commits which actually exercise this new value. If that's not practical then I'm fine with merging this. Do we have an idea of when we're going to exercise this new value? Generally speaking I'm a bit uneasy with changes "for future use" if it's not clear that this use is in a very-near future.

@@ -1062,7 +1062,7 @@ func (ot *optTable) WritableIndexCount() int {
// DeletableIndexCount is part of the cat.Table interface.
func (ot *optTable) DeletableIndexCount() int {
// Primary index is always present, so count is always >= 1.
return len(ot.desc.AllIndexes())
return len(ot.desc.DeletableNonPrimaryIndexes()) + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? This is the same thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is no longer the same thing. AllIndexes() returns all of the indexes, including backfilling indexes. But backfilling indexes are not deletable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying and your change here makes sense in that light. That being said I just pulled master I'm not seeing what you describe in the current implementation of AllIndexes(). Extending the indexCache struct to support backfilling indexes seems like it would lie within the scope of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment above is incorrect, sorry. You did what was required as far as I'm concerned.

@stevendanna
Copy link
Collaborator Author

@postamar

This seems fine on the face of it.
..
I'm all for smaller PRs but in this case it would be nice to see this commit in a multi-commit PR followed by other commits which actually exercise this new value. If that's not practical then I'm fine with merging this.

Thanks for taking a look! Happy to just fold this into a larger PR. There are a few things that need to come together to make this end-to-end usable.

@stevendanna stevendanna force-pushed the ssd/backfilling-state branch from 02d5d80 to 904e612 Compare January 19, 2022 17:46
@stevendanna
Copy link
Collaborator Author

@rhu713 Mind giving this a second look. I pulled in some of the changes from the integration branch, but otherwise this part of the change hasn't changed in a while and I think it is worth merging this if it still looks good to you.

@stevendanna
Copy link
Collaborator Author

bors r=postamar

@craig
Copy link
Contributor

craig bot commented Jan 25, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jan 25, 2022

Build failed:

@stevendanna stevendanna force-pushed the ssd/backfilling-state branch from 904e612 to f999876 Compare January 25, 2022 22:13
@stevendanna
Copy link
Collaborator Author

bors retry

@craig
Copy link
Contributor

craig bot commented Jan 26, 2022

Build failed:

@stevendanna
Copy link
Collaborator Author

The bors error we are getting her doesn't seem to easily reproduce locally for me on a rebased branch and also doesn't happen in pre-merge CI from the looks of it...

This adds a BACKFILLING mutation state. When an index is in this
state, it is ignored for the purposes of INSERT, UPDATE, and
DELETE. Currently, this state is unused in the actual code.

This new state will eventually be used in the new index backfiller. In
the new backfiller, the bulk-backfill using AddSSTable will occur when
the index is in this state, with concurrent updates being captured by
a temporary index. See
docs/RFCS/20211004_incremental_index_backfiller.md for more details.

Indexes in this state are returned by `(TableDescriptor).AllIndexes`
and `(TableDescriptor).NonDropIndexes`.

Release note: None
@stevendanna stevendanna force-pushed the ssd/backfilling-state branch from f999876 to f81a7e4 Compare January 26, 2022 17:13
@stevendanna
Copy link
Collaborator Author

bors retry

@craig
Copy link
Contributor

craig bot commented Jan 26, 2022

Build succeeded:

@craig craig bot merged commit e1ceeda into cockroachdb:master Jan 26, 2022
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Feb 16, 2022
Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the _current_
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in cockroachdb#72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index."  This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in cockroachdb#75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

Co-authored-by: Rui Hu <[email protected]>
craig bot pushed a commit that referenced this pull request Feb 16, 2022
73878: sql: introduce MVCC-compliant index backfiller r=ajwerner a=stevendanna

Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the _current_
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in #72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index."  This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in #75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

76252: kvserver: account for AddSSTableRequests in QPS r=kvoli a=kvoli

Previously, Queries-Per-Second (QPS) was calculated uniformly per
`BatchRequest` as 1. This patch introduces variable QPS calculation
for `AddSSTableRequest`, which use an order of magnitude more
resources than other request types.

This patch introduces the
`kv.replica_stats.addsst_request_size_factor` cluster setting. This
setting is used to attribute QPS to `AddSSTableRequest` sizes. The
calculation is done as QPS = 1 + size(AddSSTableRequest) / factor.
When `kv.replica_stats.addsst_request_size_factor` is less than 1, or
no `AddSSTableRequest` exists within a `BatchRequest`, then QPS = 1;
the current behavior today.

resolves #73731

Release note (performance improvement):
Introduced `kv.replica_stats.addsst_request_size_factor` cluster
setting. This setting is used to tune Queries-Per-Second (QPS)
sensitivity to large imports. By default, this setting is disabled.
When enabled, the size of any AddSSTableRequest will contribute to
QPS in inverse relation to this settings magnitude. By default this
setting configured to a conservative 50,000; every 50 kilobytes will
be accounted for as an additional 1 QPS.

76617: bazel: make `CrdbTestBuild` `const` r=RaduBerinde a=rickystewart

This partially reverts #72838. That change was made to avoid thrashing
the cache when swapping between test- and non-test configurations.
Today we have `dev cache` which can retain build artifacts across the
different configurations, so this doesn't really serve a purpose any
more. Indeed, you can now swap between `bazel build
pkg/cmd/cockroach-short` and `bazel build pkg/cmd/cockroach-short
--config=test` very quickly with Bazel downloading old versions of the
built libraries if the built-in cache gets wiped.

We still filter out the `go:build` lines in `crdb_test_{off,on}.go` so
we don't have to set `gotags` for test and non-test, which still saves a
lot of time and unnecessary recompilation. We have a check for this in
CI so no one should be able to add a build constraint without us
knowing.

Release note: None

76627: util/tracing: improve span recording interface r=andreimatei a=andreimatei

Before this patch, the Span's recording interface was left over from a
time when there were only one recording mode: verbose. We now have two
modes: verbose recording and structured recording. They can be enabled
at span creation time through the WithRecording(<type>) option. This
patch changes the Span's SetVerbose() method to expose the two options.

Release note: None

76667: roachtest/test: fix ORM testing due to removed cluster setting r=rafiss a=otan

`sql.catalog.unsafe_skip_system_config_trigger.enabled` got removed
recently and was part of an alpha. Let's clean it up in ORMs too.

Resolves #76654
Resolves #76655
Resolves #76656
Resolves #76657
Resolves #76658
Resolves #76659
Resolves #76660
Resolves #76661
Resolves #76662
Resolves #76663
Resolves #76664
Resolves #76665
Resolves #76666

Release note: None

Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Rui Hu <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the _current_
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in cockroachdb#72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index."  This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in cockroachdb#75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

Co-authored-by: Rui Hu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants