Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport-2.1: one week of merge PRs #29082

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Aug 27, 2018

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release

dt and others added 19 commits August 26, 2018 23:20
previously we'd include the pre-rendered expression, which is particularly useless with placeholders.

Fixes cockroachdb#28992.

Release note: none.
SET CLUSTER SETTING does a transactional write to the system table.
However settings values are _read_ from an in-memory cache which is
not updated until the system table write is gossiped and that gossip
update is processed by the settings worker.

Release note (sql change): SET CLUSTER SETTING cannot be used inside a transaction.
Settings values are cached in memory and that cache is updated via gossip after a SET commits.
That delay -- for the update to arrive via gosssip and be processed and the cache updated --
can mean that a statement executed immediately following a SET CLUSTER SETTING may execute
before the new setting takes effect, and knowing when it is safe to execute a statement that
uses the new setting is tricky.

Instead, we can make SET CLUSTER SETTING _wait_ until it sees the value it set returned when
reading the settings cache (or return an error if it does not appear after a timeout of 30s).
This should make it easier to set a setting and then use that value in subsequent statements.

Release note (sql change): SET CLUSTER SETTING attempts to wait until change has been gossiped.
This TODO suggested removing maybeWatchForMerge by teaching the LHS
replica to reach into the RHS replica after a merge committed to unblock
requests. It failed to consider that we'd need to do the same if the
merge aborted. We don't presently have an abort trigger, so this is
excessively difficult. Simply discard the TODO; the status quo is fine.

Release note: None
Store.SplitRange was still using the old term "range" to refer to
replicas. Also use "left" and "right" instead of "orig" and "new" for
consistency with Store.MergeRange.

Release note: None
It is important that the in-memory copy of a replica's range descriptor
exactly match the on-disk copy. Previously, during splits/merges, we
would reconstruct the updates to the in-memory descriptor piecemeal.
This is dangerous, especially in mixed version clusters, where nodes can
disagree about what updates to the descriptor to perform. Notably,
splits only update the generation counter in v2.1.

Instead of trying to reconstruct the updates piecemeal during a split
or merge, simply adopt the updated descriptor from the split/merge
trigger wholesale.

We'd ideally adjust replica changes to operate the same way.
Unfortunately the updated descriptor is not currently included in the
ChangeReplicasTrigger, so the migration is rather involved.

Release note: None
Merges will explode if used in a mixed-version cluster. Gate them behind
a cluster setting.

For extra safety, also gate increments of the new generation counter in
the range descriptor behind the same cluster setting. It's not clear
exactly what would go wrong, if anything, if mixed-version clusters
incremented the generation counter, but better to be extra cautious.

Release note: None
When applying a merge, teach the leaseholder of the LHS range to update
its timestamp cache for the keyspace previously owned by the RHS range
appropriately.

Release note: None
TestStoreRangeMergeDuringShutdown shuts down the multiTestContext when
it applies a lease for the RHS. In rare cases, the lease application can
get replayed, which previously caused the multiTestContext to get
shutdown twice, which panics. Add additional state to prevent this case.

Fix cockroachdb#28894.

Release note: None
GetSnapshotForMerge no longer fetches a snapshot of the RHS of a merge.
Rename the method Subsume to better reflect its purpose: to freeze a
range before it is subsumed by its left-hand neighbor.

Release note: None
The rules for generating the settings HTML table got broken at some
point. Fix them by using target-specific variables properly: they only
apply to prerequisites of the declaring target, and are only resolved
within recipes.

Release note: None
TestStoreRangeMergeTimestampCacheCausality could time out if the merge
transaction retried, which occured about one out of every two hundred
runs, because it would try to send multiple values over a channel whose
buffer had room for only one value. Deflake the test by remembering in a
variable the value from the last merge transaction to execute rather
than using channels.

Release note: None
During a merge, it is possible for the RHS lease to change hands, e.g.,
when the original leaseholder dies and another member of the range
acquires the lease. In this case, the new leaseholder is responsible for
checking for a deletion intent on its local range descriptor; if it
discovers such an intent, a merge is in progress and the leaseholder is
required to block all traffic unless it can prove that the merge
aborted.

The previous implementation of this scheme had a small window in which
the new leaseholder had installed a valid lease but had not yet
installed a mergeComplete channel to block all traffic. This race was
never seen in practice, but it could, in theory, lead to a
serializability violation.

Reorder the flow post-lease acquisition so that checking for an
in-progress merge occurs before the new lease is installed.

Release note: None
Now that merges do not include a snapshot of the RHS data in the merge
trigger, we no longer need a setting limiting the size of the RHS of a
merge.

Release note: None
Merges are relatively expensive. Set the merge queue interval to one
second so we avoid processing too many merges at once. Introduce a
cluster setting to allow users/tests to adjust the merge queue interval
if they so choose.

Fix cockroachdb#27769.

Release note: None
The retry loop in AdminSplit can span many seconds. In that time, the
replica may lose its lease, or the range might be merged away entirely.
In either of those cases, the split can never succeed, and so the retry
loop needs to give up.

The loop was properly exiting if it noticed it lost its lease, but a
range can get merged away without losing its lease. The final lease on
that range remains valid until the liveness epoch it is tied to expires.
Teach the loop to notice that condition too by checking
Replica.IsDestroyed on every turn of the loop.

Release note: None
Teach TestSystemZoneConfigs to install zone configs via SQL, rather than
the hacky testing override system, which interacts poorly with the
forthcoming on-by-default merge queue.

Release note: None
Turn off the merge queue in all tests that need it. The actual default
will be changed in a separate PR so that this commit can be safely
backported to release-2.1.

Release note: None
Splitting while the merge queue is enabled is almost certainly a user
mistake. Add a best-effort check to prevent users from splitting while
the merge queue is enabled. Users can override the check and request a
split anyway by twiddling a new session variable,
experimental_force_split_at.

We have plans to eventually make the splits created by SPLIT AT
"sticky", so that the merge queue does not immediately merge them away,
but not in time for 2.1.

Release note (sql change): ALTER TABLE ... SPLIT AT now produces an
error if executed while the merge queue is enabled, as the merge queue
is likely to immediately discard any splits created by the statement.
@benesch benesch requested review from tbg and a team August 27, 2018 03:25
@benesch benesch requested a review from a team as a code owner August 27, 2018 03:25
@benesch benesch requested review from a team August 27, 2018 03:25
@benesch benesch requested a review from a team as a code owner August 27, 2018 03:25
@benesch benesch requested review from a team August 27, 2018 03:25
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor

knz commented Aug 27, 2018

LGTM insofar that the picked commits are identical to the linked PRs. I will let the other reviewer(s) judge whether it is safe to cherrypick.

@tbg
Copy link
Member

tbg commented Aug 27, 2018

LGTM, let's not let this beast grow any larger.

@benesch
Copy link
Contributor Author

benesch commented Aug 27, 2018

bors r=knz,tschottdorf

craig bot pushed a commit that referenced this pull request Aug 27, 2018
29033: backport-2.1: storage: improve consistency checker diff, add debug decode-key r=a-robinson a=tschottdorf

Backport 2/2 commits from #29012.

/cc @cockroachdb/release

---

Avoid spilling non-printable characters and generally garbage into the
logs.

Release note: None


29082: backport-2.1: one week of merge PRs r=knz,tschottdorf a=benesch

Backport:
  * 1/1 commits from "sql: put rendered setting value in event log entry" (#29014)
  * 2/2 commits from " sql,settings: make SET CLUSTER SETTING wait for value to update, disallow in txn" (#29063)
  * 1/1 commits from "storage: discard a unworthwhile merge TODO" (#28885)
  * 3/3 commits from "storage: gate merges behind a cluster setting " (#28865)
  * 1/1 commits from "storage: update the tscache appropriately after a merge" (#28793)
  * 1/1 commits from "storage: avoid stopping twice in merge test" (#28902)
  * 1/1 commits from "roachpb,storage: rename GetSnapshotForMerge to Subsume" (#28910)
  * 1/1 commits from "build: fix generation rules for settings.html" (#28884)
  * 1/1 commits from "storage: deflake TestStoreRangeMergeTimestampCacheCausality" (#28928)
  * 1/1 commits from "storage: check for in-progress merge before installing new lease" (#28894)
  * 6/7 commits from " storage: enable merge queue by default" (#28961)

Please see individual PRs for details.

/cc @cockroachdb/release


Co-authored-by: Tobias Schottdorf <[email protected]>
Co-authored-by: David Taylor <[email protected]>
Co-authored-by: Nikhil Benesch <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 27, 2018

Build succeeded

@craig craig bot merged commit 5e8bd23 into cockroachdb:release-2.1 Aug 27, 2018
@knz
Copy link
Contributor

knz commented Aug 27, 2018

🎉

@benesch benesch deleted the backport2.1-29014-29063-28885-28865-28793-28902-28910-28884-28928-28894-28961 branch August 27, 2018 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants