Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ttljob: fix range decoding bug #78163

Merged
merged 1 commit into from
Mar 23, 2022
Merged

Conversation

otan
Copy link
Contributor

@otan otan commented Mar 21, 2022

Previously, when a range was split by a prefix of a PRIMARY KEY, the TTL
job would fail. This PR rectifies that.

Resolves #78162
Release note: None

Previously, when a range was split by a prefix of a PRIMARY KEY, the TTL
job would fail. This PR rectifies that.

Release note: None
@otan otan requested review from a team and rafiss March 21, 2022 04:14
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan)

@otan
Copy link
Contributor Author

otan commented Mar 22, 2022

thanks

bors r=rafiss

@craig
Copy link
Contributor

craig bot commented Mar 22, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 23, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 23, 2022

Build failed:

@otan
Copy link
Contributor Author

otan commented Mar 23, 2022

bors r=rafiss

craig bot pushed a commit that referenced this pull request Mar 23, 2022
77588: kv/bulk: split/scatter after range says it is full r=dt a=dt

Previously we would split-and-scatter after adding "filling" a range so
that as we continued to fill, we would do so to a new range that was
first scattered to a new node. We did this once we had written 48MB to
one range, as determined by not hitting a split while sending that many
bytes during a single flush of the above buffer. 48MB was an OK default
when the default range size was 64MB, but has long since been far below
what we now consider full. This was a minor annoyance (causing merges)
but calling split-and-scatter so much more than is needed has recently
become a more severe issue now that ~every scatter call actually moves
a range and blocks on that snapshot.

Even when 48MB was a reasonable fraction of the 64MB default, this was
just a rough approximation for full that failed to account for the
actual range's configured size and only counted data flushed since the
last explicit Flush().

Instead, this change replaces this "split when i've sent enough" logic
with a new approach that lets the range itself reply to each SST with
its remaining capacity and end key. The caller can then use this to
determine if there is room to send another file to that range or if it
should consider it full, and split if/when another key that would be
added to that ranges span is added to the batch.

This significantly reduces how often ingestion stops to wait on scatter
of a split span with the recently updated scatter behavior, more than
doubling observed speeds of bulk ingestion.

Release justification: high impact fix to new/modified functionality.

Release note (performance improvement): Ranges are split and rebalanced during bulk ingestion only when they become full, reducing excessive splits and subsequent merging and associated rebalancing-related performance impact.


78022: sql/schemachanger: set non-cancelable property on jobs correctly r=ajwerner a=ajwerner

This PR sets the non-cancelability of jobs at the correct moment so that
there is not a phase where the job is cancelable but the state implies that
the job should not be reverted. It also propagates this state correctly to
descriptors so that it can be correctly synthesized during a restore.

Backport will address #76441.

Release note: None

78163: ttljob: fix range decoding bug r=rafiss a=otan

Previously, when a range was split by a prefix of a PRIMARY KEY, the TTL
job would fail. This PR rectifies that.

Resolves #78162
Release note: None

Co-authored-by: David Taylor <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
@craig
Copy link
Contributor

craig bot commented Mar 23, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 23, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 23, 2022

Build succeeded:

@craig craig bot merged commit 83898e0 into cockroachdb:master Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ttljob: TTL fails on ranges split by a subset of the PRIMARY KEY
3 participants