Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: avoid splitting if the split point might be unsafe #109378

Merged

Conversation

lidorcarmel
Copy link
Contributor

@lidorcarmel lidorcarmel commented Aug 23, 2023

Restore may use unsafe keys as split points, which may cause unsafe splits
between column families, which may cause SQL to fail when reading the row, or
worse, return wrong resutls.

This commit avoids splitting on keys that might be unsafe.

See the issue for more info.

Epic: none
Informs: #109483

Release note: None.

@blathers-crl
Copy link

blathers-crl bot commented Aug 23, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@lidorcarmel lidorcarmel force-pushed the lidor_ensure_safe_split_key_issue branch 7 times, most recently from 5bac114 to e53df5b Compare August 30, 2023 17:28
@lidorcarmel lidorcarmel requested a review from dt August 30, 2023 17:28
@lidorcarmel lidorcarmel marked this pull request as ready for review August 30, 2023 17:29
@lidorcarmel lidorcarmel requested review from a team as code owners August 30, 2023 17:29
} else if len(splitAt) != 0 {
newSplitKey = splitAt
}
log.VEventf(ctx, 1, "presplitting new key %+v", newSplitKey)
if err := s.db.AdminSplit(ctx, newSplitKey, expirationTime); err != nil {
log.Infof(ctx, "presplitting on safe key %s", newSplitKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you intend to switch this to info? also, I'm not sure if "safe" if meaningful to reader of the logs here, so I might just revert this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you intend to switch this to info?

yes, I was thinking: 10K splits on a node, 200ish bytes per message, gives us 2MB of spam, not great but really nice to see the actual split that restore asked for. Would make debugging the next unsafe split issue easier. WDYT?

I'm not sure if "safe" if meaningful to reader of the logs here

yep, good point, this was "new key" which I also thought doesn't mean much? changed now to just "key".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little wary of adding net new behavior to a backport, including 10k log lines during a restore; seems like more than is strictly required to fix the bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, you did say that, oops.
done. thanks!

@lidorcarmel lidorcarmel force-pushed the lidor_ensure_safe_split_key_issue branch from e53df5b to 037ca8a Compare August 30, 2023 18:19
@lidorcarmel lidorcarmel added backport-22.2.x backport-23.1.x Flags PRs that need to be backported to 23.1 labels Aug 30, 2023
@lidorcarmel lidorcarmel force-pushed the lidor_ensure_safe_split_key_issue branch from 037ca8a to bfe187b Compare August 30, 2023 20:15
@lidorcarmel
Copy link
Contributor Author

TFTR!
bors r+

@craig
Copy link
Contributor

craig bot commented Aug 30, 2023

🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

@lidorcarmel
Copy link
Contributor Author

bors cancel

I'll merge after Rui's change is in.

@lidorcarmel
Copy link
Contributor Author

bors r-

@lidorcarmel lidorcarmel force-pushed the lidor_ensure_safe_split_key_issue branch from bfe187b to b30c7b0 Compare August 30, 2023 20:32
@craig
Copy link
Contributor

craig bot commented Aug 30, 2023

Canceled.

// The key might be corrupt, and therefore we cannot guarantee it is a valid
// split key. The restore can still continue without this split. This error
// is not expected after #109483 is fixed.
log.Errorf(ctx, "failed splitting at key %s err: %v", newSplitKey, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but given your changes to EnsureSafeSplitKey, this will be logging the key twice since err also contains it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thank you.

@lidorcarmel
Copy link
Contributor Author

bors r+

@lidorcarmel
Copy link
Contributor Author

bors r-
missed your comment @stevendanna !! fixing.

@craig
Copy link
Contributor

craig bot commented Aug 31, 2023

Canceled.

@lidorcarmel lidorcarmel force-pushed the lidor_ensure_safe_split_key_issue branch from b30c7b0 to dd2fb6e Compare August 31, 2023 14:58
Restore may use unsafe keys as split points, which may cause unsafe splits
between column families, which may cause SQL to fail when reading the row, or
worse, return wrong resutls.

This commit avoids splitting on keys that might be unsafe.

See the issue for more info.

Epic: none
Informs: cockroachdb#109483

Release note: None.
@lidorcarmel lidorcarmel force-pushed the lidor_ensure_safe_split_key_issue branch from dd2fb6e to 069f4d4 Compare August 31, 2023 14:59
@lidorcarmel
Copy link
Contributor Author

bors r+
(I'm also trying to figure out the code coverage issue, hopefully it's not blocking.)

@craig craig bot merged commit 8fb29b9 into cockroachdb:master Aug 31, 2023
@craig
Copy link
Contributor

craig bot commented Aug 31, 2023

Build succeeded:

@blathers-crl
Copy link

blathers-crl bot commented Aug 31, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 069f4d4 to blathers/backport-release-22.2-109378: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants