-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip restoring global state when one is absent in a snapshot #82075
skip restoring global state when one is absent in a snapshot #82075
Conversation
Currently it is possible to restore snapshot that was created without global state with `include_global_state: true`. This is silently ignored now. This change warns when this is happening.
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% on board with the change, I'd like to make it a little simpler mechanically. I could see how that would mess with our testing, but maybe we could instead just add an integration test that verifies that the global state file isn't even read? (happy to help with that on another channel)
@@ -301,6 +301,9 @@ private void startRestore( | |||
// Make sure that we can restore from this snapshot | |||
validateSnapshotRestorable(repository.getMetadata(), snapshotInfo); | |||
|
|||
// Make sure that we can restore cluster state from this snapshot | |||
validateGlobalStateRestorable(request, snapshot, snapshotInfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little confusing in terms of method name and mechanics (IMO).
Could we maybe just move the check into the conditional below
if (request.includeGlobalState()) {
and not mutate the request object instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this, however that is not the only usage. there is also one in getFeatureStatesToRestore
and could be more in future.
static void validateGlobalStateRestorable(RestoreSnapshotRequest request, Snapshot snapshot, SnapshotInfo snapshotInfo) { | ||
if (request.includeGlobalState() && snapshotInfo.includeGlobalState() != Boolean.TRUE) { | ||
request.includeGlobalState(false); | ||
logger.warn("[{}] the snapshot was created without global state, skipping restoring global sate", snapshot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe word this explicitly and say "[{}] was created without global state but restore request [{}] asks for global state restore explicitly, skipping global state restore"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, if CI is happy I'm happy.
Maybe relabel this to >non-issue
though, it's not really a user facing bug :)
9b62e0f this scares me a little now, are we changing behavior after all? |
Yes, this affects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should fail rather than keep silently ignoring?
static void maybeFixRestoreGlobalStateFlag(RestoreSnapshotRequest request, Snapshot snapshot, SnapshotInfo snapshotInfo) { | ||
if (request.includeGlobalState() && snapshotInfo.includeGlobalState() != Boolean.TRUE) { | ||
request.includeGlobalState(false); | ||
logger.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rather fail out here? Seems like an inadvertent use of a snapshot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henningandersen yes me and @idegtiarenko were one the same page on this as well but figured we'd have to ask you here since it's technically an API breaking change? :) If you're good with that kind of change, ++ I think we should just straight opt for failing.
Closing in favor of: #82037 |
Currently it is possible to restore with include_global_state: true from a snapshot that does not have a global state.
This behavior will nullify cluster state once #81373 is merged.
This pr adds skips global state restore and logs a warning when this happens.
related to #82019