Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip restoring global state when one is absent in a snapshot #82075

Closed

Conversation

idegtiarenko
Copy link
Contributor

Currently it is possible to restore with include_global_state: true from a snapshot that does not have a global state.
This behavior will nullify cluster state once #81373 is merged.
This pr adds skips global state restore and logs a warning when this happens.

related to #82019

Currently it is possible to restore snapshot that was created
without global state with `include_global_state: true`. This
is silently ignored now. This change warns when this is happening.
@idegtiarenko idegtiarenko added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. auto-backport-and-merge v8.1.0 labels Dec 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% on board with the change, I'd like to make it a little simpler mechanically. I could see how that would mess with our testing, but maybe we could instead just add an integration test that verifies that the global state file isn't even read? (happy to help with that on another channel)

@@ -301,6 +301,9 @@ private void startRestore(
// Make sure that we can restore from this snapshot
validateSnapshotRestorable(repository.getMetadata(), snapshotInfo);

// Make sure that we can restore cluster state from this snapshot
validateGlobalStateRestorable(request, snapshot, snapshotInfo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little confusing in terms of method name and mechanics (IMO).

Could we maybe just move the check into the conditional below

if (request.includeGlobalState()) {

and not mutate the request object instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, however that is not the only usage. there is also one in getFeatureStatesToRestore and could be more in future.

static void validateGlobalStateRestorable(RestoreSnapshotRequest request, Snapshot snapshot, SnapshotInfo snapshotInfo) {
if (request.includeGlobalState() && snapshotInfo.includeGlobalState() != Boolean.TRUE) {
request.includeGlobalState(false);
logger.warn("[{}] the snapshot was created without global state, skipping restoring global sate", snapshot);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe word this explicitly and say "[{}] was created without global state but restore request [{}] asks for global state restore explicitly, skipping global state restore"

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if CI is happy I'm happy.

Maybe relabel this to >non-issue though, it's not really a user facing bug :)

@original-brownbear
Copy link
Member

9b62e0f this scares me a little now, are we changing behavior after all?

@idegtiarenko
Copy link
Contributor Author

Yes, this affects org.elasticsearch.snapshots.RestoreService#getFeatureStatesToRestore as includeGlobalState is used there

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should fail rather than keep silently ignoring?

static void maybeFixRestoreGlobalStateFlag(RestoreSnapshotRequest request, Snapshot snapshot, SnapshotInfo snapshotInfo) {
if (request.includeGlobalState() && snapshotInfo.includeGlobalState() != Boolean.TRUE) {
request.includeGlobalState(false);
logger.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rather fail out here? Seems like an inadvertent use of a snapshot?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henningandersen yes me and @idegtiarenko were one the same page on this as well but figured we'd have to ask you here since it's technically an API breaking change? :) If you're good with that kind of change, ++ I think we should just straight opt for failing.

@idegtiarenko
Copy link
Contributor Author

Closing in favor of: #82037

@idegtiarenko idegtiarenko deleted the 82019_skip_global_restore branch January 17, 2022 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants