Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot Stability Fixes #39502

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport labels Feb 28, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@@ -437,14 +455,21 @@ public SnapshotsInProgress(StreamInput in) throws IOException {
if (in.getVersion().onOrAfter(REPOSITORY_ID_INTRODUCED_VERSION)) {
repositoryStateId = in.readLong();
}
final String failure;
if (in.getVersion().onOrAfter(Version.V_6_7_0)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have to mute BwC tests before merging this and then adjust this version to 6.7 in master afterwards, currently we have a 7.0.0 here in master.

@@ -476,6 +501,9 @@ public void writeTo(StreamOutput out) throws IOException {
if (out.getVersion().onOrAfter(REPOSITORY_ID_INTRODUCED_VERSION)) {
out.writeLong(entry.repositoryStateId);
}
if (out.getVersion().onOrAfter(Version.V_6_7_0)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have to mute BwC tests before merging this and then adjust this version to 6.7 in master afterwards, currently we have a 7.0.0 here in master.

UpdateSnapshotStatusAction(TransportService transportService, ClusterService clusterService,
ThreadPool threadPool, ActionFilters actionFilters, IndexNameExpressionResolver indexNameExpressionResolver) {
super(
settings, SnapshotShardsService.UPDATE_SNAPSHOT_STATUS_ACTION_NAME, transportService, clusterService, threadPool,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passing the settings here (we don't have that argument in master anymore) is the only difference between this PR and master in this file.


// Set of snapshots that are currently being ended by this node
private final Set<Snapshot> endingSnapshots = Collections.synchronizedSet(new HashSet<>());

@Inject
public SnapshotsService(Settings settings, ClusterService clusterService, IndexNameExpressionResolver indexNameExpressionResolver,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class should be exactly like it is in master now

@original-brownbear original-brownbear removed the request for review from ywelsch February 28, 2019 11:08
@original-brownbear
Copy link
Member Author

seems I missed a bwc spot, looking into it now

* Backport of various snapshot stability fixes from `master` to `6.7`
* Includes elastic#38368, elastic#38025 and elastic#37612
@original-brownbear original-brownbear force-pushed the backport-snapshot-fixes-6.7 branch from 00266ba to 56b0f19 Compare February 28, 2019 11:13
Copy link
Member Author

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ywelsch Added some notes and fixed BwC here now, should be good to review :)

final ShardId shardId,
final ShardSnapshotStatus status,
final DiscoveryNode masterNode) {
void sendSnapshotShardUpdate(Snapshot snapshot, ShardId shardId, ShardSnapshotStatus status, DiscoveryNode masterNode) {
try {
if (masterNode.getVersion().onOrAfter(Version.V_6_1_0)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply made it a hard condition here between the old and the new path, the old path doesn't use the de-duplicator.
I figured the optimization isn't really important in the rolling upgrade case and this makes future backports a lot easier than having the conditionals for that in the old code, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@@ -452,14 +424,13 @@ private void syncShardStatsOnNewMaster(ClusterChangedEvent event) {
// but we think the shard is done - we need to make new master know that the shard is done
logger.debug("[{}] new master thinks the shard [{}] is not completed but the shard is done locally, " +
"updating status on the master", snapshot.snapshot(), shardId);
notifySuccessfulSnapshotShard(snapshot.snapshot(), shardId, localNodeId, masterNode);
notifySuccessfulSnapshotShard(snapshot.snapshot(), shardId, masterNode);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this passing down of masterNode is different from the 7.x/master version since we need the Bwc path in the status update sending below.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ywelsch ywelsch added the v6.7.0 label Mar 1, 2019
@original-brownbear
Copy link
Member Author

@ywelsch thanks!

@original-brownbear original-brownbear merged commit 4b725e0 into elastic:6.7 Mar 1, 2019
@original-brownbear original-brownbear deleted the backport-snapshot-fixes-6.7 branch March 1, 2019 09:11
original-brownbear added a commit that referenced this pull request Mar 1, 2019
original-brownbear added a commit that referenced this pull request Mar 1, 2019
original-brownbear added a commit that referenced this pull request Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v6.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants