Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add known issue docs for #75598 #79221

Merged

Conversation

DaveCTurner
Copy link
Contributor

Adds a description of #75598, and the mitigation, to the release notes
of versions 7.13.2 through 7.14.0.

@DaveCTurner DaveCTurner added >docs General docs changes :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.13.5 v7.16.0 v7.14.3 v7.15.2 labels Oct 15, 2021
Adds a description of elastic#75598, and the mitigation, to the release notes
of versions 7.13.2 through 7.14.0.
@DaveCTurner DaveCTurner force-pushed the 2021-10-15-75598-known-issue-docs branch from e690314 to 4ca78e9 Compare October 15, 2021 08:56
@DaveCTurner DaveCTurner marked this pull request as ready for review October 15, 2021 09:07
@elasticmachine elasticmachine added Team:Docs Meta label for docs team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Oct 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner requested a review from jrodewig October 15, 2021 09:08
Copy link
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Aside from some minor wording nits, I think we should include a snippet for the setting update. Thanks @DaveCTurner!

causing future restore operations to fail. To mitigate this problem, prevent
concurrent snapshot operations by setting
`snapshot.max_concurrent_operations: 1`.
+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the remediation step is a single API call, I'd include it here. If you'd rather not do that, I'd at least state you can update snapshot.max_concurrent_operations using the update cluster settings API (with a link).

Suggested change
+
+
[source,console]
----
PUT _cluster/settings
{
"persistent" : {
"snapshot.max_concurrent_operations" : 1
}
}
----
+

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 good idea.

Comment on lines 150 to 155
* Snapshot and restore: If a running snapshot is cancelled while a
previously-started snapshot is still ongoing and a later snapshot is enqueued
then there is a risk that some shard data may be lost from the repository,
causing future restore operations to fail. To mitigate this problem, prevent
concurrent snapshot operations by setting
`snapshot.max_concurrent_operations: 1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor edits to reword some passive voice. There is still some passive voice in here, but I think this reads better. Feel free to ignore if wanted tho.

Suggested change
* Snapshot and restore: If a running snapshot is cancelled while a
previously-started snapshot is still ongoing and a later snapshot is enqueued
then there is a risk that some shard data may be lost from the repository,
causing future restore operations to fail. To mitigate this problem, prevent
concurrent snapshot operations by setting
`snapshot.max_concurrent_operations: 1`.
* Snapshot and restore: If you cancel a running snapshot while a
previously-started snapshot is still ongoing and a later snapshot is enqueued,
the repository may lose some shard data. This can cause future restore
operations to fail. To mitigate this problem, set
`snapshot.max_concurrent_operations` to `1` to prevent concurrent snapshot
operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left the first bit of passive voice in there ("if a running snapshot is cancelled" etc) since users will typically hit this when snapshots are being run by other components (SLM or ILM for instance) rather than when running snapshots themselves.

@DaveCTurner DaveCTurner force-pushed the 2021-10-15-75598-known-issue-docs branch from e673f36 to b211069 Compare October 15, 2021 14:26
@DaveCTurner
Copy link
Contributor Author

Sorry, I messed up a merge and brought in some commits from a different branch. Force-pushed to fix it, but didn't change any reviewed commits.

Copy link
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all. Still looks good. Thanks!

@DaveCTurner DaveCTurner merged commit afc3814 into elastic:7.x Oct 15, 2021
@DaveCTurner DaveCTurner deleted the 2021-10-15-75598-known-issue-docs branch October 15, 2021 14:40
DaveCTurner added a commit that referenced this pull request Oct 15, 2021
Adds a description of #75598, and the mitigation, to the release notes
of versions 7.13.2 through 7.14.0.
DaveCTurner added a commit that referenced this pull request Oct 15, 2021
Adds a description of #75598, and the mitigation, to the release notes
of versions 7.13.2 through 7.14.0.
DaveCTurner added a commit that referenced this pull request Oct 15, 2021
Adds a description of #75598, and the mitigation, to the release notes
of versions 7.13.2 through 7.14.0.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 11, 2021
The known-issue docs give the impression that an upgrade will restore
the lost data in the repository. This isn't the case, so this commit
clarifies this in the docs.

Relates elastic#73456
Relates elastic#75598
Relates elastic#79221
DaveCTurner added a commit that referenced this pull request Nov 15, 2021
The known-issue docs give the impression that an upgrade will restore
the lost data in the repository. This isn't the case, so this commit
clarifies this in the docs.

Relates #73456
Relates #75598
Relates #79221
DaveCTurner added a commit that referenced this pull request Nov 15, 2021
The known-issue docs give the impression that an upgrade will restore
the lost data in the repository. This isn't the case, so this commit
clarifies this in the docs.

Relates #73456
Relates #75598
Relates #79221
DaveCTurner added a commit that referenced this pull request Nov 15, 2021
The known-issue docs give the impression that an upgrade will restore
the lost data in the repository. This isn't the case, so this commit
clarifies this in the docs.

Relates #73456
Relates #75598
Relates #79221
DaveCTurner added a commit that referenced this pull request Nov 15, 2021
The known-issue docs give the impression that an upgrade will restore
the lost data in the repository. This isn't the case, so this commit
clarifies this in the docs.

Relates #73456
Relates #75598
Relates #79221
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport pending :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >docs General docs changes Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Docs Meta label for docs team v7.13.5 v7.14.3 v7.15.2 v7.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants