Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add snapshot stress tests #78596

Conversation

DaveCTurner
Copy link
Contributor

Adds SnapshotStressTestsIT, a test that performs a wide variety of
concurrent snapshot-related operations to explore the corners of the
snapshot state machine in a randomized fashion:

  • indexing docs, deleting and re-creating the indices
  • restarting nodes
  • removing and adding repositores
  • taking snapshots (sometimes partial), cloning them, and deleting them

It ensures that these operations should succeed via a system of
shared/exclusive locks. None of the operations block. If the necessary
locks aren't all available then the operation just releases the ones it
has acquired and tries again later. The test completes after completing
a certain number of snapshots or after a certain time has elapsed.

Adds `SnapshotStressTestsIT`, a test that performs a wide variety of
concurrent snapshot-related operations to explore the corners of the
snapshot state machine in a randomized fashion:

- indexing docs, deleting and re-creating the indices
- restarting nodes
- removing and adding repositores
- taking snapshots (sometimes partial), cloning them, and deleting them

It ensures that these operations should succeed via a system of
shared/exclusive locks. None of the operations block. If the necessary
locks aren't all available then the operation just releases the ones it
has acquired and tries again later. The test completes after completing
a certain number of snapshots or after a certain time has elapsed.
@DaveCTurner DaveCTurner added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Oct 4, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 4, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a longtime user of this, LGTM :)

+ Again, thanks so much for this David!

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, those tests are great! I left minor comments, feel free to address them.

@DaveCTurner DaveCTurner added auto-backport-and-merge auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Oct 4, 2021
@elasticsearchmachine elasticsearchmachine merged commit bfb5218 into elastic:master Oct 4, 2021
@DaveCTurner DaveCTurner deleted the 2021-10-04-snapshot-stress-tests branch October 4, 2021 12:06
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
7.x

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Oct 4, 2021
Adds `SnapshotStressTestsIT`, a test that performs a wide variety of
concurrent snapshot-related operations to explore the corners of the
snapshot state machine in a randomized fashion:

- indexing docs, deleting and re-creating the indices
- restarting nodes
- removing and adding repositores
- taking snapshots (sometimes partial), cloning them, and deleting them

It ensures that these operations should succeed via a system of
shared/exclusive locks. None of the operations block. If the necessary
locks aren't all available then the operation just releases the ones it
has acquired and tries again later. The test completes after completing
a certain number of snapshots or after a certain time has elapsed.
DaveCTurner added a commit that referenced this pull request Oct 18, 2021
Adds `SnapshotStressTestsIT`, a test that performs a wide variety of
concurrent snapshot-related operations to explore the corners of the
snapshot state machine in a randomized fashion:

- indexing docs, deleting and re-creating the indices
- restarting nodes
- removing and adding repositores
- taking snapshots (sometimes partial), cloning them, and deleting them

It ensures that these operations should succeed via a system of
shared/exclusive locks. None of the operations block. If the necessary
locks aren't all available then the operation just releases the ones it
has acquired and tries again later. The test completes after completing
a certain number of snapshots or after a certain time has elapsed.

Backport of #78596
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport pending :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test Issues or PRs that are addressing/adding tests v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants