Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Eventually Consistent Mock Repository for SnapshotResiliencyTests #40893

Conversation

original-brownbear
Copy link
Member

  • Implement simple simulation of an eventually consistent blob repository to be able to reproduce issues resulting from S3 Snapshot Repository Erroneously Assumes Consistent List Operation #38941. See new Javadoc added here for the exact behavior simulated.
    • Added a todo to make the behavior include inconsistency over a longer period of time and specifically over multiple back to back snapshots and deletes that include node failures with concurrent indexing (currently we test this, but without changing the data in the indices snapshotted so we're not affected by the eventual consistent list operation on shard directories that would break things).
  • Fortunately, as a result of Name Snapshot Data Blobs by UUID #40652 we can now merge this in as the current tests are able to deal with S3 semantics since they don't rely on consistent list operations in what they test (they don't do any updates to a shard's segment files, so it all seems to work out fine (I ran 5k iterations of each test to confirm)).

@original-brownbear original-brownbear added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.2.0 labels Apr 5, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed


@Override
public InputStream readBlob(String name) throws IOException {
ensureReadAfterWrite(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, how we're simulating eventual consistency for non-first write operation here.
If there is pendingWrite, it seems that we're just executing it and perform read of the latest pendingWrite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point, thanks for spotting this! Practically it doesn't change anything (fortunately) since the index.latest blob is the only blob we ever overwrite (and we don't use that blob in normal blob repository operation) but I'll fix this shortly :)

@original-brownbear
Copy link
Member Author

@andrershov fixed the incorrect consistency in e320ebc (I added a todo to make the simulation even less consistent, but I'm actually not sure that it's needed).

I also and (more importantly) fixed an issue with the extra0 files that Lucene leaves around that started breaking this test now that #40857 is merged. Technically that should/could be a separate PR but since it's an issue that's so much more pronounced for the async repository I put it in here since even without the extra0file we could run into empty trees since we have no guarantees that the delete for the directory runs after the deletes for the individual files.
This isn't a problem in a real (S3) repository because there isn't even any folders in S3 anyway.

@original-brownbear
Copy link
Member Author

Lets get #41332 in first before we fix the remaining things here :)

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear If we truly want to emulate eventual consistency, we can just remember all writes done to some blob path as a list of byte[] in memory and each time read is executed, we can return random write.
So I would rather have EventualConsistentInMemoryBlobContainer implementing BlobContainer interface and use it in SnapshotResiliencyTests.
The implementation that you're proposing is hard to follow, contains bugs (as we figured out yesterday) and does not fully resemble AWS consistency model.
Also, I don't think it's a good idea to emulate only those AWS consistency phenomena that we currently support. I think we should implement proper eventual consistent blob container and plug it in, once we think all bugs (that assume stronger consistency) in our code are addressed.

@jpountz jpountz added v7.4.0 and removed v7.3.0 labels Jul 3, 2019
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked for more tests

if (writes.isEmpty()) {
throw new NoSuchFileException(blobPath);
}
if (readBeforeWrite == false && writes.size() == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not take deletions into account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows that the class here needs unit tests that validates its behavior.

@Override
public void delete() {
if (closed.get()) {
throw new AssertionError("Blobstore is closed already");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps factor this out into an ensureNotClosed() method

@original-brownbear
Copy link
Member Author

@ywelsch I pushed d1f1e79

This adds tests and fixes the behavior on deletes. Also I removed the logic giving special case handling to incompatible-snapshots and index.latest since both aren't used by the snapshot logic (except for the URL repository but that's not relevant here) so there's no point having leniency for them now.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some more comments.

public void deleteBlob(String blobName) {
ensureNotClosed();
synchronized (context.actions) {
context.actions.add(new BlobStoreAction(Operation.DELETE, path.buildAsString() + blobName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path.add(blobName).buildAsString()?

public void writeBlob(String blobName, InputStream inputStream, long blobSize, boolean failIfAlreadyExists)
throws IOException {
ensureNotClosed();
// TODO: Throw if we try to overwrite any blob other than incompatible_snapshots or index.latest with different content
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about this TODO? Can we enable this already for some files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea we can for all but snap- (at the root level) and index-N at the root level I think. Will do :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I way beefed this up now. We were only running into this for snap- with the current implementation (since we don't have list inconsistencies simulated yet index-N isn't an issue).
For the snap- I actually made it check for the case when whatever is written only differs in timstamps (on a master failover this can happen) and let it pass.
Shard level overwrites I made fail 100% of the time (see comment I added for that, it's not expected).
I think this is pretty neat now. This hasn't failed over 10k+ runs of SnapshotResiliencyTests and shows that we're handling master fail-overs reasonably safely on S3 (obviously not testing list inconsistencies yet ... but still :D).

@original-brownbear
Copy link
Member Author

@ywelsch thanks for taking a look. I fixed the dirty parts you pointed out and way enhanced the overwrite check + added test coverage for that now. Let me know what you think.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

thanks @ywelsch !

@original-brownbear original-brownbear merged commit 78e5cbb into elastic:master Jul 18, 2019
@original-brownbear original-brownbear deleted the eventually-consistent-mock-repo branch July 18, 2019 13:28
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jul 18, 2019
…yTests (elastic#40893)

* Add eventually consistent mock repository for reproducing and testing AWS S3 blob store behavior
* Relates elastic#38941
original-brownbear added a commit that referenced this pull request Jul 18, 2019
…yTests (#40893) (#44570)

* Add eventually consistent mock repository for reproducing and testing AWS S3 blob store behavior
* Relates #38941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test Issues or PRs that are addressing/adding tests v7.4.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants