Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SHA for BLOB update instead of modification time #3697

Merged
merged 7 commits into from
Oct 4, 2024

Conversation

paxadax
Copy link
Contributor

@paxadax paxadax commented Oct 1, 2024

What is the purpose of the change

Issue[4077]

When deploying Nimbus or changing the leadership within a high availability Nimbus cluster, we've verified that the Topologies workers are killed due to different modification times.
By using the modTime as the version, we have found that, while using the LocalFsBlobStoreFile, every time the the Nimbus leader goes down the following occurs:

  1. Nimbus (1) leader goes down and a new Nimbus (2) picks up the leadership.
  2. If blobs in Nimbus (2) have a different modTime workers are restarted (even though they might be the same).
  3. Nimbus (1) comes back up, syncs the blobs in the startup and updates the modTime, as it downloads the blobs again.
  4. If Nimbus (2) leader goes down, all the workers will be restarted again as Nimbus (1) has new modTime again.
  5. This can be repeated endless as the modTime will always be different in each Nimbus leader.

In this PR, we've introduced a new feature to use the SHA version of the file instead of the modification time. With this feature, when a nimbus loses the leadership, the workers will continue running because the version of the BLOB will continue the same as the BLOB is the same and also it's correspondent SHA.

How was the change tested

Unit Tests
Test locally with specific jar on my local Storm, forced nimbus to change leadership and the workers on the topologies continued to work properly.

@paxadax paxadax marked this pull request as draft October 1, 2024 16:25
@paxadax paxadax marked this pull request as ready for review October 1, 2024 16:40
@Override
public long getVersion() throws IOException {
try (FileInputStream fis = new FileInputStream(path)) {
byte[] bytes = DigestUtils.sha1(fis);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about using something such as sha256 or sha512 to avoid (unlikely) collisions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, I've update it to Sha256

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a fealing how often getVersion(...) is called? Creating a SHA hash is rather expensive compared to the modification date (just hink, if we need to do caching after first call or a like) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is called often, perhaps we can use something such as MurmurHash that is used elsewhere in the code for the sharding of tuples

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being ran by the AsyncLocalizer every interval defined by supervisor.localizer.update.blob.interval.secs but this won't have impact on the worker but on the supervisor. We wouldn't need to cache it but nevertheless we can add cache.

Copy link
Contributor Author

@paxadax paxadax Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is being ran continuously we can opt for a Murmur hash that will prioritise fast hashing(suggested by @reiabreu) and this way we would opt for not using caching

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, after a brief discussion, we've decided to follow with Checksum instead of Murmur since Checksum computation is faster. Commit with the changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to use checksum.
I've approved the changes

Copy link
Contributor Author

@paxadax paxadax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't merge it yet as I'm still doing some tests where some flakiness has surged

Copy link
Contributor Author

@paxadax paxadax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything tested, we can proceed with merge

@reiabreu
Copy link
Contributor

reiabreu commented Oct 4, 2024

@rzo1 do you want to re-examine the PR?

@rzo1
Copy link
Contributor

rzo1 commented Oct 4, 2024

lgtm. Thanks for the PR.

@reiabreu reiabreu merged commit 1e8eee6 into apache:master Oct 4, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants