-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test failure of file role store auto-reload #56398
Fix test failure of file role store auto-reload #56398
Conversation
Pinging @elastic/es-security (:Security/Authorization) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am on the fence about the proposed fix, but it does fix the test failures and is minimally invasive, hence LGTM.
The expectation from tests is that the file write with truncation operation is atomic. This is wrong because even ordinary file write is not atomic.
To get around it, the test is not using write with truncate anymore, but instead uses file replace.
I believe we are testing a slightly different thing in this case. But because, from the core code's perspective, this difference is not relevant, I believe the proposed test is valid.
Ideally, I would like us to test the scenario were the file under observation is edited, not that it is replaced, but I don't have a good suggestion about how to go about it.
@albertzaharovits Your analysis is accurate. I don't see an easy way to have atomic file modification behaviour unless resorting to some in-memory FileSystem, which feels overkill for this purpose. Alternatively, we could fix the syptom only. Let me elaborate: The failure occurs in the following assertion: Lines 436 to 437 in 410b079
This failure is due to two reasons: 1) file modification is not atomic; 2) the changed role reported by the FileWatcher is Currently the PR tries to fix item 1. But we could also fix it with item 2. Given the original file content is: role5:
cluster: ...
- 'MONITOR' We could modify it to be role5x:
cluster: ...
- 'ALL' Note that we change the role name from store = new FileRolesStore(settings, env, watcherService, roleSet -> {
modifiedFileRolesModified.addAll(roleSet);
if (roleSet.contains("role5x")) {
modifyLatch.countDown();
}
}, new XPackLicenseState(Settings.EMPTY), xContentRegistry());
...
modifyLatch.await(1, TimeUnit.SECONDS);
assertEquals(2, modifiedFileRolesModified.size());
assertTrue(modifiedFileRolesModified.contains("role5x"));
descriptors = store.roleDescriptors(Collections.singleton("role5x"));
assertThat(descriptors, notNullValue()); This version is still slightly different from the current one, but is less so compared to the file move fix. Would you be more comfortable with this approach? |
I like the idea to count down the latch based on the role name @ywangd ! As a further improvement, I would suggest we always append dummy marker role at the end of the role file and count down the latch when we spot it, even in the append cases. This is different from what you're suggesting because the dummy marker role has the sole purpose of counting down the latch, and it is not used to verify the parsing. Let me know what you think about this. |
@albertzaharovits I updated the PR with the countdown latch change as discussed. I didn't add dummy marker roles since I don't think they are necessary. I understand the intention of the marker role is to ensure anything comes before it is already parsed when the marker role appears in the changset. However, if we can guarantee that each role name is only reported once by the FileWatcher, we can already be sure when a name appears in the changeset, it is fully parsed and ready to be asserted. The original issue was because the same name was sometimes reported twice in both truncation and modification. Also we cannot differentiate them because other than the name, there is no context attached to it.
I hope this makes sense. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks Yang!
After some more thoughts, I decided to use dummy marker roles for the two tests involves truncation. Even though they are not technically necessary, they help to maintain the semantics better. The two tests had the intention to ensure an existing role is preseved if it is not part of the trunction or is updated if it is modified. So they do imply that the role names are the same before and after file truncation/update. Hence the marker roles are helpful for keeping use the same role name, i.e. |
modifiedFileRolesModified.addAll(roleSet); | ||
modifyLatch.countDown(); | ||
if (roleSet.contains("dummy2")) { | ||
modifyLatch.countDown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal but you can maintain the modifiedFileRolesModified.addAll(roleSet);
from before and assert it contains role5
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added back.
Good point! |
LGTM Thank you Yang! |
@elasticmachine update branch |
Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.
Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.
Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.
Ensure file content is replaced atomically to prevent file watcher from reading imcomplete/empty file.
Resolves: #52955