Fix test failure of file role store auto-reload #56398

ywangd · 2020-05-08T00:35:51Z

Ensure file content is replaced atomically to prevent file watcher from reading imcomplete/empty file.

Resolves: #52955

…store-test-failure

elasticmachine · 2020-05-08T00:35:53Z

Pinging @elastic/es-security (:Security/Authorization)

albertzaharovits

I am on the fence about the proposed fix, but it does fix the test failures and is minimally invasive, hence LGTM.

The expectation from tests is that the file write with truncation operation is atomic. This is wrong because even ordinary file write is not atomic.

To get around it, the test is not using write with truncate anymore, but instead uses file replace.
I believe we are testing a slightly different thing in this case. But because, from the core code's perspective, this difference is not relevant, I believe the proposed test is valid.

Ideally, I would like us to test the scenario were the file under observation is edited, not that it is replaced, but I don't have a good suggestion about how to go about it.

ywangd · 2020-05-12T00:12:08Z

To get around it, the test is not using write with truncate anymore, but instead uses file replace.

@albertzaharovits Your analysis is accurate. I don't see an easy way to have atomic file modification behaviour unless resorting to some in-memory FileSystem, which feels overkill for this purpose.

Alternatively, we could fix the syptom only. Let me elaborate: The failure occurs in the following assertion:

elasticsearch/x-pack/plugin/security/src/test/java/org/elasticsearch/xpack/security/authz/store/FileRolesStoreTests.java

Lines 436 to 437 in 410b079

    
           descriptors = store.roleDescriptors(Collections.singleton("role5")); 
        
           assertThat(descriptors, notNullValue());

This failure is due to two reasons: 1) file modification is not atomic; 2) the changed role reported by the FileWatcher is role5 for both the file truncation and subsequent write.

Currently the PR tries to fix item 1. But we could also fix it with item 2. Given the original file content is:

role5:
  cluster: ...
    - 'MONITOR'

We could modify it to be

role5x:
   cluster: ...
     - 'ALL'

Note that we change the role name from role5 to role5x so that FileWatcher will report file trunction with role5 and subsequent write with role5x. And the code would be changed to something like the follows

store = new FileRolesStore(settings, env, watcherService, roleSet -> {
                modifiedFileRolesModified.addAll(roleSet);
                if (roleSet.contains("role5x")) {
                    modifyLatch.countDown();
                }
            }, new XPackLicenseState(Settings.EMPTY), xContentRegistry());
...
modifyLatch.await(1, TimeUnit.SECONDS);
assertEquals(2, modifiedFileRolesModified.size());
assertTrue(modifiedFileRolesModified.contains("role5x"));
descriptors = store.roleDescriptors(Collections.singleton("role5x"));
assertThat(descriptors, notNullValue());

This version is still slightly different from the current one, but is less so compared to the file move fix. Would you be more comfortable with this approach?

albertzaharovits · 2020-05-12T10:35:51Z

I like the idea to count down the latch based on the role name @ywangd !

As a further improvement, I would suggest we always append dummy marker role at the end of the role file and count down the latch when we spot it, even in the append cases.
This relies on the fact that roles are parsed in the order they are defined in the file, so that we can be assured that the roles preceding the dummy role have all been parsed completely (there is currently the theoretical risk that a role is not read completely, even in the append without truncate case).

This is different from what you're suggesting because the dummy marker role has the sole purpose of counting down the latch, and it is not used to verify the parsing.

Let me know what you think about this.

…store-test-failure

ywangd · 2020-05-12T11:57:49Z

@albertzaharovits I updated the PR with the countdown latch change as discussed. I didn't add dummy marker roles since I don't think they are necessary.

I understand the intention of the marker role is to ensure anything comes before it is already parsed when the marker role appears in the changset. However, if we can guarantee that each role name is only reported once by the FileWatcher, we can already be sure when a name appears in the changeset, it is fully parsed and ready to be asserted.

The original issue was because the same name was sometimes reported twice in both truncation and modification. Also we cannot differentiate them because other than the name, there is no context attached to it.

For tests use just append, each appended role is unique. So when it is in the changeset, it is ready to be asserted. If FileWatcher is triggered before the newly appended role is fully written, the parsing fails and nothing will be reported. The FileWatcher will be triggered again when the write is fully completed and append role will then be reported.
Similarly, for tests use truncation and then add new role names, when the new role name is reported, it must have be fully written and recognised by the FileWatcher and parser. We can just assert it without need for a marker role.

I hope this makes sense. Thanks!

albertzaharovits

LGTM Thanks Yang!

ywangd · 2020-05-12T12:58:15Z

After some more thoughts, I decided to use dummy marker roles for the two tests involves truncation. Even though they are not technically necessary, they help to maintain the semantics better. The two tests had the intention to ensure an existing role is preseved if it is not part of the trunction or is updated if it is modified. So they do imply that the role names are the same before and after file truncation/update. Hence the marker roles are helpful for keeping use the same role name, i.e. role5. Also dropped code of asserting exact content of the change set because it can be different due to non-atomic file operation descibed above. Sorry for the back and forth. I appreicate all the discussion and I believe this should be the final version.

albertzaharovits · 2020-05-12T20:49:13Z

...security/src/test/java/org/elasticsearch/xpack/security/authz/store/FileRolesStoreTests.java

-                modifiedFileRolesModified.addAll(roleSet);
-                modifyLatch.countDown();
+                if (roleSet.contains("dummy2")) {
+                    modifyLatch.countDown();


Not a big deal but you can maintain the modifiedFileRolesModified.addAll(roleSet); from before and assert it contains role5.

Added back.

albertzaharovits · 2020-05-12T20:51:59Z

The two tests had the intention to ensure an existing role is preseved if it is not part of the trunction or is updated if it is modified

Good point!

albertzaharovits · 2020-05-12T20:52:49Z

LGTM Thank you Yang!

ywangd · 2020-05-13T02:56:00Z

@elasticmachine update branch

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd added 2 commits May 8, 2020 10:26

Fix file role store reload test failure

f3c8abb

Merge remote-tracking branch 'origin/master' into es-25955-file-role-…

86e6262

…store-test-failure

ywangd added >test Issues or PRs that are addressing/adding tests :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v8.0.0 v7.8.1 v7.9.0 labels May 8, 2020

ywangd requested a review from albertzaharovits May 8, 2020 00:35

elasticmachine added the Team:Security Meta label for security team label May 8, 2020

ywangd added 2 commits May 8, 2020 10:37

checkstyle

b34e45a

Remove forbidden API usage

7445bec

albertzaharovits approved these changes May 11, 2020

View reviewed changes

ywangd added 2 commits May 12, 2020 21:37

address feedback

86ff9f5

Merge remote-tracking branch 'origin/master' into es-25955-file-role-…

b757813

…store-test-failure

albertzaharovits approved these changes May 12, 2020

View reviewed changes

Use marker role for better semantics

ff8aa94

albertzaharovits reviewed May 12, 2020

View reviewed changes

Address feedback

4f20dc0

Merge branch 'master' into es-25955-file-role-store-test-failure

ab37392

ywangd merged commit 23095d4 into elastic:master May 13, 2020

ywangd added the backport pending label May 13, 2020

ywangd added a commit to ywangd/elasticsearch that referenced this pull request May 15, 2020

Fix test failure of file role store auto-reload (elastic#56398)

f6d30f3

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd added a commit to ywangd/elasticsearch that referenced this pull request May 15, 2020

Fix test failure of file role store auto-reload (elastic#56398)

cab2ead

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd added a commit that referenced this pull request May 15, 2020

Fix test failure of file role store auto-reload (#56398) (#56802)

c66e7ec

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd added a commit that referenced this pull request May 15, 2020

Fix test failure of file role store auto-reload (#56398) (#56803)

d5863c6

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd added a commit to ywangd/elasticsearch that referenced this pull request Jun 10, 2020

Fix test failure of file role store auto-reload (elastic#56398)

0fcdc5a

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd added a commit that referenced this pull request Jun 11, 2020

Fix test failure of file role store auto-reload (#56398) (#57961)

1b0d246

Ensure assertion is only performed when we can be sure that the desired changes are picked up by the file watcher.

ywangd removed the backport pending label Apr 18, 2021

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test failure of file role store auto-reload #56398

Fix test failure of file role store auto-reload #56398

ywangd commented May 8, 2020

elasticmachine commented May 8, 2020

albertzaharovits left a comment •

edited

Loading

ywangd commented May 12, 2020

albertzaharovits commented May 12, 2020

ywangd commented May 12, 2020

albertzaharovits left a comment

ywangd commented May 12, 2020

albertzaharovits May 12, 2020

ywangd May 13, 2020

albertzaharovits commented May 12, 2020

albertzaharovits commented May 12, 2020

ywangd commented May 13, 2020

Fix test failure of file role store auto-reload #56398

Fix test failure of file role store auto-reload #56398

Conversation

ywangd commented May 8, 2020

elasticmachine commented May 8, 2020

albertzaharovits left a comment • edited Loading

Choose a reason for hiding this comment

ywangd commented May 12, 2020

albertzaharovits commented May 12, 2020

ywangd commented May 12, 2020

albertzaharovits left a comment

Choose a reason for hiding this comment

ywangd commented May 12, 2020

albertzaharovits May 12, 2020

Choose a reason for hiding this comment

ywangd May 13, 2020

Choose a reason for hiding this comment

albertzaharovits commented May 12, 2020

albertzaharovits commented May 12, 2020

ywangd commented May 13, 2020

albertzaharovits left a comment •

edited

Loading