Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileRolesStoreTests#testReload fails #52955

Closed
javanna opened this issue Feb 28, 2020 · 6 comments · Fixed by #56398
Closed

FileRolesStoreTests#testReload fails #52955

javanna opened this issue Feb 28, 2020 · 6 comments · Fixed by #56398
Assignees
Labels
:Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC Team:Security Meta label for security team >test-failure Triaged test failures from CI

Comments

@javanna
Copy link
Member

javanna commented Feb 28, 2020

FileRolesStoreTests#testReload failed three times in the last 10 days: one failure on 7.x , one on 7.6 and one on a PR branch . The failure does not reproduce with the same seed.

REPRODUCE WITH: ./gradlew ':x-pack:plugin:security:test' --tests "org.elasticsearch.xpack.security.authz.store.FileRolesStoreTests.testAutoReload" -Dtests.seed=E66E5F527A74FF40 -Dtests.security.manager=true -Dtests.locale=en -Dtests.timezone=Pacific/Kiritimati -Dcompiler.java=13
java.lang.AssertionError: expected:<9> but was:<10>
at __randomizedtesting.SeedInfo.seed([E66E5F527A74FF40:A0D240CB3F2E9F56]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at org.elasticsearch.xpack.security.authz.store.FileRolesStoreTests.testAutoReload(FileRolesStoreTests.java:412)

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+EAR/458

@javanna javanna added >test-failure Triaged test failures from CI :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC labels Feb 28, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (:Security/Authorization)

@javanna
Copy link
Member Author

javanna commented Feb 28, 2020

Given that it does not repro and it sporadically fails, shall we keep the test around to potentially collect more info or shall I mute it already?

@iverase
Copy link
Contributor

iverase commented Apr 21, 2020

Another failure in 7.x: https://gradle-enterprise.elastic.co/s/rssnr4evzsaww

@tvernum
Copy link
Contributor

tvernum commented Apr 30, 2020

We suspect this might have only been broken for a few weeks and the issue from 9 days ago is something different, but @ywangd will followup and see where we are.

@rjernst rjernst added the Team:Security Meta label for security team label May 4, 2020
@jkakavas
Copy link
Member

jkakavas commented May 6, 2020

Happened again in master (testAutoReload) :https://gradle-enterprise.elastic.co/s/uftgjwlhgiunm

@ywangd
Copy link
Member

ywangd commented May 7, 2020

This is a genuine issue with the test code. I can reliably reproduce it locally by inserting Thread.sleep(200); before the first writer.append call:

try (BufferedWriter writer = Files.newBufferedWriter(tmp, StandardCharsets.UTF_8, StandardOpenOption.TRUNCATE_EXISTING)) {
writer.append("role5:").append(System.lineSeparator());
writer.append(" cluster:").append(System.lineSeparator());
writer.append(" - 'ALL'");
}

The main reason of the failure is that the BufferedWriter does not write file atomically. If there is any delay after the file is opened (with trunction), it is picked up by the FileWatcher and the content is parsed for roles. Since the file is empty, we get no role back.

Now the test code does check whether the expected role is in the change-set (reported by FileWatcher) before asserting the existence of the Role (where the test failed). However, we do not differentiate the change-set between "deleted", "inserted" or "update". The test code really wanted an "updated" change-set, but it accepted a "deleted" one because role name is the same and it cannot tell the type of the change-set (modifiedFileRolesModified is the change-set in the following code):

I will fix this with a PR to basically perform file move instead of write to avoid the race condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC Team:Security Meta label for security team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants