Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E: TestNodeDrainEphermeralMigrate should work with client ACLs #12314

Closed
tgross opened this issue Mar 17, 2022 · 2 comments · Fixed by #16826
Closed

E2E: TestNodeDrainEphermeralMigrate should work with client ACLs #12314

tgross opened this issue Mar 17, 2022 · 2 comments · Fixed by #16826

Comments

@tgross
Copy link
Member

tgross commented Mar 17, 2022

Our E2E environment doesn't have acls { enabled = true } on the clients, just the servers. If this setting is enabled, the TestNodeDrainEphermeralMigrate fails because the client-to-client ACL check fails and the data isn't migrated. We need to figure out the right client ACL policy to support this. This predates #12267 but I noticed it while working on it.

Note that failure happens silently. The old allocation is simply stopped and the data isn't migrated and the user isn't warned of this, because ephemeral disk migration is best effort. #12267 includes a commit to make the test failure more legible in this case.

@tgross
Copy link
Member Author

tgross commented Mar 22, 2023

Noting that as of #16530 we do have client ACLs enabled, so this test is failing. I'll get this fixed as part of a set of work I'm doing over the next couple weeks about drains.

@tgross tgross self-assigned this Apr 5, 2023
@tgross
Copy link
Member Author

tgross commented Apr 6, 2023

Example error on the client:

2023-04-06T15:12:28.346Z [WARN] client.alloc_runner.runner_hook.migrate_disk: error migrating data from previous alloc: alloc_id=c2c411f6-942f-0977-62fb-7911d96e4cf3 error="error getting snapshot from previous alloc "b36b3f11-ebfa-a568-27c6-3444c391baa2": Unexpected response code: 403 (Permission denied)"

It turns out the problem with our test is that we don't have sticky=true set, but this means we've been migrating when migrate=true with sticky=false whenever client ACLs are disabled for years. So looking into whether we want to just extend that behavior so that migrate=true implies sticky=true.

tgross added a commit that referenced this issue Apr 7, 2023
While working on several open drain issues, I'm fixing up the E2E tests. This
subset of tests being refactored are existing ones that already work. I'm
shipping these as their own PR to keep review sizes manageable when I push up
PRs in the next few days for #9902, #12314, and #12915.
tgross added a commit that referenced this issue Apr 7, 2023
While working on several open drain issues, I'm fixing up the E2E tests. This
subset of tests being refactored are existing ones that already work. I'm
shipping these as their own PR to keep review sizes manageable when I push up
PRs in the next few days for #9902, #12314, and #12915.
@tgross tgross added this to the 1.5.x milestone Apr 7, 2023
tgross added a commit that referenced this issue Apr 12, 2023
While working on several open drain issues, I'm fixing up the E2E tests. This
subset of tests being refactored are existing ones that already work. I'm
shipping these as their own PR to keep review sizes manageable when I push up
PRs in the next few days for #9902, #12314, and #12915.
tgross added a commit that referenced this issue Apr 12, 2023
While working on several open drain issues, I'm fixing up the E2E tests. This
subset of tests being refactored are existing ones that already work. I'm
shipping these as their own PR to keep review sizes manageable when I push up
PRs in the next few days for #9902, #12314, and #12915.
tgross added a commit that referenced this issue Apr 12, 2023
While working on several open drain issues, I'm fixing up the E2E tests. This
subset of tests being refactored are existing ones that already work. I'm
shipping these as their own PR to keep review sizes manageable when I push up
PRs in the next few days for #9902, #12314, and #12915.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant