-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add e2e tests to capture connectivity issues between Pods running on Control Plane nodes and the API server #1259
Conversation
This test is added to showcase primary network not working as expected on Kind control plane nodes with thick plugin installed. Signed-off-by: Vasilis Remmas <[email protected]>
6641f87
to
6a15e99
Compare
Hi, thank you for the PR. I understand that your CI test is failed as your github action log and you want to mean that it is the flakiness of the CI testing. So the PR is good to show the CI failure, however, if you suppose the PR to be merged to repository, I understand you want to push DRA PR and you mention DRA CI is failed due to above reason, but this PR (clarify the issue) does not solve our request (provide DRA CI test in github) because it is just to clarify the error.
At the last maintainer's meeting, you mentioned that the flaky error is only happen in thick plugin, but I'm wondering that this test is passed in thin plugin mode. Did you check your current CI (in this PR) is passed in thin mode? |
@s1061123 thanks for your reply.
The DRA PR is adjusted to bypass this error and is no longer flaky while we test both thin and thick. See #1078 (comment).
My intention with this PR is to expose a flakiness that already exists and is not introduced by the DRA PR. I'm happy to close that PR after the DRA PR is merged if you think it's not useful. I could add an issue instead, but I thought that proving that this error exists in CI (for either kind or multus reason) with a PR, gives more value.
I agree we can take steps to merge this PR, if we believe that this test is valuable for the multus project, by doing either of these:
However, as I mentioned above, the intention of this PR is to get the DRA PR merged. There were concerns that DRA implementation introduces bugs in the multus codebase and I added this PR to prove that the flakiness in the e2e test is not coming from the DRA PR.
In all of the tests I've done in my setup, I haven't seen this test failing with the thin plugin. Just with the thick plugin. Therefore, I'm not sure if this is a solely a kind issue. The best I can do is expose that flakiness to you so that I can build confidence for the other PR. I haven't dug deep into the setup to understand where this is coming from and unfortunately I won't have time to dig into that. |
This pull request is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This test is added to showcase that primary network is not always working as expected on Kind control plane nodes with thick plugin installed. I'm not sure if this is a broader issue. This was discovered as part of #1078.
It may not reproduce very easily, but this is how it looks like on a fresh kind cluster.
For the DRA PR #1078, the relevant controller pod, when it's running on the control plane node, it is failing with this (indicating it's not particularly a DNS issue):