-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change service key to allow container services to always match correctly #4043
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
/test integration |
/test notebooks |
/test integration |
/test notebooks |
/test integration |
/test notebooks |
It seems all integration tests passed (not sure why it's marked as failed as here it says pass) and only 1 notebook test failed (which is often flaky) Edit: ok it seems the parallel tests are failing for the integration tests |
/test integration |
/test notebooks |
Ok it seems like the operator upgrade tests are still failing and the tracing test is still failing, so it may be an issue - rerunning to confirm |
/test integration |
/test notebooks |
/test integration |
/test notebooks |
1 similar comment
/test notebooks |
@cliveseldon I've been testing this locally, it seems like all works well as the svcorch model does work in 1.14.0-dev. I am finding some strange behaviour, but it's not clear it's form this PR nor whether it's only me - namely I am running some tests and I'm finding some strange behaviour, I'm currently testing in one of Clive's branches, but when i run the helm upgrade it doesn't trigger a model container bounce for the upgrade from |
/test notebooks |
/test integration |
1 similar comment
/test integration |
Seems only flaky test is the rolling upgrade from 1.14.0, from discussion it's expected the rolling updates to potentially fail so we should be good to merge, re-running to validate flakiness on this specific test /test integration |
@cliveseldon: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. I understand the commands that are listed here. |
It seems it was indeed flaky, as now the failed one was the |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: axsaucedo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Services created for each node in the inference graph were using the same label key which means only 1 per pod would be active. This is ok for when the service orchestrator is in same pod as it would not use the services but use localhost directly.
The change is to create a label per node name so services are always correctly finding their pods.
Which issue(s) this PR fixes:
Fixes #4036
Fixes #4302
Special notes for your reviewer: