-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support kubeflow.org/xgboostjob #1114
Support kubeflow.org/xgboostjob #1114
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/hold for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, have you tested this on a running cluster?
/assign @alculquicondor
test/integration/controller/jobs/xgboostjob/xgboostjob_controller_test.go
Outdated
Show resolved
Hide resolved
return len(createdWorkload.Status.Conditions) == 2 | ||
}, util.ConsistentDuration, util.Interval).Should(gomega.BeTrue()) | ||
|
||
ginkgo.By("checking the job gets suspended when parallelism changes and the added node selectors are removed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking nit: we could make the integration test shorter as the implementation is pretty well commonized. I'm thinking of this scenario as a potential overkill, but I don't have a strong view as some integration tests are needed as a scenarios like this is probably <1s anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.
we could make the integration test shorter as the implementation is pretty well commonized. I'm thinking of this scenario as a potential overkill
Which does that indicate all integrations or all kubeflow jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have pretty well commonized code at both levels - between integrations, and kubeflow integrations in particular. As a result each kubeflow integration implements only around 10 functions, and all are one-liners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.
we could make the integration test shorter as the implementation is pretty well commonized. I'm thinking of this scenario as a potential overkill
Which does that indicate all integrations or all kubeflow jobs?
I would focus only on kubeflow jobs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. I think moving this case to a unit test might be better since this scenario might be useful to check expected behavior although this is overkill as integration tests.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm, still I would kieep some basic integration tests per integratoin, can be follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep tracking by creating an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm, would you like to create one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize that I forgot to push Submit new Issue
button 😞
Created: #1119
Yes, I verified this feature on a real cluster using the following XGBoostJob:
I will create a PR for the document once #1109 (comment) is resolved. |
Signed-off-by: Yuki Iwai <[email protected]>
/remove-kind documentation |
Signed-off-by: Yuki Iwai <[email protected]>
556de27
to
e4fceeb
Compare
/retest |
#1090 happened again :( /test pull-kueue-test-e2e-main-1-24 |
LGTM label has been added. Git tree hash: 915ed626e7bc079fe2a4b79283648146bf2fa7ce
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Let's merge this. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Support kubeflow.org/xgboostjob.
Which issue(s) this PR fixes:
Part-of #297
Special notes for your reviewer:
Does this PR introduce a user-facing change?