-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubeflow TensorFlow Training Operator Add Evaluator #1870
Kubeflow TensorFlow Training Operator Add Evaluator #1870
Conversation
For the databricks agent
[pull] master from flyteorg:master
Signed-off-by: Future Outlier <[email protected]>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1870 +/- ##
===========================================
- Coverage 94.95% 62.83% -32.12%
===========================================
Files 136 307 +171
Lines 6165 22998 +16833
Branches 0 3490 +3490
===========================================
+ Hits 5854 14451 +8597
- Misses 311 8125 +7814
- Partials 0 422 +422
☔ View full report in Codecov by Sentry. |
Can we write more about what is the use of evaluators? have you found some example of its usage? |
No problem, I will do it. |
Signed-off-by: Future Outlier <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
…lier/flytekit into kf-operator-evaluator
…nto kf-operator-evaluator
Signed-off-by: Future Outlier <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
…nto kf-operator-evaluator
Signed-off-by: Future Outlier <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
We need to merge this pull request, then the test will be passed. |
I've asked Linkedin software engineer @yubofredwang about the PR, he said that it is great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will merge this after FLyteIDL is released.
…nto kf-operator-evaluator
Signed-off-by: Future Outlier <[email protected]>
Signed-off-by: Future Outlier <[email protected]>
--------- Signed-off-by: Future Outlier <[email protected]> Co-authored-by: Future Outlier <[email protected]>
--------- Signed-off-by: Future Outlier <[email protected]> Co-authored-by: Future Outlier <[email protected]> Signed-off-by: Rafael Raposo <[email protected]>
TL;DR
Enable running a data service in kubeflow tensorflow training operator by utilizing the evaluator section in the TF_CONFIG.
Describe your changes
Enable running a data service by utilizing the evaluator section in the TF_CONFIG to configure data service worker information, as discussed in this Slack conversation.
The use case previously doesn't include the evaluator section, so we have to give it a default value so that we can take the case into account.
Setup Process
I test it in two ways, by specifying the Dockerfile or using ImageSpec.
Dockerfile
Use the code below
Run it to flyte-console by this command
ImageSpec
Screenshot
Dockerfile
ImageSpec
Kubeflow Training Operator Pods
Type
Are all requirements met?
Complete description
The TFJob task config doesn't contain an element for evaluators which is part of the TFJob spec.
Let's make it optional!
Tracking Issue
flyteorg/flyte#4167
flyteorg/flyte#4168