-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Add New Parameter in tune
#2369
[GSoC] Add New Parameter in tune
#2369
Conversation
ref issue: #2340 |
PTAL👀 @andreyvelich @johnugeorge |
@andreyvelich I add my code here: katib/pkg/webhook/v1beta1/pod/inject_webhook.go Lines 140 to 142 in db17214
// Pass env variable KATIB_TRIAL_NAME to training containers using fieldPath.
for idx := range mutatedPod.Spec.Containers {
if mutatedPod.Spec.Containers[idx].Env == nil {
mutatedPod.Spec.Containers[idx].Env = []v1.EnvVar{}
}
mutatedPod.Spec.Containers[idx].Env = append(
mutatedPod.Spec.Containers[idx].Env,
v1.EnvVar{
Name: consts.EnvTrialName,
ValueFrom: &v1.EnvVarSource{
FieldRef: &v1.ObjectFieldSelector{
FieldPath: fmt.Sprintf("metadata.labels['%s']", consts.LabelTrialName),
},
},
},
)
} PTAL👀. Thank you for your review! |
/area gsoc |
I don't think, we should mutate |
Please rebase your PR as well @Electronic-Waste |
@Electronic-Waste Additionally, I think we forgot to add this line to the post-gen script of Katib SDK: # Import Katib report metrics functions
from kubeflow.katib.api.report_metrics import report_metrics Similar to how we include other imports to the @Electronic-Waste Please update this script, otherwise after we re-generate the SDK, your import will be deleted. |
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
3811459
to
e73826f
Compare
@andreyvelich Done! I've rebased the branch, modified the code to pass env variable only to the primary container and added the importing code to the post-gen script of Katib SDK. |
@andreyvelich Thank you for your detailed review! |
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
@@ -140,15 +141,37 @@ func (s *SidecarInjector) Mutate(pod *v1.Pod, namespace string) (*v1.Pod, error) | |||
// Add Katib Trial labels to the Pod metadata. | |||
mutatePodMetadata(mutatedPod, trial) | |||
|
|||
// Pass env variable KATIB_TRIAL_NAME to the primary container using fieldPath. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Electronic-Waste Additionally, since we always pass the Trial name to the training container, we should not allow to pass Trial name and namespace via template generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not friendly to users. Since junior users are unfamiliar with that, they may be puzzled about why they can't pass Trial's namespace and name to the training container. It's more straightforward for them to simply define trialParameter trialName
and use it in the pod env as an example suggests.
Also, it's more backward-compatible to just allow users passing Trial name and namespace via triallParameter and pod env. WDYT👀 @andreyvelich @tenzen-y @johnugeorge .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since junior users are unfamiliar with that, they may be puzzled about why they can't pass Trial's namespace and name to the training container.
Since we always pass KATIB_TRIAL_NAME to the Trial's Pods, users will always be able to read this environment in their training code, isn't ?
However, this can be useful for other part of the Pod spec: What if user wants to generate different volume which contains Trial name.
@Electronic-Waste Please can you an create an issue to track this discussion ? I think, we should discuss the future of Trial metadata parameters.
Signed-off-by: Electronic-Waste <[email protected]>
Signed-off-by: Electronic-Waste <[email protected]>
@andreyvelich I've wrapped the code into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Electronic-Waste!
just a small comment from me.
Signed-off-by: Electronic-Waste <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we should be ready to merge it.
Thanks to your contribution @tenzen-y 🎉
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreyvelich The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* chore: add metrics_collector_config in tune function. Signed-off-by: Electronic-Waste <[email protected]> * rebase: rebase feat/new-param-tune to master. Signed-off-by: Electronic-Waste <[email protected]> * chore: add metrics collector kind list in comment. Signed-off-by: Electronic-Waste <[email protected]> * fix: always pass Trial name to the training container. Signed-off-by: Electronic-Waste <[email protected]> * fix: delete passing env variable logics in katib_client.py Signed-off-by: Electronic-Waste <[email protected]> * fix: passing env variable KATIB_TRIAL_NAME in the webhook of pod. Signed-off-by: Electronic-Waste <[email protected]> * fix: pass env variable KATIB_TRIAL_NAME only to the primary container. Signed-off-by: Electronic-Waste <[email protected]> * chore: add report_metrics in post_gen.py. Signed-off-by: Electronic-Waste <[email protected]> * fix: change nil error to allErrs(deleted by accident). Signed-off-by: Electronic-Waste <[email protected]> * fix: fix lint error in inject_webhook.go. Signed-off-by: Electronic-Waste <[email protected]> * fix: wrap env variables passing logics into mutatePodEnv. Signed-off-by: Electronic-Waste <[email protected]> * chore: add unit tests for mutatePodEnv. Signed-off-by: Electronic-Waste <[email protected]> * fix: delete protocmp. Signed-off-by: Electronic-Waste <[email protected]> --------- Signed-off-by: Electronic-Waste <[email protected]>
What this PR does / why we need it:
I add a new parameter
metrics_collector_config
totune
function. Design details: https://github.com/kubeflow/katib/blob/cc95ef03cb25df3d86a1a1c20c9c69bad17fce92/docs/proposals/push-based-metrics-collection.md#add-new-parameter-in-tuneWhich issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Checklist: