-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to parse NaN value in Trial CR #1227
Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
Issue Label Bot is not confident enough to auto-label this issue. |
In addition, I think we should fail the trial if job is succeed without any ObservationLog, which are usually caused by incorrect metrics collecting configuration. Currently, trial status will be locked at pending under this situation. |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
@sperlingxx As you can see here: https://github.com/kubeflow/katib/blob/master/pkg/controller.v1beta1/trial/trial_controller_util.go#L126-L139, we convert Trial to succeeded state only if any observation was reported. My suggestion for this issue is to convert all API metrics parameters to "string".
What do you think @sperlingxx @gaocegege @johnugeorge ? |
SGTM. |
Sgtm |
Will make this change. |
/kind bug
I tried to run Experiment with additional metric names that were not collected from the Trial pod.
Because of that, in Trial Controller we try to write
math.NaN()
value to observation and Trial can't be updated.I got error here while comparing Trial statuses.
I am not sure that Kubernetes objects supports
math.NaN()
, since it must be valid JSON.We should think how we can handle empty metric results.
This issue can be related to #889.
/cc @sperlingxx @gaocegege @johnugeorge
The text was updated successfully, but these errors were encountered: