Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable to run Experiment without Goal #1065

Merged

Conversation

andreyvelich
Copy link
Member

@andreyvelich andreyvelich commented Feb 21, 2020

Fixes: #1060.

After this PR users should be able to run Experiment without specifying Goal.

/assign @johnugeorge
/cc @borremosch


This change is Reviewable

@k8s-ci-robot
Copy link

@andreyvelich: GitHub didn't allow me to request PR reviews from the following users: borremosch.

Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Fixes: #1060.

After this PR users should be able to run Experiment without specifying Goal.

/assign @johnugeorge
/cc @borremosch

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@johnugeorge
Copy link
Member

johnugeorge commented Feb 21, 2020

Inorder to verify in CI, shall we remove goal in one of the example yaml ?
May be from tfjob example https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/tfjob-example.yaml

@andreyvelich
Copy link
Member Author

@johnugeorge Sure

@andreyvelich
Copy link
Member Author

/retest

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change or add a test case to test the feature?

Thanks

@andreyvelich
Copy link
Member Author

Can we change or add a test case to test the feature?

Thanks

I added e2e test with this feature for TFJob Experiment.
Do you want to test it in the experiment_controller_test.go also?

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

No need, I think, thanks.

@johnugeorge
Copy link
Member

I see a problem in merging this now. If examples are updated, people trying katib release of 1.0 will see issues as code fix will not be included. We have to hold this for some time till images are added to manifests

@johnugeorge
Copy link
Member

/hold

@andreyvelich
Copy link
Member Author

@johnugeorge Sounds good to me.

@hougangliu
Copy link
Member

/lgtm

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@gaocegege
Copy link
Member

/hold

@k8s-ci-robot k8s-ci-robot removed the lgtm label Mar 23, 2020
@andreyvelich
Copy link
Member Author

In order to not confuse user with new example of TFJob, I removed changes in the example.
As @johnugeorge mentioned above, if user uses Katib images from current 1.0 manifests release and new example without the goal, user will get the error.

We can update examples in the future versions and merge these changes now.

/cc @johnugeorge @gaocegege @hougangliu

@johnugeorge
Copy link
Member

/lgtm

@johnugeorge
Copy link
Member

/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gaocegege, johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [gaocegege,johnugeorge]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@johnugeorge
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit d622f4a into kubeflow:master Mar 23, 2020
sperlingxx pushed a commit to sperlingxx/katib that referenced this pull request Jul 9, 2020
* Enable to run experiment without goal

* Remove goal from tfjob example

* Fix e2e test

* fix test

* Add goal to tfjob example
@andreyvelich andreyvelich deleted the issue-1060-no-goal-controller-crash branch October 6, 2021 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Katib controller goes into CrashLoopBackoff when no goal is specified
5 participants