Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Study controller - Running HP Jobs without writing code #87

Closed
YujiOshima opened this issue May 16, 2018 · 6 comments
Closed

Study controller - Running HP Jobs without writing code #87

YujiOshima opened this issue May 16, 2018 · 6 comments

Comments

@YujiOshima
Copy link
Contributor

I open PR #86
This is for study controller.
Currently, we need to write some code to use Katib in any case.
I want to make users don't need to write any code In a typical usecase,
The study controller is the implementation of logic for how to call services, run worker, and save models.
This is a POC of the study controller https://github.com/YujiOshima/hp-tuning/blob/51d456da58f2a77648290c175491e0692e0f3d4c/pkg/manager/studycontroller/defaultcontroller.go

In this PR, default study controller request all suggestions at first.
It is not suitable for Bayse Opt or Hyperband since they need to call GetSuggestions several time.
I implement study controller as a go process, but it can separate as a service like the suggestion and the earlystopping.

WDYT? @gaocegege @ddysher @libbyandhelen

@YujiOshima
Copy link
Contributor Author

@gaocegege @ddysher @libbyandhelen
I drow an overview of Katib API and resources.
https://docs.google.com/presentation/d/1KEA5vTmpDCDUl_zvnlYq2Fc0yIshzwfW6Tb8t4Zmrt0/edit?usp=sharing
Feel free to comment or edit it!

@gaocegege
Copy link
Member

Sorry for the late reply. These days I am working one tf-operator 0.2 release thus no time to sync katib. I will take a look in two hours.

@jlewi jlewi changed the title Study controller Study controller - Running HP Jobs without writing code Jul 7, 2018
@jlewi
Copy link
Contributor

jlewi commented Aug 24, 2018

@YujiOshima can you provide an update please? Do you think that by mid September people will be able to submit HP jobs without having to write any code?

I see that:

#141 containing the initial CRD was merged

#152 is open to do some additional refactoring

@jlewi
Copy link
Contributor

jlewi commented Sep 3, 2018

@YujiOshima How are we looking for 0.3.0? Where will we be after the next two weeks since we would like to cut 0.3.0 then.

@jlewi
Copy link
Contributor

jlewi commented Sep 17, 2018

@YujiOshima What is the status of this issue?

@YujiOshima
Copy link
Contributor Author

@jlewi Sorry for the late reply.
I have several issues with StudyjobController #165 #166 .
I fixed them in this PR #170 .
I think we can close this since StudyjobController is stable now.

TF-Job or other frameworks will be supported in next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants