-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dashboard V1 #125
Dashboard V1 #125
Conversation
Hi @wbuchwalter. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this.
I don't know much javascript/FE. So I mostly looked at the none JS code.
It might be useful if you could expand the README to include key points about the design of the FE.
pkg/util/k8sutil/tpr_util.go
Outdated
@@ -128,7 +128,7 @@ func (c *TfJobRestClient) Update(ns string, j *spec.TfJob) (*spec.TfJob, error) | |||
} | |||
|
|||
func (c *TfJobRestClient) Delete(ns, name string) error { | |||
_, err := c.restcli.Delete().Resource(spec.CRDKindPlural).Namespace(ns).DoRaw() | |||
_, err := c.restcli.Delete().Resource(spec.CRDKindPlural).Namespace(ns).Name(name).DoRaw() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change need to be rebased? I thought you already submitted the fix to the Delete method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I need to rebase this, duplicated fix.
images/dashboard/build_and_push.py
Outdated
@@ -0,0 +1,146 @@ | |||
#!/usr/bin/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reuse py/build_and_push_image.py rather than duplicating a lot of that code?
You can have a shim script if needed to called build_and_push with appropriate arguments.
(I'm trying to get rid of that code duplication and have a set of reusable python functions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't want to lose time on this before the point mentioned above (one or two docker images) is settled.
If we go with 2, I will clean this.
images/dashboard/Dockerfile
Outdated
@@ -0,0 +1,12 @@ | |||
# TODO(jlewi): How can we make it work with golang:1.8.2-alpine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want a separate Docker image for the dashboard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on this. having a single image would make the release scripts easier to maintain. I don't see any obvious drawbacks.
examples/tf_job.yaml
Outdated
@@ -1,7 +1,7 @@ | |||
apiVersion: "mlkube.io/v1beta1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you sync with master please? mlkube.io has been replaced by tensorflow.org?
} | ||
|
||
buildTensorBoardSpec() { | ||
// "tensorboard": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we delete this commented out code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be finished before merging. Missing Volumes
and VolumeMounts
.
dashboard/deploy.yaml
Outdated
@@ -0,0 +1,31 @@ | |||
apiVersion: v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a helm package to make it easy to configure? Should it be part of the existing helm package or have its own helm package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to suggest integrating it as part of the current helm chart, with options to enable/disable the dashboard as well as wether it should be exposed on an internal or external IP.
For branding, I'd suggest tensorflow/k8s. kubeflow doesn't exist yet . What does the frontend need to mount volumes? |
I looked at the current demo. Regarding logging, is there a way we can take advantage of different logging backends? For example GKE uses stackdriver logging which offers a nice UI which supports a fairly rich filter syntax. So it might be nice if on GKE clicking on logs redirected to the stackdriver UI with appropriate log filter. I assume you could do something similar on other clouds? Also, it looks like when you click on logs, the logs are displayed in a dialog box. Instead of using a dialog box, could you just use a text box? I.e expand the box for master/worker/ps etc... and show a text box with the logs? |
👍
This is for the creation of new TfJobs. You should be able to describe a volume backed by Azure Files, GlusterFS etc. |
} | ||
|
||
type TfJobList struct { | ||
TfJobs []spec.TfJob `json:"tfjobs"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tfJobs
?
per above
import './App.css'; | ||
import Home from './Home.js' | ||
|
||
let headerStyle = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use const
); | ||
} | ||
return ( | ||
<div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in react 16 no need for wrapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of this per @jimexist's comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @jimexist is mistaken here.
JSX still needs a wrapper element, see airbnb/javascript#1575 for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can wrap them in an array - to get rid of one level of DOM element. but it's minor issue I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or if you use react 16.2 you can use <>
and </>
: https://reactjs.org/blog/2017/11/28/react-v16.2.0-fragment-support.html
}; | ||
|
||
deleteTfJob(ns, name) | ||
.catch(console.log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use console.error
const JobList = ({jobs}) => { | ||
let jobSummaries = [] | ||
for (let i = 0; i < jobs.length; i++) { | ||
jobSummaries.push(<JobSummary key={i} job={jobs[i]} />) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just use jobs.map
would do, but it's a matter of personal taste
const ReplicaSpec = ({ spec, status, pods }) => { | ||
// let podComponents = pods.map(p => <Pod pod={p} />); | ||
return ( | ||
<div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for <div/>
?
images/dashboard/Dockerfile
Outdated
# TODO(jlewi): How can we make it work with golang:1.8.2-alpine | ||
FROM golang:1.8.2 | ||
|
||
RUN mkdir -p /opt/dashboard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combining multiple runs would be better
dashboard/frontend/package.json
Outdated
"private": true, | ||
"dependencies": { | ||
"material-ui": "^0.19.4", | ||
"react": "^16.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add prop-types if you are feeling good :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and license?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personal suggestion: prettier.js
and standard.js
@jimexist A proposal would be great. |
@jlewi Sorry for being slow on this, tangled in other commitments... I'll try to finish this today. |
Great thanks! |
@jlewi So the airflow job still fails currently as it is still using the release script from master, and consequently not building the dashboard's backend. Trying to understand what's wrong. |
So, |
/ok-to-test |
So I submitted a fix to the issue with not using the checked out version. Right now it looks like the go program is failing to build
Travis is passing though. The E2E test is running glide install. Maybe we should stop doing that since dependencies should now be checked in. |
.gitignore
Outdated
@@ -2,6 +2,10 @@ | |||
# only so we exclude them. | |||
bin/ | |||
|
|||
vendor/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this? Now that we check in vendor we don't want in gitignoe.
@jlewi not exactly sure why the last airflow run was marked as failed as I can see that all steps were successful in airflow |
Must be an issue with the test-infrastructure. Possibly a timeout? |
/test all |
On a different test I noticed the error
|
I'll try adding retries as part of #241 |
/test ok |
/test all |
/test all |
Addresses #57
To test, create k8s/dashboard/dashboard/deploy.yaml in your cluster.
Left To Do for v1:
This change is