Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support multi namespaces #39

Merged
merged 1 commit into from
Sep 19, 2017
Merged

support multi namespaces #39

merged 1 commit into from
Sep 19, 2017

Conversation

loadwiki
Copy link
Contributor

If tfjob controller runs as a deployment in k8s default namespace, we can't not create tfjob crd in another namespace.This PR fix the issue.There are three major modification:

1)tf job controller should watch crd event in all namespaces.

2)tfjob controller creates/deletes training job/tensorboard in specific namespace instead of its own ns.

3)tfjob controller maintain a map from crd name to crd object.The map key should add crd namespace as prefix.

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great thanks for contributing.
Sorry for the slow reply. One simple question please see comment.

@@ -75,8 +75,8 @@ func (c *TfJobRestClient) Client() *http.Client {
}

func (c *TfJobRestClient) Watch(host, ns string, httpClient *http.Client, resourceVersion string) (*http.Response, error) {
return c.restcli.Client.Get(fmt.Sprintf("%s/apis/%s/%s/namespaces/%s/%s?watch=true&resourceVersion=%s",
host, spec.CRDGroup, spec.CRDVersion, ns, spec.CRDKindPlural, resourceVersion))
return c.restcli.Client.Get(fmt.Sprintf("%s/apis/%s/%s/%s?watch=true&resourceVersion=%s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we need namespace anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In k8s read api(such as watch,list), http request URL with no namespace implies that the request will look up the target object in all namespaces.
See kubernetes/client-go#159.
You can also try a test using commander below:
kubectl --alsologtostderr -v=8 get job -w --all-namespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

@jlewi
Copy link
Contributor

jlewi commented Sep 19, 2017

LGTM.

@jlewi jlewi merged commit d2d9c8c into kubeflow:master Sep 19, 2017
@jlewi
Copy link
Contributor

jlewi commented Sep 19, 2017

@loadwiki
I published a new image with the changes:
gcr.io/tf-on-k8s-dogfood/tf_operator:8d53edf

Thanks again for the fix.

@loadwiki
Copy link
Contributor Author

You are welcome.
I found some problem in k8s garbage collection just now. The current GC process only collects pod,svc,deployment in the tf-controller namespace. I will fix the issue later.

@jlewi
Copy link
Contributor

jlewi commented Sep 19, 2017

@loadwiki Thank you that's greatly appreciated. When you have a chance please create an issue for this problem.

oksanabaza pushed a commit to oksanabaza/training-operator that referenced this pull request Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants