Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add watch function for TFJob python Client API #1122

Merged
merged 1 commit into from
Jan 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/openapi_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/types.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright 2019 The Kubeflow Authors
// Copyright 2020 The Kubeflow Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
Expand Down
2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/zz_generated.defaults.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/clientset.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/fake/clientset_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/fake/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/fake/register.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/scheme/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/scheme/register.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/typed/tensorflow/v1/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/informers/externalversions/factory.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/informers/externalversions/generic.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/listers/tensorflow/v1/expansion_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/listers/tensorflow/v1/tfjob.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 13 additions & 2 deletions sdk/python/docs/TFJobClient.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ namespace | str | Namespace for tfjob deploying to. If the `namespace` is not de
object

## get
> get(name=None, namespace=None)
> get(name=None, namespace=None, watch=False, timeout_seconds=600)

Get the created tfjob in the specified namespace

Expand All @@ -114,7 +114,8 @@ Name | Type | Description | Notes
------------ | ------------- | ------------- | -------------
name | str | The TFJob name. If the `name` is not specified, it will get all tfjobs in the namespace.| Optional. |
namespace | str | The tfjob's namespace. Defaults to current or default namespace.| Optional |

watch | bool | Watch the created TFJob if `True`, otherwise will return the created TFJob object. Stop watching if TFJob reaches the optional specified `timeout_seconds` or once the TFJob status `Succeeded` or `Failed`. | Optional |
timeout_seconds | int | Timeout seconds for watching. Defaults to 600. | Optional |

### Return type
object
Expand Down Expand Up @@ -180,6 +181,7 @@ object
> namespace=None,
> timeout_seconds=600,
> polling_interval=30,
> watch=False,
> status_callback=None):

Wait for the specified job to finish.
Expand All @@ -191,6 +193,14 @@ from kubeflow.tfjob import TFJobClient

tfjob_client = TFJobClient()
tfjob_client.wait_for_job('mnist', namespace='kubeflow')

# The API also supports watching the TFJob status till it's Succeeded or Failed.
tfjob_client.wait_for_job('mnist', namespace=namespace, watch=True)
NAME STATE TIME
mnist Created 2019-12-31T09:20:07Z
mnist Running 2019-12-31T09:20:19Z
mnist Running 2019-12-31T09:20:19Z
mnist Succeeded 2019-12-31T09:22:04Z
```

### Parameters
Expand All @@ -201,6 +211,7 @@ namespace | str | The tfjob's namespace. Defaults to current or default namespac
timeout_seconds | int | How long to wait for the job, default wait for 600 seconds. | Optional|
polling_interval | int | How often to poll for the status of the job.| Optional|
status_callback | str | Callable. If supplied this callable is invoked after we poll the job. Callable takes a single argument which is the tfjob.| Optional|
watch | bool | Watch the TFJob if `True`. Stop watching if TFJob reaches the optional specified `timeout_seconds` or once the TFJob status `Succeeded` or `Failed`. | Optional |

### Return type
object
Expand Down
84 changes: 24 additions & 60 deletions sdk/python/examples/kubeflow-tfjob-sdk.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -120,13 +120,13 @@
"text/plain": [
"{'apiVersion': 'kubeflow.org/v1',\n",
" 'kind': 'TFJob',\n",
" 'metadata': {'creationTimestamp': '2019-12-17T05:40:26Z',\n",
" 'metadata': {'creationTimestamp': '2019-12-31T09:20:07Z',\n",
" 'generation': 1,\n",
" 'name': 'mnist',\n",
" 'namespace': 'default',\n",
" 'resourceVersion': '13585452',\n",
" 'resourceVersion': '20125141',\n",
" 'selfLink': '/apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'},\n",
" 'uid': 'bcb3b867-2bae-11ea-8c04-00000a1001ee'},\n",
" 'spec': {'cleanPodPolicy': 'None',\n",
" 'tfReplicaSpecs': {'Worker': {'replicas': 1,\n",
" 'restartPolicy': 'Never',\n",
Expand Down Expand Up @@ -166,13 +166,13 @@
"text/plain": [
"{'apiVersion': 'kubeflow.org/v1',\n",
" 'kind': 'TFJob',\n",
" 'metadata': {'creationTimestamp': '2019-12-17T05:40:26Z',\n",
" 'metadata': {'creationTimestamp': '2019-12-31T09:20:07Z',\n",
" 'generation': 1,\n",
" 'name': 'mnist',\n",
" 'namespace': 'default',\n",
" 'resourceVersion': '13585464',\n",
" 'resourceVersion': '20125155',\n",
" 'selfLink': '/apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'},\n",
" 'uid': 'bcb3b867-2bae-11ea-8c04-00000a1001ee'},\n",
" 'spec': {'cleanPodPolicy': 'None',\n",
" 'tfReplicaSpecs': {'Worker': {'replicas': 1,\n",
" 'restartPolicy': 'Never',\n",
Expand All @@ -183,14 +183,14 @@
" '--batch_size=150'],\n",
" 'image': 'gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0',\n",
" 'name': 'tensorflow'}]}}}}},\n",
" 'status': {'conditions': [{'lastTransitionTime': '2019-12-17T05:40:26Z',\n",
" 'lastUpdateTime': '2019-12-17T05:40:26Z',\n",
" 'status': {'conditions': [{'lastTransitionTime': '2019-12-31T09:20:07Z',\n",
" 'lastUpdateTime': '2019-12-31T09:20:07Z',\n",
" 'message': 'TFJob mnist is created.',\n",
" 'reason': 'TFJobCreated',\n",
" 'status': 'True',\n",
" 'type': 'Created'}],\n",
" 'replicaStatuses': {'Worker': {}},\n",
" 'startTime': '2019-12-17T05:40:26Z'}}"
" 'startTime': '2019-12-31T09:20:09Z'}}"
]
},
"execution_count": 5,
Expand All @@ -217,7 +217,7 @@
{
"data": {
"text/plain": [
"'Running'"
"'Created'"
]
},
"execution_count": 6,
Expand All @@ -242,57 +242,19 @@
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'apiVersion': 'kubeflow.org/v1',\n",
" 'kind': 'TFJob',\n",
" 'metadata': {'creationTimestamp': '2019-12-17T05:40:26Z',\n",
" 'generation': 1,\n",
" 'name': 'mnist',\n",
" 'namespace': 'default',\n",
" 'resourceVersion': '13586024',\n",
" 'selfLink': '/apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'},\n",
" 'spec': {'cleanPodPolicy': 'None',\n",
" 'tfReplicaSpecs': {'Worker': {'replicas': 1,\n",
" 'restartPolicy': 'Never',\n",
" 'template': {'spec': {'containers': [{'command': ['python',\n",
" '/var/tf_mnist/mnist_with_summaries.py',\n",
" '--log_dir=/train/logs',\n",
" '--learning_rate=0.01',\n",
" '--batch_size=150'],\n",
" 'image': 'gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0',\n",
" 'name': 'tensorflow'}]}}}}},\n",
" 'status': {'completionTime': '2019-12-17T05:42:19Z',\n",
" 'conditions': [{'lastTransitionTime': '2019-12-17T05:40:26Z',\n",
" 'lastUpdateTime': '2019-12-17T05:40:26Z',\n",
" 'message': 'TFJob mnist is created.',\n",
" 'reason': 'TFJobCreated',\n",
" 'status': 'True',\n",
" 'type': 'Created'},\n",
" {'lastTransitionTime': '2019-12-17T05:40:36Z',\n",
" 'lastUpdateTime': '2019-12-17T05:40:36Z',\n",
" 'message': 'TFJob mnist is running.',\n",
" 'reason': 'TFJobRunning',\n",
" 'status': 'False',\n",
" 'type': 'Running'},\n",
" {'lastTransitionTime': '2019-12-17T05:42:19Z',\n",
" 'lastUpdateTime': '2019-12-17T05:42:19Z',\n",
" 'message': 'TFJob mnist successfully completed.',\n",
" 'reason': 'TFJobSucceeded',\n",
" 'status': 'True',\n",
" 'type': 'Succeeded'}],\n",
" 'replicaStatuses': {'Worker': {'succeeded': 1}},\n",
" 'startTime': '2019-12-17T05:40:26Z'}}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
"NAME STATE TIME \n",
"mnist Created 2019-12-31T09:20:07Z \n",
"mnist Running 2019-12-31T09:20:19Z \n",
"mnist Running 2019-12-31T09:20:19Z \n",
"mnist Succeeded 2019-12-31T09:22:04Z \n"
]
}
],
"source": [
"tfjob_client.wait_for_job('mnist', namespace=namespace)"
"tfjob_client.wait_for_job('mnist', namespace=namespace, watch=True)"
]
},
{
Expand All @@ -305,7 +267,9 @@
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -344,7 +308,7 @@
" 'details': {'name': 'mnist',\n",
" 'group': 'kubeflow.org',\n",
" 'kind': 'tfjobs',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'}}"
" 'uid': 'bcb3b867-2bae-11ea-8c04-00000a1001ee'}}"
]
},
"execution_count": 9,
Expand Down
Loading