Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Do not set TF_CONFIG for local training #1080

Merged
merged 2 commits into from
Sep 12, 2019

Conversation

gaocegege
Copy link
Member

@gaocegege gaocegege commented Sep 11, 2019

Close #1078

/assign @johnugeorge @richardsliu

Signed-off-by: Ce Gao [email protected]


This change is Reviewable

@coveralls
Copy link

coveralls commented Sep 11, 2019

Coverage Status

Coverage remained the same at 96.512% when pulling 0216797 on gaocegege:feat into 5c0a06b on kubeflow:master.

@TravisBuddy
Copy link

Travis tests have failed

Hey @gaocegege,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

1st Build

View build log

gometalinter --config=linter_config.json --vendor ./...
pkg/controller.v1/tensorflow/pod.go:272:2:warning: should use 'return <expr>' instead of 'if <expr> { return <bool> }; return <bool>' (S1008) (staticcheck)
pkg/controller.v1/tensorflow/pod_test.go:1::warning: file is not goimported (goimports)
goveralls -service=travis-ci -v -package ./pkg/... -ignore "pkg/client/*/*.go,pkg/client/*/*/*.go,pkg/client/*/*/*/*.go,pkg/client/*/*/*/*/*.go,pkg/client/*/*/*/*/*/*.go,pkg/client/*/*/*/*/*/*/*.go,pkg/util/testutil/*.go,pkg/apis/tensorflow/*/zz_generated.*.go,pkg/apis/tensorflow/*/*_generated.go,pkg/apis/common/*/zz_generated.*.go,pkg/apis/common/*/*_generated.go"
?   	github.com/kubeflow/tf-operator/pkg/apis/common/v1	[no test files]
=== RUN   TestSetTypeNames
--- PASS: TestSetTypeNames (0.00s)
=== RUN   TestSetDefaultTFJob
--- PASS: TestSetDefaultTFJob (0.00s)
=== RUN   TestIsChieforMaster
--- PASS: TestIsChieforMaster (0.00s)
PASS
coverage: 20.9% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1	0.030s	coverage: 20.9% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
=== RUN   TestValidateV1TFJobSpec
time="2019-09-11T09:01:39Z" level=error msg="TFJobSpec is not valid: Image is undefined in the container of Worker"
time="2019-09-11T09:01:39Z" level=error msg="TFJobSpec is not valid: There is no container named tensorflow in Worker"
--- PASS: TestValidateV1TFJobSpec (0.00s)
PASS
coverage: 14.2% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation	0.032s	coverage: 14.2% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1	[no test files]
=== RUN   TestGenGeneralName
--- PASS: TestGenGeneralName (0.00s)
PASS
coverage: 0.5% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/common/jobcontroller	0.022s	coverage: 0.5% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
?   	github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured	[no test files]
=== RUN   TestCreatePods
--- PASS: TestCreatePods (0.01s)
=== RUN   TestCreateService
time="2019-09-11T09:01:54Z" level=info msg="Controller test-tfjob created service empty_service"
--- PASS: TestCreateService (0.00s)
=== RUN   TestCreateServicesWithControllerRef
time="2019-09-11T09:01:54Z" level=info msg="Controller test-tfjob created service empty_service"
--- PASS: TestCreateServicesWithControllerRef (0.00s)
=== RUN   TestClaimServices
--- PASS: TestClaimServices (0.00s)
PASS
coverage: 41.1% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/control	0.057s	coverage: 41.1% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
=== RUN   TestNormalPath
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (2.623934ms)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (867.113µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=3, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (860.366µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=4, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (595.707µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (1.946475ms)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
E0911 09:02:01.289582    9202 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc420703010), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"PS":(*v1.ReplicaSpec)(0xc42047edc0), "Worker":(*v1.ReplicaSpec)(0xc42047f080)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc4206da6e0), "PS":(*v1.ReplicaStatus)(0xc4206da6b0)}, StartTime:(*v1.Time)(0xc42070d140), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'TFJobSucceeded' 'TFJob test-tfjob successfully completed.'
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (5.280328ms)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (341.825µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: ps-0" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: ps-0" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-1" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-1" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (978.677µs)" job=default.test-tfjob
--- PASS: TestNormalPath (0.03s)
=== RUN   TestRun
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:02:01Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:02:01Z" level=info msg="Started workers"
time="2019-09-11T09:02:01Z" level=info msg="Shutting down workers"
--- PASS: TestRun (0.50s)
=== RUN   TestAddTFJob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="TFJob test-tfjob is created." job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:02:01Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:02:01Z" level=info msg="Started workers"
--- PASS: TestAddTFJob (0.10s)
=== RUN   TestCopyLabelsAndAnnotation
time="2019-09-11T09:02:01Z" level=info msg="Shutting down workers"
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Need to create new pod: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (410.252µs)" job=default.test-tfjob
--- PASS: TestCopyLabelsAndAnnotation (0.00s)
=== RUN   TestDeletePodsAndServices
time="2019-09-11T09:02:01Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:02:01Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:02:01Z" level=info msg="Started workers"
time="2019-09-11T09:02:01Z" level=info msg="Shutting down workers"
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (320.512µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (364.549µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (365.33µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (639.256µs)" job=default.test-tfjob
--- PASS: TestDeletePodsAndServices (0.01s)
=== RUN   TestCleanupTFJob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (286.276µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:01Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:01Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (295.29µs)" job=default.test-tfjob
time="2019-09-11T09:02:01Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:01Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:03Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:03Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:03Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:03Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:03Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:03Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:03Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:02:03Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (672.819µs)" job=default.test-tfjob
--- PASS: TestCleanupTFJob (2.01s)
=== RUN   TestActiveDeadlineSeconds
time="2019-09-11T09:02:03Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:03Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:03Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:03Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:03Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=4, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:03Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:03Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:05Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
--- PASS: TestActiveDeadlineSeconds (2.01s)
=== RUN   TestBackoffForOnFailure
time="2019-09-11T09:02:05Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:05Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:05Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:05Z" level=warning msg="The restart policy of replica PS of the job test-tfjob is not OnFailure or Always. Not counted in backoff limit." job=default.test-tfjob uid=
time="2019-09-11T09:02:05Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (530.413µs)" job=default.test-tfjob
--- PASS: TestBackoffForOnFailure (0.00s)
=== RUN   TestAddPod
time="2019-09-11T09:02:05Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:05Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:05Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:02:05Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:02:06Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:02:06Z" level=info msg="Started workers"
--- PASS: TestAddPod (0.10s)
=== RUN   TestClusterSpec
time="2019-09-11T09:02:06Z" level=info msg="Shutting down workers"
--- PASS: TestClusterSpec (0.00s)
=== RUN   TestIsDistributed
--- PASS: TestIsDistributed (0.00s)
=== RUN   TestRestartPolicy
--- PASS: TestRestartPolicy (0.00s)
=== RUN   TestExitCode
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:02:06Z" level=info msg="Ignoring inactive pod default/worker-0 in state Failed, deletion time <nil>"
time="2019-09-11T09:02:06Z" level=info msg="Pod: default.worker-0 exited with code 130" job=default.test-tfjob replica-type=worker uid=
E0911 09:02:06.032091    9202 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc4203e40d0), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"Worker":(*v1.ReplicaSpec)(0xc4205d22c0)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc42059ff10)}, StartTime:(*v1.Time)(nil), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'ExitedWithCode' 'Pod: default.worker-0 exited with code 130'
time="2019-09-11T09:02:06Z" level=info msg="Need to restart the pod: default.worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=1" job=default.test-tfjob uid=
E0911 09:02:06.032649    9202 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc4203e40d0), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"Worker":(*v1.ReplicaSpec)(0xc4205d22c0)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc42059ff10)}, StartTime:(*v1.Time)(0xc4204223c0), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Warning' 'TFJobRestarting' 'TFJob test-tfjob is restarting because 1 Worker replica(s) failed.'
time="2019-09-11T09:02:06Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:02:06Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (1.895473ms)" job=default.test-tfjob
--- PASS: TestExitCode (0.00s)
time="2019-09-11T09:02:06Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:02:06Z" level=info msg="Started workers"
time="2019-09-11T09:02:06Z" level=info msg="Shutting down workers"
=== RUN   TestAddService
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:02:06Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:02:06Z" level=info msg="Started workers"
time="2019-09-11T09:02:06Z" level=info msg="Shutting down workers"
--- PASS: TestAddService (0.10s)
=== RUN   TestFailed
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=3, running=0, failed=1" job=default.test-tfjob uid=
E0911 09:02:06.137642    9202 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc420327a20), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"Worker":(*v1.ReplicaSpec)(0xc4205d2dc0)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc420375f60)}, StartTime:(*v1.Time)(0xc4204eb220), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'TFJobFailed' 'TFJob test-tfjob has failed because 1 Worker replica(s) failed.'
--- PASS: TestFailed (0.00s)
=== RUN   TestStatus
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=2, failed=2" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=2, running=0, failed=2" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=3, running=3, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=4" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=1, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=4" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:02:06Z" level=info msg="Creating Job controller"
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=4" job=default.test-tfjob uid=
time="2019-09-11T09:02:06Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
--- PASS: TestStatus (0.02s)
=== RUN   TestGenOwnerReference
--- PASS: TestGenOwnerReference (0.00s)
=== RUN   TestGenLabels
--- PASS: TestGenLabels (0.00s)
=== RUN   TestConvertTFJobToUnstructured
--- PASS: TestConvertTFJobToUnstructured (0.00s)
PASS
coverage: 51.1% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow	4.935s	coverage: 51.1% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
?   	github.com/kubeflow/tf-operator/pkg/logger	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util/k8sutil	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util/signals	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util/train	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/version	[no test files]
ignoring pkg/apis/common/v1/openapi_generated.go
ignoring pkg/apis/common/v1/zz_generated.deepcopy.go
ignoring pkg/apis/common/v1/zz_generated.defaults.go
ignoring pkg/apis/tensorflow/v1/openapi_generated.go
ignoring pkg/apis/tensorflow/v1/zz_generated.deepcopy.go
ignoring pkg/apis/tensorflow/v1/zz_generated.defaults.go
Job #2578.1
https://coveralls.io/jobs/53109921
TravisBuddy Request Identifier: d6cb78d0-d472-11e9-847a-07722ef8bbdd

@TravisBuddy
Copy link

Travis tests have failed

Hey @gaocegege,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

1st Build

View build log

gometalinter --config=linter_config.json --vendor ./...
pkg/controller.v1/tensorflow/pod_test.go:1::warning: file is not goimported (goimports)
goveralls -service=travis-ci -v -package ./pkg/... -ignore "pkg/client/*/*.go,pkg/client/*/*/*.go,pkg/client/*/*/*/*.go,pkg/client/*/*/*/*/*.go,pkg/client/*/*/*/*/*/*.go,pkg/client/*/*/*/*/*/*/*.go,pkg/util/testutil/*.go,pkg/apis/tensorflow/*/zz_generated.*.go,pkg/apis/tensorflow/*/*_generated.go,pkg/apis/common/*/zz_generated.*.go,pkg/apis/common/*/*_generated.go"
?   	github.com/kubeflow/tf-operator/pkg/apis/common/v1	[no test files]
=== RUN   TestSetTypeNames
--- PASS: TestSetTypeNames (0.00s)
=== RUN   TestSetDefaultTFJob
--- PASS: TestSetDefaultTFJob (0.00s)
=== RUN   TestIsChieforMaster
--- PASS: TestIsChieforMaster (0.00s)
PASS
coverage: 20.9% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1	0.035s	coverage: 20.9% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
=== RUN   TestValidateV1TFJobSpec
time="2019-09-11T09:21:18Z" level=error msg="TFJobSpec is not valid: Image is undefined in the container of Worker"
time="2019-09-11T09:21:18Z" level=error msg="TFJobSpec is not valid: There is no container named tensorflow in Worker"
--- PASS: TestValidateV1TFJobSpec (0.00s)
PASS
coverage: 14.2% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation	0.030s	coverage: 14.2% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1	[no test files]
=== RUN   TestGenGeneralName
--- PASS: TestGenGeneralName (0.00s)
PASS
coverage: 0.5% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/common/jobcontroller	0.022s	coverage: 0.5% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
?   	github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured	[no test files]
=== RUN   TestCreatePods
--- PASS: TestCreatePods (0.01s)
=== RUN   TestCreateService
time="2019-09-11T09:21:33Z" level=info msg="Controller test-tfjob created service empty_service"
--- PASS: TestCreateService (0.00s)
=== RUN   TestCreateServicesWithControllerRef
time="2019-09-11T09:21:33Z" level=info msg="Controller test-tfjob created service empty_service"
--- PASS: TestCreateServicesWithControllerRef (0.00s)
=== RUN   TestClaimServices
--- PASS: TestClaimServices (0.00s)
PASS
coverage: 41.1% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/control	0.062s	coverage: 41.1% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
=== RUN   TestNormalPath
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=3, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (2.782385ms)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=4, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (545.33µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (700.849µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
E0911 09:21:40.237201    9236 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc4204ed700), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"PS":(*v1.ReplicaSpec)(0xc42064f600), "Worker":(*v1.ReplicaSpec)(0xc42064f8c0)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc4205e2360)}, StartTime:(*v1.Time)(0xc4204efd00), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'TFJobSucceeded' 'TFJob test-tfjob successfully completed.'
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (2.272972ms)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (698.886µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-1" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-1" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: ps-0" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: ps-0" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (874.462µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (373.536µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: ps-1" job=default.test-tfjob replica-type=ps uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-2" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-3" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (1.238246ms)" job=default.test-tfjob
--- PASS: TestNormalPath (0.03s)
=== RUN   TestRun
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:21:40Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:21:40Z" level=info msg="Started workers"
time="2019-09-11T09:21:40Z" level=info msg="Shutting down workers"
--- PASS: TestRun (0.50s)
=== RUN   TestAddTFJob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="TFJob test-tfjob is created." job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:21:40Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:21:40Z" level=info msg="Started workers"
--- PASS: TestAddTFJob (0.10s)
=== RUN   TestCopyLabelsAndAnnotation
time="2019-09-11T09:21:40Z" level=info msg="Shutting down workers"
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Need to create new pod: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (345.803µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:21:40Z" level=info msg="Started workers"
time="2019-09-11T09:21:40Z" level=info msg="Shutting down workers"
--- PASS: TestCopyLabelsAndAnnotation (0.00s)
=== RUN   TestDeletePodsAndServices
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (351.522µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (262.298µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (333.531µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (320.518µs)" job=default.test-tfjob
--- PASS: TestDeletePodsAndServices (0.00s)
=== RUN   TestCleanupTFJob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (239.326µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:40Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:40Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (224.306µs)" job=default.test-tfjob
time="2019-09-11T09:21:40Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:40Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:42Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:42Z" level=info msg="Ignoring inactive pod default/worker-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:42Z" level=info msg="Ignoring inactive pod default/worker-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:42Z" level=info msg="Ignoring inactive pod default/worker-2 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:42Z" level=info msg="Ignoring inactive pod default/worker-3 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:42Z" level=info msg="Ignoring inactive pod default/ps-0 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:42Z" level=info msg="Ignoring inactive pod default/ps-1 in state Succeeded, deletion time <nil>"
time="2019-09-11T09:21:42Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (667.423µs)" job=default.test-tfjob
--- PASS: TestCleanupTFJob (2.01s)
=== RUN   TestActiveDeadlineSeconds
time="2019-09-11T09:21:42Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:42Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:42Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:42Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:42Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=4, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:42Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:42Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:44Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
--- PASS: TestActiveDeadlineSeconds (2.00s)
=== RUN   TestBackoffForOnFailure
time="2019-09-11T09:21:44Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:44Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:44Z" level=warning msg="The restart policy of replica PS of the job test-tfjob is not OnFailure or Always. Not counted in backoff limit." job=default.test-tfjob uid=
time="2019-09-11T09:21:44Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (384.741µs)" job=default.test-tfjob
--- PASS: TestBackoffForOnFailure (0.00s)
=== RUN   TestAddPod
time="2019-09-11T09:21:44Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:44Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:21:44Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:21:44Z" level=info msg="Started workers"
--- PASS: TestAddPod (0.10s)
time="2019-09-11T09:21:44Z" level=info msg="Shutting down workers"
=== RUN   TestClusterSpec
--- PASS: TestClusterSpec (0.00s)
=== RUN   TestIsDistributed
--- PASS: TestIsDistributed (0.00s)
=== RUN   TestRestartPolicy
--- PASS: TestRestartPolicy (0.00s)
=== RUN   TestExitCode
time="2019-09-11T09:21:44Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:44Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:21:44Z" level=info msg="Reconcile TFJobs test-tfjob" job=default.test-tfjob uid=
time="2019-09-11T09:21:44Z" level=info msg="Ignoring inactive pod default/worker-0 in state Failed, deletion time <nil>"
time="2019-09-11T09:21:44Z" level=info msg="Pod: default.worker-0 exited with code 130" job=default.test-tfjob replica-type=worker uid=
E0911 09:21:44.977159    9236 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc4204edc80), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"Worker":(*v1.ReplicaSpec)(0xc4205d82c0)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc42001b480)}, StartTime:(*v1.Time)(nil), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'ExitedWithCode' 'Pod: default.worker-0 exited with code 130'
time="2019-09-11T09:21:44Z" level=info msg="Need to restart the pod: default.worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:44Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=1" job=default.test-tfjob uid=
E0911 09:21:44.977523    9236 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc4204edc80), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"Worker":(*v1.ReplicaSpec)(0xc4205d82c0)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc42001b480)}, StartTime:(*v1.Time)(0xc4200d88c0), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Warning' 'TFJobRestarting' 'TFJob test-tfjob is restarting because 1 Worker replica(s) failed.'
time="2019-09-11T09:21:44Z" level=info msg="need to create new service: worker-0" job=default.test-tfjob replica-type=worker uid=
time="2019-09-11T09:21:44Z" level=info msg="Finished syncing tfjob \"default/test-tfjob\" (1.37874ms)" job=default.test-tfjob
time="2019-09-11T09:21:44Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:21:44Z" level=info msg="Started workers"
time="2019-09-11T09:21:44Z" level=info msg="Shutting down workers"
--- PASS: TestExitCode (0.00s)
=== RUN   TestAddService
time="2019-09-11T09:21:44Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:44Z" level=info msg="Starting TFJob controller"
time="2019-09-11T09:21:44Z" level=info msg="Waiting for informer caches to sync"
time="2019-09-11T09:21:45Z" level=info msg="Starting 1 workers"
time="2019-09-11T09:21:45Z" level=info msg="Started workers"
--- PASS: TestAddService (0.10s)
=== RUN   TestFailed
time="2019-09-11T09:21:45Z" level=info msg="Shutting down workers"
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=3, running=0, failed=1" job=default.test-tfjob uid=
E0911 09:21:45.080964    9236 event.go:259] Could not construct reference to: '&v1.TFJob{TypeMeta:v1.TypeMeta{Kind:"TFJob", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"test-tfjob", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.TFJobSpec{ActiveDeadlineSeconds:(*int64)(nil), BackoffLimit:(*int32)(nil), CleanPodPolicy:(*v1.CleanPodPolicy)(0xc4206b54f0), TTLSecondsAfterFinished:(*int32)(nil), TFReplicaSpecs:map[v1.TFReplicaType]*v1.ReplicaSpec{"Worker":(*v1.ReplicaSpec)(0xc420902840)}}, Status:v1.JobStatus{Conditions:[]v1.JobCondition(nil), ReplicaStatuses:map[v1.ReplicaType]*v1.ReplicaStatus{"Worker":(*v1.ReplicaStatus)(0xc420044b50)}, StartTime:(*v1.Time)(0xc42083e560), CompletionTime:(*v1.Time)(nil), LastReconcileTime:(*v1.Time)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'TFJobFailed' 'TFJob test-tfjob has failed because 1 Worker replica(s) failed.'
--- PASS: TestFailed (0.00s)
=== RUN   TestStatus
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=2, failed=2" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=2, running=0, failed=2" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=3, running=3, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=4" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=1, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=1, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=0, running=0, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=4" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="Creating TFJob controller"
time="2019-09-11T09:21:45Z" level=info msg="Creating Job controller"
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Chief expected=1, running=0, failed=1" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=Worker expected=4, running=0, failed=4" job=default.test-tfjob uid=
time="2019-09-11T09:21:45Z" level=info msg="TFJob=test-tfjob, ReplicaType=PS expected=2, running=2, failed=0" job=default.test-tfjob uid=
--- PASS: TestStatus (0.02s)
=== RUN   TestGenOwnerReference
--- PASS: TestGenOwnerReference (0.00s)
=== RUN   TestGenLabels
--- PASS: TestGenLabels (0.00s)
=== RUN   TestConvertTFJobToUnstructured
--- PASS: TestConvertTFJobToUnstructured (0.00s)
PASS
coverage: 51.0% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
ok  	github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow	4.926s	coverage: 51.0% of statements in github.com/kubeflow/tf-operator/pkg/apis/common/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/apis/tensorflow/validation, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/fake, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/scheme, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/clientset/versioned/typed/tensorflow/v1/fake, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/internalinterfaces, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow, github.com/kubeflow/tf-operator/pkg/client/informers/externalversions/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/client/listers/tensorflow/v1, github.com/kubeflow/tf-operator/pkg/common/jobcontroller, github.com/kubeflow/tf-operator/pkg/common/util/v1/testutil, github.com/kubeflow/tf-operator/pkg/common/util/v1/unstructured, github.com/kubeflow/tf-operator/pkg/control, github.com/kubeflow/tf-operator/pkg/controller.v1/tensorflow, github.com/kubeflow/tf-operator/pkg/logger, github.com/kubeflow/tf-operator/pkg/util, github.com/kubeflow/tf-operator/pkg/util/k8sutil, github.com/kubeflow/tf-operator/pkg/util/signals, github.com/kubeflow/tf-operator/pkg/util/train, github.com/kubeflow/tf-operator/pkg/version
?   	github.com/kubeflow/tf-operator/pkg/logger	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util/k8sutil	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util/signals	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/util/train	[no test files]
?   	github.com/kubeflow/tf-operator/pkg/version	[no test files]
ignoring pkg/apis/common/v1/openapi_generated.go
ignoring pkg/apis/common/v1/zz_generated.deepcopy.go
ignoring pkg/apis/common/v1/zz_generated.defaults.go
ignoring pkg/apis/tensorflow/v1/openapi_generated.go
ignoring pkg/apis/tensorflow/v1/zz_generated.deepcopy.go
ignoring pkg/apis/tensorflow/v1/zz_generated.defaults.go
Job #2579.1
https://coveralls.io/jobs/53110579
TravisBuddy Request Identifier: 95265410-d475-11e9-847a-07722ef8bbdd

@gaocegege
Copy link
Member Author

/assign @johnugeorge

Please have a look

@@ -226,19 +231,47 @@ func setClusterSpec(podTemplateSpec *v1.PodTemplateSpec, tfjob *tfv1.TFJob, rt,
if tfConfigStr == "" {
return nil
}
// Add TF_CONFIG environment variable.
// Add TF_CONFIG environment variable to tensorflow container in the pod.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch might be needed in Pytorch also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that you said it does not have a side effect in PyTorch. Do we need it?

@johnugeorge
Copy link
Member

/lgtm

@johnugeorge
Copy link
Member

/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] Cannot initialize the training job with TF Estimator when the user uses 1 worker and 0 PS
6 participants