-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup E2E test by running build and setup cluster in parallel #659
Labels
Comments
jlewi
added a commit
to jlewi/k8s
that referenced
this issue
Jun 15, 2018
…n parallel * To do this we split the setup step into two steps 1. setting up the cluster and 2. setting up Kubeflow. Fix kubeflow#659
jlewi
added a commit
to jlewi/k8s
that referenced
this issue
Jun 15, 2018
* Fix kubeflow#634 * Speedup the E2E test by running the build and setup cluster steps in parallel * To do this we split the setup step into two steps 1. setting up the cluster and 2. setting up Kubeflow. Fix kubeflow#659 * Shorten the name of the workflow for v1alpha2 * Otherwise the label for the workflow pod becomes too long and argo can't run it. * Pin the test worker image so that we don't get broken when someone updates the latest image * Make it a parameter in the prow_config.yaml * Use a file lock to ensure only one instance of test_runner is modifying the ksonnet app at a time; this should help with various test flakes.
yph152
pushed a commit
to yph152/tf-operator
that referenced
this issue
Jun 18, 2018
* Fix kubeflow#634 * Speedup the E2E test by running the build and setup cluster steps in parallel * To do this we split the setup step into two steps 1. setting up the cluster and 2. setting up Kubeflow. Fix kubeflow#659 * Shorten the name of the workflow for v1alpha2 * Otherwise the label for the workflow pod becomes too long and argo can't run it. * Pin the test worker image so that we don't get broken when someone updates the latest image * Make it a parameter in the prow_config.yaml * Use a file lock to ensure only one instance of test_runner is modifying the ksonnet app at a time; this should help with various test flakes.
jetmuffin
pushed a commit
to jetmuffin/tf-operator
that referenced
this issue
Jul 9, 2018
* Fix kubeflow#634 * Speedup the E2E test by running the build and setup cluster steps in parallel * To do this we split the setup step into two steps 1. setting up the cluster and 2. setting up Kubeflow. Fix kubeflow#659 * Shorten the name of the workflow for v1alpha2 * Otherwise the label for the workflow pod becomes too long and argo can't run it. * Pin the test worker image so that we don't get broken when someone updates the latest image * Make it a parameter in the prow_config.yaml * Use a file lock to ensure only one instance of test_runner is modifying the ksonnet app at a time; this should help with various test flakes.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We should be able to speedup the E2E test by running the setup cluster step and build step in parallel.
The setup cluster step waits for the TFJob operator deployment to start which will be blocked on the build step completing. But this should be fine as long as the timeout is long.
However, it would be better if we made the wait for deployment step a separate step that depends on both the cluster step and the build step.
The text was updated successfully, but these errors were encountered: