Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow concurrent runs of kubeadm e2e job. #2265

Merged
merged 1 commit into from
Mar 15, 2017

Conversation

pipejakob
Copy link
Contributor

Since resources are named using this cluster name, include the BUILD_NUMBER so it should differ between multiple concurrent prow runs in the same project.

Here's an example of one such failure, complaining about resources already existing: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-gce/503?log#log

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 15, 2017
@fejta
Copy link
Contributor

fejta commented Mar 15, 2017

This will need to disable checks for leaked resources -- otherwise you're likely to encounter flakiness around extra resources existing when testing ends (created by other runs of the job)

@pipejakob
Copy link
Contributor Author

@krzyzacy Yeah, it might be time to migrate this to a scenario. Does kubernetes_e2e.py automatically disable leaked resource checks like @fejta suggested? Or would that still be a concern if this were a scenario?

@krzyzacy
Copy link
Member

add something like

FAIL_ON_GCP_RESOURCE_LEAK=false
to the env

@krzyzacy
Copy link
Member

hummmm and now E2E_OPT is depending on E2E_NAME, we need to figure out how to make it less messy

@pipejakob
Copy link
Contributor Author

@fejta As it turns out, resource leaks are already disabled for this (since the KUBERNETES_PROVIDER is set to kubernetes-anywhere, and the --check-leaked-resources flag only gets set for gce/gke), but I can add the environment variable to make the intent clearer, even if it's not doing anything:

https://github.com/kubernetes/test-infra/blob/master/jobs/ci-kubernetes-e2e-kubeadm-gce.sh#L28
https://github.com/kubernetes/test-infra/blob/master/jenkins/e2e-image/e2e-runner.sh#L119

@fejta
Copy link
Contributor

fejta commented Mar 15, 2017

Thanks for the clarification!

/assign @krzyzacy

@@ -33,13 +33,18 @@ export KUBERNETES_PROVIDER=kubernetes-anywhere
# succeeded.
export SCM_VERSION=$(./hack/print-workspace-status.sh | grep ^STABLE_BUILD_SCM_REVISION | cut -d' ' -f2)

export E2E_NAME="e2e-kubeadm-gce"
export E2E_NAME="e2e-kubeadm-${BUILD_NUMBER:=0}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${BUILD_NUMBER:-0}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@pipejakob pipejakob force-pushed the kubeadm-concurrent-runs branch from 2eab8b3 to 5a1a37b Compare March 15, 2017 21:46
@krzyzacy
Copy link
Member

krzyzacy commented Mar 15, 2017

self.assertNotIn('source ', script, job) LOL! @fejta

in +# Resource leak detection is disabled because prow runs multiple instances of

@pipejakob
Copy link
Contributor Author

I've updated the commit to set FAIL_ON_GCP_RESOURCE_LEAK=false with a comment explaining why. I didn't realize that leak detection would get confused by resources created by other jobs. I'll have to look at its implementation and see if it could be made better to work for this case, since the identifiers used for resources should never be the same between two different runs.

@pipejakob
Copy link
Contributor Author

@krzyzacy Yikes at that assertion. I think in this case, converting to a regex and matching '\Wsource ' should preserve its intent and let this commit through. I'll get on that.

@pipejakob
Copy link
Contributor Author

Since resources are named using this cluster name, include the
BUILD_NUMBER so it should differ between multiple concurrent runs in the
same project.

Also, adjust bootstrap_test.py to allow the word "resource" to be used.
@pipejakob pipejakob force-pushed the kubeadm-concurrent-runs branch from 5a1a37b to 0866d08 Compare March 15, 2017 21:59
@pipejakob
Copy link
Contributor Author

Okay, I patched the bootstrap_test assertion, and ran it with a few changes to my job to make sure the regex was sane:

"resource "          -> pass
"source /blah"       -> fail
"    source /blah"   -> fail
"sources /blah"      -> pass

@krzyzacy
Copy link
Member

/lgtm

I'll try to rebase #2141, wonder if there's other ways to get SCM_VERSION

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 15, 2017
@pipejakob
Copy link
Contributor Author

Thanks, @krzyzacy!

@pipejakob pipejakob merged commit ba7f565 into kubernetes:master Mar 15, 2017
@@ -1814,7 +1814,7 @@ def testJobsDoNotSourceShell(self):
continue # No clean way to determine version
with open(job_path) as fp:
script = fp.read()
self.assertNotIn('source ', script, job)
self.assertFalse(re.search(r'\Wsource ', script), job)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer works:

re.search(r'\Wsource', 'source foo.sh')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, there's going to be a '\n' before, nice!

@krzyzacy
Copy link
Member

error: Error loading config file "/workspace/kubernetes-anywhere/phase1/gce/.tmp/kubeconfig.json": yaml: line 3: mapping values are not allowed in this context

seems still troubled, who generates kubeconfig.json?

@pipejakob
Copy link
Contributor Author

That's a separate race condition I have a PR out to fix: kubernetes-retired/kubernetes-anywhere#357

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants