Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make critical jobs Guaranteed Pod QOS: pull-kubernetes-e2e-kind #18591

Closed
spiffxp opened this issue Aug 1, 2020 · 7 comments
Closed

Make critical jobs Guaranteed Pod QOS: pull-kubernetes-e2e-kind #18591

spiffxp opened this issue Aug 1, 2020 · 7 comments
Assignees
Labels
area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@spiffxp
Copy link
Member

spiffxp commented Aug 1, 2020

What should be cleaned up or changed:

This is part of #18530

The following jobs should be Guaranteed Pod QOS, meaning they should have CPU and memory resource limits, and matching resource requests:

  • pull-kubernetes-e2e-kind

These jobs run on (google.com only) k8s-prow-build, so @spiffxp has provided the following guess:

  • suggest 4 cpu, slightly above 8 Gi (9?) mem

General steps to follow:

/sig testing
/sig release
/area jobs
/area release-eng

@spiffxp spiffxp added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Aug 1, 2020
@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/release Categorizes an issue or PR as relevant to SIG Release. area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject labels Aug 1, 2020
@spiffxp spiffxp changed the title Make critical jobs Guaranteed Pod QOS: pull-kubernetes-kind Make critical jobs Guaranteed Pod QOS: pull-kubernetes-e2e-kind Aug 1, 2020
@spiffxp spiffxp added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 1, 2020
@ameukam
Copy link
Member

ameukam commented Aug 4, 2020

/assign

@spiffxp
Copy link
Member Author

spiffxp commented Aug 4, 2020

/remove-help
since @ameukam has it

@k8s-ci-robot k8s-ci-robot removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 4, 2020
ameukam added a commit to ameukam/test-infra that referenced this issue Aug 5, 2020
Set resources limits and requests as suggested.

Ref: kubernetes#18591
Part of: kubernetes#18530

Signed-off-by: Arnaud Meukam <[email protected]>
ameukam added a commit to ameukam/test-infra that referenced this issue Aug 5, 2020
Set resources limits and requests as suggested.

Ref: kubernetes#18591
Part of: kubernetes#18530

Signed-off-by: Arnaud Meukam <[email protected]>
ameukam added a commit to ameukam/test-infra that referenced this issue Aug 6, 2020
Set resources limits and requests as suggested.

Ref: kubernetes#18591
Part of: kubernetes#18530

Signed-off-by: Arnaud Meukam <[email protected]>
@spiffxp
Copy link
Member Author

spiffxp commented Aug 12, 2020

https://prow.k8s.io/?job=pull-kubernetes-e2e-kind&state=error

Seeing jobs in error state this morning. Most timing out after ~5min (which I'm assuming means they couldn't get the resources they wanted).

Two timing out at 15-17min

@ameukam
Copy link
Member

ameukam commented Aug 13, 2020

With #18817 merged, things we can watch :

@ameukam
Copy link
Member

ameukam commented Aug 13, 2020

I'm not sure if it's accurate (or relevant) but we got a better success rate in 48h:

Taken at ~ 12:20 CEST August 14 :
image

Taken at ~7:30 CEST August 12 :
image

@spiffxp
Copy link
Member Author

spiffxp commented Aug 14, 2020

/close
Yeah I think this is looking much more stable

@k8s-ci-robot
Copy link
Contributor

@spiffxp: Closing this issue.

In response to this:

/close
Yeah I think this is looking much more stable

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests

3 participants