Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.14-1.15 Test Infra Changes. #1290

Closed
2 of 5 tasks
timothysc opened this issue Nov 29, 2018 · 23 comments
Closed
2 of 5 tasks

1.14-1.15 Test Infra Changes. #1290

timothysc opened this issue Nov 29, 2018 · 23 comments
Assignees
Labels
area/releasing area/test lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@timothysc
Copy link
Member

timothysc commented Nov 29, 2018

There is a large amount of technical debt that needs to get paid down to eliminate issues in CI.

  • Move Kops to periodic and release blocking
  • Make KIND a PR blocking job
    edit: there is a KIND pre-submit now, but non-blocking yet. can be called on demand with /test pull-kubernetes-e2e-kind
  • Move kubeadm-kind jobs to release blocking
  • Implement upgrade and skew tests for kubeadm-kind
  • Add cluster-api aws and release blocking for kubeadm HA verification

/cc @kubernetes/sig-cluster-lifecycle
/assign @fabriziopandini @timothysc @neolit123

@neolit123 neolit123 added area/releasing priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/test labels Nov 29, 2018
@neolit123 neolit123 added this to the v1.14 milestone Nov 29, 2018
@fabriziopandini
Copy link
Member

@timothysc I'm working on making kind support more use cases, but it would be nice to clarify a little bit better expectation/requirements.

As a short term goal, I'm targeting having kind ready for testing few variations of init - join > test workflows, that is the minimum required for statrting k/a repleacement.
Then I have in mind more complex variations (HA, external etcd, upgrades)

@neolit123
Copy link
Member

Make KIND a PR blocking job

it would be interesting to define what the PR blocking job would be, some examples:

  • 1 master (conformance)
  • 1 master + 3 workers (non conformance??)
  • 1 master + 3 workers (conformance)

currently @BenTheElder has kind passing conformance on a periodic for single master pretty consistently.
https://k8s-testgrid.appspot.com/sig-testing-kind#conformance,%20master%20(dev)

also serial vs non-serial is an interesting topic:

  • the non-serial runs are 20 minutes only.
  • while the serial runs are ~1:30 minutes.

but we should probably loop more folks from sig-testing on this topic and/or create an issue in test-infra for the above.

@timothysc
Copy link
Member Author

I think we should have kind as a PR blocking job right now and move kops to periodic to unblock the community.

@timothysc
Copy link
Member Author

HA w/kind is a nice-to-have, but not a requirement. TBH I think it would be weird.

@neolit123
Copy link
Member

it feels like the only good option we have in terms of testing HA without a CP.

@BenTheElder
Copy link
Member

some notes:

  • technically multi-node is necessary to fully pass conformance, else one test is "skipped", but otherwise they do generally pass.
    • running the bulk of them in parallel is much faster with ~15 minute CI jobs, but recently has gotten flakier on k8s master branch :/
  • lots of e2e tests that are not conformance do hacky stuff that may need some work to port / support

I think we should have kind as a PR blocking job right now and move kops to periodic to unblock the community.

I've spoken to @justinsb about this, I think we're both onboard there. I've just finally cut a binary release of the kind CLI yesterday as things are fairly stable, modulo changes from @fabriziopandini related to HA / multi-node etc. We should be able to create jobs that use stable released versions now.

There might be some concerns from others though regarding the presubmits ...

Personally I would like to see more tests be post-submit based and release blocking rather than presubmit, and move cloud providers there as they move out of tree. That requires wider buy-in and better ownership of CI signal though I think ...

@ixdy
Copy link
Member

ixdy commented Nov 29, 2018

Remove Bazel generation of .spec and .deb artifacts with standard .in files that can override build variables.

Can you elaborate on what you mean by this?

@rdodev
Copy link

rdodev commented Nov 30, 2018

/watching topic

@timothysc
Copy link
Member Author

/assign @liztio @rdodev

@timothysc timothysc changed the title 1.14 Build + Test Infra Changes. 1.14 Test Infra Changes. Dec 3, 2018
@timothysc
Copy link
Member Author

@ixdy - Moving build details here kubernetes/kubernetes#71677

@neolit123 neolit123 pinned this issue Jan 1, 2019
@fabriziopandini fabriziopandini unpinned this issue Jan 5, 2019
@neolit123
Copy link
Member

Move Kops to periodic and release blocking

ticked this item in the OP.
kops-aws was removed from PR blocking and release blocking due to an AWS account issue.
my understanding is that it might make it back in release blocking.

Make KIND a PR blocking job

there is some chance that this can happen this cycle.
but this depends on sig-testing decisions.

in terms of our dashboard we are going to have kind jobs.

@neolit123 neolit123 added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Feb 1, 2019
@miry
Copy link

miry commented Feb 7, 2019

Guys can you help me understand where I can find are all tests pass in k8s.io/kubernetes/cmd/kubeadm/app/util/config ?

Currently I have next broken

    --- FAIL: TestConfigFileAndDefaultsToInternalConfig/incompleteYAMLToDefaultedv1beta1 (0.00s)
        initconfiguration_test.go:123: the expected and actual output differs.
            	in: testdata/defaulting/master/incomplete.yaml
            	out: testdata/defaulting/master/defaulted.yaml
            	groupversion: kubeadm.k8s.io/v1beta1
            	diff: 
            --- expected
            +++ actual
            @@ -115,7 +115,6 @@
               imagefs.available: 15%!
            (MISSING)   memory.available: 100Mi
               nodefs.available: 10%!
            (MISSING)-  nodefs.inodesFree: 5%!
            (MISSING) evictionPressureTransitionPeriod: 5m0s
             failSwapOn: true
             fileCheckFrequency: 20s

And I wonder is it problem with local env and I missed something ?

@neolit123
Copy link
Member

@miry
i tried the latest master branch and this package passes for me.

$ go test ./cmd/kubeadm/app/util/config/...
ok  	k8s.io/kubernetes/cmd/kubeadm/app/util/config	9.554s
ok  	k8s.io/kubernetes/cmd/kubeadm/app/util/config/strict	0.018s

a couple of points:

  • make sure you have the latest master and go1.11.x
  • the diff library that is failing in this test is known to be buggy.

cc @chuckha

@BenTheElder
Copy link
Member

we have more CI work for kind going on (thanks @neolit123!), kops is still borked everywhere and removed due to billing account issues.

@miry
Copy link

miry commented Feb 7, 2019

@neolit123 Thank you for you help. I found why tests are not working for me: https://github.com/kubernetes/kubernetes/pull/67709/files , because I use MacOS. So it explains - nodefs.inodesFree: 5%

@neolit123
Copy link
Member

this is sort of unrelated to this ticket @miry could you please file an issue in k/k and ping the author of that PR so that an implementation for another OS is added?
thanks.

@neolit123
Copy link
Member

neolit123 commented Feb 28, 2019

  • Make KIND a PR blocking job

EDIT: my mistake i though i was reading release blocking.

this will hopefully happen next week.
this week we were able to move k-a jobs outside of blocking into release-informing / all dashboards and kind jobs to release-informing / all dashboards too.

kubernetes/test-infra#11562
https://k8s-testgrid.appspot.com/sig-release-master-blocking

@neolit123 neolit123 changed the title 1.14 Test Infra Changes. 1.14-1.15 Test Infra Changes. Mar 7, 2019
@neolit123 neolit123 modified the milestones: v1.14, v1.15 Mar 11, 2019
@neolit123
Copy link
Member

neolit123 commented Mar 14, 2019

re:

Implement upgrade and skew tests for kubeadm-kind

@fabriziopandini @timothysc
yesterday we discussed briefly what is the plan.
here is the summary again.

we need upgrade and skew tests in 1.15 as we are no longer going to use kubernetes-anywhere.

summary:

  • kind out of the box does not support what we need for e2e testing.
    major blocker is that we cannot stop the creation of the k8s cluster from the config or the command line of the kind binary. also things like kubeadm reset that we really want to test.
  • kind has a kubetest deployer in test infra but it's bound to a kind binary.
  • kinder is great for our needs but we cannot use it with that kubetest deployer.

we have multiple options (which is better than having none):

  1. extend kind to support all our needs

i don't think this will happen in 1.15 (or at least the first half of the cycle) due to:

  • what we are demanding from kind is out of scope (e.g. kubeadm reset).
  • the kind alpha (or another scoped) command is ideal for us but it's invasive and will create noise in the kind repo. potential block on features we want to add.
  • putting pressure on the kind maintainers, who are busy most of the time.
  1. add built-in support for kinder in test-infra

having built-in support for that in kubetest1 is probably a bad idea and i don't want to put effort into that. kubetest1 is beyond hard to maintain at this point. also this is political. kubetest2 (WIP) is more flexible, but kubetest2 needs work and it might take a while before we get it hooked in prow jobs.

this idea is mostly unclear.

  1. bypass the kubetest deployer process completely until kubetest2 is ready and use kinder with a custom deployment process.

the whole idea of deployers is to facilitate your testing process --up --test --down flags etc.
this is great if you want to test using kind but is very limiting in our quite demanding use case.

current kind jobs for sig-testing still run using a bash script:
https://github.com/kubernetes-sigs/kind/blob/master/hack/ci/e2e.sh

we can use the same mechanics and in such a bash script we can execute kinder or possibly even pre-cook a temporary up,down,test deployer.


at this point my vote goes for 3, because i don't like the risks from 1 and 2 in terms of timing of the 1.15 cycle...

@BenTheElder
Copy link
Member

  1. is a fair option imho 👍
  2. is possible but also perhaps not as fast as we might want / land it otherwise.
  3. kubetest2 is low priority and not ready. if you want to add kinder to kubetest1 or 2 I don't think there are any political blockers (?) but kubetest is a mess and kubetest2 is still an MVP. 😬

I would recommend not using the bash for too much longer though, we should at least get some of the kind tests over to kubetest(2) soon, if not everything else. It will be easier to do complex testing in Go.

@neolit123
Copy link
Member

ok, 3 it is then.
but i only see it as a temporary option and i will think about a way to bring the bash to a minimum.

@timothysc
Copy link
Member Author

I think it's still a little early on 1.15 to pull the trigger on options.

If we get phases in kind we can either wrap with scripts or build macro commands in kinder.

@neolit123
Copy link
Member

Implement upgrade and skew tests for kubeadm-kind

this is done now.

Make KIND a PR blocking job
Move kubeadm-kind jobs to release blocking

i will close this ticket and create a new one that only tracks the above items.
they need by-in by sig-release and sig-testing.

Add cluster-api aws and release blocking for kubeadm HA verification

probably should be tracked in the CAPA provider.

@neolit123
Copy link
Member

moved to #1599

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/releasing area/test lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

8 participants