Update Test Plan in KEP template #3279

wojtek-t · 2022-04-14T13:47:11Z

There is a lot of context for this change across the whole #3139

For PRR
/assign @johnbelamaric @deads2k @ehashman

People involved in the discussion
/assign @liggitt @lavalamp @aojea @jberkus

@kubernetes/enhancements

wojtek-t · 2022-04-14T13:47:44Z

keps/NNNN-kep-template/README.md

@@ -452,6 +452,36 @@ You can take a look at one potential example of such test in:
 https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
 -->

+### Testing Quality


I'm actually not 100% convinced this should be in PRR as opposed to other section.
But wanted to make some starting proposal to kick the discussion.

yeah, seems like it belongs in ### Test Plan. Can we ask more pointed questions there?

What do you think about questions below?

As an Enhancements Owner, I agree that this belongs in the Test Plan section with the goal of asking more clarifying questions as opposed to the existing light guidance that is there now. For instance, most people seem to just reply (in that section in KEPs) something along the lines of:

add unit and e2e tests

And that's it. So we're mostly starting from ground zero on setting expectations on what tests should include and scope of changes.

wojtek-t · 2022-04-14T13:48:07Z

So that won't merge too fast :)
/hold

kikisdeliveryservice

Some initial thoughts mostly agreeing that this goes into the Test Plan (I was going to propose changes there so you beat me to it!).

As such, let's try to be more specific in our expectations for what test coverage and passage rates should look like when moving from alpha->beta and beta ->ga.

Let's also try to get clarity on the shape of the testing needed. Does this require a new package to adequately test it? Is this just added to an existing suite?

Re: existing tests, we need to balance between raising awareness on the current state of a sigs tests vs putting the onus on a specific author to fix it. Asking about the current rate raises awareness (and I agree with having that memorialized in the KEP), but we will need to have a bigger policy as to whether that should block individual KEPs from merging. This almost seems like a pre-release audit of testing rates and discussion with the SIGs/leads as to whether or not they have plans to remediate (and what that timeline is).

keps/NNNN-kep-template/README.md

kikisdeliveryservice · 2022-04-14T19:44:58Z

keps/NNNN-kep-template/README.md

+We need to ensure that all existing areas of the code that will be touched by
+this enhancement are adequatly tested to mitigate the risk of regressions.
+--->
+


I would add a question re: e2e specifically asking if an existing test package already exists where tests would be added vs a new test package needs to be created. Along with clarifying whether the sig has adequate support/capacity to create and maintain these new tests.

kikisdeliveryservice · 2022-04-14T19:46:46Z

keps/NNNN-kep-template/README.md

+So even if those hit Alpha in such state, it won't be possible to target Beta
+graduation (and thus by-default enablement) until the testing is sufficient.
+-->
+


Additionally, as one moves from Beta to GA what are our test expectations? Above you mention 80% for unit tests, what do we expect to see for e2e? In an ideal world, what would e2e on a feature look like to allow it to be targeted to GA? Let's be explicit here.

I'd also like to see (as the KEP moves through stages) passage rates for the specific implemented tests (as opposed to the general passage rates) so that we can identify flaky tests.

In an ideal world, what would e2e on a feature look like to allow it to be targeted to GA? Let's be explicit here.

I don't think it's possible to describe it generically here.
Or are you asking just for adding a question about this? If so that makes sense.

I'd also like to see (as the KEP moves through stages) passage rates for the specific implemented tests (as opposed to the general passage rates) so that we can identify flaky tests.

Good point - let me add that.

liggitt · 2022-04-14T19:58:54Z

Re: existing tests, we need to balance between raising awareness on the current state of a sigs tests vs putting the onus on a specific author to fix it. Asking about the current rate raises awareness (and I agree with having that memorialized in the KEP)

We definitely want the people who are proposing changes to be aware of the health of the area they are proposing changing. I'm on the fence about how much detail belongs in their KEP, since it sort of denormalizes component health info into active proposals, and is likely to go stale fast.

"Is area X healthy enough to accept changes?" seems like info that belongs more at the component/subproject/sig level, and is the responsibility of the maintainers of that area to surface, and is a prereq for KEPs in that area.

we will need to have a bigger policy as to whether that should block individual KEPs from merging.

"the relevant area is not healthy enough to accept the changes proposed in this KEP" is a completely legitimate reason to not accept otherwise acceptable changes. I expect approvers to be making those judgement calls already, and communicating clearly when that is an issue impacting an otherwise acceptable KEP.

liggitt · 2022-04-14T19:36:47Z

keps/NNNN-kep-template/README.md

+This section must be completed when targeting alpha to a release.
+-->
+
+###### What is the current (unit) test coverage of packages that will be touched when implementing this KEP?


this seems hard to answer ahead of implementation, and likely to become stale quickly

Instead of these few questions, how about:

[ ] I understand the owners of the involved components may require updating existing tests to make this code solid enough to build on top of (everyone has to check yes)

and

[ ] statement from existing owners regarding pre-existing component test health (to be supplied by KEP reviewers)

this seems hard to answer ahead of implementation, and likely to become stale quickly

Thinking about it, I will tweak to the following:

this part will be required only for Beta/GA and will have to report healthiness of packages that were touched - this means that you won't need to predict anything (but rather report) and will be a double check of both authors and reviewers did a reasonable job.

Instead of these few questions, how about:

I'm not against adding those, but I don't believe in those being enough. We need some mechanism to track it this too.

The only thing I feel strongly about is to give the area owners the chance / responsibility to state the testing deficiencies of the existing code. I think having some human judgment in there is going to be better than picking a coverage target, for example. We can suggest a coverage target if the owners don't have a better idea about the deficiencies.

Which I mentioned in the template explicitly too.

As an approver you will be able to say that 20% coverage is fine for you if you want.

But the first thing to achieve is to improve visibility and awareness - many people don't realize how bad our coverage is in some areas.

kikisdeliveryservice · 2022-04-14T20:02:11Z

"Is area X healthy enough to accept changes?" seems like info that belongs more at the component/subproject/sig level, and is the responsibility of the maintainers of that area to surface, and is a prereq for KEPs in that area.

Agree. Perhaps we can surface this as a pre-release testing survey (that I just made up) during the period of time between end of one release and beginning of another? The idea would be tease out if things are going well or if there are isolated portions of tests causing issues (some keps may go in not affecting those) or an overall shaky test suite (perhaps resulting in decision to not merge lower priority keps/minimize features going in/being promoted/etc...).

lavalamp · 2022-04-14T20:04:48Z

keps/NNNN-kep-template/README.md

+<!--
+This question should be filled when targeting Beta release.
+The goal is to ensure that we don't accept enhancements with inadequate testing.
+So even if those hit Alpha in such state, it won't be possible to target Beta


IMO everything should always be tested. alpha/beta/GA doesn't matter for this requirement.

(As a note, all KEPs are required to have a test plan regardless of stage, so +1 on lavalamps point above.)

IMO everything should always be tested. alpha/beta/GA doesn't matter for this requirement.

I agree with the principle. But the principle was always there but didn't work very well in practice.
So I would like have a way to ensure that we won't proceed to Beta if shortcuts were made in Alpha.

If the principle is clearly there already, and it's not being followed, then there's likely no modification to this template that will fix that, since the problem is enforcement and not the instructions.

If the principle is clearly there already, and it's not being followed, then there's likely no modification to this template that will fix that, since the problem is enforcement and not the instructions.

I don't fully agree. Because I can also imagine that being something not fully conscious.
If you're forced to fill in (or review) the template here for Beta graduation, this will force you to think about this.

wojtek-t · 2022-04-15T06:34:17Z

@liggitt @lavalamp @kikisdeliveryservice - thanks for the feedback - PTAL

aojea · 2022-04-18T09:55:52Z

keps/NNNN-kep-template/README.md

+- <package>: <current test coverage>
+
+The data can be easily read from:
+https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-unit


note: we should work on automating this, I think you've already opened an issue about it

Yes - I already opened an issues for that. But at least for some time, we won't enforce specific targets, because our coverage is still way too low in multiple places.

aojea · 2022-04-18T09:59:03Z

+1

kikisdeliveryservice

I left some suggestions for clarity but think this provides nice clarification and further guidance for the KEP Test Plan section.

kikisdeliveryservice · 2022-04-20T16:24:15Z

keps/NNNN-kep-template/README.md

@@ -270,6 +266,55 @@ when drafting this test plan.
 [testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
 -->

+[ ] I/we understand the owners of the involved components may require updating


Suggested change

[ ] I/we understand the owners of the involved components may require updating

[ ] I/we understand, as owners, that the involved components may require updates to

I didn't fully changed that because I think the change wasn't reflecting what I wanted.

I wanted to say that "component OWNERS" (say approvers) have a right to request additional tests before proceeding. Does that make sense?

kikisdeliveryservice · 2022-04-20T16:25:04Z

keps/NNNN-kep-template/README.md

@@ -270,6 +266,55 @@ when drafting this test plan.
 [testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
 -->

+[ ] I/we understand the owners of the involved components may require updating
+existing tests to make this code solid enough prior committing changes necessary


Suggested change

existing tests to make this code solid enough prior committing changes necessary

existing tests to make this code solid enough prior to committing the changes necessary

keps/NNNN-kep-template/README.md

kikisdeliveryservice · 2022-04-20T16:30:00Z

keps/NNNN-kep-template/README.md

+<!--
+In principle every added code should be unit tested. However, the exact tests are hard
+to answer ahead of implementation. As a result, this section should only be filled when
+targeting Beta to ensure that something wasn't missed during Alpha implementation.


Do we want any updates to this for GA or just beta?

it should be no-op for GA, but it doesn't hurt to add it - added.

kikisdeliveryservice · 2022-04-20T16:34:47Z

@mrbobbytables @jeremyrickard PTAL. I believe this provides some clarifications as to what we expect to see in the Test Plan (which many times have thin details). The changes focus on stability and shouldn't be burdensome and will provide the author and sig with better insight into the stability of the features and components.

lavalamp · 2022-04-22T22:58:03Z

I think this might be a tad heavyhanded, but we should try and see if it works.

/lgtm

liggitt · 2022-04-25T15:25:18Z

keps/NNNN-kep-template/README.md

+Talking about individual tests would be an overkill, so we just require to list the
+packages that were touched during the implementation together with their current test
+coverage in the form:
+- <package>: <current test coverage>


in the KEP + beta stage still seems like the wrong time/place for this information to me; it denormalizes information that will be instantly stale into KEPs after changes were already accepted into the tree

the "I acknowledge improvements to existing code may be required" bit above seems like a good mental prompt to me

for unit testing, I would reframe this as "all new code is expected to have complete unit test coverage. if that is not possible, explain why here and explain why that is acceptable"

I'm definitely happy to add what you suggested above.

But I don't really agree that the coverage information is a wrong time/place for this information.
If you're worried about staleness - let me change that to:

<package>: <date> - <coverage>

This will be definition not go stale. And having the information about the coverage will expose it to people just reading KEPs and not watching to individual PRs to see how it goes.

My main reason for ensuring it's put here is that KEP approvers are in many cases not approvers for the code itself (or at least not all the code). So putting it here verbatim will show it very clearly into their eyes.

+1 here. I think the stats here are going to be stale too quickly and there isn't going to be a lot of value here for alpha/beta KEP states.

to be clear about the timing aspect... I think pre-alpha/alpha is exactly the right time to be asking this question, since that's when code starts merging and maybe breaking existing undertested stuff. I just think asking the KEP author to collate it into the KEP isn't a great mechanism.

to be clear about the timing aspect... I think pre-alpha/alpha is exactly the right time to be asking this question, since that's when code starts merging and maybe breaking existing undertested stuff

OK - so I'm all for asking that for Alpha (if you remember that was actually my initial proposal).
But your couterargument is that the exact set of packages will sometimes be hard to predict.

So before Alpha is the compromise because:
(a) a significant part of implementation is already done at this point
(b) it still isn't enabled by default, which means that enhancement owner still have big motivation to move it to Beta

I discussed this with Jordan offline and updated to some in-between point.

I think the updated unit test bullet is a reasonable level of info to include

liggitt · 2022-04-25T16:05:06Z

aside from the request for package unit test coverage stats, these additions lgtm

jeremyrickard · 2022-04-25T18:10:04Z

Thanks for all the thoughtful comments and reviews on this everyone! This generally looks good to me, although I agree with @liggitt's point above regarding package level unit test coverage stats. I think we should amend the content there, but maybe it's sufficient to just say the stats should be filled out for promotion to stable?

lavalamp · 2022-04-26T17:16:06Z

keps/NNNN-kep-template/README.md

+
+<!--
+Based on reviewers feedback describe what additional tests need to be added prior
+implementing this enhancement to ensure the enhancements have also solid foundations.


nit: prior to implementing

lavalamp · 2022-04-26T17:17:06Z

/lgtm

jeremyrickard · 2022-04-26T20:59:53Z

The updated wording seems good and discussion above seem good to me.

/approve

kikisdeliveryservice · 2022-04-26T21:42:06Z

Agree with the updates since last time I reviewed. After this merges, I'll send out an email to k-dev/sig leads to note the Test Plan clarifications/changes. We can also let the new RT (once it forms) know about the change and to keep an eye out on the enhancements as they review them.

I think we'll definitely get more robust Test Plans thanks to this PR :)

/approve

k8s-ci-robot · 2022-04-26T21:42:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jeremyrickard, kikisdeliveryservice, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/OWNERS~~ [jeremyrickard,kikisdeliveryservice]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wojtek-t · 2022-04-27T06:17:25Z

Thank you all for all the comments - I really hope it will help with our testing quality.
If we realize in a release or so that something can be done differently/better, we can always change it, but for now I don't think we see those opportunities.

I'm going to hold cancel to let it merge to ensure it's there before 1.25 release starts.

/hold cancel

tallclair · 2022-05-12T16:49:36Z

Some feedback from filling in these sections for #3310 (comment)

The prerequisites comment could use a bit more guidance, and maybe an example.
Generating the test coverage was painful. The comment says "The data can be easily read from: https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-unit", but that was slow to load, and then crashed my browser (resolved with using the regex filter). Then, coverage cannot easily be copied out due to how the table selection works. If we keep this, I'd like to see a script that can generate the report in the desired format, ex: hack/generate-coverage-report.sh staging/src/k8s.io/pod-security-admission/...
"In principle every added code should have complete unit test coverage, so providing
the exact set of tests will not bring additional value." I don't know what the second half of this sentence means. Regarding complete unit test coverage, I disagree. See 7f5d72d for examples of cases that shouldn't have unit tests.
integration tests: took me some time to figure out how to find the PodSecurity integration tests (section would benefit from an example).
Integration & e2e sections ask for test coverage - how do you get test coverage from integration and e2e tests?

k8s-ci-robot assigned aojea, deads2k, ehashman, jberkus, johnbelamaric, lavalamp and liggitt Apr 14, 2022

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Apr 14, 2022

k8s-ci-robot requested review from annajung and kikisdeliveryservice April 14, 2022 13:47

wojtek-t commented Apr 14, 2022

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 14, 2022

wojtek-t mentioned this pull request Apr 14, 2022

KEP-3138: Increase the Reliability Bar proposal #3139

Closed

wojtek-t force-pushed the better_testing_in_kep branch from f230107 to 76307b7 Compare April 14, 2022 13:54

kikisdeliveryservice reviewed Apr 14, 2022

View reviewed changes

kikisdeliveryservice self-assigned this Apr 14, 2022

liggitt reviewed Apr 14, 2022

View reviewed changes

lavalamp reviewed Apr 14, 2022

View reviewed changes

wojtek-t force-pushed the better_testing_in_kep branch 2 times, most recently from 53b44a1 to 0e31c99 Compare April 15, 2022 06:33

aojea reviewed Apr 18, 2022

View reviewed changes

kikisdeliveryservice reviewed Apr 20, 2022

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 22, 2022

liggitt reviewed Apr 25, 2022

View reviewed changes

wojtek-t force-pushed the better_testing_in_kep branch from edc5d69 to a113d0c Compare April 26, 2022 15:05

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 26, 2022

Extend Test Plan section in the KEP template

a86942e

wojtek-t force-pushed the better_testing_in_kep branch from a113d0c to a86942e Compare April 26, 2022 15:07

lavalamp reviewed Apr 26, 2022

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 26, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 26, 2022

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 27, 2022

k8s-ci-robot merged commit 278a316 into kubernetes:master Apr 27, 2022

k8s-ci-robot added this to the v1.25 milestone Apr 27, 2022

github-actions bot mentioned this pull request Apr 27, 2022

Week Ending April 24, 2022 dev-obs/actus#424

Open

tallclair mentioned this pull request May 12, 2022

KEP-2579: Pod Security GA plan #3310

Merged

gnufied mentioned this pull request Jun 8, 2022

Change milestone of recovery feature to 1.25 #3362

Closed

nikhita mentioned this pull request Jan 12, 2023

Add subresource support to kubectl #2590

Open

12 tasks

marosset mentioned this pull request Jan 31, 2023

Mutable scheduling directives for suspended Jobs #2926

Closed

8 tasks

	[ ] I/we understand the owners of the involved components may require updating
	[ ] I/we understand, as owners, that the involved components may require updates to

	existing tests to make this code solid enough prior committing changes necessary
	existing tests to make this code solid enough prior to committing the changes necessary

Update Test Plan in KEP template #3279

Update Test Plan in KEP template #3279

Conversation

wojtek-t commented Apr 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kikisdeliveryservice Apr 14, 2022 • edited Loading

Choose a reason for hiding this comment

wojtek-t commented Apr 14, 2022

kikisdeliveryservice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt commented Apr 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kikisdeliveryservice commented Apr 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wojtek-t commented Apr 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aojea commented Apr 18, 2022

kikisdeliveryservice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kikisdeliveryservice commented Apr 20, 2022

lavalamp commented Apr 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wojtek-t Apr 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt Apr 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt commented Apr 25, 2022

jeremyrickard commented Apr 25, 2022

Choose a reason for hiding this comment

lavalamp commented Apr 26, 2022

jeremyrickard commented Apr 26, 2022

kikisdeliveryservice commented Apr 26, 2022

k8s-ci-robot commented Apr 26, 2022

wojtek-t commented Apr 27, 2022

tallclair commented May 12, 2022 • edited Loading

kikisdeliveryservice Apr 14, 2022 •

edited

Loading

liggitt commented Apr 14, 2022 •

edited

Loading

kikisdeliveryservice commented Apr 14, 2022 •

edited

Loading

wojtek-t Apr 25, 2022 •

edited

Loading

liggitt Apr 25, 2022 •

edited

Loading

tallclair commented May 12, 2022 •

edited

Loading