Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execute yaml examples via go tests #2541

Closed
wants to merge 2 commits into from

Conversation

bobcatfish
Copy link
Collaborator

Changes

In #2540 we are seeing that some yaml tests are timing out, but it's
hard to see what yaml tests are failing. This commit moves the logic out
of bash and into individual go tests - now we will run an individual go
test for each yaml example, completing all v1alpha1 before all v1beta1
and cleaning up in between. The output will still be challenging to read
since it will be interleaved, however the failures should at least
be associated with a specific yaml file.

This also makes it easier to run all tests locally, though if you
interrupt the tests you end up with your cluster in a bad state and it
might be good to update these to execute each example in a separate
namespace (in which case we could run all of v1alpha1 and v1beta1 at the
same time as well!)

I have a feeling this won't work on the first try and that I've still go a few issues to work out, not to mention that the code is a bit icky, esp. since im using t.Helper so profusely.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

@tekton-robot tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 5, 2020
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 5, 2020
Copy link
Member

@imjasonh imjasonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely progress! 🙌

var stderr, stdout bytes.Buffer
cmd.Stderr = &stderr
cmd.Stdout = &stdout
if err := cmd.Run(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use CombinedOutput instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL! thanks :D

I'm not sure I do want to use combinedoutput in this case tho, cuz im using stderr if the command fails, and if not, im returning stdout. in the error case, dumping both seems fine, but in the success case, not including stderr seems like it makes sense to me. what do you think?

"serviceaccounts",
"persistentvolumeclaims",
}
for _, c := range crdTypes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we should be able to parse the YAML file and create the resource using a client, instead of involving kubectl.

For this specifically we can just use the CRD client to delete the types.

It'd be fine to have a TODO for that for now but that's a better final state I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. We will need to also test some kubectl create commands as this is something the user will do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure which is better - if we want to use the clients, we need to load the yamls + find find the right client for the thing being created. i think im like 60% convinced this is better in the long run, but there's also something nice about just blasting it all out and watching it. i do think the part where i look for the word "run" in the ko output is pretty hacky, ill put a comment in about that for sure


// replaceDockerRepo will look in the content f and replace the hard coded docker
// repo with the on provided via the KO_DOCKER_REPO environment variable
func replaceDockerRepo(t *testing.T, f string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this is kinda gross. I think we could just drop any test that requires ko-building a package?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could - this is what the original yaml tests do tho, do you mind if i keep this in the context of this pull request and remove in another PR?


// pollRun will use kubectl to query the specified run to see if it
// has completed. It will timeout after timeoutSeconds.
func pollRun(t *testing.T, run string, wg *sync.WaitGroup) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could Watch instead and maybe end up faster?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm could you give me an example of how that would work? ive never used kubectl watch - trying it out now it seems like id have to stream output from it which seems a bit more complicated given im calling kubectl with exec.Command - i think im being a bit lazy for sure but this seems pretty much fine for now? i can put in a comment to explore using watch

done
done

failed=$(go test -timeout 15m ./examples)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌🙌🙌🙌🙌

@vdemeester
Copy link
Member

level=error msg="Running error: context loading failed: failed to load program with go/packages: could not parse GOARCH and Go compiler in format \"<GOARCH> <compiler>\" from stdout of go command:\nGOROOT=/usr/local/go GOPATH=/home/prow/go GO111MODULE= PWD=/home/prow/go/src/github.com/tektoncd/pipeline go [list -f {{context.GOARCH}} {{context.Compiler}} -tags e2e -mod=vendor -- unsafe]\ndir: /home/prow/go/src/github.com/tektoncd/pipeline\nstdout: <<>>\nstderr: <<go: inconsistent vendoring in /home/prow/go/src/github.com/tektoncd/pipeline:\n\tgithub.com/google/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/google/[email protected]\n\tgithub.com/pkg/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/pkg/[email protected]\n\nrun 'go mod vendor' to sync, or use -mod=mod or -mod=readonly to ignore the vendor directory\n>>"

The build failure is interesting 😛

@ghost ghost added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label May 21, 2020
@bobcatfish bobcatfish force-pushed the yaml_blamo branch 4 times, most recently from 7aac3b2 to 375d820 Compare May 21, 2020 20:50
@bobcatfish
Copy link
Collaborator Author

/test check-pr-has-kind-label

@bobcatfish
Copy link
Collaborator Author

I dunno what was up with that weird prow error but it stopped :D

This should be ready for a real review now! Might still be some kinks to work out...

I'm hoping after this we could migrate these tests to a separate test triggered via prow maybe, tho it's a bit more complicated b/c we'll have to invoke boskos too...

@bobcatfish bobcatfish marked this pull request as ready for review May 21, 2020 22:05
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2020
@ghost
Copy link

ghost commented May 26, 2020

Looks like this would fix #1251 ?

Also looks like there's a similarly-intentioned PR here: #2685 ?

@ghost ghost mentioned this pull request May 26, 2020
3 tasks
#!/usr/bin/env bash

# TODO: assert something about the expected contents of $(params.output)
# TODO: assert something about the expected results in the cluster
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh no

@bobcatfish
Copy link
Collaborator Author

Looks like this would fix #1251 ?

🤦‍♀️ my bad for not seeing that sooner! If this next round of tests passes, im suggesting (#2685 (comment) ) we merge this and then add improvements from #2685 on top - means more work for @thomaschandler tho :(

@bobcatfish bobcatfish added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels May 26, 2020
@tekton-robot
Copy link
Collaborator

This PR cannot be merged: expecting exactly one kind/ label

Available kind/ labels are:

kind/bug: Categorizes issue or PR as related to a bug.
kind/flake: Categorizes issue or PR as related to a flakey test
kind/cleanup: Categorizes issue or PR as related to cleaning up code, process, or technical debt.
kind/design: Categorizes issue or PR as related to design.
kind/documentation: Categorizes issue or PR as related to documentation.
kind/feature: Categorizes issue or PR as related to a new feature.
kind/misc: Categorizes issue or PR as a miscellaneuous one.

@@ -30,8 +30,9 @@ install_pipeline_crd
failed=0

# Run the integration tests
header "Running Go e2e tests"
go_test_e2e -timeout=20m ./test/... || failed=1
# TODO HACK HACK HACK HACK
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dead code? If not could the comment do a bit more to describe why these lines have been left in but commented out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHOOPS

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bit that runs our integration tests! Unfortunately they all run BEFORE the yaml tests so I had temporarily commented them out so I could check if the tests were succeeding without having to wait for these to complete 🤦‍♀️

In the long run hopefully we can run them in parallel! The main complication that stops us from doing that right away is that both tests require a boskos cluster

if err != nil {
t.Fatalf("couldnt read contents of %s: %v", f, err)
}
return strings.Replace(string(read), "gcr.io/christiewilson-catfactory", r, -1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth putting "gcr.io/christiewilson-catfactory" into a named constant? I'm a bit confused why it appears here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call!

In tektoncd#2540 we are seeing that some yaml tests are timing out, but it's
hard to see what yaml tests are failing. This commit moves the logic out
of bash and into individual go tests - now we will run an individual go
test for each yaml example, completing all v1alpha1 before all v1beta1
and cleaning up in between. The output will still be challenging to read
since it will be interleaved, however the failures should at least
be associated with a specific yaml file.

This also makes it easier to run all tests locally, though if you
interrupt the tests you end up with your cluster in a bad state and it
might be good to update these to execute each example in a separate
namespace (in which case we could run all of v1alpha1 and v1beta1 at the
same time as well!)
There's some good stuff in this doc but it's hard to remember what's in
it cuz it's kinda all over the place - maybe a TOC will help!
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbwsg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 26, 2020
@bobcatfish
Copy link
Collaborator Author

can has lgtm?

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this, looking great!
The only concern I have is about stdout in case of error, but that's something we can improve on later.

cmd.Stderr = &stderr
cmd.Stdout = &stdout
if err := cmd.Run(); err != nil {
logf("couldn't run command %s %v: %v, %s", c, args, err, stderr.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like in case of error we do not get to see the stdout at all.
I think we should print out both out and err, perhaps in two different log commands.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree, if the command failed, I tend to want to see stderr and stdout.
gotest.tools/v3/icmd would come handy there 😝

return
}

switch status {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future we might want to allow for example metadata of some kind (annotations?) to specify an expected target status and more details about it.

}

// getYamls will look in the directory in examples indicated by version and run for yaml files
func getYamls(t *testing.T, version, run string) []string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if some of this helpers could use a unit test... we do run them as part of the tests anyways, so they most likely all do that they are expected to :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha probably!!!

@afrittoli
Copy link
Member

/hold

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 27, 2020
@afrittoli
Copy link
Member

I'm afraid something is wrong, the YAML tests are timing out, logged as "FAILED" but still the CI job is marked as green:

panic: test timed out after 15m0s

(...)

goroutine 2833 [IO wait]:
internal/poll.runtime_pollWait(0x7f23b8a5d008, 0x72, 0xffffffffffffffff)
	/usr/local/go/src/runtime/netpoll.go:203 +0x55
internal/poll.(*pollDesc).wait(0xc00038a4f8, 0x72, 0x501, 0x5de, 0xffffffffffffffff)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc00038a4e0, 0xc00027ec22, 0x5de, 0x5de, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
	/usr/local/go/src/os/file_unix.go:263
os.(*File).Read(0xc0005ea328, 0xc00027ec22, 0x5de, 0x5de, 0x22, 0x0, 0x0)
	/usr/local/go/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc000589ef0, 0x13c8460, 0xc0005ea328, 0x7f23b9c24660, 0xc000589ef0, 0xc0005f0701)
	/usr/local/go/src/bytes/buffer.go:204 +0xb1
io.copyBuffer(0x13c7120, 0xc000589ef0, 0x13c8460, 0xc0005ea328, 0x0, 0x0, 0x0, 0x406ca5, 0xc0004ae360, 0xc0005f07b0)
	/usr/local/go/src/io/io.go:391 +0x2fc
io.Copy(...)
	/usr/local/go/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0xc0004ae360, 0xc0005f07b0)
	/usr/local/go/src/os/exec/exec.go:310 +0x63
os/exec.(*Cmd).Start.func1(0xc0005e7600, 0xc00069bb00)
	/usr/local/go/src/os/exec/exec.go:436 +0x27
created by os/exec.(*Cmd).Start
	/usr/local/go/src/os/exec/exec.go:435 +0x608
FAIL	github.com/tektoncd/pipeline/examples	900.032s
FAIL'
+ ((  failed  ))

@bobcatfish
Copy link
Collaborator Author

whoa, that's no good at all! thanks for noticing that @afrittoli 🙏

@thomaschandler
Copy link
Contributor

thomaschandler commented Jun 1, 2020

@bobcatfish I've managed to get tests passing on #2685. How would you feel about merging #2685 instead of this MR?

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments 👼

"sync"
"testing"
"time"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra space not needed 😝

// we may want to consider either not running examples that require registry access
// or doing something more sophisticated to inject the right registry in when folks
// are executing the examples
horribleHardCodedRegistry = "gcr.io/christiewilson-catfactory"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not running a local registry (in the test namespace) ? (in any case, it would be a follow-up)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a good follow up! unless the test is deploying to a different namespace 🤔

cmd.Stderr = &stderr
cmd.Stdout = &stdout
if err := cmd.Run(); err != nil {
logf("couldn't run command %s %v: %v, %s", c, args, err, stderr.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree, if the command failed, I tend to want to see stderr and stdout.
gotest.tools/v3/icmd would come handy there 😝

t.Helper()
r := os.Getenv("KO_DOCKER_REPO")
if r == "" {
t.Fatalf("KO_DOCKER_REPO must be set")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we error out or skip ? (like we do on the e2e go tests — mainly to not break openshift-pipelines CI 😝 )


// logRun will retrieve the entire yaml of run in namespace n and log it
func logRun(t *testing.T, n, run string) {
t.Helper()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Helper might no be needed here (as it is called by other func that are already calling t.Helper) — see stretchr/testify#933

for i := 0; i < (timeoutSeconds / sleepBetween); i++ {
status, err := cmd(t.Logf, "kubectl", []string{"--namespace", n, "get", run, "--output=jsonpath={.status.conditions[*].status}"}, "")
if err != nil {
t.Fatalf("couldnt get status of %s: %v", run, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Fatalf does t.FailNow which calls runtime.Goexit, so wg.Done() seems unecessary 🙃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means it will quit the test on this error… If that's not what we want, we need to use t.Errorf.


t.Logf("Applying %s to namespace %s", y, n)
content := replaceDockerRepo(t, fmt.Sprintf("%s/%s/%s", version, run, y))
output, err := cmd(t.Logf, "ko", []string{"create", "--namespace", n, "-f", "-"}, content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ko/kubectl 🤔 ⁉️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the existing scripts are using ko :O

// getYamls will look in the directory in examples indicated by version and run for yaml files
func getYamls(t *testing.T, version, run string) []string {
t.Helper()
_, filename, _, _ := runtime.Caller(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to make the test dependent on the test file ? 🤔 (and use runtime.Caller)

@bobcatfish
Copy link
Collaborator Author

@bobcatfish I've managed to get tests passing on #2685. How would you feel about merging #2685 instead of this MR?

@thomaschandler good call, I think you're able to get to this faster than me! lemme go over to #2685 and review and we can merge your PR instead/first :D

@bobcatfish
Copy link
Collaborator Author

Apologies for closing this after your careful review @vdemeester @afrittoli @imjasonh but @thomaschandler is making much faster progress over in #2685 and he's using the client libs so it's a bit cleaner so im closing this PR in favor of it.

@bobcatfish bobcatfish closed this Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants