Epic: ginkgo: ExitWithError() considered harmful #18188

edsantiago · 2023-04-13T17:46:49Z

Deep-diving into fixing logformatter so it handles ginkgo v2, I found a log that included our number-one flake :

# podman-remote [options] exec -it test ip addr show eth0
Error: container create failed (no logs from conmon): .....

...but the test in question passed! TL;DR the root cause is:

podman/test/e2e/network_connect_disconnect_test.go

Lines 426 to 428 in 5e6c064

    
           exec = podmanTest.Podman([]string{"exec", "-it", "test", "ip", "addr", "show", "eth0"}) 
        
           exec.WaitWithDefaultTimeout() 
        
           Expect(exec).Should(ExitWithError())

This is one of my pet peeves. All of you have heard me rant angrily about that, but I'm going to repeat it here anyway: please, please do not just check that "some error occured"! Everywhere possible—and it's usually possible—tests should:

Check for a PRECISE EXIT CODE; and
Check for an ERROR MESSAGE.

I count over 200 instances of ExitWithError under tests/e2e. Those need to be fixed. I'll tackle some of them over the next few weeks/months/years, but please, everyone, when writing new tests, please please please be more careful.

The text was updated successfully, but these errors were encountered:

vrothberg · 2023-04-14T09:05:04Z

Fair to rephrase that just checking exit code != 0 will hide flakes of some entirely different error?

I also count 239 but will refrain from changing until @Luap99's PR migrating to v2 has merged.

Luap99 · 2023-04-14T09:15:01Z

Given that podman mostly exits 125 anyway I see little problems with ExitWithError(), the big thing here is, as you said, that they do not check error messages which is a big problem.

The e2e test have unique advantage to the systems test they we get individual stdout/err streams. So if a test checks Expect(output).To(Equal("foo")) it will likely work in most cases even when there are unexpected logrus warnings/errors logged on stderr.

What each test should do is also Expect(syderr).To(Equal("")). And then in cases where expects an error match it correctly.

edsantiago · 2023-04-17T20:43:35Z

I'm just going to leave this here:

podman/test/e2e/prune_test.go

Lines 527 to 532 in 5c70641

    
           It("podman system prune --all --external fails", func() { 
        
           	prune := podmanTest.Podman([]string{"system", "prune", "--all", "--enternal"}) 
        
           	prune.WaitWithDefaultTimeout() 
        
           	Expect(prune).Should(Exit(125)) 
        
           })

The test passes. This is the kind of stuff that I'm always terrified will make its way into our code base.

edsantiago · 2023-05-09T14:32:40Z

Here's a beautiful example of what I'm talking about: machine e2e tests. There is one failure, at the start: timeout waiting for machine. No other tests fail! A human reader can see error after error, but the tests don't actually check output, so la la la la. All those tests need major surgery to add message checking; I started doing that just now but it's too much to do in a day.

Luap99 · 2023-05-09T14:39:32Z

Yeah there is a lot of nasty stuff in there. I am currently going through most problems I encountered while working on ginkgo v2. Will report an issue in the next few days. Poorly written tests are definitely at the top of that list.

github-actions · 2023-06-14T00:06:41Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2023-06-20T18:40:10Z

@edsantiago @Luap99 Is this still an issue?

edsantiago · 2023-06-20T18:41:40Z

Very much so.

Luap99 · 2024-04-04T11:17:10Z

@edsantiago Can this be closed with since you added all these stderr checks?

edsantiago · 2024-04-04T11:21:52Z

No, this is completely different and unrelated. (The stderr checks I added were only for Exit(0)).

The purpose of this epic is:

-    Expect(ExitWithError)
+    Expect(ExactExitStatusNotJustSomeVagueErrorCode)
+    Expect(ExactErrorMessageOrAtLeastADistinctiveSubstring)

Luap99 · 2024-04-04T11:27:14Z

ok, agreed somehow I thought you added many of them while you did your other changes.

edsantiago · 2024-04-04T11:32:47Z

These are hard, and I'm lazy. For every individual ExitWithError() I need to manually find a test log, guess the exit code, identify the error message, see if it differs between root/rootless/remote, and add an ExpectSubstring() that matches all cases (or, yuk, a conditional).

What I'm thinking about now is a commit hook that prevents new ExitWithError. To at least prevent the problem from getting worse.

Luap99 · 2024-04-04T11:58:58Z

These are hard, and I'm lazy. For every individual ExitWithError() I need to manually find a test log, guess the exit code, identify the error message, see if it differs between root/rootless/remote, and add an ExpectSubstring() that matches all cases (or, yuk, a conditional).

I wonder if this could be automated. We already have the logs and know the stderr output, so technically it would juts need to be captured then matched for common substrings and then added as check in the right place. IF there are enough of them I think it may be worth to automate this. Of course we still need review the results to make sure the error messages are actually the ones that should be happening in the tests.

What I'm thinking about now is a commit hook that prevents new ExitWithError. To at least prevent the problem from getting worse.

This sounds like a good first step.

...and an optional error-message string, to be checked against stderr. This is a starting point and baby-steps progress toward containers#18188. There are 249 ExitWithError() checks in test/e2e. It will take weeks to fix them all. This commit enables new functionality: Expect(ExitWithError(125, "expected substring")) ...while also allowing the current empty-args form. Once all 249 empty-args uses are modernized, the matcher code will be cleaned up. I expect it will take several months of light effort to get all e2e tests transitioned to the new form. I am choosing to do so in pieces, for (relative) ease of review. This PR: 1) makes the initial changes described above; and 2) updates a small subset of e2e _test.go files such that: a) ExitWithError() is given an exit code and error string; and b) Exit(Nonzero) is changed to ExitWithError(Nonzero, "string") (when possible) Signed-off-by: Ed Santiago <[email protected]>

edsantiago · 2024-07-02T12:35:34Z

ExitWithError() now requires two args: exit status and error substring. I can't (automatically) enforce that new test code test for meaningful substrings, we will all have to be diligent in writing and reviewing new tests.

There's still the Expect(Should(Exit(N))) pattern. There's no way to block that, because it is sometimes impossible to use ExitWithError(). Again, we'll try to flag those in review and allow them only if nothing else will work.

Current list of exceptions that I can't clean up:

$ ack '\(Exit\([^0]' test/e2e
test/e2e/containers_conf_test.go:606:		Eventually(session, DefaultWaitTimeout).Should(Exit(125), description)
test/e2e/images_test.go:301:			Expect(session).Should(Exit(result))
test/e2e/info_test.go:43:			Expect(session).Should(Exit(tt.exitCode), desc)
test/e2e/play_kube_test.go:2093:		Expect(kube).Should(Exit(-1))
test/e2e/play_kube_test.go:2105:		Expect(exec).Should(Exit(-1))
test/e2e/quadlet_test.go:638:			Expect(session).Should(Exit(exitCode))
test/e2e/quadlet_test.go:708:			Expect(session).Should(Exit(1))
test/e2e/systemd_activate_test.go:137:		Expect(activateSession).To(Exit(125))
test/e2e/version_test.go:59:			Expect(session).Should(Exit(tt.exitCode), desc)

I consider this closed. This was an interesting challenge.

edsantiago mentioned this issue Apr 19, 2023

Ed's pet PR with no flake retries #17831

Draft

Luap99 mentioned this issue May 11, 2023

ginkgo tests EPIC #18540

Open

13 tasks

github-actions bot added the stale-issue label Jun 14, 2023

edsantiago removed the stale-issue label Jun 14, 2023

Luap99 mentioned this issue Jun 19, 2023

bump golangci-lint to v1.53.3 #18931

Merged

edsantiago mentioned this issue Apr 5, 2024

e2e: redefine ExitWithError() to require exit code #22270

Merged

edsantiago closed this as completed Jul 2, 2024

stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Oct 1, 2024

stale-locking-app bot locked as resolved and limited conversation to collaborators Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: ginkgo: ExitWithError() considered harmful #18188

Epic: ginkgo: ExitWithError() considered harmful #18188

edsantiago commented Apr 13, 2023

vrothberg commented Apr 14, 2023 •

edited

Loading

Luap99 commented Apr 14, 2023

edsantiago commented Apr 17, 2023

edsantiago commented May 9, 2023

Luap99 commented May 9, 2023

github-actions bot commented Jun 14, 2023

rhatdan commented Jun 20, 2023

edsantiago commented Jun 20, 2023

Luap99 commented Apr 4, 2024

edsantiago commented Apr 4, 2024

Luap99 commented Apr 4, 2024

edsantiago commented Apr 4, 2024

Luap99 commented Apr 4, 2024

edsantiago commented Jul 2, 2024

Epic: ginkgo: ExitWithError() considered harmful #18188

Epic: ginkgo: ExitWithError() considered harmful #18188

Comments

edsantiago commented Apr 13, 2023

vrothberg commented Apr 14, 2023 • edited Loading

Luap99 commented Apr 14, 2023

edsantiago commented Apr 17, 2023

edsantiago commented May 9, 2023

Luap99 commented May 9, 2023

github-actions bot commented Jun 14, 2023

rhatdan commented Jun 20, 2023

edsantiago commented Jun 20, 2023

Luap99 commented Apr 4, 2024

edsantiago commented Apr 4, 2024

Luap99 commented Apr 4, 2024

edsantiago commented Apr 4, 2024

Luap99 commented Apr 4, 2024

edsantiago commented Jul 2, 2024

vrothberg commented Apr 14, 2023 •

edited

Loading