-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kola: support denylist Warn feature #3539
Conversation
1071f29
to
0f06f66
Compare
mantle/kola/harness.go
Outdated
// Ignore the error when only denied tests with Warn:true feaute failed | ||
if runErr != nil && allTestsWarnOnError(testResults.getResults()) { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the goal here is that in our pipeline we are able to know when a test that fails is warn: true
and "warn" in the pipline (warn in jenkins is different; the build continues but does get flagged with yellow warn status). This is crucial because otherwise we don't get notified when a warning happens.
I'm thinking we should surface this up and return a non-zero return code, but defined return code so that callers (like jenkins) can adjust their behavior. For example if we returned 77 here jenkins could check the return code if it's non-zero and if it is 77 then warn instead of fail. Here is an example where we added a new option to change the return code in #1152
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this idea and posted some code over in https://github.com/dustymabe/coreos-assembler/commits/PR3539 - let me know what you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add 77
with a separate PR, but not here (to keep warn:true
feature support simple).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add
77
with a separate PR, but not here (to keepwarn:true
feature support simple).
Yes, if you don't mind we can open a followup PR for something like changing the exit code and then update coreos-ci-lib to change behavior of kola()
based on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
mantle/kola/harness.go
Outdated
if IsWarningOnFailure(test) { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. For example denylisted tagged warn: true
tests shouldn't affect allow-rerun-success tag=foo
failed tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh I see now. if a warn:true test didn't have tag=foo
then the loop would exit early because it found a test that didn't allow rerun success, but we don't want this to count against that.
Maybe the better thing to do here would be to not rerun warn:true tests. i.e. we'd filter it out in getRerunnable(
somehow. I'm not sure if that's a great idea or not, but spitballing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've removed this check and now kola
simply reruns everything. Anyhow we sort denylist + warn
out later
5ccf356
to
fdedefa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly LGTM - some comments
mantle/kola/harness.go
Outdated
if IsWarningOnFailure(test) { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh I see now. if a warn:true test didn't have tag=foo
then the loop would exit early because it found a test that didn't allow rerun success, but we don't want this to count against that.
Maybe the better thing to do here would be to not rerun warn:true tests. i.e. we'd filter it out in getRerunnable(
somehow. I'm not sure if that's a great idea or not, but spitballing.
Fail TestResult = "FAIL" | ||
Warn TestResult = "WARN" | ||
Skip TestResult = "SKIP" | ||
Pass TestResult = "PASS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random comment: In the future I think it would be cool if we could use colors for these outputs. I'm thinking red, yellow, white, green.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's about s/white/blue/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this work for you? I can't get colors to output locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me that works. tmux and gnome-terminal have no problems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I got it working :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok now the CI unittests are failing because they don't expect the color in the output. We could add the expected color to the CI tests.. Alternatively I wonder if we should try to check to see if the output is to a terminal (tty) or not and add the color based on that. Something similar to https://github.com/coreos/rpm-ostree/blob/dd370d13d3d41da0e3d61fc9ace558d6bc349f7f/src/libpriv/rpmostree-editor.cxx#L37-L55
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example is NetworkManager. nmcli c show
will show colors, but nmcli c show | cat
won't because the output is to a pipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was wrong?
I didn't have the code checked out right locally :(
38b353d
to
94beb35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
coreos#3344 We bubble denylisted tests with 'Warn: true' option as warnings rather than hard failures: ``` kola -p qemu run --parallel 8 ext.config.ntp.* --output-dir tmp/kola ⏭️ Skipping kola test pattern "ext.config.ntp.chrony.dhcp-propagation": 👉 coreos/fedora-coreos-tracker#1508⚠️ Warning kola test pattern "ext.config.ntp.timesyncd.dhcp-propagation", snoozing expired on Jul 20 2023: 👉 coreos/fedora-coreos-tracker#1508 === RUN ext.config.ntp.chrony.coreos-platform-chrony-config === RUN ext.config.ntp.timesyncd.dhcp-propagation --- PASS: ext.config.ntp.chrony.coreos-platform-chrony-config (27.56s) --- WARN: ext.config.ntp.timesyncd.dhcp-propagation (90.72s) cluster.go:162: Error: Unit kola-runext.service exited with code 1 cluster.go:162: 2023-07-27T06:24:18Z cli: Unit kola-runext.service exited with code 1 harness.go:1236: kolet failed: : kolet run-test-unit failed: Process exited with status 1 FAIL, output in tmp/kola + rc=0 + set +x ``` Co-authored-by: Dusty Mabe <[email protected]>
With the introduction of warn: true the logic gets complicated. This is an attempt to simplify the logic and handle all corner cases.
94beb35
to
e35315b
Compare
e35315b
to
5c2f264
Compare
/retest |
@nikita-dubrovskii: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
the prow/rhcos failure seems unrelated to this. Merging. |
warn: true was added in coreos/coreos-assembler#3539 and can be used to make test failures non-fatal in the pipeline. Let's strip those from the denylist too on promotion.
This is a new feature that was added recently in coreos/coreos-assembler#3539. Now when a snooze expires we can configure it to warn us and continue to build rather than marking the entire run as a failure.
warn: true was added in coreos/coreos-assembler#3539 and can be used to make test failures non-fatal in the pipeline. Let's strip those from the denylist too on promotion.
This is a new feature that was added recently in coreos/coreos-assembler#3539. Now when a snooze expires we can configure it to warn us and continue to build rather than marking the entire run as a failure.
This is a new feature that was added recently in coreos/coreos-assembler#3539. Now when a snooze expires we can configure it to warn us and continue to build rather than marking the entire run as a failure.
This is a new feature that was added recently in coreos/coreos-assembler#3539. Now when a snooze expires we can configure it to warn us and continue to build rather than marking the entire run as a failure.
#3344
We bubble denylisted tests with 'Warn: true' option as warnings rather than hard failures:
```
kola -p qemu run --parallel 8 ext.config.ntp.* --output-dir tmp/kola