Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys tests: run_podman: check for unwanted warnings/errors #19878

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion test/system/001-basic.bats
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ function setup() {
run_podman -v
is "$output" "podman.*version \+" "'Version line' in output"

run_podman --config foobar version
run_podman 0+w --config foobar version
is "$output" ".*The --config flag is ignored by Podman. Exists for Docker compatibility\+" "verify warning for --config option"
}

Expand Down
4 changes: 3 additions & 1 deletion test/system/010-images.bats
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,9 @@ EOF
tries=100
while [[ ${#lines[*]} -gt 1 ]] && [[ $tries -gt 0 ]]; do
# Prior to #18980, 'podman images' during rmi could fail with 'image not known'
run_podman images --format "{{.ID}} {{.Names}}"
# '0+w' reflects that we may see "Top layer not found" warnings.
# FIXME FIXME: find a way to check for any other warnings
run_podman 0+w images --format "{{.ID}} {{.Names}}"
tries=$((tries - 1))
done

Expand Down
13 changes: 9 additions & 4 deletions test/system/030-run.bats
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ echo $rand | 0 | $rand
run_podman image exists $NONLOCAL_IMAGE

# Now try running with --rmi : it should succeed, but not remove the image
run_podman run --rmi --rm $NONLOCAL_IMAGE /bin/true
run_podman 0+e run --rmi --rm $NONLOCAL_IMAGE /bin/true
is "$output" ".*image is in use by a container" "--rmi should warn that the image was not removed"
Comment on lines +184 to 185
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is "e", that is, the warning is actually level=error. Should it be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That LGTM. The man page states: "After exit of the container, remove the image unless another container is using it." It seems like a good idea to log when the image cannot be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full agreement on "there should be a warning". My question is, should it be level=error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for clarifying. I think a warning would be better than an error. It's perfectly fine behavior which does not classify as an error IMO.

run_podman image exists $NONLOCAL_IMAGE

Expand Down Expand Up @@ -229,7 +229,7 @@ echo $rand | 0 | $rand
"conmon pidfile (= PID $conmon_pid_from_file) points to conmon process"

# All OK. Kill container.
run_podman rm -f $cid
run_podman rm -f -t0 $cid
if [[ -e $cidfile ]]; then
die "cidfile $cidfile should be removed along with container"
fi
Expand Down Expand Up @@ -946,7 +946,7 @@ EOF
if grep -- -1000 /proc/self/oom_score_adj; then
skip "the current oom-score-adj is already -1000"
fi
run_podman run --oom-score-adj=-1000 --rm $IMAGE true
run_podman 0+w run --oom-score-adj=-1000 --rm $IMAGE true
is "$output" ".*Requested oom_score_adj=.* is lower than the current one, changing to .*"
}

Expand Down Expand Up @@ -1058,7 +1058,12 @@ $IMAGE--c_ok" \
"ls /dev/tty[0-9] with --systemd=always: should have no ttyN devices"

# Make sure run_podman stop supports -1 option
run_podman stop -t -1 $cid
# FIXME: why is there no signal name here? Should be 'StopSignal XYZ'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like a fart. Especially the double white space screams for the value being printed to differ from the one actually being used. For instance, the value being an empty string as printed which is later on being normalized to SIGTERM.

Likely needs a dedicated issue.

# FIXME: do we really really mean to say FFFFFFFFFFFFFFFF here???
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the question. Can you elaborate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't know the reasoning behind accepting stop -t -1. It seems meaningless to me, and even more meaningless to convert that to uint64 and display it as such.

run_podman 0+w stop -t -1 $cid
if ! is_remote; then
assert "$output" =~ "StopSignal failed to stop container .* in 18446744073709551615 seconds, resorting to SIGKILL" "stop -t -1 (negative one) issues warning"
vrothberg marked this conversation as resolved.
Show resolved Hide resolved
fi
run_podman rm -t -1 -f $cid
}

Expand Down
5 changes: 4 additions & 1 deletion test/system/045-start.bats
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@ load helpers
run_podman wait $cid_none_implicit $cid_none_explicit $cid_on_failure

run_podman rm $cid_none_implicit $cid_none_explicit $cid_on_failure
run_podman stop -t 1 $cid_always
run_podman 0+w stop -t 1 $cid_always
if ! is_remote; then
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning is never seen in podman-remote. Should it be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes but it's technically very difficult. logrus logs are always shown on the server side, and never on the client side.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, thanks.

assert "$output" =~ "StopSignal SIGTERM failed to stop container .*, resorting to SIGKILL"
fi
run_podman rm $cid_always
}

Expand Down
18 changes: 13 additions & 5 deletions test/system/050-stop.bats
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,11 @@ load helpers

# Run 'stop'. Time how long it takes.
t0=$SECONDS
run_podman stop $cid
run_podman 0+w stop $cid
t1=$SECONDS
if ! is_remote; then
assert "$output" =~ "StopSignal SIGTERM failed to stop container .*, resorting to SIGKILL"
fi

# Confirm that container is stopped. Podman-remote unfortunately
# cannot tell the difference between "stopped" and "exited", and
Expand Down Expand Up @@ -44,7 +47,10 @@ load helpers
is "${lines[2]}" "c3--Up.*" "podman ps shows running container (3)"

# Stop -a
run_podman stop -a -t 1
run_podman 0+w stop -a -t 1
if ! is_remote; then
assert "$output" =~ "StopSignal SIGTERM failed to stop container .*, resorting to SIGKILL"
fi

# Now podman ps (without -a) should show nothing.
run_podman ps --format '{{.Names}}'
Expand Down Expand Up @@ -185,8 +191,10 @@ load helpers
@test "podman stop -t 1 Generate warning" {
skip_if_remote "warning only happens on server side"
run_podman run --rm --name stopme -d $IMAGE sleep 100
run_podman stop -t 1 stopme
is "$output" ".*StopSignal SIGTERM failed to stop container stopme in 1 seconds, resorting to SIGKILL" "stopping container should print warning"
run_podman 0+w stop -t 1 stopme
if ! is_remote; then
is "$output" ".*StopSignal SIGTERM failed to stop container stopme in 1 seconds, resorting to SIGKILL" "stopping container should print warning"
fi
}

@test "podman stop --noout" {
Expand All @@ -204,7 +212,7 @@ load helpers

run_podman run --rm -d --name rmstop $IMAGE sleep infinity
local cid="$output"
run_podman stop rmstop
run_podman stop -t0 rmstop

# Check the OCI runtime directory has removed.
is "$(ls $OCIDir | grep $cid)" "" "The OCI runtime directory should have been removed"
Expand Down
2 changes: 1 addition & 1 deletion test/system/055-rm.bats
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ load helpers
# the window for race conditions that led to #9479.
run_podman run --rm -d $IMAGE sleep infinity
local cid="$output"
run_podman rm -af
run_podman rm -af -t0

# Check the OCI runtime directory has removed.
is "$(ls $OCIDir | grep $cid)" "" "The OCI runtime directory should have been removed"
Expand Down
1 change: 0 additions & 1 deletion test/system/075-exec.bats
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ load helpers

is "$(check_exec_pid)" "" "there isn't any exec pid hash file leak"

run_podman stop --time 1 $cid
run_podman rm -t 0 -f $cid
}

Expand Down
2 changes: 1 addition & 1 deletion test/system/250-systemd.bats
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ function service_cleanup() {
# Warn when a custom restart policy is used without --new (see #15284)
run_podman create --restart=always $IMAGE
cid="$output"
run_podman generate systemd $cid
run_podman 0+w generate systemd $cid
is "$output" ".*Container $cid has restart policy .*always.* which can lead to issues on shutdown.*" "generate systemd emits warning"
run_podman rm -f $cid

Expand Down
2 changes: 1 addition & 1 deletion test/system/330-corrupt-images.bats
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ function _corrupt_image_test() {
is "$output" "Error: locating item named \".*\" for image with ID \"$id\" (consider removing the image to resolve the issue): file does not exist.*"

# Run the requested command. Confirm it succeeds, with suitable warnings
run_podman $*
run_podman 0+w $*
is "$output" ".*Failed to determine parent of image.*ignoring the error" \
"$* with missing $what_to_rm"

Expand Down
2 changes: 1 addition & 1 deletion test/system/450-interactive.bats
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ function teardown() {


@test "podman run --tty -i failure with no tty" {
run_podman run --tty -i --rm $IMAGE echo hello < /dev/null
run_podman 0+w run --tty -i --rm $IMAGE echo hello < /dev/null
is "$output" ".*The input device is not a TTY.*" "-it _without_ a tty"

CR=$'\r'
Expand Down
9 changes: 6 additions & 3 deletions test/system/500-networking.bats
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,7 @@ load helpers.network
is "$output" 'Error: failed to find published port "99/tcp"'

# Clean up
run_podman stop -t 1 myweb
run_podman rm myweb
run_podman rm -f -t0 myweb
}

# Issue #5466 - port-forwarding doesn't work with this option and -d
Expand Down Expand Up @@ -630,7 +629,11 @@ load helpers.network
run curl --retry 2 -s $SERVER/index.txt
is "$output" "$random_1" "curl 127.0.0.1:/index.txt after auto restart"

run_podman restart $cid
run_podman 0+w restart $cid
if ! is_remote; then
assert "$output" =~ "StopSignal SIGTERM failed to stop container .* in 10 seconds, resorting to SIGKILL" "podman restart issues warning"
fi

# Verify http contents again: curl from localhost
# Use retry since it can take a moment until the new container is ready
run curl --retry 2 -s $SERVER/index.txt
Expand Down
6 changes: 3 additions & 3 deletions test/system/520-checkpoint.bats
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@ function teardown() {
local subnet="$(random_rfc1918_subnet)"
run_podman network create --subnet "$subnet.0/24" $netname

run_podman run -d --network $netname $IMAGE sleep inf
run_podman run -d --network $netname $IMAGE top
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed all sleep infs to top. This eliminates the need for 0+w because top is interruptible.

If there is some important reason for using sleep; if checkpoint/restore do not work properly with top; please speak now.

cid="$output"
# get current ip and mac
run_podman inspect $cid --format "{{(index .NetworkSettings.Networks \"$netname\").IPAddress}}"
Expand Down Expand Up @@ -310,7 +310,7 @@ function teardown() {
# now create a container with a static mac and ip
local static_ip="$subnet.2"
local static_mac="92:d0:c6:0a:29:38"
run_podman run -d --network "$netname:ip=$static_ip,mac=$static_mac" $IMAGE sleep inf
run_podman run -d --network "$netname:ip=$static_ip,mac=$static_mac" $IMAGE top
cid="$output"

run_podman container checkpoint $cid
Expand Down Expand Up @@ -340,7 +340,7 @@ function teardown() {
run_podman rm -t 0 -f $cid

# now create container again and try the same again with --export and --import
run_podman run -d --network "$netname:ip=$static_ip,mac=$static_mac" $IMAGE sleep inf
run_podman run -d --network "$netname:ip=$static_ip,mac=$static_mac" $IMAGE top
cid="$output"

run_podman container checkpoint --export "$archive" $cid
Expand Down
35 changes: 30 additions & 5 deletions test/system/helpers.bash
Original file line number Diff line number Diff line change
Expand Up @@ -325,8 +325,11 @@ function timestamp() {
#
function run_podman() {
# Number as first argument = expected exit code; default 0
expected_rc=0
# "0+[we]" = require success, but allow warnings/errors
local expected_rc=0
local allowed_levels="dit"
case "$1" in
0\+[we]*) allowed_levels+=$(expr "$1" : "^0+\([we]\+\)"); shift;;
[0-9]) expected_rc=$1; shift;;
[1-9][0-9]) expected_rc=$1; shift;;
[12][0-9][0-9]) expected_rc=$1; shift;;
Expand All @@ -336,8 +339,8 @@ function run_podman() {
# Remember command args, for possible use in later diagnostic messages
MOST_RECENT_PODMAN_COMMAND="podman $*"

# stdout is only emitted upon error; this echo is to help a debugger
echo "$(timestamp) $_LOG_PROMPT $PODMAN $*"
# stdout is only emitted upon error; this printf is to help in debugging
printf "\n%s %s %s\n" "$(timestamp)" "$_LOG_PROMPT" "$*"
Comment on lines +342 to +343
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is completely unrelated, I'm just sneaking it in. It adds an extra newline, which has greatly helped me scan error logs for most-recent-podman-command.

# BATS hangs if a subprocess remains and keeps FD 3 open; this happens
# if podman crashes unexpectedly without cleaning up subprocesses.
run timeout --foreground -v --kill=10 $PODMAN_TIMEOUT $PODMAN $_PODMAN_TEST_OPTS "$@" 3>/dev/null
Expand Down Expand Up @@ -381,6 +384,28 @@ function run_podman() {
die "exit code is $status; expected $expected_rc"
fi
fi

# Check for "level=<unexpected>" in output, because a successful command
# should never issue unwanted warnings or errors. The "0+w" convention
# (see top of function) allows our caller to indicate that warnings are
# expected, e.g., "podman stop" without -t0.
if [[ $status -eq 0 ]]; then
# FIXME: don't do this on Debian: runc is way, way too flaky:
# FIXME: #11784 - lstat /sys/fs/.../*.scope: ENOENT
# FIXME: #11785 - cannot toggle freezer: cgroups not configured
Comment on lines +393 to +395
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any way around this

if [[ ! "${DISTRO_NV}" =~ debian ]]; then
# FIXME: All kube commands emit unpredictable errors:
# "Storage for container <X> has been removed"
# "no container with ID <X> found in database"
# These are level=error but we still get exit-status 0.
# Just skip all kube commands completely
Comment on lines +397 to +401
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, though, really bothers me. podman kube is noisy, scary-noisy. I would really like us to consider fixing that.

if [[ ! "$*" =~ kube ]]; then
if [[ "$output" =~ level=[^${allowed_levels}] ]]; then
die "Command succeeded, but issued unexpected warnings"
fi
fi
fi
fi
}


Expand Down Expand Up @@ -413,7 +438,7 @@ function wait_for_output {

t1=$(expr $SECONDS + $how_long)
while [ $SECONDS -lt $t1 ]; do
run_podman logs $cid
run_podman 0+w logs $cid
logs=$output
if expr "$logs" : ".*$expect" >/dev/null; then
return
Expand All @@ -426,7 +451,7 @@ function wait_for_output {
exitcode=$output

# One last chance: maybe the container exited just after logs cmd
run_podman logs $cid
run_podman 0+w logs $cid
if expr "$logs" : ".*$expect" >/dev/null; then
return
fi
Expand Down