-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: add apparmor prompting integration tests #14518
tests: add apparmor prompting integration tests #14518
Conversation
97ec76b
to
604508d
Compare
Interesting that jammy tests failed but noble succeeded, and that the prompting-client integration tests #14387 pass on jammy. This suggests a problem with the test setup, rather than prompting support itself. The error would seem to suggest this as well, but I'll need to run with
Edit: Yes, appears to be a difference in the way access to non-owned files in |
Also want to add a test which queues up multiple requests, then the client replies to the final one, and it checks whether the previous requests are now handled by the reply. But this requires being able to have prompt entries in the script which do not have replies, and this is not yet supported, I believe. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #14518 +/- ##
==========================================
+ Coverage 78.95% 78.96% +0.01%
==========================================
Files 1084 1084
Lines 146638 146709 +71
==========================================
+ Hits 115773 115853 +80
+ Misses 23667 23659 -8
+ Partials 7198 7197 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I think it's worth reviewing these as they are now, and either adding more cases here or opening follow-up PRs for more cases in the future. Some remaining considerations for reviewers:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Overall, I think it looks great. To respond to your comments, I think it could be a little cleaner if, as you already mentioned, you moved the prompting-client.scripted
call to the task level. Along that same vein, since each test has a custom action that it performs followed by a custom check for success, you could place "do action" and "check for success" logic for each test inside of functions that get called at the task level. That would slightly reduce code duplication since your wait conditions could also be moved to the task level. That said, I think the way you currently laid out the tests is just fine; I suppose I would more favor the change if you think many more tests will be added in the future.
snap run --shell prompting-client.scripted -c "echo $name is written > ${TEST_DIR}/${name}" | ||
done | ||
|
||
sleep 5 # give the client a chance to write its result and exit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: If possible, I would try to get rid of all sleeps throughout this PR. Sleeps introduce unnecessary delay and mask the true wait condition. What about something like timeout 5 bash -c 'while pgrep -f "prompting-client-scripted" > /dev/null; do true; done'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced sleeps with timeouts like you suggested, and set the timeout duration as a variable in task.yaml
which is passed into each test case script.
Thanks for the great suggestion about pgrep as well!
Problem seems to be the kernel locking up waiting for a reply?? I'm testing in a lxd VM. After sending the first write request, sending another one does not cause a kernel request, and trying to open another shell fails. I'll record a video to show what I'm seeing... |
Here's the recording: 2024-10-09_14-11-21.mp4Some annotations:
I'm going to test another theory to see if I can bypass the shell locking up and get the test to behave. Another (shorter) video incoming... |
Okay, very interesting. Adding a sleep (to get the inner bash shells running first before anything hangs on the write) does not help, the same thing occurs. But we also saw an actual kernel trace this time: Here's the recording, with some (overly-verbose, sorry) commentary about what's going on: https://bucket.calder.dev/integration-test-debugging/sleep-first.mp4 And here's the kernel trace:
Okay, I tried again, this time running each as Lastly, I tried with So in conclusion, once one write is initiated (and not actioned by a prompt), it seems to lock up the shell and prevent other writes from being carried out. Interestingly, snapd and the prompting client are still able to respond to the kernel, but new requests aren't received by snapd from the kernel, and the shell seems to lock up. So I'm not sure what to make of this... I think I can add some more listener debug logging to make sure it's not snapd locking up, but since the shell locks up, I think it's something else. |
What I really need to do is try on oracular, as that has the fixes for the |
I tried this on oracular, with basically the same result (though I did identify #14593 along the way 🙃). Here's the recording: https://bucket.calder.dev/integration-test-debugging/oracular.mp4 And here's the journalctl log, with Most interesting immediate takeaway for me is that the journal stops receiving messages for ~1min after each write blocks. |
Just noticed something else: the system locking up only occurs when prompting is blocked on a write, not on a read. And the things which are locked up also seem to be writes (though some writes do seem to be able to still occur, such as the client writing a response to the snapd API socket, or snapd writing a response to the kernel (using ioctl). Here's the recording: read-vs-write.mp4 |
Oh, it's more interesting than that: it's that file creation in the same directory hangs, but other file creation succeeds! In the file below, everything from |
464fd3a
to
e14860e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, looks really good! did a first pass.
Also, maybe the script files can go under a testdata directory so that they are not included in the go package? feel free to ignore this for now, it may make sense to do this for all tests anyway later.
https://pkg.go.dev/cmd/go#hdr-Package_lists_and_patterns
Directory and file names that begin with "." or "_" are ignored by the go tool, as are directories named "testdata".
tests.session -u test exec prompting-client.scripted \ | ||
--script="${TEST_DIR}/script.json" \ | ||
--grace-period=1 \ | ||
--var="BASE_PATH:${TEST_DIR}" | tee "${TEST_DIR}/result" & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the prompting client keep running endlessly in the background?
How about using systemd-run
to manage its life cycle instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usual prompting client daemon is running in the background, yes. This may not be necessary, but if snapd restarts, it'll check for the presence of a handler-service
app and restart it if it's not running. I could experiment with disabling the prompting-client.daemon
service...
The scripted client will terminate when it finishes its script and waits the grace period to see if any unexpected prompts occur. The scripted client needs to run as the test
user, would systemd-run
work for that? I'm wary of doing anything much different than a normal user invoking prompting-client.scripted
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think you're right, if the test script exits early, the prompting client needs to be manually killed. I added this fix in one of the recent PRs.
Open to exploring systemd-run though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you can easily run as any user, the added benefit is that you don't need to handle pids yourself you let systemd handle that for you.
This is very similar to what you could use
tests.session -u test exec systemd-run --user --unit test-kill.service flock "$lockfile" "$sh_snap_bin" -c 'touch $SNAP_USER_COMMON/alive; sleep 100000' |
e14860e
to
37506e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for pushing this forward, did an high-level pass, and looked only at the first of the tests, there's quite a bit of timeout/sleep/retry based code, I'm worried about the behavior of that on slow testing machines. If we cannot do better, you should consider being at least a bit more pessimistic with the numbers, it might also make sense to code the most common wait/timeout times etc as env vars instead of having to change them one by one
--) | ||
shift | ||
break | ||
;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change? seems a bit unrelated, also @sergiocazzolato should review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing help text didn't actually work:
show_help() {
echo "usage: tests.session exec [-u USER] [-p PID_FILE] [--] <CMD>"
echo " tests.session prepare | restore [-u USER | -u USER1,USER2,...]"
echo " tests.session kill-leaked"
echo " tests.session dump"
echo " tests.session has-system-systemd-and-dbus"
echo " tests.session has-session-systemd-and-dbus"
}
Here's the switch for the exec keyword:
exec)
action="exec"
shift
# remaining arguments are the command to execute
break
;;
So if you try to do tests.session exec -u test -- echo foo
, it'll treat -u
, test
, --
echo
foo
as the command to execute, which is wrong.
So I saw two options:
- change the behavior of
exec
so it actually expects--
and doesn'tbreak
immediately - remove the
--
from theexec
help text
I checked every usage of tests.session
in spread tests, and all of them put -u USER -p PID_FILE
before the exec
. So I did the latter with regards to exec
. And --
isn't used by any other command, so I thought it best to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, the help needs an update, after the exec command it breaks, so what it expected is the cmd to execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other way to do this is removing the break in exec, this will force to use -- to delimit the command to execute.
But this will require a change in the tests which should be done in a separate pr
echo "Run the test script as the test user" | ||
if ! tests.session -u test exec sh -x "${TEST_DIR}/${VARIANT}.sh" "$TEST_DIR" ; then | ||
# kill the prompting-client-scripted process for this test run | ||
pkill -f "prompting-client-scripted.*${TEST_DIR}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we always kill it in restore per above discussion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but would the scripted client running in the background prevent the execute
phase from completing, and thus prevent restore
from starting? In testing, I found that if the scripted client did not terminate during execute, the test would sit waiting until it timed out.
Old answer when I hadn't caught the "in restore" in your question:
The prompting client scripted mode will terminate successfully after it has observed all expected prompts and the grace period elapses (so it's sure no unexpected prompts occurred after the last expected one). It will terminate unsuccessfully if it sees any unexpected prompt or if it receives an error after talking to the snapd API. I believe the only case when we need to kill it explicitly is if the shell script for the test case exited early for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably worth expanding the comment on the pkill to mention that is still running because it might not have observed all the expected prompts?
"prompt-filter": { | ||
"snap": "prompting-client", | ||
"interface": "home", | ||
"constraints": { | ||
"path": "$BASE_PATH/.*" | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the role of this outside of prompts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall purpose of the scripted client is to test that the correct prompts occur in the correct order. This top-level filter applies first, so only prompts matching this filter are considered when checking the order and contents of the prompts. For these tests, the top-level filter is a way of isolating the different test cases, so only prompts which have a path starting with the passed-in base path are checked against the script, and that base path is mktemp
-unique to the test case.
@@ -1,7 +1,7 @@ | |||
#!/bin/bash -e | |||
|
|||
show_help() { | |||
echo "usage: tests.session exec [-u USER] [-p PID_FILE] [--] <CMD>" | |||
echo "usage: tests.session [-u USER] [-p PID_FILE] exec <CMD>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could have prepare and retore following the same logic
tests.session [-u USER | -u USER1,USER2,...]" prepare | restore
In fact, in most the tests it is used like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need just break after prepare/restore commands are found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most tests already do -u prepare
and -u restore
, but there are a few exceptions:
me@hostname:~/Projects/snapd$ grep -r 'tests\.session .*prepare.* -u.*'
tests/regression/lp-2065077/task.yaml: tests.session prepare -u test
tests/lib/tools/tests.session: echo " tests.session prepare | restore [-u USER | -u USER1,USER2,...]"
tests/lib/tools/suite/tests.session/task.yaml: tests.session prepare -u "$USER"
tests/main/snap-user-dir-perms-fixed/task.yaml: tests.session prepare -u "$USER"
tests/main/user-session-env/task.yaml: tests.session prepare -u "$user"
me@hostname:~/Projects/snapd$ grep -r 'tests\.session .*restore.* -u.*'
tests/regression/lp-2065077/task.yaml: tests.cleanup defer tests.session restore -u test
tests/lib/tools/tests.session: echo " tests.session prepare | restore [-u USER | -u USER1,USER2,...]"
tests/lib/tools/suite/tests.session/task.yaml: tests.session restore -u "$USER"
tests/main/snap-user-dir-perms-fixed/task.yaml: tests.session restore -u "$USER"
tests/main/user-session-env/task.yaml: tests.session restore -u "$user"
Seeing as prepare/restore work fine either way, I don't see a need to change them in this PR, but the change should be simple enough to enforce -u prepare
and -u restore
always.
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
…uce sleeps Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
When creating a new file is blocked on a reply to a request prompt, the directory in which the file will be created is locked from other writes. Thus, we can't queue up multiple outstanding writes on files in the same directory. Instead, we must write files in different directories in order for this test to succeed. Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
…st dir to terminate Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Now that canonical#14538 has landed, rules may overlap as long as their outcomes do not conflict. As such, the download_file_defaults test case is no longer expected to fail. Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
eeb8e0d
to
f00813f
Compare
Test failures: Prepare failed:
Execute failed:
Restore failed:
Plus all of the opensuse tests, none of which are related to this PR: https://github.com/canonical/snapd/actions/runs/11450399848/job/31907785826?pr=14518 This PR adds a new spread test and does not affect production code, so it's clear the other test failures are unrelated to the changes here. The changes to |
Port the prompting-client integration tests to snapd and extend them to cover common use cases for AppArmor Prompting.
This work is tracked internally by https://warthogs.atlassian.net/browse/SNAPDENG-30450.