pasta: timeout again (rawhide) #20170

edsantiago · 2023-09-27T12:20:26Z

Resurrecting #17598. Pasta test timing out again. Seen in rawhide with passt-0^20230908.g05627dc-1.fc40-x86_64:

[+1596s] ok 556 podman networking with pasta(1) - TCP/IPv4 small transfer, loopback
[+1726s] not ok 557 podman networking with pasta(1) - TCP/IPv4 large transfer, tap
         # (from function `die' in file test/system/helpers.bash, line 747,
         #  from function `run_podman' in file test/system/helpers.bash, line 384,
         #  from function `pasta_test_do' in file test/system/505-networking-pasta.bats, line 248,
         #  in test file test/system/505-networking-pasta.bats, line 700)
         #   `pasta_test_do' failed
         #
         # [20:56:52.117914919] $ /var/tmp/go/src/github.com/containers/podman/bin/podman rm -t 0 --all --force --ignore
         # [20:56:52.226806205] 42b6b5a167b84848f4b09d240ee31f69736ad905a0761bf4b53a4539b2879ceb
         #
         # [20:56:52.238402603] $ /var/tmp/go/src/github.com/containers/podman/bin/podman ps --all --external --format {{.ID}} {{.Names}}
         #
         # [20:56:52.314645311] $ /var/tmp/go/src/github.com/containers/podman/bin/podman images --all --format {{.Repository}}:{{.Tag}} {{.ID}}
         # [20:56:52.386168270] quay.io/libpod/testimage:20221018 f5a99120db64
         # 1+0 records in
         # 1+0 records out
         # 10485760 bytes (10 MB, 10 MiB) copied, 0.0408792 s, 257 MB/s
         #
         # [20:56:52.693255830] $ /var/tmp/go/src/github.com/containers/podman/bin/podman run --net=pasta -p [10.128.0.120]:5294:5294/tcp quay.io/libpod/testimage:20221018 sh -c for port in $(seq 5294 5294); do                              socat -u TCP4-LISTEN:${port},bind=[10.128.0.120] EXEC:md5sum &                          done; wait
         # [20:59:02.709781235] timeout: sending signal TERM to command ?/var/tmp/go/src/github.com/containers/podman/bin/podman?
         # timeout: sending signal KILL to command ?/var/tmp/go/src/github.com/containers/podman/bin/podman?
         # [20:59:02.716663367] [ rc=137 (** EXPECTED 0 **) ]

This is f40 rawhide, not f39; and it is an in-progress PR. Podman is not yet testing f40 rawhide, so we are unlikely to see this flake often... until that PR merges.

The text was updated successfully, but these errors were encountered:

edsantiago · 2023-09-27T13:20:32Z

Another one from a test PR. Four runs total, two failures.

sbrivio-rh · 2023-09-27T13:56:23Z

cc @dgibson

edsantiago · 2023-09-27T16:46:13Z

And again, same story (rawhide).

...from f38 + f37. Requires one minor e2e test change, to handle an error logging change in conmon 2.1.8. Also, this is important, requires crun-1.9.1 because of a kernel symlink change; see containers/crun#1309 The VM images here were carefully built to include that. By the time the next VM images get built, it should be default. Since we've bumped crun, remove two obsolete skips And, skip a flaky pasta test, containers#20170 Signed-off-by: Ed Santiago <[email protected]>

dgibson · 2023-10-11T03:05:01Z

Hrm, not much to go on here. It's possible this is a new version of #17598, but it's equally possible it's some entirely unrelated timeout. Since we've had what I believe should be a pretty robust fix for the #17598 aliasing problem, I'm inclined to suspect it's something else.

@edsantiago do you have a recipe for creating a VM with the f40 snapshot where the problem is appearing?

edsantiago · 2023-10-11T17:37:38Z

Hangs on my very first try just now, rawhide, passt-0^20231004.gf851084-1.fc40.x86_64

$ 1minutetip -n 1MT-Fedora-Rawhide
....
ssh into it...
git clone podman, install dependencies
...
# loginctl enable-linger fedora
# su - fedora
$ cd /path/to/podman
$ vim test/system/505*   <----- search for 20170, remove the skip
$ hack/bats --rootless  505:"TCP/IPv4 large transfer, tap"
chcon: failed to change context of '/root/go/podman/bin/podman' to ‘unconfined_u:object_r:container_runtime_exec_t:s0’: Operation not permitted
--------------------------------------------------
$ bats --filter TCP/IPv4 large transfer, tap test/system/505-networking-pasta.bats
505-networking-pasta.bats
   podman networking with pasta(1) - TCP/IPv4 large transfer, tap

After ^C:

   [13:25:17.815281711] $ /root/go/podman/bin/podman run --net=pasta -p [10.0.184.105]:5788:5788/tcp quay.io/libpod/testimage:20221018 sh -c for port in $(seq 5788 5788); do                              socat -u TCP4-LISTEN:${port},bind=[10.0.184.105] EXEC:md5sum &                          done; wait
   [13:25:17.959441496] Error: pasta failed with exit code 1:
   No external routable interface for IPv6
   Failed to bind any port for '-t 10.0.184.105/5788-5788:5788-5788', exiting
   [13:25:17.962768061] [ rc=126 (** EXPECTED 0 **) ]

edsantiago · 2023-10-11T18:07:10Z

One-line reproducer:

$ bin/podman run --net=pasta -p '[127.0.0.1]':5799:5799/tcp quay.io/libpod/testimage:20221018 date
Error: pasta failed with exit code 1:
No external routable interface for IPv6
Failed to bind any port for '-t 127.0.0.1/5799-5799:5799-5799', exiting

Fails the same way in f38 with passt-selinux-0^20230908.g05627dc-1.fc38.noarch.

In f38, after dnf downgrade passt to 0^20230329.gb10b983-1.fc38, it fails differently:

Couldn't open network namespace /run/user/1000/netns/netns-aeb16e93-56d4-0a60-ad67-c4b17976d553

dgibson · 2023-10-12T04:11:22Z

This message looks like a good clue:

Failed to bind any port for '-t 10.0.184.105/5788-5788:5788-5788', exiting

Usually this means that pasta was unable to bind the specified ports on the host side, which suggests the port is already in use. Are there any stale pasta processes on the host where podman is running? Or is there something else using port 5788?

edsantiago · 2023-10-12T10:19:37Z

Same message no matter what port number I specify

dgibson · 2023-10-12T23:48:06Z

Couldn't open network namespace /run/user/1000/netns/netns-aeb16e93-56d4-0a60-ad67-c4b17976d553

That looks like one of the SELinux problems we already fixed.

As for the Failed to bind any port error, I haven't managed to reproduce so far on my own machine (f38) with passt-0^20231004.gf851084-1.fc38.x86_64.

Does the problem occur if you disable selinux enforcement?

edsantiago · 2023-10-13T01:18:38Z

OK, yeah, it's SELinux; container-selinux was not installed on my VM. This is obviously not the same error as was being seen in CI, because container-selinux is installed there. I will continue investigating next week, but for now, everyone else stand down. Thank you for the suggestion.

edsantiago · 2023-10-16T14:31:54Z

Yep, still broken with passt-0^20231004.gf851084-1.fc40.x86_64. Look at the timestamps. The first one passes in O(3s), the next few in O(20s), then, hang.

$ while :;do echo;date;hack/bats --rootless 505:"TCP/IPv4 large transfer, tap"||break;done

Mon Oct 16 10:19:20 AM EDT 2023
--------------------------------------------------
$ bats --filter TCP/IPv4 large transfer, tap test/system/505-networking-pasta.bats
505-networking-pasta.bats
 ✓ podman networking with pasta(1) - TCP/IPv4 large transfer, tap

1 test, 0 failures


Mon Oct 16 10:19:23 AM EDT 2023
--------------------------------------------------
$ bats --filter TCP/IPv4 large transfer, tap test/system/505-networking-pasta.bats
505-networking-pasta.bats
 ✓ podman networking with pasta(1) - TCP/IPv4 large transfer, tap

1 test, 0 failures


Mon Oct 16 10:19:49 AM EDT 2023
--------------------------------------------------
$ bats --filter TCP/IPv4 large transfer, tap test/system/505-networking-pasta.bats
505-networking-pasta.bats
 ✓ podman networking with pasta(1) - TCP/IPv4 large transfer, tap

1 test, 0 failures


Mon Oct 16 10:20:16 AM EDT 2023
--------------------------------------------------
$ bats --filter TCP/IPv4 large transfer, tap test/system/505-networking-pasta.bats
505-networking-pasta.bats
 ✓ podman networking with pasta(1) - TCP/IPv4 large transfer, tap

1 test, 0 failures


Mon Oct 16 10:20:42 AM EDT 2023
--------------------------------------------------
$ bats --filter TCP/IPv4 large transfer, tap test/system/505-networking-pasta.bats
505-networking-pasta.bats
   podman networking with pasta(1) - TCP/IPv4 large transfer, tap        <<<<<<---------- HANG

After 5 minutes I ^Ced and got:

  Received SIGINT, aborting ...

Twiddle thumbs five more minutes, ^C again, nothing useful. Test terminates but without a backtrace.

edsantiago · 2023-10-16T15:50:16Z

Here's ps on a hung job:

# ps auxww --forest
...
fedora     34534  0.0  1.2  76220 24632 ?        Ss   11:39   0:00 /usr/bin/pasta --config-net -t 10.0.184.181/5304-5304:5304-5304 -u none -T none -U none --no-map-gw --netns /run/user/1000/netns/netns-320bc153-dbe7-1873-fde9-23aa349d29da
fedora     34536  0.0  0.1   9776  2332 ?        Ss   11:39   0:00 /usr/bin/conmon [....]
fedora     34538  0.0  0.0   1616   896 ?        Ss   11:39   0:00  \_ sh -c for port in $(seq 5304 5304); do                              socat -u TCP4-LISTEN:${port},bind=[10.0.184.181] EXEC:md5sum &                          done; wait
fedora     34542  0.0  0.1   4916  2432 ?        S    11:39   0:00      \_ socat -u TCP4-LISTEN:5304,bind=[10.0.184.181] EXEC:md5sum
fedora     34544  0.0  0.0   1616   640 ?        S    11:39   0:00          \_ md5sum

FWIW it reproduces pretty easily, within 15 minutes at most.

dgibson · 2023-10-17T00:52:09Z

Unfortunately those additional reproduces don't have any additional clues.

@edsantiago, again, do you have an easy way to set up an fc40 VM? It looks like I'm going to have to reproduce this myself to investigate.

edsantiago · 2023-10-17T00:58:29Z

Yeah, it's just a 1minutetip VM with podman from git + dependencies. It takes seconds. Do you have 1minutetip? If not, PM me on a secure channel and I'll set one up for you. 1mt VMs have a limited lifetime, 6 hours or so, so it's much better if you set up your own.

dgibson · 2023-10-17T05:38:14Z

With @edsantiago 's assistance, I've now reproduced the problem both on a VM he set up, and one I set up. I've simplified the reproducing case down a bit, but still don't have a lot in the way of clues about what's going on. Because the client/sending socat appears to be finished in the stuck case, I'm guessing it's (another) race in FIN handling.

I expect to continue digging into this tomorrow and make some more substantial progress.

dgibson · 2023-10-18T02:44:07Z

Ok, through yesterday I've discovered some interesting new things.

The package version in use already includes fixes for upstream bug 74, so it's not identical to that bug.
Just to verify, I also checked the problem still occurs with upstream git passt, rather than the package
However, the problem can no longer be reproduced if I increase the net.core.rmem_max sysctl to 4MiB
That strongly suggests this is closely related to bug 74
When I reduce the rmem_max value, I'm now able to reproduce it on my laptop, so it doesn't look like it's specific to something in Fedora 40 after all.

Continuing to investigate.

dgibson · 2023-10-18T06:41:48Z

Investigated further today. The mechanics here seem to be very similar to upstream bug 74.

Packet traces on the guest/server/receiver side simply show the transfer stopping for no clear reason.
Packet traces on the host/client/sender side show the sender getting a window full event, then no further transfers
We get occasional TCP keepalive packets from the sender, which the receiver (pasta) responds to with explicit zero window messages
The guest side packet traces show a non-zero window advertised by the guest-side server
Instrumentation I added in pasta shows an estimate of the amount of data in the receive buffer as zero - so we'd expect the host side window to be quite large

Our analysis indicated that bug 74 was caused by a kernel bug: it appears that our heavy use of MSG_PEEK and MSG_TRUNC causes the kernel to miscalculate the window size. We thought we'd worked around that bug by explicitly setting the TCP_WINDOW_CLAMP socket option, which seemed to force the kernel to recalculate the window correctly.

My working theory is that that workaround isn't as robust as we'd hoped, and even with the TCP_WINDOW_CLAMP calls, once the kernel hits a zero window size momentarily it's incorrectly not re-opening the window when we consume data with MSG_TRUNC.

As an aside, I tried replacing our MSG_TRUNC call with an explicit recv() into a discard buffer, but that didn't change the behaviour. That suggests the kernel miscalculation is in the handling of MSG_PEEK rather than MSG_TRUNC.

dgibson · 2023-10-24T02:36:24Z

Forgot to mention, I now have a reproducer with pasta alone, not using podman. I'm using this script (which will need some tweaking for anyone using it on a different system):

#! /bin/bash

PASTA=~/src/passt/pasta
PORT=38237
FILE=~/passt/podman20170/xfer.bin

set -e

IFNAME="$(ip -j -4 route show | jq -rM '[.[] | select(.dst == "default").dev] | .[0]')"

ADDR="$(ip -j -4 addr show "${IFNAME}" | jq -rM '.[0] | .addr_info[0].local')"

md5sum $FILE

test_once() {
    trap "exit 1" SIGINT

    $PASTA --config-net -t $PORT --debug --log-file=pasta.log --pcap=guestside.pcap -- socat -u TCP4-LISTEN:$PORT EXEC:md5sum &
    sleep 1
    socat -u $FILE TCP4:$ADDR:$PORT
    wait
}

date
while test_once; do
    date
done

dgibson · 2023-10-24T09:19:03Z

There are actually two symtpoms we see, both with the original podman reproducer and the pasta alone reproducer.

The transfer stalls completely and never finishes
The transfer completes after roughtly 30s, instead of the normal ~2s for a working run

I've been focusing on case (1), but (2) is seen as well.

As an experiment I tried entirely disabling use of TCP_WINDOW_CLAMP (on the grounds that while it made for a partial workaround, maybe it's also part of the triggering conditions. When I do that, I'm no longer able to reproduce symptom (1), but symptom (2) becomes much more common (more commmon than successful runs).

dgibson · 2023-10-25T02:55:25Z

Tried a different experiment of not using TCP_WINDOW_CLAMP with the same window sized we already set. If we're in STALLED state we always adjust it upwards. This didn't appear to have any effect on the symptoms.

dgibson · 2023-10-25T04:31:50Z

I tried double checking that we're not somehow missing epoll events: I added instrumentation on a timer that ran did a recv() with MSG_PEEK on the sockets. After we're definitely in stalled state, this is still returning EAGAIN.

dgibson · 2023-11-10T00:41:21Z

Hi @edsantiago ,

Sorry there have been no updates for a while. I've been looking into this with assistance from @sbrivio-rh and Paolo Abeni, and we think we finally figured it out.

It seems the cause of the stall is a kernel bug, where calling TCP_WINDOW_CLAMP at the wrong moment can permanently lock the TCP window to 0 bytes. However, it also appears that TCP_WINDOW_CLAMP probably didn't really do what we wanted it, even before that bug was introduced. At least for the time being it looks like the solution is not to use TCP_WINDOW_CLAMP at all in passt.

We have a draft fix here. Could you try that out and see if works for you?

edsantiago · 2023-11-10T01:19:10Z

Nice work! I'd be happy to give it a spin, but is there any chance you could give me a scratch rpm? It's a lot easier for me to wget & rpm -Uvh than it is to git clone, figure out & install dependencies, figure out build process, figure out what to install where.

(Also, I don't work Fridays. I will be home part of tomorrow, though, and will be glad to instrument a test PR during a free moment).

sbrivio-rh · 2023-11-10T13:50:27Z

Nice work! I'd be happy to give it a spin, but is there any chance you could give me a scratch rpm?

There you go:

https://passt.top/builds/copr/0%5E20231110.gcf3eeba/fedora-rawhide-x86_64/06618377-passt/passt-0%5E20231110.gcf3eeba-1.fc40.x86_64.rpm

By the way:

It's a lot easier for me to wget & rpm -Uvh than it is to git clone, figure out & install dependencies

no dependencies other than a standard C library.

figure out build process

make

edsantiago · 2023-11-14T12:22:57Z

Thanks. I ended up running my above reproducer in a 1mt, it ran fine for the lifetime of the VM. And each time I checked, it was O(5s), not 30. Your code LGTM, thank you!

sbrivio-rh · 2023-11-14T13:38:54Z

Thanks. I ended up running my above reproducer in a 1mt, it ran fine for the lifetime of the VM. And each time I checked, it was O(5s), not 30. Your code LGTM, thank you!

This is now included in passt-0^20231110.g5ec3634-1.fc40 for Fedora Rawhide. Debian update is still pending.

edsantiago · 2023-11-17T01:07:50Z

Unfortunately, today's VM build picked up 2023-11-07 so the test continues to remain skipped.

On the L2 tap side, we see TCP headers and know the TCP window that the ultimate receiver is advertising. In order to avoid unnecessary buffering within passt/pasta (or by the kernel on passt/pasta's behalf) we attempt to advertise that window back to the original sock-side sender using TCP_WINDOW_CLAMP. However, TCP_WINDOW_CLAMP just doesn't work like this. Prior to kernel commit 3aa7857fe1d7 ("tcp: enable mid stream window clamp"), it simply had no effect on established sockets. After that commit, it does affect established sockets but doesn't behave the way we need: * It appears to be designed only to shrink the window, not to allow it to re-expand. * More importantly, that commit has a serious bug where if the setsockopt() is made when the existing kernel advertised window for the socket happens to be zero, it will now become locked at zero, stopping any further data from being received on the socket. Since this has never worked as intended, simply remove it. It might be possible to re-implement the intended behaviour by manipulating SO_RCVBUF, so we leave a comment to that effect. This kernel bug is the underlying cause of both the linked passt bug and the linked podman bug. We attempted to fix this before with passt commit d3192f6 ("tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag"). However while that commit masked the bug for some cases, it didn't really address the problem. Fixes: d3192f6 ("tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag") Link: containers/podman#20170 Link: https://bugs.passt.top/show_bug.cgi?id=74 Signed-off-by: David Gibson <[email protected]> Signed-off-by: Stefano Brivio <[email protected]>

github-actions · 2023-12-19T00:07:03Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2023-12-19T12:31:37Z

@edsantiago any update on this one?

edsantiago · 2023-12-19T12:39:06Z

This is believed to be fixed in pasta-2023-11-10 but the CI VMs are still on 11-07 (even though they have been respun twice in the past few weeks). All we're doing is waiting for the new pasta rpm to propagate.

rhatdan · 2023-12-19T12:50:31Z

Not sure I understand but this looks like it is 2023-12-08
IMAGE_SUFFIX: "c20231208t193858z-f39f38d13"

Or are you talking about something else?

edsantiago · 2023-12-19T12:58:59Z

Something else. That string is the date/time the VMs were built. The string I mentioned is the pasta version (it's confusing because pasta versions are YMD, not semantic.

This table might help. This is my new little thing that displays package versions on all VM builds. Find the passt column. You can see that some OSes have newer pasta, but some still have old (1107). If run rootless, this pasta test would fail on that OS.

(Sorry about the dash in fedora. Long story short, the way my script finds versions is by looking for dnf/apt updates, and this OS is apparently keeping the default version, and I have no way to find out what that is). HTH.

sbrivio-rh · 2023-12-19T13:09:38Z

This table might help. This is my new little thing that displays package versions on all VM builds. Find the passt column. You can see that some OSes have newer pasta, but some still have old (1107). If run rootless, this pasta test would fail on that OS.

Ah, yes, for Fedora 39 I waited a couple of weeks before submitting this update -- there's always the risk of obsoleting previous updates which are still in testing. This has been in stable for just 11 hours how.

I suppose it should be sorted on Debian too in a bit -- 0.0~git20231204.b86afe3-1 was accepted into unstable 10 days ago.

edsantiago · 2023-12-19T13:13:05Z

Thanks for the update!

Who wants me to spin up new VMs and cycle them into CI juuuuuust before a long holiday break? Anyone? Anyone?

sbrivio-rh · 2023-12-19T13:20:37Z

Who wants me to spin up new VMs and cycle them into CI juuuuuust before a long holiday break? Anyone? Anyone?

I can remind you on Friday evening. Today... I don't know, it wouldn't feel professional enough.

edsantiago · 2024-01-03T21:35:32Z

PR #21129 merged today, with pasta 2012-12-04 in all VMs and the skip removed from that test. I've run a couple of extra CI laps, and all seems fine, so I'm going to close this and hope for the best. Thank you @dgibson and @sbrivio-rh!

edsantiago added flakes Flakes from Continuous Integration pasta pasta(1) bugs or features labels Sep 27, 2023

edsantiago assigned sbrivio-rh Sep 27, 2023

Icarus9913 mentioned this issue Oct 18, 2023

VM crash with network passt mode kubevirt/kubevirt#10583

Closed

github-actions bot added the stale-issue label Dec 19, 2023

edsantiago removed the stale-issue label Dec 19, 2023

edsantiago closed this as completed Jan 3, 2024

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Apr 3, 2024

github-actions bot locked as resolved and limited conversation to collaborators Apr 3, 2024

pasta: timeout again (rawhide) #20170

pasta: timeout again (rawhide) #20170

Comments

edsantiago commented Sep 27, 2023

edsantiago commented Sep 27, 2023

sbrivio-rh commented Sep 27, 2023

edsantiago commented Sep 27, 2023

dgibson commented Oct 11, 2023

edsantiago commented Oct 11, 2023

edsantiago commented Oct 11, 2023

dgibson commented Oct 12, 2023

edsantiago commented Oct 12, 2023

dgibson commented Oct 12, 2023

edsantiago commented Oct 13, 2023

edsantiago commented Oct 16, 2023 • edited Loading

edsantiago commented Oct 16, 2023

dgibson commented Oct 17, 2023

edsantiago commented Oct 17, 2023

dgibson commented Oct 17, 2023

dgibson commented Oct 18, 2023

dgibson commented Oct 18, 2023

dgibson commented Oct 24, 2023

dgibson commented Oct 24, 2023

dgibson commented Oct 25, 2023

dgibson commented Oct 25, 2023

dgibson commented Nov 10, 2023 • edited Loading

edsantiago commented Nov 10, 2023

sbrivio-rh commented Nov 10, 2023

edsantiago commented Nov 14, 2023

sbrivio-rh commented Nov 14, 2023

edsantiago commented Nov 17, 2023

github-actions bot commented Dec 19, 2023

rhatdan commented Dec 19, 2023

edsantiago commented Dec 19, 2023

rhatdan commented Dec 19, 2023

edsantiago commented Dec 19, 2023

sbrivio-rh commented Dec 19, 2023

edsantiago commented Dec 19, 2023

sbrivio-rh commented Dec 19, 2023

edsantiago commented Jan 3, 2024

edsantiago commented Oct 16, 2023 •

edited

Loading

dgibson commented Nov 10, 2023 •

edited

Loading