-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman hangs when sending large amounts of data on stdout #13779
Comments
I can reproduce. Conmon crashes with SIGSEGV.
It works correctly with @mheon @haircommander PTAL |
Adding |
Did journald terminate our connection because it felt we were spamming the logs? I was suspecting it might be a log pressure thing, denying the write because journald felt it was running out of cache or wasn't able to keep up with writes to disk, but the bad file descriptor errno makes it seem a lot more like the connection was entirely closed? |
Bit of googling leads me to https://docs.openshift.com/container-platform/4.7/logging/config/cluster-logging-systemd.html Most notable passage: "We recommend setting RateLimitIntervalSec=30s and RateLimitBurst=10000 (or even higher if necessary) to prevent the journal from losing entries." I think the journal is rate-limiting us. We should probably respond to this by applying backpressure on the logs (refusing to read from the TTY and letting it fill, while we wait for journald to start accepting our logs again. Exact details on the "wait for journald to start accepting logs again" will need to be figured out - do we just keep retrying until it succeeds? How do we tell if journald is rate-limiting us - it could just be broken (user SIGKILL'd the process, for example). |
Why are we spamming the journal with There are also significant security/privacy implications here if we end up logging unhashed passwords, for example... |
The default log driver is journald, by default all stdout/stderr will be captured. You can change it to file but again eveything will be captured. This is required to make
No application should ever log sensitive data!!! Podman just stores the output. It is no different than running your application directly in a systemd unit for example where systemd would store the output in the journal. |
My point is that plenty of applications might reasonably output sensitive data to the tty or stdout, be it either writing a password or some kind of other authentication token to the screen, or private messages from a secure messaging app that the user expected to be ephemeral. This logging (by podman) happens even in |
Well this is the default for Kubernetes, Podman and Docker. As well as systemd unit files. |
As described in containers/podman#13779, podman by default logs all of a container's stdout in order to retain info to make 'podman logs' work. But this extra copy of stdout can be overrun and cause hangs when shipping large amounts of data through stdout (as in the case when piping out a tar stream). Changing the instructions and documentation to add '--log-driver=none' to the podman commandline disables this extra unneeded logging and prevents hangs. Signed-off-by: Jim Ramsay <[email protected]>
I'd vote for changing the default for |
As described in containers/podman#13779, podman by default logs all of a container's stdout in order to retain info to make 'podman logs' work. But this extra copy of stdout can be overrun and cause hangs when shipping large amounts of data through stdout (as in the case when piping out a tar stream). Changing the instructions and documentation to add '--log-driver=none' to the podman commandline disables this extra unneeded logging and prevents hangs. Signed-off-by: Jim Ramsay <[email protected]> (cherry picked from commit a2124e0)
Today I ran into the Steps to reproduce
Result Error: timed out waiting for file /home/username/.local/share/containers/storage/overlay-containers/<sha256 1>/userdata/<sha256 2/exit/<sha256 1>: internal libpod error Expected Full output and not getting kicked out of the container. Additional information Comments here seem to indicate that setting --log-driver=none (or k8s-file) for podman-run will workaround the issue. I could not reproduce that with this setup. For quicker reproduction during testing, you can use |
As described in containers/podman#13779, podman by default logs all of a container's stdout in order to retain info to make 'podman logs' work. But this extra copy of stdout can be overrun and cause hangs when shipping large amounts of data through stdout (as in the case when piping out a tar stream). Changing the instructions and documentation to add '--log-driver=none' to the podman commandline disables this extra unneeded logging and prevents hangs. Signed-off-by: Jim Ramsay <[email protected]>
As described in containers/podman#13779, podman by default logs all of a container's stdout in order to retain info to make 'podman logs' work. But this extra copy of stdout can be overrun and cause hangs when shipping large amounts of data through stdout (as in the case when piping out a tar stream). Changing the instructions and documentation to add '--log-driver=none' to the podman commandline disables this extra unneeded logging and prevents hangs. Signed-off-by: Jim Ramsay <[email protected]> Signed-off-by: Nishant Parekh <[email protected]>
A friendly reminder that this issue had no activity for 30 days. |
I believe this is fixed in the podman 4.1 |
This is not fixed, |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
@Luap99 @haircommander Do we know if this is fixed in conmon-rs? |
I don't know! though, it would be sweet if it was |
Yeah unless it is integrated into podman we cannot really test it. Although there is also still the podman issue, itjust hangs when conmon died. I know @vrothberg reworked the wait logic recently so maybe this is fixed now? |
Which logic are you referring to @Luap99? |
You exit code work, I know there were changes to how we wait for the container exit as well in there. |
I just tried it on main and it no longer hangs:
|
Nice catch connecting the dots, @Luap99 ! |
Hello, It seems that we are impacted by this bug. Can we know when the new release will come out (with the correction of this bug?) As well as the commit associated with this correction? Thanks in advance, |
The fix will be in Podman 4.2. It was resolved by #14685 |
A friendly reminder that this issue had no activity for 30 days. |
As described in containers/podman#13779, podman by default logs all of a container's stdout in order to retain info to make 'podman logs' work. But this extra copy of stdout can be overrun and cause hangs when shipping large amounts of data through stdout (as in the case when piping out a tar stream). Changing the instructions and documentation to add '--log-driver=none' to the podman commandline disables this extra unneeded logging and prevents hangs. Signed-off-by: Jim Ramsay <[email protected]>
I've been experiencing this issue lately. The problem "seems" to vanish after removing any What I am doingI'm building a set of Live CD components for both How the issue manifests itselfWhen the builds take a long time (especially with What seems to work around the problemI "think" that I have successfully mitigated the issue by removing the What I have tried without success
|
Still an issue with 4.4.3 with a kube yaml with a bash script as entrypoint to compile aosp. Compile aosp has much logging output to the terminal. |
Same here :/ cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 | head -n 9999
...
Error: timed out waiting for file /home/<my_user_here>/.local/share/containers/storage/overlay-containers/xxxxx/userdata/xxxxx/exit/xxxxxx: internal libpod error
[~]$ (Here, I am dropped from inside the container to the terminal of my machine.) $ podman version
Version: 3.4.4
API Version: 3.4.4
Go Version: go1.17.3
Built: Wed Dec 31 21:00:00 1969
OS/Arch: linux/amd64 $ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy $ dpkg -l podman
||/ Nome Versão Arquitectura Descrição
+++-==============-==================-============-==========================================
ii podman 3.4.4+ds1-1ubuntu1 amd64 engine to run OCI-based containers in Pods |
@yveszoundi I am not using the --rm option: |
I fixed it by redirecting the log output into a file in my entrypoint script. |
Is this still an issue, lost in the issue flod? |
@rhatdan I still have this problem :/ |
This is not really a podman problem, the issue is the conmon crashes with segfault so it has to be fixed in conmon. |
I still have this problem, any news? |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
When trying to pipe large amounts of data via stdout, podman often hangs. There was a similar issue #9183 about that before and apparently it was fixed in containers/conmon#236 but I still see an extremely similar issue here, a year later, on new versions.
Steps to reproduce the issue:
The is more or less 100% reliable hang for me:
If you want, you can also run it through
pv
to see how much data was written. It tends to stop after 3.23MiB for me:Describe the results you received:
Both of those hang.
I'm pretty sure this isn't caused by
tar
getting stuck on reading a file, since, for examplepodman run --rm docker.io/library/node:latest tar cf /dev/null / >/dev/null
works fine and exits quickly.Note as well that
tar
isn't running anymore by the time of the hang:The hang can be interrupted by pressing
^C
. When you do that, you immediately see this message:followed exactly 5 seconds later by this:
...followed by an exit with status
130
.Describe the results you expected:
Not hanging.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
No
Additional environment details (AWS, VirtualBox, physical, etc.):
Default install of Silverblue 36, standard issue RH ThinkPad X1 Gen 9.
Since this is potentially a bug in conmon again, the relevant versions:
The text was updated successfully, but these errors were encountered: