-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PAM Auth with interactive input is broken. #49028
Comments
I suspect that #43756 might have improved or possibly resolved this issue. Running through the same reproduction steps as above, but using 17.0.1 instead of 16.4.7 I see the following:
Edit: HRmm, maybe not, this only seems to get this far if |
I've been able to dig into this a bit further, and it appears that nothing has changed in regards to PAM, or go runtime/CGO behavior. This looks to be a deadlock in handling When an interactive PAM auth is attempted, we execute the auth modules while processing the |
I believe this stopped working as a result of #29279. |
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
#29279 caused PAM to deadlock when performing interactive authentication. To restore the previous semblance of functional PAM, this reverts waiting for PAM to be complete if BPF is disabled. #29279 was specifically added to prevent systemd, which may be invoked via a PAM module, from moving the exec subprocess to a different cgroup. Since cgroups are not used outside of Enhanced Session Recording this is a stop-gap measure that can allow mose users of PAM to get an immediate restoration of behavior while a more long term and sane approach to performing PAM during the SSH handshake can be considered, evaluated, and tested. Closes #49028.
Expected behavior:
On a Teleport node with PAM support and PAM auth enabled, users should be presented with output from configured PAM directives, and should be able to provide auth interactive input (like an PAM module that prompts for an OTP code)
Current behavior:
When attempting to open an SSH session, the session terminates after around 20 seconds. The Teleport node logs "Child process never became ready," and "trace.aggregate timed out waiting for continue signal." The user is presented with a generic "EOF" message.
Bug details:
Recreation steps
To reproduce this error, and demonstrate it was working with 7.x, I did the following.
/etc/pam.d/teleport
/etc/pam.d/otp_banner.sh
/etc/pam.d/otp_check.sh
/etc/teleport.yaml
Note: My 16.4.7 lab is a typical
node
instance that is connected to a cluster.My Teleport 7.3.26 instance was an auth+proxy+node all in one
teleport.yaml
file. Thessh_service
pam configuration was the same in both clusters. The teleport 7 instance used an alternate data_dir and bind address so both versions could run on the same host at the same timesudo /path/to/7.3.26/bin/teleport start -d -c /etc/teleport7.yaml
All
tsh
commands were run with a tsh version that matches the cluster version.Expected behavior (as seen in Teleport 7.3.26):
Current behavior (as seen in Teleport 16.4.7):
The
tsh
command hangs for 20 seconds before the EOF message appears.tsh debug messages
Here are the debug logs on the 16.4.7 ssh node:
Strace analysis
When reviewing the behavior of both versions of Teleport via
strace
, there were a few subtle differences in which calls were made. The pid that actually launches theotp_banner.sh
is cloned in both versions.click to expand
This is an strace snippet from the non-working version. It is nearly identical to the working version:
The new version has that
mmap
call that seems to be absent in the old version. The clone options are also slightly different.I think there may be a change in how go and/or cgo handles go threads and loading libraries, which leads to the callback never being processed for this type of PAM message.
The text was updated successfully, but these errors were encountered: