-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import nacl.bindings hangs #327
Comments
Are you sure it successfully reinstalled without using the manylinux1 binary? Ubuntu 14.04.5 should have kernel 4.4, which definitely has |
Verbatim commands I'm using to reproduce the issue
uname -a
This was a 14.04.1 install that was upgraded to 14.04.5 but it looks like it still has the 4.2 kernel
|
Upon further research it looks like although 14.04.5 ships with 4.4 by default a dist-upgrade from a previous version of 14.04 will not include the 4.4 kernel update automatically. This means that 14.04.5 is not guaranteed to be running on the 4.4 kernel. Release notes (for new installs):
https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes LTS Enablement notes stating that older installs need to be explicitly upgraded
https://wiki.ubuntu.com/Kernel/LTSEnablementStack#Ubuntu_14.04_LTS_-_Trusty_Tahr Can this be adjusted to account for the lack of 4.4 kernel on 14.04.x installs? |
Thanks for the followup, that's good to know. Altering the way libsodium attempts to determine if the entropy pool is initialized on the /dev/urandom path would require discussion upstream with the libsodium project itself, but we could potentially explore not running |
@colinmcintosh : just curious, which kind of environment is this? Could you try to run a
to verify if the file exists and is regularly updated at system restarts? |
Another question, if this is a virtual machine, which kind of hypervisor is running it? |
@lmctv here is the output of that command. I rebooted and it appears to be updated after the system restart.
This is a virtual machine running in VMware Workstation 12. |
From past experience, I was somewhat expecting we were talking about a VM; if running under qemu, my suggestion would have been to enable the I remember stumbling in some notes about available options to gather entropy from inside the VM ranging from I think this issue can be closed, and maybe referred from a new one about documenting the runtime requirement for a fully seeded entropy pool. |
Added a documentation tag. |
Shouldn't there be an option to disable this behavior and use /dev/urandom? libsodium seems to have a compiler flags for this: setting NO_BLOCKING_RANDOM_POLL, and NOT setting USE_BLOCKING_RANDOM. From what i read online, it would seem to suggest that /dev/urandom should almost always be used instead of /dev/random, and that there's only a few seconds on startup where /dev/urandom isn't cryptographically secure. |
By default libsodium only uses the This obviously results in a bad user experience, but the underlying problem is not one pynacl/libsodium can fix, we can only wait. |
There seems to be a bug with entropy pool gathering with libsodium in which calls to /dev/random fail. This manifests itself in QEMU because it has no hardware entropy (and as a result containers run in VMs). As /dev/random and /dev/urandom are blocking devices this manifests itself in slowdowns for random, sporadic calls from QUADS. libsodium which is used by paramiko cannot use /dev/urandom (non-blocking entropy) so this moves us to use a software-based entropy service which is optional. #221 More about the nacl bindings / libsodium bug: PyNACL / Sodium: pyca/pynacl#327 Symptoms / cause: ====== By default libsodium only uses the /dev/random path to determine if the CSPRNG is initialized before it starts using /dev/urandom. It also prefers to use getrandom, which gives the same behavior for free (blocking only until the CSPRNG is ready). On normal systems this happens extremely rapidly, but there exists a weird long tail of systems that can have bad entropy for long periods of time ====== The pragmatic solution here is to instead use the haveged service to provide software entropy. https://issihosts.com/haveged/ * Make haveged a dependency on QUADS via RPM installation * For containers include and start it in the Dockerfile * haveged is optional, just turn it off if you want to use /dev/random if you're on bare-metal or feel you have sufficient entropy. Included are other fixes/changes below: * Add python3-ipdb to rpm spec requirements * Specify version requirements for ipmitool and git * Remove contactbank wp plugin, we're not using it and it may not even work right. * Since HTTPD is required for serving visuals/instackenv.json make it also be a dependency of quads-server.service * Demote docker compose from recommended status * Correct RPM spec warnings about incorrect date format * Remove waffle.io badge / references as they are closing up shop. https://waffle.io/closing-its-doors Fixes: #221 Change-Id: I5c20defe12871e6399cf6b1ada659caf1a5e1b94
I've experienced this issue several times, at random. I've spent hours troubleshooting, debugging, checking things, talking with various experts. Whenever we have hiccups in the system I most often blame thus bug. It can hang for many many seconds, 30+. It hangs with this, from python -v...
Is there any movement or any other things to test on this? Thank you, running out of ideas. |
@zefoo if this is from a virtual machine as was in @colinmcintosh case, did you try if any of the suggestions in #327 (comment) could help? |
This blog makes some good points: https://www.2uo.de/myths-about-urandom |
That blog post has nothing to say about this issue, which is that libsodium blocks in the presence of an uninitialized CSPRNG. This is desirable behavior, but frustrating if you happen to be on a device with a problem bootstrapping initial entropy. Once you get that initial seed urandom (and getrandom) will never block and are the sole source libsodium uses. |
@lmctv @hypercodex @reaperhulk I really appreciate the two cents. This is not running on a virtual machine that I have host access to. I am seeing this on a Digital Ocean droplet. I've been trying to track it down for months. I see processes overlapping instead of properly dying off due to the hanging of loading nacl (per my paste above). I don't exactly understand all of the moving parts here and not sure what to do next. Any suggestions appreciated, or anything more I can do would be helpful. I started noticing after upgrading to Ubuntu 18.04. Thank you. |
I was advised to apt-get install haveged. Before, I did |
If it's inherently insecure to not read /dev/random at startup, does that mean that OpenSSH is insecure? I can spawn dozens of ssh processes and none of them will hang, which implies that they're not reading /dev/random. If I try to spawn two Python processes that both import nacl.bindings, at least one of them will hang, since both will try to read /dev/random, and the first one to succeed will drain off all the entropy for the next 10 minutes or so on a Google Cloud VM instance. Starting dozens of Python processes that all import nacl.bindings is impossible. |
I am experiencing the same issue running PyNaCl in a docker container in a VM under qemu/kvm
so what is the suggested workarounds? Install |
This one fixed it for me:
|
We are facing this issue on a Linux VPS used for test, the app is restarted after each code modification, which triggers a new init call to pynacl/libsodium. Sometimes (~30% of the time) it takes 40-90 seconds to start, because this "pause" in libsodium start. Here's the point when we CTRL+C when this hangs. Note that the prompt shell is given back only after the hang ending, so this is a real blocking thing.
This is not happenning with "no-binary" and compiling from source with wheel. |
Some info about the VPS system : |
As I noted in this issue in May 2019 this is entirely a CSPRNG initialization situation within libsodium itself. There is nothing pynacl can do here. Pynacl ships with a newer version of libsodium than buster does so it’s possible they’ve changed some details of that initialization though. |
I ran into an issue with another library hanging on import that appears be caused by PyNaCl. I can reproduce the issue with
Expected Behavior: command should complete in, at most, a few seconds.
Actual Behavior: commands takes roughly 90 seconds to complete
Running strace on the command shows that the import is hanging here
I found this closed issue and tried following the recommendation of running
pip install pynacl -I --no-binary pynacl
but the issue still occurs.paramiko/paramiko#1023
OS: Ubuntu 14.04.5
Python Version: 2.7.6
The text was updated successfully, but these errors were encountered: