You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a pyzmq-specific bug, not an issue of zmq socket behavior. Don't worry if you're not sure! We'll figure it out together.
What pyzmq version?
26.0.3
What libzmq version?
4.3.5
Python version (and how it was installed)
Python 3.9 via apt
OS
Debian
What happened?
Recently I've run into a bug in cpython that directly affects ZMQ. It triggers whenever asyncio debugging is enabled, and the ZMQ future blocks for more than .1 second:
The bug is due to an unintended recursion that happens when repr() is called on an asyncio task. The recursion is caused by ZMQ's future storing references to other futures including itself, which creates a circular reference. However, because each new layer of recursion must iterate over multiple futures, a RecursionError is never reached, and instead it results in a deadlock where the CPU is stuck at 100%:
This is mainly a bug in cpython, and was fixed in 3.11. However, 3.10 and earlier are still vulnerable to this bug, and based on the feedback from the cpython issue, the fix will not be back ported to those older versions.
I'm creating this issue so you're aware of it, and so anyone else googling for the issue can find it. This one was a beast to track down, since it only happens when PYTHONASYNCIODEBUG=1 and when the ZMQ future blocks for more than .1 second. Hopefully it's helpful to someone.
Thanks for the report! I'm not sure there's an easy fix, but if one turns up I'll give it a try. Hopefully this will help others find out what's going on, at least.
This is a pyzmq bug
What pyzmq version?
26.0.3
What libzmq version?
4.3.5
Python version (and how it was installed)
Python 3.9 via apt
OS
Debian
What happened?
Recently I've run into a bug in cpython that directly affects ZMQ. It triggers whenever asyncio debugging is enabled, and the ZMQ future blocks for more than .1 second:
https://github.com/python/cpython/blob/7c2921844f9fa713f93152bf3a569812cee347a0/Lib/asyncio/base_events.py#L2021-L2023
The bug is due to an unintended recursion that happens when repr() is called on an asyncio task. The recursion is caused by ZMQ's future storing references to other futures including itself, which creates a circular reference. However, because each new layer of recursion must iterate over multiple futures, a RecursionError is never reached, and instead it results in a deadlock where the CPU is stuck at 100%:
This is mainly a bug in cpython, and was fixed in 3.11. However, 3.10 and earlier are still vulnerable to this bug, and based on the feedback from the cpython issue, the fix will not be back ported to those older versions.
python/cpython#122296
I'm creating this issue so you're aware of it, and so anyone else googling for the issue can find it. This one was a beast to track down, since it only happens when
PYTHONASYNCIODEBUG=1
and when the ZMQ future blocks for more than .1 second. Hopefully it's helpful to someone.Full traceback:
python_traceback.txt
Code to reproduce bug
Traceback, if applicable
No response
More info
No response
The text was updated successfully, but these errors were encountered: