Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wurlitzer hangs when the GIL is taken by C code #20

Closed
Lothiraldan opened this issue Jul 20, 2018 · 4 comments · Fixed by #83
Closed

Wurlitzer hangs when the GIL is taken by C code #20

Lothiraldan opened this issue Jul 20, 2018 · 4 comments · Fixed by #83

Comments

@Lothiraldan
Copy link

Hello,

I had to debug an issue in a library where I use Wurlitzer. The script I was willing to monitor was using http://caffe.berkeleyvision.org/ which involves some C code and a Python API. This C code output a lot of outputs directly from the C code and don't release the GIL before running the C code.

The results was that the process froze as the main thread was trying to a full pipe and the wurlitzer consuming threads were blocked waiting for the GIL.

I've attached a redacted thread dump from gdb (all the Python threads are waiting for the GIL, I can send the non-redacted thread dump if needed):
thread-dumps.log

I've reworked our output monitoring code to reproduce wurlitzer architecture but using multiprocessing.Process instead of threads. This way the consumers process have their own GIL and can run even if the main process is running some C. Moreover I had to monkeypatch sys.std{out,err} to force flushing in order to have the last lines of std{out,err} before the end of the script.

I also tried to improve the buffer size by following the inspiration from https://github.com/wal-e/wal-e/blob/cdfaa92698ba19bbbff23ab2421fb377f9affb60/wal_e/pipebuf.py#L56. It did worked, but the stdout / stderr was captured after the C code finished, and in my case it could takes hours. So that was not a solution for me.

The changes are pretty big and I'm not sure if they have their place in wurlitzer, what do you think? Anyway I wanted to let you know the potential issue so you can document it or apply wal-e solution to increase pipe buffer size.

@minrk
Copy link
Owner

minrk commented Jul 30, 2018

Thanks! I think we can hold this as a "known issue" for now. I suspect we should be able to work out what call is blocking. It's quite surprising to me that anything in wurlitzer would hold the GIL and block, since it's only FD-writing calls, which shouldn't hold the GIL while they block in CPython. Maybe what's blocking is the write to the pipe on sys.stdout if the GIL never allows the reader thread to wake?

If you can make a minimal working example, ideally with a little pure c/ctypes/cython code to trigger the problem, that would help a lot. I hope finding the blocking call and making it non-blocking somehow will alleviate the issue.

@Lothiraldan
Copy link
Author

Hi, I also think the issue is that the write (the main thread executing C code and holding the GIL) is blocking because the pipe is full and the reader doesn't execute because the GIL is held by the main thread.

I will try making a minimal working example reproducing the issue.

@Lothiraldan
Copy link
Author

I pushed a repository that seems to reproduce the issue: https://github.com/Lothiraldan/wurlitzer-c-gil

Running python script.py with Cython installed is sufficient to reproduce the issue locally. I couldn't get debugging info from gdb on my local machine but I will try to confirm that the wurlitzer thread is blocked while waiting for the GIL.

@Lothiraldan
Copy link
Author

I have updated https://github.com/Lothiraldan/wurlitzer-c-gil as I wasn't sure I was reproducing the same issue. GDB traces seems more close than the original case I was the issue.

In order to reproduce:

python3 setup.py build_ext --inplace
python3 helloworld_in_c.py

It should hangs and gdb should show that thread 3 is waiting on the GIL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants