-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency issue on MRI #47
Conversation
@efivash wrote:
|
@efivash @mperice I am pretty sure this is a deadlock caused by GIL (Global Interpreter Lock). It's not the first time I see this. The provided spec passes on jruby. We need to brainstorm on this. The connector is meant to be thread-safe and reused for performance reasons. It provides connection pooling. You could try playing with different pool sizes (see neo4j-ruby-driver/ffi/neo4j/driver/config.rb Lines 22 to 34 in 01fab91
|
Hmm, the thing is I first started seeing the connection hang issue when I had only 2 threads (2 sidekiq workers) and the connections hang basically on the first two write transactions (right after starting the sidekiq workers). So I don't think I exceeded the connection pool size. However, I was connecting to a remote neo4j server (average latency of around 40ms). When I tried the same setting on my local neo4j db, I couldn't reproduce the issue with only 2 threads. That is why I created the "synthetic" code with many more threads. It seems that there might be two different issues here. |
Yeah, I don't think I'm hitting the thread pool limit either. I'm hitting this with 2 threads and a pool size of 100. I don't run into this deadlock when each thread has its own driver, though I'm not sure how useful that information is. |
I can confirm that we're also seeing this problem on MRI with three threads running. |
If anyone could confirm where in the code the deadlocked threads wait that would be very helpful in potentially finding a solution. |
@klobuczek I took @efivash's 4 threads are waiting for the GVL, one is waiting for a mutex inside libseabolt. The five threads start with Not sure if I'm interpreting this correctly, but the deadlock might be happening because the thread that's waiting for the seabolt mutex has the GVL. The thread that is waiting on the mutex is stuck in the function One issue that is not clear, how did 3 of the threads get past the mutex in the |
Hello, I have the same problem. Do you think there is a chance it will be solved? And if so, when? |
Hi ! Any news or temporal fix on this guys ? |
It happens to me on Rails app when Puma has more than 1 thread and when I do multiple requests at the same time. So I have set puma to 1 thread for now, but to use sidekiq and restore Puma 5 thread, a fix would be appreciated. |
We're running into this as well. @wmene you're right on regarding the GVL and the mutex. I'm also not sure about those three threads, but the thread that IS waiting on the mutex has the GVL, and the other 3 are waiting to acquire the GVL lock. So, you're dead locked. We have the same problem and its ultimately because
Meanwhile:
So Thread A has the mutex and wants the GVL, while Thread B has the GVL and wants the mutex. Deadlock. Our example is a bit easier to read since its only two threads blocked up, not 4. Check out this gist: https://gist.github.com/joshjordan/a53b01b5feb7a0920efd4c3e8982e450 Here, Thread 10 is A from my example, and Thread 9 is B. @klobuczek is there any simple way to detach the logging functions before calling the connection pool acquire and release functions? Or does this change need to be made at the seabolt level? |
This is particularly unfortunate because the default behavior is for the On first look, it doesn't seem that its safe to just disable logging during acquire/release, since multiple The positive side of the no-op logging statements is that there isn't much impact to disabling logging. For anyone looking to escape this issue short-term, check out my branch at https://github.com/joshjordan/neo4j-ruby-driver/tree/temp-disable-logging-fix-deadlock which disables all logging callbacks from seabolt to |
Thank you @joshjordan for the spot-on analysis. There is no easy solution to the problem. We would have to implement the logging function in C and make it write to the same file as the ruby logger. |
The logging has been removed from the ffi version of the driver (1.7). The error should not happen anymore with the caveat of reduced logging. |
No description provided.