-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brick process crashed when global thread pool is enabled #3527
Comments
@xhernandez So If my assumption is wrong, In general, do you see why the |
I needed to refresh my memory before fully understanding the issue (this code was wrote long ago). I think you are right. The order in which wake and create are executed is not good because there's a race that could cause that two threads try to create another thread at the same time, and both use a non-thread-safe function. However this was done this way to reduce the latency of the jobs pending in the queue. Creating the thread before waking another one will increase the latency. The mistake here was using a shared stack to keep already created threads. I need to figure out how to solve this. However, util I found a better solution, I think your patch is ok. I'll approve it. |
We were sending the signal to transfer the leader before we create a new leader. Which means, there is a potential chance that the new thread creation will be executed twice which is not what we are intended to perform Fixes: gluster#3527 Change-Id: I8446ae3130894f351e6bd1197f26ce97d8cb1eef Signed-off-by: Mohammed Rafi KC <[email protected]>
We were sending the signal to transfer the leader before we create a new leader. Which means, there is a potential chance that the new thread creation will be executed twice which is not what we are intended to perform Fixes: #3527 Change-Id: I8446ae3130894f351e6bd1197f26ce97d8cb1eef Signed-off-by: Mohammed Rafi KC <[email protected]>
Description of problem:
We have multiple crashes when global thread pool is enabled. We enabled the threadpool on the brick side with 16 threads on the pool. We have also disabled the io-threads.
All the crashes are pointing to a single back trace , that is
The exact command to reproduce the issue:
The full output of the command that failed:
Expected results:
Mandatory info:
- The output of the
gluster volume info
command:- The output of the
gluster volume status
command:- The output of the
gluster volume heal
command:**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
**- Is there any crash ? Provide the backtrace and coredump
Additional info:
- The operating system / glusterfs version:
glusterfs-8.6
Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration
The text was updated successfully, but these errors were encountered: