Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

posix: A brick process is getting crashed at the time of graph reconfigure #1794

Closed
mohit84 opened this issue Nov 16, 2020 · 1 comment · Fixed by #1804
Closed

posix: A brick process is getting crashed at the time of graph reconfigure #1794

mohit84 opened this issue Nov 16, 2020 · 1 comment · Fixed by #1804
Assignees

Comments

@mohit84
Copy link
Contributor

mohit84 commented Nov 16, 2020

The brick process is getting crashed at the time of graph reconfigure in io_uring code path.It seeems at the time of calling posisX_reconfigure it call's posix_io_uring_off which eventually is calling posix_io_uring_fini, the fini function is not checking
io_uring object is configured or not so it is crashing.

warning: Unexpected size of section `.reg-xstate/2316867' in core file.
#0 0x00007f77faaedcba in io_uring_get_sqe () from /lib64/liburing.so.1
[Current thread is 1 (Thread 0x7f77fb34a700 (LWP 2316867))]
(gdb) bt
#0 0x00007f77faaedcba in io_uring_get_sqe () from /lib64/liburing.so.1
#1 0x00007f77fab2ff83 in posix_io_uring_drain (priv=0x7f77ec0723d0) at posix-io-uring.c:550
#2 posix_io_uring_fini (this=) at posix-io-uring.c:567
#3 0x00007f77fab30126 in posix_io_uring_off (this=) at posix-io-uring.c:612
#4 0x00007f77fab2a2b5 in posix_reconfigure (this=0x7f77ec008a40, options=0x7f77ec0bcc48) at posix-common.c:402
#5 0x00007f780d2a4c16 in xlator_reconfigure_rec (old_xl=0x7f77ec008a40, new_xl=0x7f77ec12feb0) at options.c:1126
#6 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec00c6a0, new_xl=0x7f77ec18af00) at options.c:1110
#7 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec00e7e0, new_xl=0x7f77ec10f990) at options.c:1110
#8 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec010ba0, new_xl=0x7f77ec0dde20) at options.c:1110
#9 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec012ae0, new_xl=0x7f77ec0defc0) at options.c:1110
#10 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec014690, new_xl=0x7f77ec18c0a0) at options.c:1110
#11 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec0163d0, new_xl=0x7f77ec088a50) at options.c:1110
#12 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec0184e0, new_xl=0x7f77ec1782e0) at options.c:1110
#13 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec01a1c0, new_xl=0x7f77ec0878b0) at options.c:1110
#14 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec01bed0, new_xl=0x7f77ec0e1300) at options.c:1110
#15 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec01dba0, new_xl=0x7f77ec177110) at options.c:1110
#16 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec01f770, new_xl=0x7f77ec086680) at options.c:1110
#17 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec0214b0, new_xl=0x7f77ec08ae60) at options.c:1110
#18 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec023ce0, new_xl=0x7f77ec0d4180) at options.c:1110
#19 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec025c10, new_xl=0x7f77ec0cf8d0) at options.c:1110
#20 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec027d70, new_xl=0x7f77ec0d1d70) at options.c:1110
#21 0x00007f780d2a4b8a in xlator_reconfigure_rec (old_xl=0x7f77ec029ee0, new_xl=0x7f77ec10e7f0) at options.c:1110
#22 0x00007f780d2a7aa9 in xlator_tree_reconfigure (old_xl=, new_xl=) at options.c:1154
#23 0x00007f780d288b67 in glusterfs_graph_reconfigure (oldgraph=oldgraph@entry=0x7f77ec003620, newgraph=newgraph@entry=0x7f77ec0c8900)
at graph.c:1129
#24 0x00007f780d288d9a in glusterfs_volfile_reconfigure (newvolfile_fp=newvolfile_fp@entry=0x7f77ec0023f0, ctx=ctx@entry=0x10892c0)
at graph.c:951
#25 0x0000000000411294 in mgmt_getspec_cbk (req=, iov=, count=, myframe=0x7f77ec091098)
at glusterfsd-mgmt.c:2255
#26 0x00007f780d1f9dc2 in rpc_clnt_handle_reply (clnt=clnt@entry=0x1136850, pollin=pollin@entry=0x7f77ec081080) at rpc-clnt.c:759
#27 0x00007f780d1fa115 in rpc_clnt_notify (trans=0x1136a70, mydata=0x1136880, event=, data=0x7f77ec081080)
--Type for more, q to quit, c to continue without paging--
at rpc-clnt.c:926
#28 0x00007f780d1f6dc6 in rpc_transport_notify (this=this@entry=0x1136a70, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED,
data=data@entry=0x7f77ec081080) at rpc-transport.c:520
#29 0x00007f77fbc18f78 in socket_event_poll_in_async (xl=, async=async@entry=0x7f77ec081198) at socket.c:2502
#30 0x00007f77fbc1e56c in gf_async (cbk=0x7f77fbc18f50 <socket_event_poll_in_async>, xl=, async=0x7f77ec081198)
at ../../../../libglusterfs/src/glusterfs/async.h:189
#31 socket_event_poll_in (notify_handled=true, this=0x1136a70) at socket.c:2543
#32 socket_event_handler (event_thread_died=0 '\000', poll_err=0, poll_out=, poll_in=, data=0x1136a70,
gen=1, idx=1, fd=) at socket.c:2934
#33 socket_event_handler (fd=fd@entry=10, idx=idx@entry=1, gen=gen@entry=1, data=data@entry=0x1136a70, poll_in=,
poll_out=, poll_err=0, event_thread_died=0 '\000') at socket.c:2854
#34 0x00007f780d2b0bab in event_dispatch_epoll_handler (event=0x7f77fb348fe4, event_pool=0x10bf7d0) at event-epoll.c:640
#35 event_dispatch_epoll_worker (data=0x1138c40) at event-epoll.c:751
#36 0x00007f780cfdb432 in start_thread (arg=) at pthread_create.c:477
#37 0x00007f780cc1a9d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reproducer:

  1. Setting up a 1x3 volume
  2. Run a add-brick operation
    The volume status is showing "N/A" because a brick process is getting crashed.
@mohit84
Copy link
Contributor Author

mohit84 commented Nov 16, 2020

@itisravi
Can you please send a patch, after put a condition in posix_io_uring_off check the value of priv->io_uring_capable i am able to avoid a crash.

@itisravi itisravi self-assigned this Nov 16, 2020
itisravi added a commit to itisravi/glusterfs that referenced this issue Nov 16, 2020
Call posix_io_uring_fini only if it was inited to begin with.

Fixes: gluster#1794
Reported-by: Mohit Agrawal <[email protected]>
Signed-off-by: Ravishankar N <[email protected]>

Change-Id: I0e840b6b1d1f26b104b30c8c4b88c14ce4aaac0d
pranithk pushed a commit that referenced this issue Nov 17, 2020
Call posix_io_uring_fini only if it was inited to begin with.

Fixes: #1794
Reported-by: Mohit Agrawal <[email protected]>
Signed-off-by: Ravishankar N <[email protected]>

Change-Id: I0e840b6b1d1f26b104b30c8c4b88c14ce4aaac0d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants