-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verbs hangs in ibv_reg_mr() / fi mr caching issue #5687
Comments
I believe I see the issue, and I'll work on a fix. |
shefty
added a commit
to shefty/libfabric
that referenced
this issue
Mar 13, 2020
When a build a new cache entry (via util_mr_cache_create), we allocate memory and register the region with the underlying provider. This can result in the generation of monitor notifications, for example, intercepting the alloc calls. Because the notifications will acquire the cache lock in order to flush unusable entries, we cannot hold that same lock while building the entry, or deadlock can occur. This has been seen by applications. See issue ofiwg#5687. To handle this, we build new cache entries outside of the lock, and only acquire the lock when inserting them back into the cache. This opens a race condition where a conflicting entry can be inserted into the cache between the first find() call and the insert() call. We expect such occurences to be rare, as it requires a multi-threaded app to post transfers referencing the same region simultaneously from multiple threads. In order to handle the race, we need to duplicate the find() check after building the new entry prior to inserting it. If a conflict is found, we abort the insertion and restart the entire higher-level search operation. Signed-off-by: Sean Hefty <[email protected]>
shefty
added a commit
to shefty/libfabric
that referenced
this issue
Mar 13, 2020
When a build a new cache entry (via util_mr_cache_create), we allocate memory and register the region with the underlying provider. This can result in the generation of monitor notifications, for example, intercepting the alloc calls. Because the notifications will acquire the cache lock in order to flush unusable entries, we cannot hold that same lock while building the entry, or deadlock can occur. This has been seen by applications. See issue ofiwg#5687. To handle this, we build new cache entries outside of the lock, and only acquire the lock when inserting them back into the cache. This opens a race condition where a conflicting entry can be inserted into the cache between the first find() call and the insert() call. We expect such occurences to be rare, as it requires a multi-threaded app to post transfers referencing the same region simultaneously from multiple threads. In order to handle the race, we need to duplicate the find() check after building the new entry prior to inserting it. If a conflict is found, we abort the insertion and restart the entire higher-level search operation. Signed-off-by: Sean Hefty <[email protected]>
shefty
added a commit
to shefty/libfabric
that referenced
this issue
Mar 13, 2020
When a build a new cache entry (via util_mr_cache_create), we allocate memory and register the region with the underlying provider. This can result in the generation of monitor notifications, for example, intercepting the alloc calls. Because the notifications will acquire the cache lock in order to flush unusable entries, we cannot hold that same lock while building the entry, or deadlock can occur. This has been seen by applications. See issue ofiwg#5687. To handle this, we build new cache entries outside of the lock, and only acquire the lock when inserting them back into the cache. This opens a race condition where a conflicting entry can be inserted into the cache between the first find() call and the insert() call. We expect such occurences to be rare, as it requires a multi-threaded app to post transfers referencing the same region simultaneously from multiple threads. In order to handle the race, we need to duplicate the find() check after building the new entry prior to inserting it. If a conflict is found, we abort the insertion and restart the entire higher-level search operation. Signed-off-by: Sean Hefty <[email protected]>
PR #5729 is intended to address this issue. Testing has not been completed on those changes yet. |
shefty
added a commit
to shefty/libfabric
that referenced
this issue
Mar 16, 2020
When a build a new cache entry (via util_mr_cache_create), we allocate memory and register the region with the underlying provider. This can result in the generation of monitor notifications, for example, intercepting the alloc calls. Because the notifications will acquire the cache lock in order to flush unusable entries, we cannot hold that same lock while building the entry, or deadlock can occur. This has been seen by applications. See issue ofiwg#5687. To handle this, we build new cache entries outside of the lock, and only acquire the lock when inserting them back into the cache. This opens a race condition where a conflicting entry can be inserted into the cache between the first find() call and the insert() call. We expect such occurences to be rare, as it requires a multi-threaded app to post transfers referencing the same region simultaneously from multiple threads. In order to handle the race, we need to duplicate the find() check after building the new entry prior to inserting it. If a conflict is found, we abort the insertion and restart the entire higher-level search operation. Signed-off-by: Sean Hefty <[email protected]>
Can you see if the changes in PR #5729 fix the problem for you? |
shefty
added a commit
to shefty/libfabric
that referenced
this issue
Mar 16, 2020
When a build a new cache entry (via util_mr_cache_create), we allocate memory and register the region with the underlying provider. This can result in the generation of monitor notifications, for example, intercepting the alloc calls. Because the notifications will acquire the cache lock in order to flush unusable entries, we cannot hold that same lock while building the entry, or deadlock can occur. This has been seen by applications. See issue ofiwg#5687. To handle this, we build new cache entries outside of the lock, and only acquire the lock when inserting them back into the cache. This opens a race condition where a conflicting entry can be inserted into the cache between the first find() call and the insert() call. We expect such occurences to be rare, as it requires a multi-threaded app to post transfers referencing the same region simultaneously from multiple threads. In order to handle the race, we need to duplicate the find() check after building the new entry prior to inserting it. If a conflict is found, we abort the insertion and restart the entire higher-level search operation. Signed-off-by: Sean Hefty <[email protected]>
Tried this on CaRT iv_test suite using verbs - passed all tests locally. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
On CaRT project we've tried to update to ba597c9 of OFI and are now hitting a problem where ofi+verbs;ofi_rxm seems to be hanging at init time.
This has been reproduced on few systems with different server tests.
The same tests running with "FI_MR_CACHE_MAX_COUNT=0" envariable set, passes this particular hang.
Attaching to server via gdp -p / backtrace shows:
(gdb) bt
#0 0x00007f65f544951d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f65f5444e1b in _L_lock_812 () from /lib64/libpthread.so.0
#2 0x00007f65f5444ce8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f65f305fd98 in ofi_intercept_handler () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#4 0x00007f65f305fea2 in ofi_intercept_madvise () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#5 0x00007f65f27f759c in ibv_madvise_range.part.5 () from /lib64/libibverbs.so.1
#6 0x00007f65f27f8fd2 in ibv_reg_mr () from /lib64/libibverbs.so.1
#7 0x00007f65f3087849 in vrb_mr_cache_add_region () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#8 0x00007f65f30606a9 in util_mr_cache_create.isra.5 () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#9 0x00007f65f3060a3f in ofi_mr_cache_search () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#10 0x00007f65f30875d6 in vrb_mr_cache_reg () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#11 0x00007f65f309512e in rxm_mr_regv () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#12 0x00007f65f309525a in rxm_mr_reg () from /home/aaoganez/github/liwei/cart/install/Linux/lib/libfabric.so.1
#13 0x00007f65f4d0cdd4 in fi_mr_reg (context=0x0, mr=0x7ffe95704398, flags=0, requested_key=0, offset=0, acs=16128, len=1050672, buf=0x3034000, domain=)
at /home/aaoganez/github/liwei/cart/install/Linux/include/rdma/fi_domain.h:328
#14 na_ofi_mem_alloc (na_class=, mr_hdl=0x7ffe95704398, size=1050672) at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/na/na_ofi.c:2360
#15 na_ofi_mem_pool_create (block_size=4096, block_count=256, na_class=0x187c600) at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/na/na_ofi.c:2311
#16 na_ofi_mem_pool_alloc (mr_hdl=, size=4096, na_class=0x187c600) at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/na/na_ofi.c:2415
#17 na_ofi_msg_buf_alloc (na_class=0x187c600, size=4096, plugin_data=0x3033b60) at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/na/na_ofi.c:3649
#18 0x00007f65f4f2c012 in hg_core_alloc_na (use_sm=, hg_core_handle=0x3033a50) at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/mercury_core.c:1708
#19 hg_core_create (context=context@entry=0x302f6d0, use_sm=) at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/mercury_core.c:1624
#20 0x00007f65f4f2d9d8 in hg_core_context_post (use_sm=, repost=, request_count=, context=)
at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/mercury_core.c:2747
#21 HG_Core_context_post (context=0x302f6d0, request_count=request_count@entry=256, repost=repost@entry=1 '\001')
at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/mercury_core.c:3832
#22 0x00007f65f4f24f3a in HG_Context_create_id (hg_class=, id=id@entry=0 '\000') at /home/aaoganez/github/liwei/cart/_build.external-Linux/mercury/src/mercury.c:1082
#23 0x00007f65f58c6217 in crt_hg_ctx_init (hg_ctx=hg_ctx@entry=0x302a518, idx=0) at src/cart/crt_hg.c:651
#24 0x00007f65f5892297 in crt_context_create (crt_ctx=crt_ctx@entry=0x7ffe95704750) at src/cart/crt_context.c:239
#25 0x000000000040147f in get_self_uri (h=0x2c0c010) at src/crt_launch/crt_launch.c:163
#26 main (argc=8, argv=0x7ffe95704968) at src/crt_launch/crt_launch.c:306
The text was updated successfully, but these errors were encountered: