You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To make this as easy to understand as possible, I recommend making the following code modifications for the sake of testing:
Adding some print statements inside of UCS_STATIC_CLEANUP in parser.c:
Adding a print statement inside of ucs_profile_write in profile.c to hammer home the point that it's being called after the destructor has been executed:
Adding some print statements indicating enter/exit of init.c's destructor:
I build the script using a modification of the DEBIAN release pipeline:
AZ_ARTIFACT_NAME='rpi-build.tar.gz'
POSTFIX='ucx-1.0.0-rpi'# Build
./autogen.sh
./contrib/configure-devel --with-cuda --with-java=no --with-mad --enable-debug
echo"Making clean..."
make_clean distclean
echo"finish making clean.."
make dist
tarball=$(echo ucx*.tar.gz)
tar -xzvf ${tarball}# extract the sources in a subdirectorycd$(tar tf ${tarball}| head -1)# go to extracted tarball directoryecho 10 > debian/compat # https://www.debian.org/doc/manuals/maint-guide/dother.en.htmdpl#compat
dpkg-buildpackage -us -uc -Pcuda,xpmem
cd .. # Move back to the working directory# Rename DEB files
VER="${POSTFIX#ucx-}"# Remove 'ucx-' prefix from the POSTFIX string
find . -name "ucx*.deb" -exec bash -c 'mv "$1" "${1%%_*}-'"${VER}"'.deb"' _ {} \;
find . -name '*.deb'# Show new names# Remove superfluous dependency
dpkg-deb -R "ucx-cuda-${VER}.deb" tmp # Extract
sed -i 's/libnvidia-compute-[0-9]* | libnvidia-ml1, //g' tmp/DEBIAN/control
dpkg-deb -b tmp "ucx-cuda-${VER}.deb"# Rebuild
dpkg-deb -I "ucx-cuda-${VER}.deb"
dpkg-deb -I "ucx-${VER}.deb"# Package
tar -cjf "${AZ_ARTIFACT_NAME}"*.deb *.ddeb # Package all DEBs AND Debug Symbols
tar -tjf "${AZ_ARTIFACT_NAME}"
This script runs in a docker container based on the ubuntu22.04-mofed5-cuda12 image defined here.
Actual Reproduction Steps
ucx_perftest -K mlx5_0:1 is run on the listener host
Something like ucx_perftest -K mlx5_0:1 -t tag_bw lid:40 -s 10 -n 1 is run on the other host
SIGSEV:
It's worth noting that compiling with debug symbols enabled (as seen in my build script) and packaging them appropriately does produce a slightly better callstack.
Further, I compiled with Address Sanitizer (added the -fsanitize=address -fno-omit-frame-pointer flags) and got more information:
==3748572==ERROR: AddressSanitizer: heap-use-after-free on address 0xfffff2c02cd0 at pc 0xfffff7133d18 bp 0xffffffffdd70 sp 0xffffffffdd80
READ of size 4 at 0xfffff2c02cd0 thread T0
#0 0xfffff7133d14 in ucs_config_parser_get_env_vars config/parser.c:2113 #1 0xfffff7157c08 in ucs_profile_write profile/profile.c:297 #2 0xfffff7158a60 in ucs_profile_dump profile/profile.c:700 #3 0xfffff7170cac in ucs_profile_cleanup profile/profile.c:767 #4 0xfffff711ef04 (/lib/libucs.so.0+0x2ef04) #5 0xfffff7fc7394 in _dl_fini elf/dl-fini.c:142 #6 0xfffff6e2cde4 in __run_exit_handlers stdlib/exit.c:113 #7 0xfffff6e2cf08 in __GI_exit stdlib/exit.c:143 #8 0xfffff6e173fc in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:74 #9 0xfffff6e174c8 in __libc_start_main_impl ../csu/libc-start.c:392 #10 0xaaaaaaab3f6c (/usr/bin/ucx_perftest+0x13f6c)
0xfffff2c02cd0 is located 0 bytes inside of 4-byte region [0xfffff2c02cd0,0xfffff2c02cd4)
freed by thread T0 here:
#0 0xfffff75a9fe8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 #1 0xfffff711ecf0 (/lib/libucs.so.0+0x2ecf0) #2 0xfffff7fc7394 in _dl_fini elf/dl-fini.c:142 #3 0xfffff6e2cde4 in __run_exit_handlers stdlib/exit.c:113 #4 0xfffff6e2cf08 in __GI_exit stdlib/exit.c:143 #5 0xfffff6e173fc in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:74 #6 0xfffff6e174c8 in __libc_start_main_impl ../csu/libc-start.c:392 #7 0xaaaaaaab3f6c (/usr/bin/ucx_perftest+0x13f6c)
previously allocated by thread T0 here:
#0 0xfffff75aa2f4 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0xfffff71203c8 in kh_resize_ucs_config_env_vars config/parser.c:47 #2 0xfffff7123b2c in kh_put_ucs_config_env_vars config/parser.c:47 #3 0xfffff71345d4 in ucs_config_parser_mark_env_var_used config/parser.c:1288 #4 0xfffff7139b78 in ucs_config_parser_print_env_vars_once config/parser.c:2060 #5 0xfffff733593c in ucp_worker_create core/ucp_worker.c:2606 #6 0xaaaaaaac1d6c in ucp_perf_setup lib/libperf.c:1662 #7 0xaaaaaaab9d98 in ucx_perf_run lib/libperf.c:1790 #8 0xaaaaaaab9d98 in run_test_recurs src/tools/perf/perftest_run.c:281 #9 0xaaaaaaab09a8 in run_test src/tools/perf/perftest_run.c:340 #10 0xaaaaaaab09a8 in main src/tools/perf/perftest.c:896 #11 0xfffff6e173f8 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #12 0xfffff6e174c8 in __libc_start_main_impl ../csu/libc-start.c:392 #13 0xaaaaaaab3f6c (/usr/bin/ucx_perftest+0x13f6c)
Doing this in conjunction with my print statements makes it pretty cut and dry:
Setup and versions
OS version (e.g Linux distro) + CPU architecture (x86_64/aarch64/ppc64le/...)
Ubuntu 22.04.2 LTS (aarch64)
Personal Remarks
Because C++ doesn't offer guarantees on destructor execution ordering, needing to rely on an object to not be destructed in another destructor seems risky. Likely, the calling of ucs_config_parser_get_env_vars will need to be moved somewhere outside of destruction.
However, I'm honestly not familiar enough with what the intended outcome or value of this function is-- so I can't in good faith make a recommendation on where or how this is done.
With guidance, I'm happy to open a pull request to address.
The text was updated successfully, but these errors were encountered:
Can you try to use this revision and let me know whether it fixes the issue?
Thanks for the quick update! I'm building now and will perform testing this morning and will follow up.
farmerau
changed the title
heap-use-after-free in ucs_config_parser_get_env_vars (SIGSEV)
heap-use-after-free in ucs_config_parser_get_env_vars (SIGSEGV)
Oct 23, 2023
Describe the bug
Running ucx_perftest results in a SIGSEV for trying to read freed memory when running the destructor defined in init.cs on line 131. The call to
ucs_profile_cleanup
happens after the destructor in parser.c on line 2122 is invoked (at least on arm64 on BlueField-2 cards), which frees the memory for the hashtable.Steps to Reproduce
To make this as easy to understand as possible, I recommend making the following code modifications for the sake of testing:
UCS_STATIC_CLEANUP
in parser.c:ucs_profile_write
in profile.c to hammer home the point that it's being called after the destructor has been executed:I build the script using a modification of the DEBIAN release pipeline:
ubuntu22.04-mofed5-cuda12
image defined here.Actual Reproduction Steps
ucx_perftest -K mlx5_0:1
is run on the listener hostucx_perftest -K mlx5_0:1 -t tag_bw lid:40 -s 10 -n 1
is run on the other hostIt's worth noting that compiling with debug symbols enabled (as seen in my build script) and packaging them appropriately does produce a slightly better callstack.
Further, I compiled with Address Sanitizer (added the
-fsanitize=address -fno-omit-frame-pointer
flags) and got more information:Setup and versions
Personal Remarks
Because C++ doesn't offer guarantees on destructor execution ordering, needing to rely on an object to not be destructed in another destructor seems risky. Likely, the calling of
ucs_config_parser_get_env_vars
will need to be moved somewhere outside of destruction.However, I'm honestly not familiar enough with what the intended outcome or value of this function is-- so I can't in good faith make a recommendation on where or how this is done.
With guidance, I'm happy to open a pull request to address.
The text was updated successfully, but these errors were encountered: