-
Notifications
You must be signed in to change notification settings - Fork 626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle thread exit #1869
Handle thread exit #1869
Conversation
6c6cd45
to
b5b142a
Compare
cf0c05d
to
7cc4fc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -27,7 +30,7 @@ typedef struct { | |||
void | |||
run_long_task() | |||
{ | |||
// Busy waiting to be interruptible by exception | |||
// Busy waiting to be interruptible by trap or `proc_exit` | |||
for (int i = 0; i < TIMEOUT_SECONDS; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wenyongh btw I notice something while testing: if I replace this for loop with a __builtin_wasm_memory_atomic_wait32(0, 0, -1);
or even while(1);
the thread doesn't get interrupted even if the exception is propagated (in the first case it gets stuck here https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/shared/platform/common/posix/posix_thread.c#L192). Is that expected? I'll investigate to understand what's happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's what i expect to happen with the current implementation.
to terminate busy looping threads, something like wasm_cluster_thread_send_signal
would work.
i guess we eventually need to implement something (eg. send a signal on posix-like platforms) to terminate the blocking threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wenyongh btw I notice something while testing: if I replace this for loop with a
__builtin_wasm_memory_atomic_wait32(0, 0, -1);
or evenwhile(1);
the thread doesn't get interrupted even if the exception is propagated (in the first case it gets stuck here https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/shared/platform/common/posix/posix_thread.c#L192). Is that expected? I'll investigate to understand what's happening
Normally runtime should terminate all threads when exception was thrown, I remember it is the expected behavior of Java VM. Seems it is an issue of the main branch, I try to fix it with the patch below for interpreter. And another issue of main branch may be that the main thread doesn't spread exception to other thread when exception was thrown, like the change of this PR in wasm_runtime_call_wasm. I will discuss with @xujuntwt95329 for how to fix them.
diff --git a/core/iwasm/interpreter/wasm_interp_fast.c b/core/iwasm/interpreter/wasm_interp_fast.c
index 3109b0c8..36863bf2 100644
--- a/core/iwasm/interpreter/wasm_interp_fast.c
+++ b/core/iwasm/interpreter/wasm_interp_fast.c
@@ -1054,6 +1054,8 @@ wasm_interp_call_func_import(WASMModuleInstance *module_inst,
#if WASM_ENABLE_THREAD_MGR != 0
#define CHECK_SUSPEND_FLAGS() \
do { \
+ if (wasm_get_exception(module)) \
+ goto got_exception; \
if (exec_env->suspend_flags.flags != 0) { \
if (exec_env->suspend_flags.flags & 0x01) { \
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eloparco I discussed the issues with @xujuntwt95329, he agreed that they are issues and will submit patch to fix them on main branch. Could you wait until we fix them and merge branch main
into dev/wasi_threads
, so as to sync up the code and avoid conflicts in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as proposed by @yamt, I tried adding a wasm_cluster_send_signal_all(cluster, WAMR_SIG_TERM);
after this line that spreads the exception
traverse_list(&cluster->exec_env_list, set_exception_visitor, exec_env); |
while(1);
but not for other cases like sleep(TIMEOUT_SECONDS);
or __builtin_wasm_memory_atomic_wait32(0, 0, -1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for the wasm_cluster_add_exec_env, not sure why ret can be removed.
It can be replaced with return
statements (we no longer have a lock to unlock). As I said, it's not a big deal though. The rest looks ok for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @yamt that we may need some more complex mechanism to terminate the threads when they are blocked in kernel.
@xujuntwt95329 / @wenyongh does anybody on your side actively working on that? If not, we'll take on that work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@loganek not yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok thanks, we'll look into that then.
c4adc3a
to
e355799
Compare
core/iwasm/aot/aot_runtime.c
Outdated
#if WASM_ENABLE_LIBC_WASI != 0 | ||
/* With WASI threads, proc_exit exception is not cleared because it has to | ||
* be detected and propagated to the other threads in the process */ | ||
#if WASM_ENABLE_LIBC_WASI != 0 && WASM_ENABLE_LIB_WASI_THREADS == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems lib-pthread
also need to spread the proc exit exception, I'll modify this line to #if WASM_ENABLE_LIBC_WASI != 0 && WASM_ENABLE_THREAD_MGR == 0
in main branch
And refactor clear_wasi_proc_exit_exception, refer to #1869
e355799
to
c2a3961
Compare
c2a3961
to
80b0257
Compare
Rebased onto #define TEST_TERMINATION_BY_TRAP 1 // Otherwise test `proc_exit` termination
#define TEST_TERMINATION_IN_MAIN_THREAD 0 once I got ./iwasm -v=5 wasm-apps/thread_termination.wasm
ninja: no work to do.
[05:30:26:276 - 102E6C580]: Load type section success.
[05:30:26:298 - 102E6C580]: Load import section success.
[05:30:26:332 - 102E6C580]: Load function section success.
[05:30:26:335 - 102E6C580]: Load table section success.
[05:30:26:337 - 102E6C580]: Load memory section success.
[05:30:26:340 - 102E6C580]: Load global section success.
[05:30:26:344 - 102E6C580]: Load export section success.
[05:30:26:346 - 102E6C580]: Load start section success.
[05:30:26:349 - 102E6C580]: Load table segment section success.
[05:30:26:351 - 102E6C580]: Load datacount section success.
[05:30:26:353 - 102E6C580]: Load code segment section success.
[05:30:26:356 - 102E6C580]: Load data segment section success.
[05:30:26:358 - 102E6C580]: Ignore custom section [.debug_info].
[05:30:26:360 - 102E6C580]: Ignore custom section [.debug_loc].
[05:30:26:361 - 102E6C580]: Ignore custom section [.debug_ranges].
[05:30:26:363 - 102E6C580]: Ignore custom section [.debug_aranges].
[05:30:26:364 - 102E6C580]: Ignore custom section [.debug_abbrev].
[05:30:26:365 - 102E6C580]: Ignore custom section [.debug_line].
[05:30:26:367 - 102E6C580]: Ignore custom section [.debug_str].
[05:30:26:384 - 102E6C580]: Ignore custom section [name].
[05:30:26:389 - 102E6C580]: Ignore custom section [producers].
[05:30:26:391 - 102E6C580]: Ignore custom section [target_features].
[05:30:26:393 - 102E6C580]: Found aux __heap_base global, value: 38560
[05:30:26:394 - 102E6C580]: Found aux __data_end global, value: 5788
[05:30:26:396 - 102E6C580]: Found aux stack top global, value: 38560, global index: 0, stack size: 32768
[05:30:26:399 - 102E6C580]: Found malloc function, name: malloc, index: 19
[05:30:26:401 - 102E6C580]: Found free function, name: free, index: 21
[05:30:26:444 - 102E6C580]: wasm-micro-runtime/core/iwasm/interpreter/wasm_loader.c, line 715, can not find an export 0 named _initialize in the module
[05:30:26:450 - 102E6C580]: Load module success.
[05:30:26:456 - 102E6C580]: Memory instantiate:
[05:30:26:458 - 102E6C580]: page bytes: 65536, init pages: 1, max pages: 30
[05:30:26:459 - 102E6C580]: heap offset: 65536, heap size: 0
[05:30:26:673 - 102E6C580]: Memory instantiate success.
Thread running
Waiting before terminating
Main thread running
Thread running
Assertion failed: false && "Unreachable" (wasm-micro-runtime/samples/wasi-threads/wasm-apps/thread_termination.c: __wasi_thread_start_C: 61)
[05:30:36:007 - 16D3AF000]: wasm-micro-runtime/core/iwasm/interpreter/wasm_interp_fast.c, line 3970, meet an exception Exception: unreachable
[05:30:36:570 - 16D39B000]: wasm-micro-runtime/core/iwasm/interpreter/wasm_interp_fast.c, line 3970, meet an exception Exception: unreachable
[05:30:36:541 - 102E6C580]: wasm-micro-runtime/core/iwasm/interpreter/wasm_interp_fast.c, line 3970, meet an exception Exception: unreachable i.e. the example is failing bacause the trap generate in one of the spawn threads is not progated within |
We can probably merge this PR and I'll open a separate issue to track the possible race condition that I documented above (and I've never been able to reproduce again). |
…ance#1889) And refactor clear_wasi_proc_exit_exception, refer to bytecodealliance#1869
…mination (bytecodealliance#1869) Trigger wasi_proc_exit in the sample
Implementation of what's described here: WebAssembly/wasi-threads#17:
proc_exit
in the main thread should stop all the threads in the processproc_exit
should terminate all the threads in the processThe sample has been modified to test combinations of trap or
proc_exit
for termination and termination forced in the main thread or a spawn thread.Part of #1790.