-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calling join() on terminated thread hangs #71
Comments
What would be the updated condition? And what exactly is the rationale behind this change? Is there any POSIX requirement for this? |
The comment above the function indicates that |
I took a second look at the code. The problem is that the thread sets itself to To avoid this kind of deadlock I would either suggest the fix above (if there are no side effects) or somehow raise the priority of the idle thread in this situation. (Something like that is already done for mutexes, right?) |
I can't tell if the fix is ok, we need to analyse it carefully. The priority can be temporarily raised, but to what level, to be sure that it does not have negative side effects, considering that the idle thread is intended to run only when there is nothing else to be done? |
To the maximum level of all threads that are waiting on a
Maybe then the thread destruction should not be done by the idle thread
sure. I will go with my quick fix for now and we can come back to this once testing infrastructure is in place. |
I need to check the implementation, but is there a list of threads waiting on a
Please feel free to suggest an alternate solution. From my point of view, a well behaved application generally should not rely on the preemptive scheduler, and most of the times should not use a full scheduler cuanta, but enter the idle thread at the end of most scheduler ticks. Unless you have a background computation task, if your application does not reach the idle thread, I would search for busy waits or similar issues. |
I don't have a overview of all the implementation details but this is what comes to mind:
Yes, the problem occurs only on high load. Actually I noticed it due to a busy wait as described in #72. |
Because it is quite difficult for a thread to deallocate the stack it is currently using. Or even the thread object itself.
It can, but are you sure that this will help the thread waiting to join?
How? Adding a new GC thread is not exactly attractive. |
The fact that my hack works shows that the thread reaches |
I may be terribly wrong since I'm not sure I recall correctly the details, but the CMSIS OS validation tests might somehow try to reuse threads; in other words, the thread storage is allocated once and reinitialised multiple times. For this to work, it is mandatory that the thread is no longer in use. As far as I know, only threads that were joined are guaranteed to no longer be in use. If Your hack works since you probably don't reuse threads. When I asked if you are sure that raising priorities helps, I was thinking of the case when a CPU intensive thread may very well have a higher priority, so a slight priority bump might not help. |
I am not suggesting my hack as general solution to the problem. Next to what you describe, there is also the problem that if the memory for the stack does not get deallocated (because the idle thread doesn't run) it might be missing for other tasks. So you could occasionally run out of memory for no apparent reason. Inheriting the priority just makes sure that the transition terminated -> destroyed is run at the same priority as the thread function did. So you you set up priorities in a way that the thread can finish the thread will also get destroyed. |
Also if the idle thread's priority was temporarily raised to destroy a terminated thread, the idle thread should skip other idle actions. That should reduce potential sideffects |
Right. For such cases one might prefer to use static stacks.
We'll consider this mechanism, in some cases it might help, but the real problems here are:
I'm thinking of an alternate solution. How about, in addition to performing the destruction on the idle thread, as of now, it is also performed on the thread waiting to join? (if the thread is terminated, of course) |
higher priority than the idle thread, right?
I was also thinking about that. But what if the idle thread already started with destroying the thread and then you call |
higher priority than the bumped priority, whatever it is.
right, it has to wait for the destruction to complete anyway, and if this involves freeing memory, this duration is non-deterministic for the first-fit-top allocator, since it must also coalesce blocks. but if the waiting thread bumps the idle priority to its own priority, it should take about the same as if it does the destruction itself. just that we need an exclusion mechanism to prevent them both starting, probably a new state like while_destroying. |
But in that case the thread would have not even reached
Yes, that would work. But what's the advantage over taking the priority of the terminating thread? It certainly does make things more complicated. |
That's correct. Otherwise the current functionality applies. We can make things even faster if the terminating thread also awakes the thread waiting to join, and in this case the destruction is done sooner, without having to wait for the idle thread. The idle thread will do the cleanup only for threads that are of no interest for other threads
I don't know, but I can ask the same, what is the advantage of the idle thread taking the priority of the terminating thread? I suggest we use a mechanism similar to the one used by mutexes, if for no other reason that it is known to work. But if we decide to perform the destruction on the context of the waiting thread, we might not even need to bump the idle priority. |
I believe this condition should also check for
state::terminated
micro-os-plus-iii/src/rtos/os-thread.cpp
Line 873 in 9581626
The text was updated successfully, but these errors were encountered: