Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support detached threads still running when the main thread exits ("thread leaks") #1371

Open
1 of 3 tasks
RalfJung opened this issue Apr 27, 2020 · 17 comments
Open
1 of 3 tasks
Labels
A-concurrency Area: affects our concurrency (multi-thread) support C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement

Comments

@RalfJung
Copy link
Member

RalfJung commented Apr 27, 2020

With #1284 we can run multi-threaded programs in Miri. However, right now we error if, when the main thread exits, there are any other threads still left. We should properly support that case.

For that, we need to:

  • not complain about memory reachable from those threads as leaking.
  • figure out what the semantics are wrt. TLS dtors. I assume those do not run for threads that are still around (after all the thread might be working on that piece of TLS right now, so a dtor would find it in an inconsistent state), but we should confirm that and leave appropriate comments.
  • decide what to do about non-detached threads still running when the main thread exits. I am inclined to call those threads a leak, but maybe if they were spawned by a detached thread it's okay? IMO we should report a leak when e.g. the main thread spawns a thread and then neither detaches nor joins it.

Cc @vakaras

@RalfJung RalfJung added A-concurrency Area: affects our concurrency (multi-thread) support C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement labels Apr 27, 2020
@oli-obk
Copy link
Contributor

oli-obk commented Apr 27, 2020

I always found it surprising that it was fine to have threads still exist when the main thead exits. The other threads don't get shut down time at all, they are just killed. I wonder if we should at least warn about it.

@RalfJung
Copy link
Member Author

Some libraries might have "background threads" to do stuff, that does not seem unreasonable to me -- I doubt rayon's thread pool or crossbeam's epoch cleanup thread will be joined ever, and indeed I don't think there is a good way to join them.

@RalfJung
Copy link
Member Author

In other words, "detaching" a thread IMO is an explicit marker that it is entirely okay for this thread to be suddenly killed when the program is done.

I am fine warning about this for non-detached threads, see item 3 above. Detaching then serves as explicit opt-out from that warning.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 27, 2020

Detaching is solely a library thing. Dropping a JoinHandle is a nop afaict, there's no operation that guarantees that a thread has been detached, the user may at any point get a thread id from the ether and then decide to join the thread.

@RalfJung
Copy link
Member Author

Dropping a JoinHandle is a nop afaict

Nope, docs explicitly say

A JoinHandle detaches the associated thread when it is dropped, which means that there is no longer any handle to thread and no way to join on it.

there's no operation that guarantees that a thread has been detached, the user may at any point get a thread id from the ether and then decide to join the thread.

It is UB to join on a detached thread, and @vakaras patch detects that UB.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 27, 2020

oh, interesting. In that case it may be impossible to do a leak check without running those threads to completion after the main thread exits. Since they may never terminate, I'm not sure how we can achieve that though.

The reason I think it's impossible is that the threads may hold integer addresses of memory that they could free, but we can't tell whether that's the case.

@RalfJung
Copy link
Member Author

The reason I think it's impossible is that the threads may hold integer addresses of memory that they could free, but we can't tell whether that's the case.

That's already a problem (#1318), so I don't think it should affect our decision here.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 27, 2020

so... we treat everything that the thread refers to with provenance pointers the same way we treat them in statics? That seems reasonable to me, maybe with a flag to disable it specifically when ppl want to debug leaking from threads that they didn't clean up properly (e.g. when accidentally detaching them)

@RalfJung
Copy link
Member Author

we treat everything that the thread refers to with provenance pointers the same way we treat them in statics? That seems reasonable to me

We have agreement then. :)

maybe with a flag to disable it specifically when ppl want to debug leaking from threads that they didn't clean up properly (e.g. when accidentally detaching them)

In that case I'd just say we have a flag to warn about detached threads still sitting around when the main thread exits -- there is little point in accepting that silently, but then complaining about all its stack memory leaking.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 27, 2020

true! That's much better. Oh, we could even just warn when detaching, to show the user where it's happening 😄

@jonhoo
Copy link

jonhoo commented Jan 5, 2021

Following on from jonhoo/bus#29 (comment), I believe that Rust currently does run destructors for anything in thread-local storage. It does this by using various hooks that the OS provides, such as __cxa_thread_atexit_impl on Linux/glibc.

To follow-on from your comment there Ralf, I believe that function is called when each thread exits. There is also atexit, which I believe is only called after all the threads have terminated, though I'm not entirely sure. You may need someone from rustc to confirm that part.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 5, 2021

I made an experiment and it confirmed what I suspected -- destructors of still-running detached threads are not run.

When the process exits, I don't think the still-running threads are considered to "exit" it any clean sense. It's more like thread cancellation -- they get abruptly stopped without any cleanup.

@jonhoo
Copy link

jonhoo commented Jan 6, 2021

Oh, that's really interesting. Makes me wonder what all that logic I linked to is then doing? It sure seems like it's trying to arrange for thread-locals to be cleaned up... Who might be a good person to ping to ask about this do you think?

@RalfJung
Copy link
Member Author

RalfJung commented Jan 6, 2021

All that logic is ensuring thread-locals get cleaned up when a thread exits cleanly.

@RalfJung
Copy link
Member Author

With #1858 it is at least possible to execute programs that leave behind some threads, by passing -Zmiri-disable-leaks -- basically we treat thread leaks and memory leaks together.

What remains (and what would actually be non-trivial work) is to support checking for memory leaks when there are remaining threads (and possibly checking for threads that are neither detached nor joined).

bors added a commit that referenced this issue Jul 27, 2021
also ignore 'thread leaks' with -Zmiri-ignore-leaks

This is a step towards #1371. The remaining hard part would be supporting checking for memory leaks when there are threads still running. For now we elegantly avoid this problem by using the same flag to control both of these checks. :)
@RalfJung
Copy link
Member Author

RalfJung commented Jun 20, 2022

Unfortunately, this error can come up when using std::thread::scoped, since that code does not actually join the spawned threads; it implements its own logic instead to ensure they are all done processing Rust code. This is not wrong in any way I can see, but unfortunately it startles Miri.

@m-ou-se I am curious, why do our scoped threads hand-roll their own implementation of join?

@m-ou-se
Copy link
Member

m-ou-se commented Jun 20, 2022

That's to make sure the memory (and pid/tid) usage doesn't grow without bounds for long-lived scopes.

For example:

fn main() {
    let config = &read_config();
    let socket = start_listening(config);
    thread::scope(|s| {
        loop {
            let connection = socket.accept();
            s.spawn(move || handle_connection(connection, config));
        }
    });
}

Detaching the threads means they clean up after themselves as soon as they finish. Keeping their join handle around until the end of the scope means this program would have to keep around an ever increasing list of join handles.

@RalfJung RalfJung changed the title Support detached threads still running when the main thread exits Support detached threads still running when the main thread exits ("thread leaks") Jun 25, 2022
akern40 added a commit to rust-ndarray/ndarray that referenced this issue Oct 30, 2024
This is carefully constructed to allow Miri to test most of `ndarray` without slowing down CI/CD very badly; as a result, it skips a number of slow tests. See #1446 for a list.

It also excludes `blas` because Miri cannot call `cblas_gemm`, and it excludes `rayon` because it considers the still-running thread pool to be a leak. `rayon` can be re-added when rust-lang/miri#1371 is resolved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-concurrency Area: affects our concurrency (multi-thread) support C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement
Projects
None yet
Development

No branches or pull requests

4 participants