Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: 'thread lifetime #1705

Closed
wants to merge 1 commit into from
Closed

Conversation

ticki
Copy link
Contributor

@ticki ticki commented Aug 5, 2016

Add a 'thread lifetime, which denotes a thread-bounded region.

Rendered

@eddyb
Copy link
Member

eddyb commented Aug 5, 2016

See rust-lang/rust#17954 (comment) for a simpler solution.

@nikomatsakis
Copy link
Contributor

I remember @alexcrichton and @aturon and I discussing precisely this question when working on the thread_local! API -- this is basically why you have foo.with(...) instead of foo.get(). Originally, the idea was that foo.get() would return something that lets you access thread-local data via the Deref impl, but we realize that this would allow leakage -- ah, I remember now, we were thinking also of scoped thread-locals, which I think have since been removed (RIP). I wonder if we could now add get so long as the thing which is returned is not Send?

Anyway, certainly we could make special rules for &foo where foo is #[thread_local] -- it's perhaps a bit disappointing, in that one can't return that reference out of your current function, but it'd be a small step to restore soundness.

Are there other use cases for 'thread?

(I am not that concerned about #[thread_local], since it is not fully portable or general.)

@ticki
Copy link
Contributor Author

ticki commented Aug 6, 2016

would allow leakage

Uh, that's right...

Are there other use cases for 'thread?

There a no other (at least to might knowledge) where it is strictly required, but there are quite a few where it is handy (pinging @tomaka).

@ticki
Copy link
Contributor Author

ticki commented Aug 6, 2016

In any case, the current API is kinda hacky (e.g. it relies on closures, devirtualization, and so on).

@tomaka
Copy link

tomaka commented Aug 6, 2016

There a no other (at least to might knowledge) where it is strictly required, but there are quite a few where it is handy (pinging @tomaka).

Sorry. I have had several problems with multithreading in general, but nothing comes to my mind that would be solved with 'thread.

ticki added a commit to redox-os/ralloc that referenced this pull request Aug 6, 2016
The API was unsound due to allowing leakage of a reference to another thread. We solve this in a manner similar to libstd, by using a `with` method, which temporarily gives access through a reference by taking a closure. Related to [RFC 1705](rust-lang/rfcs#1705).
@Amanieu
Copy link
Member

Amanieu commented Aug 7, 2016

(I am not that concerned about #[thread_local], since it is not fully portable or general.)

Actually, since LLVM now supports TLS emulation, #[thread_local] can probably be made to work on any platform. It's still not as general as thread_local! since it doesn't support dynamic initialization and drop, but it will be on par with global statics.

ticki added a commit to redox-os/ralloc that referenced this pull request Aug 7, 2016
The API was unsound due to allowing leakage of a reference to another thread. We solve this in a manner similar to libstd, by using a `with` method, which temporarily gives access through a reference by taking a closure. Related to [RFC 1705](rust-lang/rfcs#1705).
@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 7, 2016
@nikomatsakis nikomatsakis self-assigned this Aug 11, 2016
@nrc nrc mentioned this pull request Aug 17, 2016
@arielb1
Copy link
Contributor

arielb1 commented Aug 20, 2016

The problems I remember with 'thread being equal to 'static are not data races - they are that 'thread has different meaning between threads, so that one of the threads can go down and leave the reference dangling. Are there other problems (e.g. is there UB when no thread ever exits?).

The bit about 'thread not being Send needs further elaboration. A &'thread T can be coerced to a &'a T, which can be freely passed to a scoped thread and lead to just the same problems. Yet passing an &mut &'thread u32 would allow the same dangling pointer situations.

In the presence of thread locals, a sendable Rust Fn(X) -> Y is equivalent to a for<'thread> Fn(ThreadLocals<'thread>, X) -> Y, and we must avoid capture of 'thread there. Not sure what is the best way to handle this.

@ticki
Copy link
Contributor Author

ticki commented Aug 20, 2016

The problems I remember with 'thread being equal to 'static are not data races - they are that 'thread has different meaning between threads, so that one of the threads can go down and leave the reference dangling. Are there other problems (e.g. is there UB when no thread ever exits?).

They aren't equal:

'thread outlives every lifetime with the exception of 'static.

'thread cannot coerce into 'static

The bit about 'thread not being Send needs further elaboration. A &'thread T can be coerced to a &'a T, which can be freely passed to a scoped thread and lead to just the same problems. Yet passing an &mut &'thread u32 would allow the same dangling pointer situations.

Hmm, that's a good point.

@arielb1
Copy link
Contributor

arielb1 commented Aug 20, 2016

They aren't equal

With today's Rust they are and that causes unsafety, not because of data races, but because of thread termination.

Hmm, that's a good point.

You must also stop bad things when they are done in a rather indirect way:

#![feature(thread_local)]

fn call_scoped<F, T>(f: F) -> T
    where F: FnOnce() -> T + Send
{
    // this function  can be implemented using e.g. the crossbeam API, and
    // we want to allow it.
}

fn set_bomb<'b>(bomb: &mut Option<&'b u32>)
    where 'thread: 'b
{
    // this function looks legal.
    #[thread_local]
    static LOCAL: u32 = 0;
    *bomb = Some(&LOCAL);
}

fn detonate_bomb<'b>(bomb: &mut Option<&'b u32>, set_bomb: fn(&mut Option<&'b u32>))
{
    // look at me! no `'thread` in sight!
    call_scoped(|| set_bomb(bomb));    
}

fn blow_up() {
    // no `Send` in sight here.
    let mut bomb = None;
    detonate_bomb(&mut bomb, set_bomb);
    println!("{:?}", bomb);
}

fn main() {
    blow_up()
}

@eddyb
Copy link
Member

eddyb commented Aug 20, 2016

@arielb1 I believe that in my alternative (function-local borrows), this line:

    *bomb = Some(&LOCAL);

would not, in fact, pass the borrow-checker, because &LOCAL would behave a lot like &variable (although it would outlive all function variables) and couldn't escape the function in any way.

@arielb1
Copy link
Contributor

arielb1 commented Aug 20, 2016

@eddyb

function-local borrows

Sure. That would also end up with us not needing 'thread, right? OTOH, some people may want "arbitrary scoped borrowing" - to have an

fn with_thread_locals<T, F: for<'a> FnOnce(ThreadLocals<'a>) -> T>(f: F) -> T { /* <snip> */ }

@eddyb
Copy link
Member

eddyb commented Aug 20, 2016

@arielb1 Yeah, it's intended to make thread_local safe without introducing anything to lifetimes, but rather treating thread_local like local variables (which is a semantic approximation, but it's hard to do anything else because threads are a library construct).
That function signature should work with f(&THREAD_LOCAL) as the body (T cannot capture 'a).

@nikomatsakis
Copy link
Contributor

Yeah, thinking more deeply about the idea of 'thread, I don't think it's plausible to have a system where &'a T: Send is depending on 'a. That just seems fundamentally incompatible with subtyping (not to mention just plain undesirable).

I think that the way to think about 'thread which would make it safe is to consider it instead an implicit region parameter that appears on all fns -- but I'm not sure how we would integrate that notion with our current notion of closures. IOW, to account for the "viewshift" between threads, you'd have to add some way of saying that a closure type or other thing is not dependent or tainted upon the "lifetime of current thread" idea. (Otherwise you get examples like the one that @arielb1 gave.)

Certainly @eddyb's notion of "current fn" seems much simpler -- and, thinking back to what @aturon and @alexcrichton and I were banging on a while back, I remember that a special lifetime 'caller is in fact what we were thinking of (that could be used in unsafely-implemented APIs to get the "immediately enclosing fn body").

@ticki
Copy link
Contributor Author

ticki commented Aug 22, 2016

The current fn approach is not ideal, it makes it imossible for library-level TLS vars.

@eddyb
Copy link
Member

eddyb commented Aug 22, 2016

@ticki You can always create your own handle type which is not Send and which implement Deref unsafely, borrowing the handle type, or have a safe with method, or just a regular free function.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Aug 23, 2016

@ticki

The current fn approach is not ideal, it makes it imossible for library-level TLS vars.

Yes, I agree it's not ideal -- but it is workable. =)

I think you can imagine something like 'thread that also works in a similar fashion. The idea would be roughly that 'thread is a special lifetime parameter and only usable in fn signatures (or perhaps constants and/or other rvalues) as a kind of early lifetime parameter.

When checking a fn body, we create a skolemized lifetime (let's call it 'current-thread) to represent the current thread. 'current-thread <= 'static, naturally, but that's probably all we know about it (and I guess some relation to the current fn body). When a fn is referenced whose signature includes 'thread, it is instantiated with a fresh region variable. This region variable is bounded to be <= 'current-thread. Sorry if this is vague but rough idea is that, when you instantiate something, 'thread is replaced with the 'current-thread representing the point of reference.

XXX -- this got posted before I was finished thinking it through ;)

I realize now that while this scheme makes sense, it would not cause an error @arielb1's example, I don't think. But then, is that a problem? In particular, using this scheme, one could allow a pointer to a thread-local slot to escape to another thread -- but is that necessarily an issue? It seems like it's only a problem if the type of that thread-local variable is a RefCell or something, and in that case the reference will not be Send.

@nikomatsakis
Copy link
Contributor

Ah, never mind, I see the error in my thinking. I think the key part of @arielb1's example that is problematic is here:

fn set_bomb<'b>(bomb: &mut Option<&'b u32>)
    where 'thread: 'b
{
    // this function looks legal.
    #[thread_local]
    static LOCAL: u32 = 0;
    *bomb = Some(&LOCAL);
}

In particular, we can't replace 'thread with 'current-thread at the point where we refer to set_bomb, we have to replace it at the point where we call set_bomb. IOW, 'thread would have to be late-bound which would forbid it from appearing in the where-clause, and would in turn render the fn body illegal.

This isn't a fully thought out thought yet either though :)

Certainly 'thread is full of subtle-ty. I am not sure yet if treating it "properly" is worth it. The only reason I can see pursuing it is precisely if you want to enable references to thread-local data to be shared between (scoped) threads (e.g., when using rayon). Otherwise non-send wrappers seem better.

@eddyb
Copy link
Member

eddyb commented Aug 23, 2016

@nikomatsakis You can always just pass it down the stack in the "current fn" approach, it's as valid as any reference to a stack-local variable

@nikomatsakis
Copy link
Contributor

@eddyb yeah, true. It can cross threads, you just can't return it. But I guess you can return it with a wrapper and share that.

I have to say that adding a bunch of machinery to cover this corner case doesn't feel very good, especially since I consider directly using #[thread_local] a kind of anti-pattern, at least for now (perhaps I will revise my opinion in light of #1705 (comment) though).

@nikomatsakis
Copy link
Contributor

@rfcbot fcp close

I move we close this RFC. The motivation (supporting references to #[thread-local] data) doesn't feel that strong: that attribute isn't even stable, and thread_local! is perfectly adequate (and performant!) in practice. thread_local! already addresses the soundness problems via its API. Moreover, the 'thread "lifetime" doesn't compose well, since you want (safe) pointers to thread-local data to be thread-local, which would mean that &'a T would be Send unless 'a == 'thread, something which we cannot support.

@rfcbot
Copy link
Collaborator

rfcbot commented Nov 11, 2016

Team member @nikomatsakis has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot
Copy link
Collaborator

rfcbot commented Dec 20, 2016

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Dec 20, 2016
@nrc
Copy link
Member

nrc commented Jan 6, 2017

Since this PR has been in FCP for 16 days and there are no new comments, I'm closing it. For justification see #1705 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants