Simplify io::Lazy #58768

pitdicker · 2019-02-27T07:09:37Z

io::Lazy is used inside the standard library by the stdio types as a helper to initialize a global static, and run the destructor at process exit.

The previous code took me some effort to read, as it gets creative with raw pointers to use Box inside a static, and with 0 and 1 as special values.

It now uses an atomic to set the current state, only acquires a mutex during initialization instead of every time, and no longer requires the data to be wrapped in an Arc.

There is one minor change: if for example Stdout were to be used during process exit, which it isn't, three things can happen:

It may still exist at that point, so no problem in using it;
It may be cleaned up, in which case Lazy::get returns None;
It may not yet be initialized. The old code would hand out an Arc, but not initialize the static. Now it also returns None.

But since this case doesn't happen in the standard library, it is not really interesting.

Another change is that stdin_raw(), stdout_raw() and stderr_raw() no longer return a result. Since that design the code has changed to be able to pick up changes to the stdio file descriptors / handles. The state at initialization is not the reliable end-state it will have for the duration of the process. As all sys::stdio types always return Ok, I just removed the needless complexity, including Maybe::fake.

r? @alexcrichton

I am still working on changing the Stdout buffer strategy and detecting a terminal. That is for another PR.

src/libstd/io/stdio.rs

src/libstd/io/lazy.rs

alexcrichton · 2019-02-27T15:17:01Z

Thanks for this! Looking this over though I think this may allow UB because child threads can still be running when the atexit destructors are running, so returning &'static T isn't possible here but rather Arc<T> was intended to protect against that and ensure that child threads would still have a strong reference as they're running around.

Other than that though I think this definitely looks like an improvement!

pitdicker · 2019-02-27T17:15:36Z

Thank you, another edge case I didn't think about. So Drop never runs for the stdio types if there is another thread that holds a reference, even if it gets terminated almost instantly afterwards. And for a quick test this doesn't drop either:

use std::io;
use std::mem;

fn main() {
    let throw_away = io::stdout();
    mem::forget(throw_away);
    print!("Hello, world!");
}

==28634== Memcheck, a memory error detector
==28634== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==28634== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==28634== Command: ./target/debug/test_lazy
==28634== 
==28634== 
==28634== HEAP SUMMARY:
==28634==     in use at exit: 1,144 bytes in 3 blocks
==28634==   total heap usage: 19 allocs, 16 frees, 3,409 bytes allocated
==28634== 
==28634== LEAK SUMMARY:
==28634==    definitely lost: 80 bytes in 1 blocks
==28634==    indirectly lost: 1,064 bytes in 2 blocks
==28634==      possibly lost: 0 bytes in 0 blocks
==28634==    still reachable: 0 bytes in 0 blocks
==28634==         suppressed: 0 bytes in 0 blocks
==28634== Rerun with --leak-check=full to see details of leaked memory
==28634== 
==28634== For counts of detected and suppressed errors, rerun with: -v
==28634== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Not sure what to think about this; but soundness is more important than preventing leaks.

Ideally we would drop in the at_exit handler, and have an option and atomic somewhere so the other threads know the time has come. I have to think about that a bit.

alexcrichton · 2019-02-27T17:21:15Z

That may also be thread locals on the main thread which we've historically had a bad time with

pitdicker · 2019-02-27T17:37:04Z

That may also be thread locals on the main thread which we've historically had a bad time with

Another thing I don't want to know about 😄. I am not sure I understand the current situation. at_exit seems to work pretty well from that issue. The main problem that remains are pthread TLS destuctors that have their own issue #19776?

Or do you mean something more?

alexcrichton · 2019-02-27T18:45:02Z

Oh I just mean that the main thread tls dtor leak may be related to the leak you saw above, it doesn't necessarily indicate a bug. I may be misdiagnosing the leak valgrind you pointed out above though

bors · 2019-02-28T14:43:31Z

☔ The latest upstream changes (presumably #58208) made this pull request unmergeable. Please resolve the merge conflicts.

pitdicker · 2019-03-01T18:08:00Z

Ah, now I get it. It does look very similar, but I intentionally caused it with the mem::forget(io::stdout()).

It took me a bit to understand the issue. Two possible choice are:

Make things live on the heap, and store one Arc to it in a static, which is what the current code did.
Store it in the static, but keep track of the number of references.

I gave the second one a try, it feels slightly cleaner and also just for the fun of it.

pitdicker · 2019-03-03T17:49:22Z

Just realized that my previous attempt did two loads of the atomic: one in the match that guards against DESTRUCTOR_RUNNING, another in fetch_add. Which was wrong, because the atomic can change in-between. Sorry for the extra pushes.

alexcrichton · 2019-03-04T16:59:33Z

No worries for the pushes! This version though looks somewhat complicated and possibly more error-prone, so I'm curious if there's some advantages over the previous iteration?

pitdicker · 2019-03-04T17:15:19Z

This version though looks somewhat complicated and possibly more error-prone, so I'm curious if there's some advantages over the previous iteration?

I am not really happy that it now turns out to be about as complicated or more than before I touched io::Lazy. Is that what you mean with the previous iteration?

I do think it is an advantage it doesn't lock a mutex on every access but only on initialization, is less creative with raw pointers in a Cell, and avoids an allocation. But it is all minor.

alexcrichton · 2019-03-04T17:27:41Z

Ok no worries! I do mean yeah as just before this PR, and I think this isn't really performance critical per se so while it does indeed look a bit faster, I think for maintenance for now we'll probably want to stick with the previous Arc-based version. I think it'd be good to keep the other refactorings though!

pitdicker · 2019-03-04T18:08:39Z

🙁 Just when I was feeling somewhat proud... More serious I think you are right, the complexity is not really worth it.

And I learned some things.

alexcrichton · 2019-03-04T21:22:42Z

It looks like there's still a constant for SHUTDOWN, but after it's stored I can't see where it's later read to return None?

RalfJung · 2019-03-05T10:41:04Z

src/libstd/io/lazy.rs

 pub struct Lazy<T> {
-    // We never call `lock.init()`, so it is UB to attempt to acquire this mutex reentrantly!


You removed this comment, and the one below about reentrancy -- but it doesn't seem like guard.init() is ever actually called?

The comment should be left until we merge #56410 I guess. After that there will not be any init() method and it would not be UB to call it reentrantly even (it would deadlock instead).

These mutexes are not used reentrantly, so it's fine to not call init() now.

It is only moved to the top of the file, hopefully the warning is enough there.

These mutexes are not used reentrantly, so it's fine to not call init() now.

This relies on the users of this abstractions -- but the comment saying so has been removed from get.

It is only moved to the top of the file, hopefully the warning is enough there.

When I added these comments I systematically added them next to the place where the Mutex is declared. I think this consistency is helpful.

RalfJung · 2019-03-05T10:44:05Z

I forgot, why can't this use sync::Once?

pitdicker · 2019-03-05T12:20:16Z

I forgot, why can't this use sync::Once?

Good question, and I am just not sure.

My motivation for avoiding it was because part of what io::Lazy does happens in rt::lang_start_internal. I am not sure if there are any extra guarantees that should be uphold there, and distrust the poison part of mutexes and thread waiter part of Once. But that may turn out to be completely irrelevant. Maybe that all works, and the panic machinery writes directly to StderrRaw, and the rest of stdio should not be used, let alone initialized there.

One maybe more valid reason: We do need to have an atomic somewhere that is set with Ordering::Release after running the destructor, and it makes sense to combine that with one for the initialization.

pitdicker · 2019-03-05T12:22:12Z

It looks like there's still a constant for SHUTDOWN, but after it's stored I can't see where it's later read to return None?

I used it only to 'broadcast' the new contents of the UnsafeCell, but I suppose it can also be set to AVAILABLE. Then it becomes nothing more than a bool, and we can use AtomicBool.

alexcrichton · 2019-03-05T15:31:19Z

I think though that this code still has a data race during shutdown? The write of None isn't synchronized with .as_ref().cloned() which means that if a thread is getting a handle it's racing with the main thread exiting.

In general this is so non-performance-critical I'd prefer to avoid as much complication as possible and favor dirt-simple code. Especially when we start talking about atomic orderings for code like this I think it's probably overkill, so perhaps the old mutex/cell solution could be used?

Dylan-DPC-zz · 2019-03-18T19:30:05Z

ping . from triage @pitdicker waiting for your reply on the last review

Dylan-DPC-zz · 2019-03-25T13:42:42Z

ping from triage @pitdicker
Unfortunately we haven't heard from you on this in a while, so I'm closing the PR to keep things tidy. Don't worry though, if you'll have time again in the future please reopen this PR, we'll be happy to review it again! Thanks for taking time to contribute

rust-highfive assigned alexcrichton Feb 27, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 27, 2019

Centril reviewed Feb 27, 2019

View reviewed changes

src/libstd/io/stdio.rs Show resolved Hide resolved

src/libstd/io/lazy.rs Show resolved Hide resolved

pitdicker mentioned this pull request Feb 27, 2019

Expose raw Stdout/err/in #58326

Open

pitdicker added 2 commits February 28, 2019 18:07

Rewrite io::lazy

271d990

Make construction of raw stdio infallible

1b04b45

Hand out Arc's in io::lazy

b90d70f

RalfJung reviewed Mar 5, 2019

View reviewed changes

Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 18, 2019

Dylan-DPC-zz closed this Mar 25, 2019

Dylan-DPC-zz added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 25, 2019

pitdicker mentioned this pull request Jan 17, 2020

Add lazy initialization primitives to std #68198

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify io::Lazy #58768

Simplify io::Lazy #58768

pitdicker commented Feb 27, 2019

alexcrichton commented Feb 27, 2019

pitdicker commented Feb 27, 2019

alexcrichton commented Feb 27, 2019

pitdicker commented Feb 27, 2019

alexcrichton commented Feb 27, 2019

bors commented Feb 28, 2019

pitdicker commented Mar 1, 2019 •

edited

Loading

pitdicker commented Mar 3, 2019

alexcrichton commented Mar 4, 2019

pitdicker commented Mar 4, 2019

alexcrichton commented Mar 4, 2019

pitdicker commented Mar 4, 2019 •

edited

Loading

alexcrichton commented Mar 4, 2019

RalfJung Mar 5, 2019

faern Mar 5, 2019

pitdicker Mar 5, 2019

RalfJung Mar 5, 2019

RalfJung commented Mar 5, 2019

pitdicker commented Mar 5, 2019

pitdicker commented Mar 5, 2019

alexcrichton commented Mar 5, 2019

Dylan-DPC-zz commented Mar 18, 2019

Dylan-DPC-zz commented Mar 25, 2019

		pub struct Lazy<T> {
		// We never call `lock.init()`, so it is UB to attempt to acquire this mutex reentrantly!

Simplify io::Lazy #58768

Simplify io::Lazy #58768

Conversation

pitdicker commented Feb 27, 2019

alexcrichton commented Feb 27, 2019

pitdicker commented Feb 27, 2019

alexcrichton commented Feb 27, 2019

pitdicker commented Feb 27, 2019

alexcrichton commented Feb 27, 2019

bors commented Feb 28, 2019

pitdicker commented Mar 1, 2019 • edited Loading

pitdicker commented Mar 3, 2019

alexcrichton commented Mar 4, 2019

pitdicker commented Mar 4, 2019

alexcrichton commented Mar 4, 2019

pitdicker commented Mar 4, 2019 • edited Loading

alexcrichton commented Mar 4, 2019

RalfJung Mar 5, 2019

Choose a reason for hiding this comment

faern Mar 5, 2019

Choose a reason for hiding this comment

pitdicker Mar 5, 2019

Choose a reason for hiding this comment

RalfJung Mar 5, 2019

Choose a reason for hiding this comment

RalfJung commented Mar 5, 2019

pitdicker commented Mar 5, 2019

pitdicker commented Mar 5, 2019

alexcrichton commented Mar 5, 2019

Dylan-DPC-zz commented Mar 18, 2019

Dylan-DPC-zz commented Mar 25, 2019

pitdicker commented Mar 1, 2019 •

edited

Loading

pitdicker commented Mar 4, 2019 •

edited

Loading