Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove TLS #14

Closed
wants to merge 6 commits into from
Closed

Remove TLS #14

wants to merge 6 commits into from

Conversation

newpavlov
Copy link
Member

@newpavlov newpavlov commented Mar 17, 2019

Closes: #13

@dhardy
Copy link
Member

dhardy commented Mar 18, 2019

Any plans to extend this to the other platforms (besides WASM) using TLS?

@newpavlov
Copy link
Member Author

Yes, but I am not sure if Haiku is considered a part of UNIX family by Rust. Also current implementation has a bug, which I'll fix later.

@newpavlov
Copy link
Member Author

newpavlov commented Mar 18, 2019

I think I've finished what I wanted to do with non-WASM targets. Result is a bit more complex than that I initially had in mind, but oh well. :) There are certain code duplications, but due to subtle differences I don't see how they can be abstracted nicely. I think orderings are correct, but it's better to double check.

I guess we could remove TLS for WASM targets as well and replace them with "atomics" (which in future may become true atomics) using a similar approach.

@newpavlov newpavlov changed the title WIP: remove TLS Remove TLS Mar 18, 2019
@newpavlov
Copy link
Member Author

newpavlov commented Mar 18, 2019

BTW maybe we can remove rust_122_compat macro from __wbg_shims and simplify relevant code a bit? Also it looks like rustwasm/wasm-bindgen#201 is fixed, so it may be worth to try move shims to bindgen module.

@dhardy
Copy link
Member

dhardy commented Mar 19, 2019

Good call, but lets keep that in a separate PR. You can test the WASM stuff by following the CI script.

src/dragonfly_haiku_emscripten.rs Outdated Show resolved Hide resolved
src/dragonfly_haiku_emscripten.rs Outdated Show resolved Hide resolved
src/dragonfly_haiku_emscripten.rs Outdated Show resolved Hide resolved
@dhardy
Copy link
Member

dhardy commented Mar 22, 2019

@newpavlov, I would say you've done a good job on this, but I'm still not keen to accept it. The reason is that making heavy use of low-level tools (atomics, bit-ops) makes the code considerably less readable, which makes the code:

  • harder to review, thus less trustworthy
  • harder to modify (esp. without breaking it)

Also, if we are going to take this approach, it would be good to have someone who really understands atomic synchronisation primitives to review this code.

I'm not saying a definite no to this, but I would like us to look at alternatives first (e.g. an AtomicBool is-ready flag + Once init + Cell/RefCell storage, or a higher-level wrapper over atomics).

(I guess I also owe you an apology for the slow review and initially-positive comments; I've had a busy week.)

@newpavlov
Copy link
Member Author

newpavlov commented Mar 22, 2019

While I agree that this code should be carefully reviewed, I think that increasing reader-facing complexity a bit, but getting in return the most efficient implementation for given platform is a not bad trade-off for foundational low-level library.

Regarding Once, as I've already mentioned earlier, the problem is that initialization function may fail, and with Once we lose an ability to retry calls. And RefCell approach is less memory efficient, because we have to use None as an initialization value.

@dhardy
Copy link
Member

dhardy commented Mar 22, 2019

It may be best then to leave this PR unmerged for now (there's no reason optimisations cannot land in a point release). Would you mind making a separate PR for the module changes + wasm-bindgen clean-up, then rebasing this?

@newpavlov
Copy link
Member Author

newpavlov commented Mar 22, 2019

If you don't want to delay getrandom release, then yes, we can postpone this PR, though I hope it will not stay open for too long (i.e. it will be merged or a better approach will be found).

What do you mean by module changes? Also how about bumping MSRV to 1.31 and migrating to 2018 edition? I think 3 release difference is not a big deal, especially considering that upcoming Debian stable will package Rust 1.32.

@dhardy
Copy link
Member

dhardy commented Mar 22, 2019

You merged four modules into one in this PR; I think this is easy to do without the other changes here? Also you renamed 'solarish'; I don't really care about the name but it would be nicer if this PR didn't change it.

If it makes the code significantly nicer, go ahead and use 1.31. There seems to be very little interest in support for old compilers.

@dhardy
Copy link
Member

dhardy commented Apr 6, 2019

I think this could be tidied up a little if you still want to do so, e.g. avoid renaming solaris_illumos and rebase.

@newpavlov
Copy link
Member Author

newpavlov commented Apr 6, 2019

Done! Looks like CI failure is due to rust-lang/rust#59731.

To be safe it would be nice to ask someone to review atomics usage. For example right now I use Ordering::Acquire for first atomic loads, while Ordering::Relaxed may work as well (considering that it's followed by AcqRel operation) and be a bit more efficient.

Copy link

@HadrienG2 HadrienG2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not have time to check all implementations, but hopefully those that I have reviewed should give you an idea of how to review the rest.

As a general comment, since lock-free code can be a pain to maintain, it might be a good idea to somehow centralize all the tricky synchronization in a single place that is shared by all implementations.

const STATE_USE_FD: usize = 1 << 2;

pub fn getrandom_inner(dest: &mut [u8]) -> Result<(), Error> {
let state = RNG_STATE.load(Ordering::Acquire);
Copy link

@HadrienG2 HadrienG2 May 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be able to reduce synchronization overhead by using a Relaxed load, and only using an Acquire fence after the fact if...

  1. Some state has indeed been initialized
  2. You need to synchronize with extra state besides the atomic value itself (e.g. another atomic value)

But cross-check that this is worthwhile with a benchmark on a relaxed-memory architecture (Arm, Power...), because compilers are less skilled at optimizing fences than they are at optimizing ordered atomic ops.

);
fn init_loop(dest: &mut [u8]) -> Result<(), Error> {
loop {
let state = RNG_STATE.fetch_or(STATE_INIT_ONGOING, Ordering::AcqRel);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release ordering is about allowing clients who read STATE_INIT_ONGOING with Acquire ordering to synchronize with other shared state that you modified before writing to this variable. I do not think that it is useful here. If this is just a state flag, even Relaxed could be enough, as long as you use an Acquire fence as discussed above where necessary.

fn init(dest: &mut [u8]) -> Result<(), Error> {
match use_syscall(&mut []) {
Ok(()) => {
RNG_STATE.store(STATE_USE_SYSCALL, Ordering::Release);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for Release ordering if you have no writes to other shared memory locations that you want to share with other threads before the atomic store.

match init_fd() {
Ok(fd) => {
RNG_FD.store(fd as isize, Ordering::SeqCst);
RNG_STATE.store(STATE_USE_FD, Ordering::SeqCst);
Copy link

@HadrienG2 HadrienG2 May 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relaxed followed by Release on this side, and an Acquire fence on the reader's side, are enough to guarantee that the value of RNG_FD loaded by the reader is at least as recent as that of RNG_STATE in this case.

use_init(f, getrandom_init, |source| getrandom_fill(source, dest))
})

let state = RNG_STATE.load(Ordering::Acquire);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need Acquire here, since as far as I can see you're not synchronizing any shared state besides the value of the atomic variable itself...

} else {
STATE_INIT_DONE
},
Ordering::Release,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and thus, Release should not be needed here either.

@newpavlov
Copy link
Member Author

@HadrienG2
Thank you a lot for review! So if I understand correctly the current code is too conservative and orderings can be relaxed, but otherwise it's correct? In that case I think we can go ahead without any changes and open a separate issue/PR for relaxing orderings later.

it might be a good idea to somehow centralize all the tricky synchronization in a single place that is shared by all implementations

I've tried to think about how it can be done, bit I couldn't find good ideas, each case has a slightly different requirements (some platforms got merged into use_file.rs, but that's it), so I don't see an elegant way to abstract it.

@dhardy
What do you think?

@dhardy
Copy link
Member

dhardy commented May 29, 2019

I think it still needs a very careful review, and it would be worth trying to address @HadrienG2's points before this.

Alternatively, we could still avoid TLS by using regular mutexes. Not as efficient perhaps, but I don't think that is so important here, and it's much easier to ensure correctness. (In this case I'm not sure if it's better to use std::sync or parking_lot; although there was discussion of using the latter to implement std::sync, parking_lot still has the better API.)

@newpavlov
Copy link
Member Author

newpavlov commented May 29, 2019

it would be worth trying to address @HadrienG2's points before this

I think it will be better to do it in a separate issue/PR, in other words let's introduce a conservative version first and then work on relaxing it if needed. Though I think it may be better to return to yield_now() as was suggested here as part of this PR.

src/solaris_illumos.rs Outdated Show resolved Hide resolved
src/solaris_illumos.rs Outdated Show resolved Hide resolved
This was referenced Jun 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Removing per-thread file descriptors
4 participants