Skip to content
This repository has been archived by the owner on Jun 21, 2019. It is now read-only.

Tokio reform #1

Closed
wants to merge 3 commits into from
Closed

Tokio reform #1

wants to merge 3 commits into from

Conversation

carllerche
Copy link
Member

@carllerche carllerche commented Aug 18, 2017

@alexcrichton
Copy link
Contributor

@aturon so @carllerche and I discussed more during RustConf and I've changed the commit here to do a slight variant of the "handle passing" solution. Instead of one function that takes no handle and one function that takes a handle all functions take handles.

My rationale here looks like:

  • Carl brought up an interesting use case. While a future is executing you don't know when it creates I/O objects. As a result, with a TLS based approach, it's not not certain which scope should be wrapped. While this can be fixed, it's non-obvious at least.
  • Creating I/O handles and timeouts isn't really all that common. My thinking is that calling Handle::global() isn't really that bad.
  • No duplicate API surface area is necessary.
  • API migration is actually easier with this design to boot!


let done = socket.incoming().for_each(move |(socket, addr)| {
let (reader, writer) = socket.split();
let amt = copy(reader, writer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the interaction here if a user called copy(reader, writer).wait() in here? Is it like today in that it's a big no-no, and it will likely deadlock the event loop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd do the same before/after the rfc, right? In that it'd deadlock the event loop no matter what?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just wanted to be sure, since it sounded like wait would maybe do some extra "magic", and I wondered if that changed how it would work in here.

@alexcrichton
Copy link
Contributor

I've pushed another commit to avoid breaking tokio-core, which to me seems like the best path forward if we're keeping Handle.

@seanmonstar
Copy link
Member

Having the ability for implicit global handles sounds like probably the right default for most people, so 👍 to that. Here's some assorted thoughts, I would like to try to come up with some solutions to these when I get a chance.

  • If we're going to be changing the design, though, I think it's a prime opportunity to make some anti-patterns harder to trigger. Specifically around the use of future.wait(). It's a bit too easy (and sadly far too common) for people to block the thread running the event loop, by using some_future.wait() inside somewhere. This may be because of lack of understanding, laziness, or something else.

    I'd probably want to consider removing Future::wait completely. This could be replaced with something like CurrentThread::block_until(fut).

  • I wonder if CurrentThread being an empty struct could lead to some confusion. Specifically, after some method calls and nested callbacks, it might not be obvious that a closure could be running on a different thread than you realize, and so simply calling CurrentThread.spawn(fut) would look correct, but perhaps push the future into a thread local of a temporary thread that will shutdown in a second (and then the future is never actually run).

    It seems one could either call Handle::global().spawn(fut) (though I realize you're proposing that tokio no longer spawn futures, so perhaps not this exactly), or make CurrentThread a struct that isn't Send, and encourage creating it back with the global handle.

@carllerche
Copy link
Member Author

Thoughts

A summarization of my thoughts:

I/O Reactor

Thoughts regarding the I/O reactor (Core and friends).

Making &Handle explicit.

Given that in most cases, one will always want to use the same I/O reactor
during the lifetime of a process, requiring &Handle as an argument is going to
be annoying boilerplate. This will add additional friction to the guides (as
illustrated in the provided examples).

The API could look something like:

some_executor::spawn(
  TcpStream::connect(&addr)
    .and_then(|tcp| io::read_to_end(tcp))
    .and_then(|buf| println!("num: {}", buf.len()));

I believe that TcpStream::connect and TcpListener::bind should not take
a &Handle argument. However, there should be a way to specify an alternate
I/O reactor. For this, maybe use a builder API. There already is a builder
pattern to do more advanced configuration, maybe add a handle function to
that:

TcpStream::new_v4()
  .handle(&handle)
  .connect(&addr)
  .unwrap();

Of course, this would require all libs (like hyper) to provide an API to specify the
&Handle, but this is already the case w/ the current API. The downside is that
libs could forget to provide an API by which the &Handle can be configured.
This is mitigated by education & examples. Also, given that most of the
time, the default I/O reactor will be used (especially w/ the point below). and
adding a new API to a lib to specify the &Handle is most likely not a breaking
change, it seems like a fine trade off.

I also think that there could be value in figuring out Tokio lib "patterns",
such as how to allow configuring a lib.

Should it be possible to change the default I/O Reactor?

If the default I/O reactor is a constant, then in order to bind a TcpStream to
any other reactor, a &Handle must be passed in as an argument. However, I
think that there is value to be able to change the default I/O reactor, just not
on a "oneshot" basis.

I think that it could be useful for an executor to setup an I/O reactor that is
dedicated to tasks it runs. In which case, an API to switch the global &Handle
is required. The difference is that it would be documented that this function
should not be used by "end users" but it is a building block level API for
executors to build "environments".

How should the global I/O reactor be configured

The I/O reactor potentially has many knobs to turn. How should the global one be
configured? This actually tends to be a fairly important requirement (based on
looking at libs in other langs). The usual answer is some ENV variables, or
other language specific configuration system.

The best answer is probably some configuration API that must be called before
the global reactor is touched. For example:

tokio::reactor::configure(|c| {
  c.set_max_sockets(1_000_000);
});

I don't really have a particular opinion on the specific configuration API. I
haven't put much thought into it yet.

APIs on Core

IMO, turn should be deprecated and removed, leaving only Core::run<F: Future>(f: F). To enable this change, Core::run should always "turn" at least
once, even if the provided future is already complete.

CurrentThread

It wasn't immediately obvious to me: CurrentThread needs to be a type (vs a
set of mod fns) in order to have a type that implements Executor. That said, I
think the two aspects of the executor (spawning and running) shouldn't be
combined into a single type (similar to Core / Handle with the I/O reactor).

I don't understand what CurrentThread::poll is trying to do.

I also believe that the current proposed API makes it too easy to just "lose"
any futures that were spawned onto CurrentThread. The issue is that the
current thread could exit while futures are currently queued. For example:

thread::spawn(|| {
  let my_future = my_fn();
  CurrentThread::spawn(my_future);
});

To handle this, I think there should be some fn that takes a closure or uses
RAII, and all calls to CurrentThread::spawn that don't happen in that
closure should panic.

Maybe something like (with RAII):

// The `context` is what `CurrentThread::spawn` pushes into.
let context = current_thread::setup();

let my_future = my_fn();
CurrentThread::spawn(my_future);

if run_to_completion {
  // By default, dropping will run the thread-local executor until it is "idle",
  // aka, there are no more futures spawned into it.
  drop(context);
} else {
  // Run until idle w/ a timeout. This takes `&self`
  context.run_with_timeout(Duration::from_millis(1000));

  // Run until the future completes
  context.run_until(shutdown_future);

  // Takes `self`. Drops any remaining futures that are spawned into CurrentThread.
  context.forget();
}

There can also be a "simple" fn that takes a closure and runs until
CurrentThread is idle.

current_thread::with(|| {
  let my_future = my_fn();
  CurrentThread::spawn(my_future);
});

I also don't think this should go in the unsync module.

Changes to futures-rs

As I have said many times, I think usage of Future::wait should be
significantly de-emphasized and used only for the sync / async bridge.

There also are a multitude of ways to be smart about blocking a future. For
example, when running on a thread-pool executor, a "blocking" action can be
coordinated w/ spawning a new thread for the thread pool. However, any "smart"
action taken when blocking the current thread has problems. As such, it should
probably be a de-emphasized API anyway.

There also already are "custom" functions that block the current thread to wait
for a future to complete. For example: Core::run.

I agree w/ @seanmonstar thatFuture::wait should be deprecated in favor of
custom blocking utilities. For example:

futures::current_thread::wait(my_future)

By having individual functions that handle blocking the current thread, they can
each provide the warnings associated w/ using the functions.

There should also be a way for inappropriate calls to a blocking function to
panic. For example, if Core::run is called in a future on the thread-local
executor, that should result in a panic.

Default thread-safe executor

futures-rs should probably provide an API similar to CurrentThread but for
spawning thread-safe futures. However, since there are many strategies by which
this could be implemented (as opposed to current-thread executor), futures-rs
will probably want to make the actual executor in question pluggable. There
could be a default implementation provided by futures-cpupool.

For example:

let dur = Duration::from_milis(100);

futures::executor::ThreadPool::spawn(
  timer.sleep(dur)
    .and_then(|_| {
      println!("Hello");
      Ok(())
    }));

@aturon
Copy link
Contributor

aturon commented Aug 29, 2017

@carllerche Thanks much for all of this feedback! @alexcrichton and I had a chance to work through it today and have various thoughts and questions.

First, a few easy cases:

  • Swapping out the default reactor. Seems good. One implication is that we need the API that gets the current default handle to return an owned value, rather than &'static Handle, but that's fine.

  • Configuring the global reactor. Seems good. Rayon has some very similar APIs we should be able to crib from.

  • Changes to Core API. Seems good.

  • Default thread-safe executor. Definitely agreed.

Now on to the hairier cases!

Passing Handle

We'd like to propose punting on this question for now, for a couple of reasons:

  • All existing libraries in the Tokio ecosystem currently take Handles, so the migration here would involve (1) a temporary poorly-named but more ergonomic method and (2) renaming things at the next major release of libs. It seems like less churn in the near future to just stick with Handle-based APIs -- which we want to provide anyway! -- and see how things play out.

  • This is really a straightforward case of default arguments. If we take the more conservative approach, we could potentially have a single API surface with sweet names that supports both uses.

One point @alexcrichton made is that in Rust, app authors tend to pretty quickly become lib authors: it's very very common to break up an app into multiple crates. So the tradeoffs here are not totally clear, which is one argument in favor of a more conservative route to start.

CurrentThread

I agree that the footgun with CurrentThread is real and needs to be mitigated. To me, the key mitigation is that you get a panic if you try to CurrentThread::spawn on a threadpool or other executor.

Thus, I'd propose that we introduce a TLS key that essentially tracks "What executor, if any, does this thread belong to?" When using CurrentThread:

  • if the key is unset, the thread is now considered to belong to the CurrentThread executor
  • if the key points to CurrentThread, all is good
  • if the key points to anything else, panic

I believe this will catch the bugs we're concerned with, while keeping the APIs on all sides minimal. (It does mean that executors need to call into some setup APIs, but very few people should be writing new executors.)

wait

I agree that wait is another footgun, but want to be careful not to veer too far on this front. In particular, it's important that it's easy to use the futures-based ecosystem within a synchronous context at the edges.

I think we could mitigate this footgun in two ways:

  • By employing a similar panic strategy to the one proposed above for CurrentThread, i.e. calling wait on any executor results in a panic.
  • By renaming the method to something like block.

General thoughts

I'm curious what you think of the above responses, but also whether you're open to getting an iteration posted in the near future (incorporating your next round of feedback). The RFC process is all about design iteration and getting insights from a broader set of people. So I feel like once we have something that's at least in the ballpark, we're best off opening it up. I'm wary of pushing too hard for perfection before we even open for feedback.

Of course, that's all modulo the framing of the RFC, which needs significant work (and I will do once we have rough consensus on the technical starting point).

@carllerche
Copy link
Member Author

All existing libraries in the Tokio ecosystem currently take Handles

Do you have anything specific in mind?

All existing libraries in the Tokio ecosystem currently take Handles

If this will be available on stable this year, I think that could be a good option. Otherwise, I think that keeping the handle argument is a lost opportunity.

To me, the key mitigation is that you get a panic if you try to CurrentThread::spawn on a threadpool or other executor.

I'm not sure I follow how this handles the issue of the current thread executor never getting run?

the thread is now considered to belong to the CurrentThread executor

How does a thread become unassociated from the current thread executor?

it's important that it's easy to use the futures-based ecosystem within a synchronous context at the edges.

I don't think calling a free function is a huge burden. Especially since the current thread executor will become a thing, wouldn't you want to block the current thread by running the CurrentThread executor? Where as wait is just "block and do nothing while blocked".

Anyway, I'm fine opening up the RFC once the write up looks good.

@carllerche
Copy link
Member Author

Also:

One point @alexcrichton made is that in Rust, app authors tend to pretty quickly become lib authors: it's very very common to break up an app into multiple crates.

I'm not sure I follow what is being said here. The APIs of libs will have a &Handle fn as part of their builder / configurator API no matter what (as far as I can tell). How does transitioning from app -> lib change things?

@seanmonstar
Copy link
Member

It seems like less churn in the near future to just stick with Handle-based APIs

I wouldn't want worry of churn to mean a lesser design. If a new design can exist that also happens to prevent churn, that's nice. But a confusing API is worse, IMO. I quite like the proposal to remove it as a usual argument, and put in builder style, like all the rest of configuring a TcpStream. And with the default arguments idea, we can do the opposite: require using the builder for now if you want to do the weird thing, we can add a default argument later if the language ever supports them.

Thus, I'd propose that we introduce a TLS key that essentially tracks "What executor, if any, does this thread belong to?"

This seems quite tricky, and still able to lead to obscure bugs, since it still sounds like using CurrentThread inside a closure that was moved to a thread::spawn may not panic (since the key isn't set), and the future is just lost to the void.

I agree that wait is another footgun, but want to be careful not to veer too far on this front. In particular, it's important that it's easy to use the futures-based ecosystem within a synchronous context at the edges.

I've had to deal with a large number of reports in the hyper IRC channel and in hyper issues of people using fut.wait() inside a Service and not understanding why the server is locked. If you go look at the Scala Future docs, you'll also notice over and over and over "blocking on a future is strong discouraged".

Also, speaking with Oliver about helping engineers who were just trying to get stuff done using Finagle, it was far too common that someone would wait on a future, and block the event loop.

The footgun here is more of a footrailgun. People will blow off their entire foot.

Maybe a side effort can be to add a lint to clippy to find all instances of calling a blocking API in the context of an event loop thread, and if so, that'd be fantastic.

But we also need to save people's feet from themselves. wait being a method on Future is so so easy to reach for. We should make it harder (I'm not saying impossible). Perhaps as a free function, futures::wait(fut), perhaps with a scarier name, futures::block_until(fut), I don't know.

@aturon
Copy link
Contributor

aturon commented Sep 14, 2017

I've pushed up a very heavily-revised RFC here.

@carllerche
Copy link
Member Author

Closing in favor of #2

@carllerche carllerche closed this Sep 15, 2017
@vitalyd vitalyd mentioned this pull request Sep 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants