Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More flexible reactor/executor API #79

Open
sdroege opened this issue Aug 18, 2019 · 17 comments
Open

More flexible reactor/executor API #79

sdroege opened this issue Aug 18, 2019 · 17 comments
Labels
api design Open design questions feedback wanted Needs feedback from users

Comments

@sdroege
Copy link

sdroege commented Aug 18, 2019

While this seems like something that shouldn't be considered for 1.0 IMHO, it would be good to start discussions about how an API could look like and what requirements different folks have here.

Also while this is kind of related to #60, my main point here is about being able to have control over the lifetime of the reactor/executor, allowing to run multiple and about which would be used when/where. See also rustasync/runtime#42 for a similar issue of mine for the runtime crate, on which everything that follows is based.


Currently the executor and reactor and thread pools are all global and lazily started when they're first needed, and there's no way to e.g. start them earlier, stop them at some point, run multiple separate ones, etc.

This simplifies the implementation a lot at this point (extremely clean and easy to follow code right now!) and is also potentially more performant than passing around state via thread-local-storage (like in e.g. tokio).

It however limits the usability at least in two scenarios where I'd like to make use of async-std.

Anyway, reasons why this would be useful to have (I'm going to call the reactor/executor/threadpool combination a runtime for the following):

  1. Usage in library crates without interfering with any other futures code other library crates or the application might use. This would potentially also go with specific per-thread configuration inside the library crate, for e.g. setting thread priorities of the runtime in a way that is meaningful for what this specific library is doing. (See also Ability to have custom initialization for each worker thread rustasync/runtime#8)
  2. Similar to the above, but an extension with more requirements: plugins. For plugins you might want to use a runtime internally, but at some point you might want to be able to unload the plugin again. As Rust generally does static linking at this point, each plugin would have its own version of async-std/etc included, so unloading a plugin also requires to be able to shut down the runtime at a specific point and to ensure that none of the code of the plugin is running anymore.
  3. Error isolation. While this is probably done even better with separate processes, being able to compartmentalize the application into different parts that don't implicitly share any memory with each other could be useful, also for debuggability.
@vertexclique
Copy link
Member

Truly, I am into this. I've mentioned this implicitly here:

#14 (comment)

... We should share a part of the thread pool explicitly for polling purposes to our own task mechanisms' management. If we don't do that writing abstractions will be cumbersome over time.

@ghost
Copy link

ghost commented Aug 19, 2019

This is a feature we should and will implement, but we'll need to explore the design space first. I think the biggest blocker right now is improving our scheduler. Once the scheduler matures, it should become more obvious how to start/stop runtimes manually instead of always relying on a single static one.

I really want to do something radically different from Tokio here. In particular, instead of splitting the runtime into fully independent components like threadpool, reactor, timer, and so on, which can all be individually started/stopped, I'd like to try a more tightly-coupled design where lines between these components become blurry.

@sdroege
Copy link
Author

sdroege commented Aug 19, 2019

I really want to do something radically different from Tokio here. In particular, instead of splitting the runtime into fully independent components like threadpool, reactor, timer, and so on, which can all be individually started/stopped, I'd like to try a more tightly-coupled design where lines between these components become blurry.

If you use the Runtime API in tokio then you get the whole bundle all together. There's API to run each of the components separately, but that's not really needed to expose IMHO.

What ideas do you have for the scheduler and how would that play into the design here?

Basically something like the API on the tokio Runtime type would be sufficient already, the tricky part is the passing of the runtime to everything that needs it.

@rw
Copy link

rw commented Oct 30, 2019

Now that rustasync/runtime has been deprecated and archived, does this issue have increased importance?

My use case is that, when running integration tests, I need to be able to plug in my custom runtime in place of Tokio.

@ghost
Copy link

ghost commented Nov 1, 2019

How does everyone feel about an API like this?

use async_std::rt::Runtime;
use async_std::task;

// Create a new runtime with default settings.
let rt = Runtime::new();

rt.block_on(async {
    // Since we're inside `rt` now, tasks get spawned onto `rt`.
    task::spawn(async { ... });
});

// What happens here?
drop(rt);

Something I'm unsure about is what should happen when we drop the runtime instance. If there are threads executing tasks at the moment drop(rt) happens, do we block until those tasks are completed? Do we just let them go and signal the threadpool that it should shut down ASAP? What kinds of shutdown procedures do you need?

@yoshuawuyts
Copy link
Contributor

My use case is that, when running integration tests, I need to be able to plug in my custom runtime in place of Tokio.

@rw What purpose does this custom runtime serve? Are you mocking out any APIs perhaps?

@sdroege
Copy link
Author

sdroege commented Nov 1, 2019

How does everyone feel about an API like this?

That would work as the most minimal starting point, yes. Maybe also some kind of rt.spawn(fut) that does not block would be good to have, or to otherwise get a (also cloneable) handle to something that can spawn on this specific runtime.

What kinds of shutdown procedures do you need?

Personally it would be sufficient to shut down ASAP on drop. The active time of the runtime would be defined by the given future, and it should be possible to get any other shutdown behaviour by writing that future in a specific way.

You probably do not want to wait for all spawned tasks though: there might e.g. easily be interval timers (forever) or timeouts (far in the future) be scheduled that you don't really want to wait on.

@sdroege
Copy link
Author

sdroege commented Nov 1, 2019

@stjepang Or maybe even have rt.block_on() consume the runtime by value so that it's clear that after it returned nothing else is going to happen anymore.

@vorner
Copy link

vorner commented Nov 1, 2019

I've found that being able to reuse the runtime for multiple iterative block_ons is a nice optimisation (I've used that in tokio). Actually, tokio allows even concurrent block_ons from multiple threads.

As for the drop… there should be some way to wait for the RT to drain itself before dropping it. And I think that it should be the default, because oftentime leaving a thread that actively does something while the main thread shuts down does weird things and bugs. If one wants it to outlive the main, it would be fine to forget(rt) instead.

@ghost
Copy link

ghost commented Nov 1, 2019

because oftentime leaving a thread that actively does something while the main thread shuts down does weird things and bugs.

What if we leave only idling threads that have no tasks to execute when the main thread shuts down?

@vorner
Copy link

vorner commented Nov 1, 2019

I don't know. I mean, if nothing runs it might be fine ‒ at least from the Rust user point of view. But I've heard it might be problematic on Windows ‒ I don't know any details there, though.

If the thread is already synchronized to know it should stay idle, is it a problem to shut it down instead?

@ghost ghost added the feedback wanted Needs feedback from users label Nov 2, 2019
@dignifiedquire
Copy link
Member

Adding this here as #137 points here. One blocker for adapting async-std in some places for me is that I have no control over the threadpools at all. The two minimum things I would need are a) setting a maximum thread count b) setting properties on the threads, like their nice'nes level. The way rayon solves this, works quite well in this regard for me.

@skade
Copy link
Collaborator

skade commented Nov 6, 2019

@dignifiedquire To my reading, that describes a separate threadpool library with async/.await integration though?

@dignifiedquire
Copy link
Member

@skade while that would be great, that was not what I meant. The thing I need to be able to do is to control the amount of threads and their properties anything in my rust stack creates, including the executor/runtime.

I don't need access to threadpools directly, I just need to control them. For manual thread spawning I would still use a different solution.

This is hard enough with a bunch of libraries today already creating thread here and there how ever they seem fit, which is why I hope async-std can help in stopping that trend :)

@ghost
Copy link

ghost commented Nov 6, 2019

@dignifiedquire Can you say more on how you want to set properties on threads and how Rayon solves the problem?

Is this the method you would be using in Rayon? https://docs.rs/rayon/1.2.0/rayon/struct.ThreadPoolBuilder.html#method.spawn_handler

You've also mentioned you need to control the number of threads. But I wonder what we should do in case of spawn_blocking(), where the number of threads is dynamic, and we essentially spawn an unlimited number of threads on demand?

@dignifiedquire
Copy link
Member

This is roughly how I plan to use rayons capabilities: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=38ed5e90ea0fff7ca5cea1b196118bf3 (+ using spawn_handler to set properties)

My expectation would be that any thread spawning would be limited to a max that I set, and as such it would block until there is a free thread to use/there is room to spawn another one

@vavrusa
Copy link

vavrusa commented Nov 18, 2019

I would be interested in this as well. The 0.3 futures have a decent Spawn mechanism for this. Perhaps async-std/task could implement something like:

pub fn spawn_on<F, T, S>(spawner: S, future: F) -> JoinHandle<T> where
    F: Future<Output = T> + Send + 'static,
    T: Send + 'static,
    S: Spawn

The caller would be responsible for providing an appropriate implementation (e.g. if the task is blocking, a spawner that can handle blocking tasks).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api design Open design questions feedback wanted Needs feedback from users
Projects
None yet
Development

No branches or pull requests

8 participants