Adds optional async interface #126

xfbs · 2023-05-04T17:57:46Z

Hey!

In reference to #110, this adds an optional async feature which, when enabled, enables the AsyncDependencyProvider trait.

This PR also cleans up the resolver code somewhat (I like it when code is not nested too deeply, as it makes it easier for me to follow).

mpizenberg · 2023-05-04T18:11:35Z

Cool! I didn’t know async trait was a thing yet. Would you mind splitting this PR in 2? So we can easily review first the code cleanup, then the async one?

xfbs · 2023-05-04T18:13:26Z

For sure! Sounds like a good idea. I'll pull the refactoring out into a separate PR. Will also add some tests for the async stuff in here.

mpizenberg · 2023-05-04T18:14:31Z

Also, this would need to PR against the dev branch, not the release one.

xfbs · 2023-05-04T18:15:11Z

Ha! I did not even know there was a dev branch (used to PRing against master with a stable branch for the release, hehe).

Eh2406

A couple of interesting things to consider before the next version of this.

Eh2406 · 2023-05-05T17:50:58Z

src/solver.rs

+/// Main function of the library.
+/// Finds a set of packages satisfying dependency bounds for a given package + version pair.
+#[cfg(feature = "async")]
+pub async fn resolve_async<P: Package, V: Version>(


It seems unfortunate for us to need to completely duplicate all of this code.

I definitely agree.

My rough idea is to refactor this method into smaller methods, such that there is no duplication, only the async parts need to be duplicated.

I will wait for the refactoring PR (#127) to go through before starting this however.

Taking long methods and splitting them up seems like good value all on its own. I look forward to seeing your work on it.

Eh2406 · 2023-05-05T17:53:17Z

src/solver.rs

+            .should_cancel()
+            .map_err(|err| PubGrubError::ErrorInShouldCancel(err))?;
+
+        state.unit_propagation(next)?;


This function call is usually cheap, but sometimes it will need to do a lot of work (potentially exponential amount of work) and will therefore block the asynchronous runner. On the other hand, if we do some kind of spawn_blocking call here the overhead will be significant as the work is almost always trivial.

I think this is a profound and serious issue we are going to need to explore before we can merge something like this.

I was very lucky to get @carllerche's opinion on this. The two alternatives he suggested were:

Have unit_propagation to a fixed amount of work and if there's more work to be done do the rest in a spon_blocking. In theory, this is the best solution. In practice I'm not sure that we can clearly identify how much work is being done by each sub call within unit_propagation, so I don't know if it's practical. We would also need extensive tests to verify that we don't accidentally have a loop occurring before we call spon_blocking.

Do not provide an async interface. Callers can block_in_place before calling resolve, and call tokio::runtime::Handle::block_on within the callbacks.

I'm not sure which solution is more unpleasant/inefficient, but it's comforting to know that it is a real problem.

To be honest, I did not know of the block_in_place approach, but that sounds like something that might be easy to use and still provide the ability to interface with async code.

Maybe one idea is to add this to the documentation with some code examples? This should be a good starting point for other people that interface with async stuff.

My main concern with that approach is how it works with other executors. It can work well for Tokyo, but are there equivalents in the others? What about when running as Wasm in a browser?

Anyway, 110% on documentation. We should have a section in the book about how to shoehorn asynchronous communication into the current API. That, to the best of our ability, documents the available options and their advantages and disadvantages. That unblocks people without waiting for us to release a new version of the crate.

Eh2406 · 2023-05-05T17:57:28Z

src/solver.rs

+        };
+
+        let decision = dependency_provider
+            .choose_package_version(potential_packages)


It would be really nice if the asynchronous calls to get package information can be run in parallel allowing this await to return when the first of them returns. Of course this would make reproducibility much more complicated, and would need to be documented carefully. But it should make many real-world cases much more efficient.

Adds optional async interface

99e7bb4

mpizenberg closed this May 4, 2023

Eh2406 reviewed May 5, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds optional async interface #126

Adds optional async interface #126

xfbs commented May 4, 2023

mpizenberg commented May 4, 2023

xfbs commented May 4, 2023

mpizenberg commented May 4, 2023

xfbs commented May 4, 2023

Eh2406 left a comment

Eh2406 May 5, 2023

xfbs May 7, 2023

Eh2406 May 8, 2023

Eh2406 May 5, 2023

Eh2406 May 5, 2023 •

edited

Loading

xfbs May 7, 2023

Eh2406 May 8, 2023

Eh2406 May 5, 2023

Adds optional async interface #126

Adds optional async interface #126

Conversation

xfbs commented May 4, 2023

mpizenberg commented May 4, 2023

xfbs commented May 4, 2023

mpizenberg commented May 4, 2023

xfbs commented May 4, 2023

Eh2406 left a comment

Choose a reason for hiding this comment

Eh2406 May 5, 2023

Choose a reason for hiding this comment

xfbs May 7, 2023

Choose a reason for hiding this comment

Eh2406 May 8, 2023

Choose a reason for hiding this comment

Eh2406 May 5, 2023

Choose a reason for hiding this comment

Eh2406 May 5, 2023 • edited Loading

Choose a reason for hiding this comment

xfbs May 7, 2023

Choose a reason for hiding this comment

Eh2406 May 8, 2023

Choose a reason for hiding this comment

Eh2406 May 5, 2023

Choose a reason for hiding this comment

Eh2406 May 5, 2023 •

edited

Loading