Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why so many lifetimes? #7342

Closed
dpc opened this issue Sep 9, 2019 · 16 comments
Closed

Why so many lifetimes? #7342

dpc opened this issue Sep 9, 2019 · 16 comments
Labels
A-cargo-api Area: cargo-the-library API and internal code issues

Comments

@dpc
Copy link
Contributor

dpc commented Sep 9, 2019

I'm sorry for bothering you, but I'm interacting with cargo library a lot and I can't help but wonder - why is there so many complications in the API? There are many lifetimes, RefCells (that make stuff !Sync etc.

Like, why can't Config be taken by move so that Workspace and bunch of other structs doesn't require lifetime? Why not just Arc<Mutex<_>> stuff instead of playing with RefCells? I don't think there's much performance wins to be had in the code that will always be blocked on IO anyway. Is there something I am missing here?

@alexcrichton
Copy link
Member

I think the answer here is "Alex thought it was fun to avoid RefCell and Mutex", there's no real technical motivation.

@dpc
Copy link
Contributor Author

dpc commented Sep 9, 2019

Is it something that can be easily fixed? If I submitted a PR would it get accepted? It does affect my code quite a lot, because many times I do want to store these structs and it bubble up in the code.

@alexcrichton
Copy link
Member

To be honest it's not really something that's ever been considered I think. It would be a pretty massive and major refactoring of Cargo, so I suspect this would not be easily fixed. I would encourage discussing with the cargo team (e.g. not just me) about this before proceeding.

@dpc
Copy link
Contributor Author

dpc commented Sep 11, 2019

@alexcrichton
Copy link
Member

Sure! Or trying to get more of them on this thread, it's sort of up to other folks if they want to respond.

@ehuss
Copy link
Contributor

ehuss commented Sep 12, 2019

From my perspective, the cargo library is provided as-is. Although small changes to accommodate other users is occasionally done, I wouldn't want to do any large changes for that purpose. I also personally like the lifetime usage, as it works towards rust's strengths, and would not want to use more arc/mutex whenever possible.

A more interesting question is why you need the cargo library at all? Most users are trying to move off it, and it is desired to provide command-based APIs to support most use cases.

@dpc
Copy link
Contributor Author

dpc commented Sep 12, 2019

I don't understand why would anyone ... enjoy the lifetime usage. I avoid lifetimes like a plague, and only deal with them when I absolutely need to avoid some clones or allocations somewhere. Lifetimes are not a strength of Rust. There are the price we pay for getting both safety and performance. They are lowering ergonomics and causing confusion and complexity.

it is desired to provide command-based APIs to support most use cases.

How is that desired? It would mean that my Rust-related tool, instead of coming with the in built support for Rust, using code that was tested, etc. now will require user to install cargo, deal with the potential problems of spawning processes / shell commands, serializing and de-serializing input and output. And then I'll have to handle support issues for all the people for who something in that chain failed (serialization changed or had a bug is some version of cargo, cargo could not be found, etc). From my perspective, this has only downsides.

Cargo library has actually quite OK API already, and these couple of warts and missing documentation (nice to have, not mandatory, really) are the only thing left to make it quite pleasant to use. I don't mind upgrading and fixing API breakage. At least it happens when I want it to, and can be tested and released when ready.

@Mark-Simulacrum
Copy link
Member

Using cargo as a library has a big problem today: it means you need to closely track latest stable, and even then nightly-using projects will likely break often and be unusable with the cargo-library using tool. I personally deal with this quite a bit in my work on rust-lang/rust and similar so would really prefer that we moved tools to consuming e.g. cargo metadata output and similar as much as possible.

As somewhat of an aside, I personally find lifetimes enormously helpful for determining whether, and how, an API is intended to be used; I haven't worked much with Cargo's API in particular but in both Cargo and other libraries (within rustc, for example) I find Mutex/RefCell more confusing than & or &mut as it's much harder to determine where, and if, any mutation is happening. Arc/Rc are much less prone to this since they're not all that dissimilar from &, so I personally don't mind them all that much.

@dpc
Copy link
Contributor Author

dpc commented Sep 12, 2019

You might be thinking about lifetimes on arguments, which are usually OK. However lifetimes on whole structs are just a pain, binding lifetime of the struct to some other structs making it impossible to store them without some self-rental tricks etc.

The cargo metadata vs library is just dynamic vs static linking, dynamic vs static typing, etc. debate. Both have pros and cons. I am sure for many tools it's better to use cargo metadata. For some, it will be better to use a library. I don't think "we're going to keep the API meh, to force people to use cargo metadata" is a reasonable thing to say. "No time right now." or something works with me, but then again... I could attempt to submit a PR and get rid of at least the Workspace not taking Config by move, which is getting in my way.

I would be surprised if the lifetimes I'm pointing out, aren't making the internal code of cargo unnecessarily complex and difficult to work with. From what I can tell, some lifetimes already disappeared in few recent versions, because I remember some weird <'static> bounds on some structures (Package?) and now I can't find them.

These couple of lifetimes in cargo API doesn't really seem to have a good reason to be there (from what I can tell, that's why opened this ticket), and RefCells could be just upgraded to Mutex to make things Send.

@Mark-Simulacrum
Copy link
Member

If there's transformations that clearly make the code simpler, then I don't see why those would not be accepted.

I disagree on "library is just dynamic vs static linking, dynamic vs static typing" -- cargo metadata is a intentionally stable (via --format-version, we can have breakage if needed) API. I do think that having a crate in the ecosystem that consumes cargo metadata and provides helpful APIs atop that would be quite useful; it may even be feasible to use a similar crate or so internally, I'm not sure.

@dpc
Copy link
Contributor Author

dpc commented Sep 12, 2019

I think the whole cargo metadata discussion is off topic. I'll see if I can get a working PR addressing my issues, and then I will just keep using cargo as a library. 😛

Now, not being able to resist being off topic... Whatever cargo metadata is doing, one could do via hypothetical fn cargo_metadata(version: u32) -> Result<String> function inside cargo the library. The fundamental difference is exactly "dynamic vs static linking". Either the code that handles Cargo.toml is the one I linked during build of my binary release, or is it provided by the end user. Everything else is just a method of delivering a stable interface (in this case via serialization and execution of a binary program). And it seems to me that for some reason most people believe that user's cargo binary always have to be more up to date than the library I linked to, or that the user even wants to have cargo installed in the first place. There can be other reasons to interact with Rust code then to build it.

cargo metadata is probably very convenient for many tools, and almost inevitable for ones that are written in other programming languages, especially ones that can assume the user will have Rust toolchain installed, since they probably want to... write some Rust code. So it is great to have it.

However for a tool that is written in Rust, and can link to libcargo, serialization and deserialization are a drag, and shell as an API is inferior to a native Rust-API. With cargo the library I can do a lot of non-trivial operations that I don't even know if are possible by calling cargo the binary. Also, please be mindful that by encouraging calling cargo the binary instead of linking the library, you're effectively encourage ossification of the cargo shell interface, which (AFAIK) is not versioned.

Using cargo as a library has a big problem today: it means you need to closely track latest stable, and even then nightly-using projects will likely break often and be unusable with the cargo-library using tool.

My suggestion, would be to just handle more gracefully unsupported nightly features etc. and not just "blow up" as soon soon as unknown (and often irrelevant) feature have been spotted. AFAII, these sortfs of problems have already been kind of addressed, somewhat at least. I would also expect this to be less of a problem going forward, as everything matures. And in the very end ... if my tool can't work with a particular package... it's not the end of the world.

@Mark-Simulacrum
Copy link
Member

Agreed on being off topic. I do think a PR here is a good way to move forward.

@dpc
Copy link
Contributor Author

dpc commented Sep 13, 2019

What's the deal with interning?

lazy_static::lazy_static! {

@Eh2406
Copy link
Contributor

Eh2406 commented Sep 17, 2019

We don't just intern strings, we do the same for package_ids #6332 and source_ids #6342. The structure of interning has changed, but the justification is the same as when it was introduced #5121 (I used blame in the github UI to find that PR).

There are 3 reasons interning is grate:

  1. A significant percent of all the strings cargo makes are the string "serde". keeping only one copy of each string significantly reduces memory usage.
  2. Having deduplicated a string, EQ is O(1) not O(n), and does not require a cache miss.
  3. An interned string has O(1) clone, no allocations no O(n) copy and no cache miss. The Resolver needs to copy most of its internals for each of the crates in the dependency tree, so clone speed really matters.

BTW, what project are you working on?

@dpc
Copy link
Contributor Author

dpc commented Sep 17, 2019

Oh. So it's a perf optimization. I wonder if this really makes a difference. I mean... how many dependencies people have etc.? Anyway - thanks for answer.

I am working on https://github.com/crev-dev

@ehuss ehuss added the A-cargo-api Area: cargo-the-library API and internal code issues label Apr 6, 2020
@epage
Copy link
Contributor

epage commented Nov 2, 2023

Considering cargo-the-library is provided as-is, a major refactor like this would likely be driven by the needs of cargo-the-bin, rather than users of the API. It'll also likely be done opportunistically as other feature development is happening. As such, I'm going to go ahead and close this out. If there is a reason for us to reconsider, let us know!

@epage epage closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cargo-api Area: cargo-the-library API and internal code issues
Projects
None yet
Development

No branches or pull requests

6 participants