-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Nested Cargo packages #3452
base: master
Are you sure you want to change the base?
Changes from 7 commits
80b9ddf
126b2b4
ef9a157
a1d4425
35184e0
c4dd372
e71a953
3da9eeb
7f9c09f
130ceb0
42bdb61
25d86b1
743d531
34a19dc
b75a519
d7e8dea
6670428
da547de
b4923ac
68ad634
33d1c6e
680709c
2af4921
0deba10
b999976
3edb308
dd90f20
07b058d
2c220ee
4bc77cb
e470314
2a5474e
db9e7fa
4817861
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,186 @@ | ||
- Feature Name: `nested_publish` | ||
- Start Date: 2023-06-30 | ||
- RFC PR: [rust-lang/rfcs#3452](https://github.com/rust-lang/rfcs/pull/3452) | ||
- Rust Issue: ... | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Allow Cargo packages to be bundled within other Cargo packages when they are published (not just in unpublished workspaces). | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
There are a number of reasons why a Rust developer currently may feel the need to create multiple library crates, and therefore multiple Cargo packages (since one package contains at most one library crate). These multiple libraries could be: | ||
|
||
* A trait declaration and a corresponding derive macro (which must be defined in a separate proc-macro library). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should This was talked about a little at #2224 (comment) |
||
* A library that uses a build script that uses another library or binary (e.g. for precomputation or bindings generation). | ||
* A logically singular library broken into multiple parts to speed up compilation. | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
Comment on lines
+16
to
+18
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I understand the value of this RFC, and these pain points are truely painful. I am a bit not comfortable about the complexity exposed to Cargo users. With the upcoming public/private dependencies in Edition 2024. The situation becomes way awkward. # in a `foo` package
foo-priv-types = { path = "priv-types", public = false, publish = "nested" }
foo-core= { version = "0.1", path = "core", public = true }
foo-util = { version = "0.1", path = "util", public = false }
foo-derive = { path = "derive", public = true, publish = "nested" } The above example is very likely to happen, but it not immediately clear the mixed meaning of
I may have over-complicated the situation, but it indeed introduces cognitive overhead to understand when combining different concept together. I don't know how complex inline-module would be, but that might be a chance to changing to compilation unit from crate to module (don't bash on my head, just an idea). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (No need to say when open namespace comes and joins the party. While it's a pretty independent feature, the learning curve doesn't look too good when everything gathers…) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For me, the big concern with
Comment on lines
+14
to
+18
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another use case is that this provides another way for us to break dependency cycles that involve Currently, the solution involves dropping the dependency on publish (by not specifying a With this feature, we can instead nest the path dev-dependency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Similarly, a |
||
|
||
Currently, developers must publish these packages separately. This has several disadvantages (see the [Rationale](#rationale-and-alternatives) section for further details): | ||
|
||
* Clutters the public view of the registry with packages not intended to be usable on their own, and which may even become obsolete as internal architecture changes. | ||
* Requires multiple `cargo publish` operations (this could be fixed with bulk publication) and writing public metadata for each package. | ||
* Can result in semver violations and thus compilation failures, due to the developer not thinking about semver compatibility within the group. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While some of these indirectly touch on it, one I'd explicitly add is sheer boilerplate. In working on #3424, one of the things I've noticed is the commentary from people who are looking to further drop boilerplate. This also came up in a recent blogpost and HN discussion of it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This proposal will still require There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any of the standard manifest fields that crates.io requires. Granted workspace inheritance helps with those (which will automatically be used in Combine that with "cargo script" (if we support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I couldn't find a definitive statement of exactly which fields |
||
|
||
This RFC will allow developers to avoid all of these inconveniences and hazards by publishing a single package. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
By default (and always, prior to this RFC's implementation): | ||
|
||
* If your package contains any sub-packages, Cargo [excludes](https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields) them from the `.crate` archive file produced by `cargo package` and `cargo publish`. | ||
* If your package contains any non-`dev` dependencies which do not give a `version = "..."`, it cannot be published to `crates.io`. | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* If your package contains `[dev-dependencies]` which do not give a `version = "..."`, they are stripped out on publication. | ||
|
||
(By “**sub-package**” we mean a package (directory with `Cargo.toml`) which is a subdirectory of another package. We shall call the outermost such package, the package being published, the “**parent package**”.) | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
You can change this default by placing in the manifest (`Cargo.toml`) of a sub-package: | ||
|
||
```toml | ||
[package] | ||
publish = "nested" | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
If this is done, Cargo's behavior changes as follows: | ||
|
||
* If you publish the parent package, the sub-package is included in the `.crate` file (unless overridden by explicit `exclude`/`include`) and will be available to the parent package whenever the parent package is downloaded and compiled. | ||
* The parent package (and other sub-packages) may have `path =` dependencies upon the sub-package. (Such dependencies must not have a `version =` or `git =`; that is, the `path` must be the _only_ source for the dependency.) | ||
* You cannot `cargo publish` the sub-package, just as if it had `publish = false`. (This is a safety measure against accidentally publishing the sub-package separately when this is not intended.) | ||
|
||
Nested sub-packages may be freely placed within other nested sub-packages. | ||
|
||
When a group of packages is published in this way, and depended on, this has a number of useful effects (which are not things that Cargo explicitly implements, just consequences of the system): | ||
|
||
* The packages are a single unit for all versioning purposes; there is no way for a version mismatch to arise since all the code was published together. Version resolution does not apply (in the same way that it does not for any other `path =` dependency). | ||
* The sub-package is effectively “private”: it cannot be named by any other package on `crates.io`, only by its parent package and sibling sub-packages. | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Example: trait and derive macro | ||
|
||
Suppose we want to declare a trait-and-derive-macro package. We can do this as follows. The parent package would have this manifest `foo/Cargo.toml`: | ||
|
||
```toml | ||
[package] | ||
name = "foo" | ||
version = "0.1.0" | ||
edition = "2021" | ||
publish = true | ||
|
||
[dependencies] | ||
foo-macros = { path = "macros" } # newly permitted | ||
``` | ||
|
||
The sub-package manifest `foo/macros/Cargo.toml`: | ||
|
||
```toml | ||
[package] | ||
name = "macros" # this name need not be claimed on crates.io | ||
version = "0.1.0" # this version is not used for dependency resolution | ||
edition = "2021" | ||
publish = "nested" # new syntax | ||
|
||
[lib] | ||
proc-macro = true | ||
``` | ||
|
||
Then you can `cargo publish` from within the parent `foo` directory, and this will create a single `foo` package on `crates.io`, with no `macros` (or `foo-macros`) package visible except when inspecting the source code or in compilation progress messages. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The following changes must be made across Cargo and `crates.io`: | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* **Manifest schema** | ||
* The Cargo manifest now allows `"nested"` as a value for the `package.publish` key. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As it is already a term in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I get the similarity, but vendoring normally means making a copy of a package that is available by other means, and one of the design goes here is to discourage any such copies existing (because they are likely to be accidental, and if they aren't, then they may create the same kinds of problems as multiple major versions do). I think reusing the term would create more confusion than it avoids. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The perspective I was using when I came up with "vendor" was that instead of getting a dependency through the registry, we are copying it into our package. Its not vendored within the repo but in the This also ties into whether we should generalize this across dependency sources at which point it feels like it becomes even more applicable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still don't think that vendoring is the right term, especially as one of the things that came up during my review of prior art is the concept of using nested-packages-or-whatever for vendoring, e.g. to fix a bug before upstream accepts the patch — I think these need to be kept distinct ideas. That being considered, what do we need to do here with the RFC text to resolve this thread? Should there be an unresolved question for terminology, or can we just proceed as-is? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "vendoring" isn't all-inclusive of the situations where one might use this. Usually vendoring refers to "taking some third party dependency's entire source code and jamming it into to some There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't seen any "seconding" of support for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At least keeping this unresolved to centralize any name bikeshedding conversations |
||
* **`cargo package` & `cargo publish`** | ||
* Should refuse to publish a package if that package (not its sub-packages) has `publish = "nested"`. | ||
* Exclude/include rules should, upon finding a sub-package, check if it is `publish = "nested"` and not automatically exclude it. Instead, they should treat it like any other subdirectory; in particular, it should be affected by explicitly specified exclude/include rules. | ||
* Nested `Cargo.toml`s should be normalized in the same way the root `Cargo.toml` is, if they declare `publish = "nested"`, and not if they do not. | ||
* This avoids modifying the publication behavior for existing packages, even if they contain project templates or invoke `cargo` to compile sub-packages to probe the behavior of the compiler. | ||
* If the nested `Cargo.toml` has a syntax error such that its `package.publish` value cannot be determined, then if it is depended upon, emit an error; if it is not, emit a warning and do not normalize it. | ||
* **`crates.io`** | ||
* Should allow `path` dependencies that were previously prohibited, at least provided that the named package in fact exists in the `.crate` archive file. The path must not contain any upward traversal (`../`) or other hazardous or non-portable components. | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* **Build process** | ||
* Probably some messages will need to be adjusted; currently, `path` dependencies' full paths are always printed in progress messages, but they would be long noise here (`/home/alice/.cargo/registry/src/index.crates.io-6f17d22bba15001f/...`). Perhaps progress for sub-packages could look something like “`Compiling foo/macros v0.1.0`”. | ||
|
||
The presence or absence of a `[workspace]` has no effect on the new behavior, just as it has no effect on existing package publication. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
* This increases the number of differences between “Cargo package (on disk)” from “Cargo package (that may be published in a registry, or downloaded as a unit)” in a way which may be confusing; it would be good if we have different words for these two entities, but we don't. | ||
* If Cargo were to add support for multiple libraries per package, that would be largely redundant with this feature. | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* It is not possible to publish a bug fix to a sub-package without republishing the entire parent package. | ||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* Suppose `foo` has a sub-package `foo-core`. Multiple major versions of `foo` cannot share the same instance of `foo-core` as they could if `foo-core` were separately published and the `foo`s depended on the same version of `foo-core`. Thus, choosing nested publication may lead to type incompatibilities (and greater compile times) that would not occur if the same libraries had been separately published. | ||
* If this situation comes up, it can be recovered from by newly publishing `foo-core` separately (as would have been done if nested publishing were not used) and using the [semver trick](https://github.com/dtolnay/semver-trick) to maintain compatibility. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Semver trick does solve this problem, but only when those package bumps their MSRV, so this feature won't get adopted until some point. Some popular crates using proc-macros hold a relatively conservative MSRV. The adoption rate might be mild annoying in the short-mid term. |
||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
kpreid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
The reason for doing anything at all in this area is that publishing multiple packages is often a bad solution to the problems that motivate it; in particular: | ||
|
||
* Non-lockstep versioning risk: If you publish `foo 1.0.0` and `foo-macros 1.0.0`, then later publish `foo 1.1.0` and `foo-macros 1.1.0`, then it is _possible_ for users' `Cargo.lock`s to get into a state where they select `foo-macros 1.1.0` and `foo 1.0.0`, and this then breaks because `foo-macros` assumed that items from `foo 1.0.0` would be present. Arguably, this is a deficiency in the proc-macro system (`foo-macros` has a _de facto_ dependency on `foo` but does not declare it), but not one that is likely to be corrected any time soon. This can be worked around by having `foo` specify an exact dependency `foo-macros = "=1.0.0"`, but this is a subtlety that library authors do not automatically think of; semver is easy to get wrong silently. | ||
* The crates.io registry may be cluttered with many packages that are not relevant to users browsing packages. (Of course, there are many other reasons why such clutter will be found.) | ||
* When packages are implementation details, it makes a permanent mark on the `crates.io` registry even if the implementation of the parent package stops needing that particular subdivision. By allowing sub-packages we can allow package authors to create whatever sub-packages they imagine might be useful, and delete them in later versions with no consequences. | ||
* It is possible to depend on a published package that is intended as an implementation detail. Ideally, library authors would document this clearly and library users would obey the documentation, but that doesn't always happen. By allowing nested packages, we introduce a simple “visibility” system that is useful in the same way that `pub` and `pub(crate)` are useful within Rust crates. | ||
|
||
The alternative to nested packages that I have heard of as a possibility would be to support multiple library targets per package. That would be arguably cleaner, but has these disadvantages: | ||
|
||
* It would require new manifest syntax, not just for declaring the multiple libraries, but for referring to them, and for making per-target dependencies (e.g. only a proc-macro lib should depend on `proc-macro2`+`quote`+`syn`, not the rest of the libraries in the package). | ||
* It would require many new mechanisms in Cargo. | ||
* It might have unforeseen problems; by contrast, nested packages are compiled exactly the same way `path` dependencies currently are, and the only new element is the ability to publish them, so the risk of surprises is lower. | ||
|
||
Also, nested packages enables nesting *anything* that Cargo packages can express now and in the future; it is composable with other Cargo functionality. | ||
|
||
We could also do nothing, except for warning the authors of paired macro crates that they should use exact version dependencies. The consequence of this will be continued hassle for developers; it might even be that useful proc-macro features might not be written simply because the author does not want to manage a second package. | ||
|
||
## Details within this proposal | ||
|
||
Instead of introducing a new value for the `publish` key, we could simply allow sub-packages to be published when they would previously be errors. However, this would be problematic when an existing package has a dev-dependency on a sub-package; either that sub-package would suddenly start being published as nested, or there would be no way to specify the sub-package *should* be published. | ||
|
||
We could also introduce an explicit `[subpackages]` table in the manifest. However, I believe `publish = "nested"` has the elegant and worthwhile property that it simultaneously enables nested publication and prohibits accidental un-nested publication of the sub-package. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
I am not aware of other package systems that have a relevant similar concept, but I am not broadly informed about package systems. I have designed this proposal to be a **minimal addition to Cargo**, building on the existing concept of `path` dependencies to add lots of power with little implementation cost; not necessarily to make sense from a blank slate. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
I see no specific unclear design choices, but we might want to incorporate one or more of the below _Future possibilities_ into the current RFC, particularly omitting version numbers. | ||
|
||
# Future possibilities | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Combined with bin deps, we could allow delegating build scripts to a nested package, allowing a more complete environment for its development. |
||
[future-possibilities]: #future-possibilities | ||
|
||
## Omit version numbers | ||
|
||
Nested packages don't really have any use for version numbers; arguably, they should be omitted and even prohibited, since they may mislead a reader into thinking that the version numbers are used for some kind of version resolution. However, this is a further change to Cargo that is not strictly necessary to solve the original problem, and it disagrees with the precedent of how local `path` dependencies currently work (local packages must have version numbers even though they are not used). | ||
|
||
## Nested packages with public binary targets | ||
|
||
One common reason to publish multiple packages is in order to have a library and an accompanying tool binary, without causing the library to have all of the dependencies that the binary does. Examples: `wasm-bindgen` (`wasm-bindgen-cli`), `criterion` (`cargo-criterion`), `rerun` (`rerun-cli`). | ||
|
||
This RFC currently does not address that — if nothing is done, then `cargo install` will ignore binaries in sub-packages. It would be easy to make a change which supports that; for example, `cargo install` could traverse sub-packages and install all found binaries — but that would also install binaries which are intended as testing or (once [artifact dependencies] are implemented) code-generation helpers, which is undesirable. Thus, additional design work is needed to support `cargo install`ing from subpackages: | ||
|
||
* Should there be an additional manifest key which declares the binary target “public”? | ||
* Should targets be explicitly “re-exported” from the parent package? | ||
* Should there be an additional option to `cargo install` which picks subpackages? (This would cancel out the user-facing benefit from having a single package name.) | ||
|
||
## Nested packages with public library targets | ||
|
||
Allowing nested libraries to be named and used from outside the package would allow use cases which are currently handled by Cargo `features` and conditional compilation (optional functionality with nontrivial costs in dependencies or compilation time) to be instead handled by defining additional public libraries within one package. | ||
|
||
This would allow library authors to avoid writing fragile and hard-to-test conditional compilation, and allow library users to avoid accidentally depending on a feature being enabled despite not having enabled it explicitly. It would also allow compiling the optional functionality and its dependencies with maximum parallelism, by not introducing a single `feature`-ful library crate which acts as a single node in the dependency graph. | ||
|
||
However, it requires additional syntax and semantics, and these use cases might be better served by [#3243 packages as namespaces] or some other namespacing proposal, which would allow the libraries to be published independently. (I can also imagine a world in which both of these exist, and the library implementer can transparently use whichever publication strategy best serves their current needs.) | ||
|
||
## Additional privileges between crates | ||
|
||
Since nested packages are versioned as a unit, we could relax the trait coherence rules and allow implementations that would otherwise be prohibited. | ||
|
||
This would be particularly useful when implementing traits from large optional libraries; for example, package `foo` with subpackages `foo_core` and `foo_tokio` could have `foo_tokio` write `impl tokio::io::AsyncRead for foo_core::DataSource`. This would improve the dependency graph compared to `foo_core` having a dependency on `tokio` (which is the only way to do this currently), though not have the maximum possible benefit unless we also added public library targets as above, since the package as a whole still only exports one library and thus one dependency graph node. | ||
|
||
[artifact dependencies]: https://github.com/rust-lang/rfcs/pull/3028 | ||
[#3243 packages as namespaces]: https://github.com/rust-lang/rfcs/pull/3243 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor question: If a Git repository contains a package with nested packages, can the other package depends on any of those nested packages as a Git dependency? Currently Git dependency searches packages whose name matches recursively inside the repository.