From 6c1d9a2e9c92bb8f7cf9f0e6511fcd7a00251439 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 16:15:15 -0700 Subject: [PATCH 01/12] RFC: Cargo build system integration --- text/0000-build-systems.md | 444 +++++++++++++++++++++++++++++++++++++ 1 file changed, 444 insertions(+) create mode 100644 text/0000-build-systems.md diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md new file mode 100644 index 00000000000..bef61247405 --- /dev/null +++ b/text/0000-build-systems.md @@ -0,0 +1,444 @@ +- Feature Name: N/A +- Start Date: 2017-09-01 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This **experimental RFC** lays out a high-level plan for improving Cargo's +ability to integrate with other build systems and environments. As an +experimental RFC, it opens the door to landing [unstable features] in Cargo to +try out ideas, but *not* to stabilizing those features, which will require +follow-up RFCs. + +[unstable features]: https://github.com/rust-lang/cargo/pull/4433/ + +# Motivation +[motivation]: #motivation + +One of the first hurdles for using Rust in production is integrating it into +your organization's build system. The level of challenge depends on the level of +integration required: it's relatively painless to invoke Cargo from a makefile +and let it fully manage building Rust code, but gets harder as you want the +external build system to exert finer-grained control over how Rust code is built. +The goal of this RFC is to lay out a vision for making integration at *any* scale +much easier than it is today. + +After extensive discussion with stakeholders, there appear to be two distinct +kinds of use-cases (or "customers") involved here: + +- **Mixed build systems**, where building already involves a variety of + language- or proeject-specific build systems. For this use case, the desire is + to use Cargo as-is, except for some specific concerns. Those concerns take a + variety of shapes: customizing caching, having a local crate registry, custom + handling for native dependencies, and so on. Addressing these concerns well + means adding new points of extensibility or control to Cargo. + +- **Homogenous build systems** like [Bazel], where there is a single prevailing + build system and methodology that works across languages and projects and is + expected to drive all aspects of the build. In such cases the goal of Cargo + integration is largely *interoperability*, including easy use of the crates.io + ecosystem and Rust-centric tooling, both of which expect Cargo-driven build + management. + +[Bazel]: https://bazel.build/ + +The interoperability constraints are, in actuality, hard constraints around +*any* kind of integration. + +In more detail, a build system integration *must*: + +- Make it easy for the outer build system to control the aspects of building + that are under its purview (e.g. artifact management, caching, network access). +- Make it easy to depend on arbitrary crates in the crates.io ecosystem. +- Make it easy to use Rust tooling like `rustfmt` or the RLS with projects that + depend on the external build system. + +A build system integration *should*: + +- Provide Cargo-based or Cargo-like workflows when developing Rust projects, so + that documentation and guidance from the Rust community applies even when + working within a different build system. +- To the extent possible, support Cargo concepts in a smooth, first-class way in + the external build system (e.g. Cargo features, profiles, etc) + +This RFC does not attempt to provide a detailed solution for all of the needed +extensibility points in Cargo, but rather to outline a general plan for how to +get there over time. Individual components that add significant features to +Cargo will need follow-up RFCs before stabilization. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +The plan proposed in this RFC is to address the two use-cases from the +[motivation] section in parallel: + +- **For the mixed build system case**, we will triage feature requests and work on + adding further points of extensibility to Cargo based on expected impact. Each + added point of extensibility should ease build system integration for another + round of customers. + +- **For the homoegenous build system case**, we will immediately pursue + extensibility points that will enable the external build system to perform + many of the tasks that Cargo does today--but while still meeting our + interoperability constraints. We will then work on smoothing remaining rough + edges, which have a high degree of overlap with the work on mixed build + systems. + +In the long run, these two parallel lines of work will converge, such that we +offer a complete spectrum of options (in terms of what Cargo controls versus an +external system). But they start at critically different points, and working on +those in parallel is the key to delivering value quickly and incrementally. + +## A high-level model of what Cargo does + +Before delving into the details of the plan, it's helpful to lay out a mental +model of the work that Cargo does today, broken into several stages: + +| **Step** | **Conceptual output** | **Related concerns** | +| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------- | +| Dependency resolution | Lock file | Custom registries, mirrors, offline/local, native deps, ... | +| Build configuration | Cargo settings per crate in graph | Profiles | +| Build lowering | A build plan: a series of steps that must be run in sequence, including rustc and binary invocations | Build scripts, plugins | +| Build execution | Compiled artifacts | Caching | + +The first stage, dependency resolution, is the most complex; it's where our +model of semver comes into play, as well as a huge list of related +concerns. Interestingly, it's also the area where external build systems tend to +have the *least* to say, because it's not really about *building* so much as +*figuring out what even goes into a build*. + +Dependency resolution produces a lockfile, which records what crates are +included in the dependency graph, coming from what sources and at what versions, +as well as interdependencies. It operates independently of the requested Cargo +workflow. + +The next stage is build *configuration*, which conceptually is where things like +profiles come into play: of the crates we're going to build, we need to decide, +at a high level, "how" we're going to build them. This configuration is at the +"Cargo level of abstraction", i.e. in terms of things like profiles rather than +low-level rustc flags. There's strong desire to make this system more +expressive, for example by allowing you to always optimize certain dependencies +even when otherwise in the debug profile. + +After configuration, we know at the Cargo level exactly what we want to build, +but we need to *lower* the level of abstraction into concrete, individual +steps. This is where, for example, profile information is transformed into +specific rustc flags. Lowering is done independently for each crate, and results +in a sequence of process invocations, interleaving calls to `rustc` with +e.g. running the binary for a build script. You can think of these sequences as +expanding what was previously a "compile this crate with this configuration" +node in the dependency graph into a finer-grained set of nodes for running rustc +etc. + +Finally, there's the actual build *execution*, which is conceptually +straightforward: we analyze the dependency graph and existing, cached artifacts, +and then actually perform any un-cached build steps (in parallel when +possible). Of course, this is the bread-and-butter of many external build +systems, so we want to make it easy for them to tweak or entirely control this +part of the process. + +The first two steps -- dependency resolution and build configuration -- need to +operate on an entire dependency graph at once. Build lowering, by contrast, can +be performed for any crate in isolation. + +### Cargo plugins + +**A key point is that, in principle, each of these steps is separable from the +others**. That is, we should be able to rearchitect Cargo so that each of these +steps is managed by a distinct component, and the components have a stable -- +and public! -- way of communicating with one another. That in turn will enable +replacing any particular component while keeping the others. + +More concretely, this RFC proposed that we pursue *some* kind of plugin system +in which you can inform Cargo that part of its normal work should instead by +done by an external tool. This might live in `Cargo.toml`, or in Cargo +configuration. In any case, each phase of Cargo's work should be swappable. + +Note that the four-step division above is a useful conceptual model, but is +ultimately too coarse-grained; we will likely need to divide up Cargo's +functionality into numerous small pieces that can be re-used when integrating +into a larger build system. This finer division is left as a question for +experimentation. + +## Specifics for the homogenous build system case + +For homogenous build systems, there's a key question: how is the Rust code +itself managed, through a crate registry or though some external system? Any +integration has to handle the first case (to have access to crates.io or a +mirror thereof), but organizations can choose whether to manage their own crates +through a custom registry (more on that below) or some other means. + +### Using crates managed by a crate registry + +Using a crate registry implies using Cargo's dependency resolution, and, in +particular, `Cargo.toml`. In this case, the external build system should invoke +Cargo for *at least* the dependency resolution and build configuration steps, +and likely the build lowering step as well. In such a world, Cargo is +responsible for *planning* the build (which involves largely Rust-specific +concerns), but the external build system is responsible for *executing* it. + +One important concern is: how do you depend on code from other languages, which +is being managed by the external build system? That's a narrow version of a more +general question around *native dependencies*, which will be addressed +separately in a later section. + +#### Workflow and interop story + +On the external build system side, a rule or plugin will need to be written that +knows how to invoke Cargo to produce a build plan, then translate that build +plan back into appropriate rules for the build system. Thus, when doing normal +builds, the external build system drives the entire process, but invokes Cargo +for guidance during the planning stage. + +When *developing* a crate, it should be possible to invoke Cargo commands as +usual. We do this via a plugin. When invoking, for example, `cargo build`, the +plugin will translate that to a request to the external build system, which will +in turn re-invoke Cargo to request a build plan (the exact mechanics here are +TBD), and then execute the build. For `cargo run`, the same steps are followed +by putting the resulting build artifact in an appropriate location, and then +following Cargo's usual logic. And so on. + +A similar story plays out when using, for example, the RLS or rustfmt. Ideally, +these tools will have no idea that a Cargo plugin is in play; the information +and artifacts they need can be obtained by using Cargo in the appropriate way, +transparently. + +While the details here are quite hazy, the overall point is that control swaps +back and forth between Cargo and the external build system, depending on the +concerns at play. We set things up so that the Rust-specific pieces (including +Cargo workflows) continue to be handled by Cargo whenever possible. + +### Using "unmanaged" crates + +In some cases, an organization may want to employ a their own strategy for +maintaining and versioning code, and for resolving dependencies. In this case, +they may wish to entirely forgo writing a meaningful Cargo.toml, instead having +one that simply forwards to a plugin. The description of dependencies is then +written in the external build system's rule format. Here, Cargo acts primarily +as a *workflow and tool orchestrator*, since it is not involved in either +planning or executing the build. + +#### Workflow and interop story + +As with the workflow for crate registries, a plugin is used to manage control +passing back-and-forth between Cargo and the external build system. The main +difference is that Cargo is used for fewer steps. So, for example, when running +`cargo test`, the external build system is invoked directly (in an appropriate +mode for building tests) and performs the build without consulting Cargo at all +(or, perhaps it uses Cargo strictly for lowering, i.e. to determine `rustc` +invocations from higher-level configuration). Once the build is complete, Cargo +takes over, actually executing the resulting test binary. + +For the RLS or other tools that need to explore the dependency structure of the +crate, again they should work with a clear Cargo interface that hides any use of +plugins. The plugin for the external build system must be able to translate its +dependencies back into something equivalent to a lockfile, at least. + +### A hybrid + +In general, any integration with a homogenous build system needs to be able to +handle crate registries, because access to crates.io is a hard constraint. + +However, it's possible to *mix* this model for crates.io with the model for +unmanaged crates. All that's needed is a distinction within the external build +system between these two kinds of dependencies, which then drives the plugin +interactions accordingly. + +## Cross-cutting concern: native dependencies + +One important point left out of the above explanation is the story for +dependencies on non-Rust code. These dependencies should be built and managed by +the external build system. But there's a catch: existing "sys" crates on +crates.io that provide core native dependencies use custom build scripts to +build or discover those dependencies. We want to *reroute* those crates to +instead use the dependencies provided by the build system. + +Here, there's a short-term story and a long-term story. + +### Short term: white lists with build script overrides + +Cargo today offers the ability to [override the build script] for any crate +using the `links` key (which is generally how you signal *what* native +dependency you're providing), and instead provide the library location +directly. This feature can be used to instead point at the output provided by +the external build system. Together with whitelisting crates that use build +scripts, it's possible to use the existing crates.io ecosystem while managing +native dependencies via the external build system. + +[override the build script]: http://doc.crates.io/build-script.html#overriding-build-scripts + +There are some downsides, though. If the sys crates change in any way -- for +example, altering the way they build the native dependency, or the version they +use -- there's no clear heads-up that something may need to be adjusted within +the external build system. It might be possible, however, to use +version-specific whitelisting to side-step this issue. + +Even so, whitelisting itself is a laborious process, and in the long run there +are advantages to offering a higher-level way of specifying native dependencies +in the first place. + +### Long term: declarative native dependencies + +Reliably building native dependencies in a cross-platform way +is... challenging. Today, Rust offers some help with this through crates like +[`gcc`] and `[pkgconfig]`, which provide building blocks for writing build +scripts that discover or build native dependencies. But still, today, each build +script is a bespoke affair, customizing the use of these crates in arbitrary +ways. It's difficult, error-prone work. + +[`gcc`]: https://docs.rs/gcc +`[pkgconfig`]: https://docs.rs/pkg-config + +This RFC proposes to start a *long term* effort to provide a more first-class +way of specifying native dependencies. The hope is that we can get coverage of, +say, 80% of native dependencies using a simple, high-level specification, and +only in the remaining 20% have to write arbitrary code. And, in any case, such a +system can provide richer information about dependencies to help avoid the +downsides of the whitelisting approach. + +The likely approach here is to provide [some mechanism] for using a dependency +*as* a build script, so that you could specify high-level native dependency +information directly in `Cargo.toml` attributes, and have a general tool +translate that into the appropriate build script. + +[some mechanism]: https://internals.rust-lang.org/t/pre-rfc-cargo-build-and-native-dependency-integration-via-crate-based-build-scripts/5708 + +Needless to say, this approach will need significant experimentation. But if +successful, it would have benefits not just for build system integration, but +for using external dependencies *anywhere*. + +## Specifics for the mixed build system case + +Switching gears, let's look at mixed build systems. Here, we generally don't +want to use Cargo plugins, because Cargo is still doing the vast majority of the +work. Instead, we need to incorporate new features or points of +customization/extensibility to handle specific concerns. The primary ones on the +radar right now are as follows. + +- **Multiple/custom registries**. There is a longstanding desire to support + registries other than crates.io, e.g. for private code, and to allow them to + be used *in conjunction* with crates.io. In particular, this is a key pain + point for customers who are otherwise happy to use Cargo as-is, but want a + crates.io-like experience for their own code. There's + an [early RFC](https://github.com/rust-lang/cargo/pull/4036) on this topic, + and more work here is planned soon. + +- **Network and source control**. We've already put significant work into + providing control over where sources live (though vendoring) and tools for + preventing network access. However, we could do more to make the experience + here first class, and to give people a greater sense of control and assurance + when using Cargo on their build farm. + +- **Caching and artifact control**. Many organizations would like to provide a + shared build cache for the entire organization, across all of its + projects. More generally, it seems plausible and useful to provide a way of + customizing caching and artifact storage without having to go all the way to a + full plugin. + +These bullets are quite vague, and that's because, while we know there are needs +here, the precise problem -- let alone the solution -- it not yet clear. The +point, though, is that these are the most important problems we want to get our +head around in the foreseeable future. + +## Additional areas where revisions are expected + +Beyond all of the above, it seems very likely that some existing features of +Cargo will need to be revisited to fit with the build system integration +work. For example: + +- **Profiles**. Putting the idea of the "build configuration" step on firmer + footing will require clarifying the precise role of profiles, which today blur + the line somewhat between *workflows* (e.g. `test` vs `bench`) and flags + (e.g. `--release`). Moreover, integration with a homogenous build system + effectively requires that we can translate profiles on the Cargo side back and + forth to *something* meaningful to the external build system, so that for + example we can make `cargo test` invoke the external build system in a + sensible way. Additional clarity here might help pave the way for [custom + profiles] and other enhancements. On a very different note, it's not currently + possible to control enough about the `rustc` invocation for at least some + integration cases, and the answer may in part lie in improvements to profiles. + +- **Build scripts**. Especially for homogenous build systems, build scripts can + pose some serious pain, because in general they may depend on numerous + environmental factors invisibly. It may be useful to grow some ways of telling + Cargo the precise inputs and outputs of the build script, declaratively. + +- **Vendoring**. While we have [support for vendoring] dependencies today, it is + not treated uniformly as a mirror. We may want to tighten up Cargo's + understanding, possibly by treating vendoring in a more first-class way. + +[custom profiles]: https://github.com/rust-lang/cargo/issues/2007 +[support for vendoring]: https://github.com/alexcrichton/cargo-vendor/ + +There are undoubtedly other aspects of Cargo that will need to be touched to +achieve better build system integration; the plan as a whole is predicated on +making Cargo much more modular, which is bound to reveal concerns that should be +separated. As with everything else in this RFC, user-facing changes will require +a full RFC prior to stabilization. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +This is an experimental RFC. Reference-level details will be presented in +follow-up RFCs after experimentation has concluded. + +# Drawbacks +[drawbacks]: #drawbacks + +It's somewhat difficult to state drawbacks for such a high-level plan; they're +more likely to arise through the particulars. + +That said, it's unquestionable that following the plan in this RFC will result +in greater overall complexity for Cargo. The key to managing this complexity +will be ensuring that it's surfaced only on an as-needed basis. That is, uses of +Cargo in the pure crates.io ecosystem should not become more complex -- if +anything, they should become more streamlined, through improvements to features +like profiles, build scripts, and the handling of native dependencies. + +# Rationale and Alternatives +[alternatives]: #alternatives + +Numerous organizations we've talked to who are considering, or already are, +running Rust in production complain about difficulties with build system +integration. There's often a sense that Cargo "does too much" or is "too +opinionated", in a way that works fine for the crates.io ecosystem but is "not +realistic" when integrating into larger build systems. + +It's thus critical to take steps to smooth integration, both to remove obstacles +to Rust adoption, but also to establish that Cargo has an important role to play +even within opinionated external build systems: coordinating with Rust tooling +and workflows. + +This RFC is essentially a *strategic vision*, and so the alternatives are +different strategies for tackling the problem of integration. Some options +include: + +- Focusing entirely on one of the use-cases mentioned. For example: + - We could decide that it's not worthwhile to have Cargo play a role within a + build system like [Bazel], and instead focus on users who just need to + customize a particular aspect of Cargo. However, this would be giving up on + the hope of providing strong integration with Rust tooling and workflows. + - We could decide to focus solely on the [Bazel]-style use-cases. But that + would likely push people who would otherwise be happy to use Cargo to manage + most of their build (but need to customize some aspect) to instead try to + manage more of the concerns themselves. + +- Attempting to impose more control when integrating with hommogenous build + systems. In the most extreme case presented above, for internal crates Cargo + is little more than a middleman between Rust tooling and the external build + system. We could instead support only using custom registries to manage + crates, and hence always use Cargo's dependency resolution and so on. This + would, however, be a non-starter for many organizations who want a + single-version, mono-repo world internally, and it's not clear what the gains + would be. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Since this is an experimental RFC, there are more questions here than +answers. However, one question that would be good to tackle prior to acceptance +is: how should we prioritize various aspects of this work? Should we have any +specific customers in mind that we're trying to target (or who, better yet, are +working directly with us and plan to test and use the results)? From a732f6a5b9b029f376c2e2de795a8e324641ce94 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 16:17:13 -0700 Subject: [PATCH 02/12] Update summary --- text/0000-build-systems.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index bef61247405..6ef36ef5c5c 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -10,7 +10,9 @@ This **experimental RFC** lays out a high-level plan for improving Cargo's ability to integrate with other build systems and environments. As an experimental RFC, it opens the door to landing [unstable features] in Cargo to try out ideas, but *not* to stabilizing those features, which will require -follow-up RFCs. +follow-up RFCs. It proposes a variety of features which, in total, permit a wide +spectrum of integration cases -- from customizing a single aspect of Cargo to +letting an external build system run almost the entire show. [unstable features]: https://github.com/rust-lang/cargo/pull/4433/ From e8cb125b99a6fbc3a6dc38ced8592fa2a3d7b5a4 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 20:30:38 -0700 Subject: [PATCH 03/12] Tone down complexity argument --- text/0000-build-systems.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index 6ef36ef5c5c..72a3f3ed825 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -285,13 +285,13 @@ in the first place. Reliably building native dependencies in a cross-platform way is... challenging. Today, Rust offers some help with this through crates like -[`gcc`] and `[pkgconfig]`, which provide building blocks for writing build +[`gcc`] and [`pkgconfig`], which provide building blocks for writing build scripts that discover or build native dependencies. But still, today, each build script is a bespoke affair, customizing the use of these crates in arbitrary ways. It's difficult, error-prone work. [`gcc`]: https://docs.rs/gcc -`[pkgconfig`]: https://docs.rs/pkg-config +[`pkgconfig`]: https://docs.rs/pkg-config This RFC proposes to start a *long term* effort to provide a more first-class way of specifying native dependencies. The hope is that we can get coverage of, @@ -392,7 +392,7 @@ follow-up RFCs after experimentation has concluded. It's somewhat difficult to state drawbacks for such a high-level plan; they're more likely to arise through the particulars. -That said, it's unquestionable that following the plan in this RFC will result +That said, it's plausible that following the plan in this RFC will result in greater overall complexity for Cargo. The key to managing this complexity will be ensuring that it's surfaced only on an as-needed basis. That is, uses of Cargo in the pure crates.io ecosystem should not become more complex -- if From 3591258b17dd0dc271bc350e1129bf17923b87b1 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 20:54:24 -0700 Subject: [PATCH 04/12] Clarify native deps managed externally --- text/0000-build-systems.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index 72a3f3ed825..cdc7043f75e 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -311,6 +311,15 @@ Needless to say, this approach will need significant experimentation. But if successful, it would have benefits not just for build system integration, but for using external dependencies *anywhere*. +### The story for externally-managed native dependencies + +Finally, in the case where the external build system is the one specifying and +providing a native dependency, all we need is for that to result in the +appropriate flags to the lowered `rustc` invocations. If the external build +system is producing those lowered calls itself, it can completely manage this +concern. Otherwise, we will need for the plugin interface to provide a way to +plumb this information through to Cargo. + ## Specifics for the mixed build system case Switching gears, let's look at mixed build systems. Here, we generally don't From e1fdd82408bd304225fcfe180e5bab7c2945aac6 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 21:15:32 -0700 Subject: [PATCH 05/12] Clarify the homogeneous story --- text/0000-build-systems.md | 124 ++++++++++++++++++------------------- 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index cdc7043f75e..c7061332071 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -31,7 +31,7 @@ After extensive discussion with stakeholders, there appear to be two distinct kinds of use-cases (or "customers") involved here: - **Mixed build systems**, where building already involves a variety of - language- or proeject-specific build systems. For this use case, the desire is + language- or project-specific build systems. For this use case, the desire is to use Cargo as-is, except for some specific concerns. Those concerns take a variety of shapes: customizing caching, having a local crate registry, custom handling for native dependencies, and so on. Addressing these concerns well @@ -166,20 +166,30 @@ experimentation. ## Specifics for the homogenous build system case -For homogenous build systems, there's a key question: how is the Rust code -itself managed, through a crate registry or though some external system? Any -integration has to handle the first case (to have access to crates.io or a -mirror thereof), but organizations can choose whether to manage their own crates -through a custom registry (more on that below) or some other means. - -### Using crates managed by a crate registry - -Using a crate registry implies using Cargo's dependency resolution, and, in -particular, `Cargo.toml`. In this case, the external build system should invoke -Cargo for *at least* the dependency resolution and build configuration steps, -and likely the build lowering step as well. In such a world, Cargo is -responsible for *planning* the build (which involves largely Rust-specific -concerns), but the external build system is responsible for *executing* it. +For homogenous build systems, there are two kinds of code that must be dealt +with: code originally written using vanilla Cargo and a crate registry, and code +written "natively" in the context of the external build system. Any integration +has to handle the first case (to have access to crates.io or a vendored mirror +thereof). + +### Using crates vendored from or managed by a crate registry + +Whether using a registry server or a vendored copy, if you're building Rust code +that is written using vanilla Cargo, you will at some level need to use Cargo's +dependency resolution and `Cargo.toml` files. In this case, the external build +system should invoke Cargo for *at least* the dependency resolution and build +configuration steps, and likely the build lowering step as well. In such a +world, Cargo is responsible for *planning* the build (which involves largely +Rust-specific concerns), but the external build system is responsible for +*executing* it. + +A typical pattern of usage is to have a whitelist of "root dependencies" from an +external registry which will be permitted as dependencies within the +organization, often pinning to a specific version and set of Cargo +features. This whitelist can be described as a single `Cargo.toml` file, which +can then drive Cargo's dependency resolution just once for the entire registry. +The resulting lockfile can be used to guide vendoring and construction of a +build plan for consumption by the external build system. One important concern is: how do you depend on code from other languages, which is being managed by the external build system? That's a narrow version of a more @@ -189,64 +199,54 @@ separately in a later section. #### Workflow and interop story On the external build system side, a rule or plugin will need to be written that -knows how to invoke Cargo to produce a build plan, then translate that build -plan back into appropriate rules for the build system. Thus, when doing normal -builds, the external build system drives the entire process, but invokes Cargo -for guidance during the planning stage. +knows how to invoke Cargo to produce a build plan corresponding to a whitelisted +(and potentially vendored) registry, then translate that build plan back into +appropriate rules for the build system. Thus, when doing normal builds, the +external build system drives the entire process, but invokes Cargo for guidance +during the planning stage. + +### Using crates managed by the build system + +Many organization want to employ a their own strategy for maintaining and +versioning code, and for resolving dependencies. In this case, they may wish to +entirely forgo producing a meaningful Cargo.toml for the code the write, instead +having one that just forwards to a plugin. The description of dependencies is +then written in the external build system's rule format. Here, Cargo acts +primarily as a *workflow and tool orchestrator*, since it is not involved in +either planning or executing the build. + +#### Workflow and interop story + +Even though the external build system is entirely handling both dependency +resolution and build execution for the crates under its management, it may still +use Cargo for *lowering*, i.e. to produce the actual `rustc` invocations from a +higher-level configuration. Cargo will provide a way to do this. When *developing* a crate, it should be possible to invoke Cargo commands as usual. We do this via a plugin. When invoking, for example, `cargo build`, the plugin will translate that to a request to the external build system, which will -in turn re-invoke Cargo to request a build plan (the exact mechanics here are -TBD), and then execute the build. For `cargo run`, the same steps are followed -by putting the resulting build artifact in an appropriate location, and then -following Cargo's usual logic. And so on. +then execute the build (possibly re-invoking Cargo for lowering). For `cargo +run`, the same steps are followed by putting the resulting build artifact in an +appropriate location, and then following Cargo's usual logic. And so on. A similar story plays out when using, for example, the RLS or rustfmt. Ideally, these tools will have no idea that a Cargo plugin is in play; the information -and artifacts they need can be obtained by using Cargo in the appropriate way, -transparently. - -While the details here are quite hazy, the overall point is that control swaps -back and forth between Cargo and the external build system, depending on the -concerns at play. We set things up so that the Rust-specific pieces (including -Cargo workflows) continue to be handled by Cargo whenever possible. - -### Using "unmanaged" crates - -In some cases, an organization may want to employ a their own strategy for -maintaining and versioning code, and for resolving dependencies. In this case, -they may wish to entirely forgo writing a meaningful Cargo.toml, instead having -one that simply forwards to a plugin. The description of dependencies is then -written in the external build system's rule format. Here, Cargo acts primarily -as a *workflow and tool orchestrator*, since it is not involved in either -planning or executing the build. - -#### Workflow and interop story - -As with the workflow for crate registries, a plugin is used to manage control -passing back-and-forth between Cargo and the external build system. The main -difference is that Cargo is used for fewer steps. So, for example, when running -`cargo test`, the external build system is invoked directly (in an appropriate -mode for building tests) and performs the build without consulting Cargo at all -(or, perhaps it uses Cargo strictly for lowering, i.e. to determine `rustc` -invocations from higher-level configuration). Once the build is complete, Cargo -takes over, actually executing the resulting test binary. - -For the RLS or other tools that need to explore the dependency structure of the -crate, again they should work with a clear Cargo interface that hides any use of -plugins. The plugin for the external build system must be able to translate its -dependencies back into something equivalent to a lockfile, at least. +and artifacts they need can be obtained by using Cargo's in a standard way, +transparently -- but the underlying information will be coming from the external +build system, via the plugin. Thus the plugin for the external build system must +be able to translate its dependencies back into something equivalent to a +lockfile, at least. -### A hybrid +### The complete picture In general, any integration with a homogenous build system needs to be able to -handle crate registries, because access to crates.io is a hard constraint. +handle (vendored) crate registries, because access to crates.io is a hard constraint. -However, it's possible to *mix* this model for crates.io with the model for -unmanaged crates. All that's needed is a distinction within the external build -system between these two kinds of dependencies, which then drives the plugin -interactions accordingly. +Usually, you'll want to combine the handling of these external registries with +crates managed purely by the external build system, meaning that there are +effectively *two* modes of building crates at play overall. All that's needed to +do this is a distinction within the external build system between these two +kinds of dependencies, which then drives the plugin interactions accordingly. ## Cross-cutting concern: native dependencies From 1f004ce01c357c3447e392498b84aedad4f10a59 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 21:15:50 -0700 Subject: [PATCH 06/12] Spelling fix --- text/0000-build-systems.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index c7061332071..395575da660 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -37,7 +37,7 @@ kinds of use-cases (or "customers") involved here: handling for native dependencies, and so on. Addressing these concerns well means adding new points of extensibility or control to Cargo. -- **Homogenous build systems** like [Bazel], where there is a single prevailing +- **Homogeneous build systems** like [Bazel], where there is a single prevailing build system and methodology that works across languages and projects and is expected to drive all aspects of the build. In such cases the goal of Cargo integration is largely *interoperability*, including easy use of the crates.io @@ -164,9 +164,9 @@ functionality into numerous small pieces that can be re-used when integrating into a larger build system. This finer division is left as a question for experimentation. -## Specifics for the homogenous build system case +## Specifics for the homogeneous build system case -For homogenous build systems, there are two kinds of code that must be dealt +For homogeneous build systems, there are two kinds of code that must be dealt with: code originally written using vanilla Cargo and a crate registry, and code written "natively" in the context of the external build system. Any integration has to handle the first case (to have access to crates.io or a vendored mirror @@ -239,7 +239,7 @@ lockfile, at least. ### The complete picture -In general, any integration with a homogenous build system needs to be able to +In general, any integration with a homogeneous build system needs to be able to handle (vendored) crate registries, because access to crates.io is a hard constraint. Usually, you'll want to combine the handling of these external registries with @@ -362,7 +362,7 @@ work. For example: - **Profiles**. Putting the idea of the "build configuration" step on firmer footing will require clarifying the precise role of profiles, which today blur the line somewhat between *workflows* (e.g. `test` vs `bench`) and flags - (e.g. `--release`). Moreover, integration with a homogenous build system + (e.g. `--release`). Moreover, integration with a homogeneous build system effectively requires that we can translate profiles on the Cargo side back and forth to *something* meaningful to the external build system, so that for example we can make `cargo test` invoke the external build system in a @@ -371,7 +371,7 @@ work. For example: possible to control enough about the `rustc` invocation for at least some integration cases, and the answer may in part lie in improvements to profiles. -- **Build scripts**. Especially for homogenous build systems, build scripts can +- **Build scripts**. Especially for homogeneous build systems, build scripts can pose some serious pain, because in general they may depend on numerous environmental factors invisibly. It may be useful to grow some ways of telling Cargo the precise inputs and outputs of the build script, declaratively. From 733f988ddc2524c07b3e79a3531d53af247449be Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 1 Sep 2017 21:17:16 -0700 Subject: [PATCH 07/12] Typo fix --- text/0000-build-systems.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index 395575da660..f0c5e111916 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -207,7 +207,7 @@ during the planning stage. ### Using crates managed by the build system -Many organization want to employ a their own strategy for maintaining and +Many organization want to employ their own strategy for maintaining and versioning code, and for resolving dependencies. In this case, they may wish to entirely forgo producing a meaningful Cargo.toml for the code the write, instead having one that just forwards to a plugin. The description of dependencies is From 562d116d38c28d1cf14724926863bc8ca7c50c2b Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 Sep 2017 20:16:37 -0700 Subject: [PATCH 08/12] Additional clarifications --- text/0000-build-systems.md | 97 +++++++++++++++++++++++++++++--------- 1 file changed, 76 insertions(+), 21 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index f0c5e111916..dded8dd8f8a 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -145,32 +145,86 @@ The first two steps -- dependency resolution and build configuration -- need to operate on an entire dependency graph at once. Build lowering, by contrast, can be performed for any crate in isolation. -### Cargo plugins +### Customizing Cargo **A key point is that, in principle, each of these steps is separable from the others**. That is, we should be able to rearchitect Cargo so that each of these steps is managed by a distinct component, and the components have a stable -- and public! -- way of communicating with one another. That in turn will enable -replacing any particular component while keeping the others. +replacing any particular component while keeping the others. (To be clear, the +breakdown above is just a high-level sketch; in reality, we'll need a more +nuanced and layered picture of Cargo's activities). -More concretely, this RFC proposed that we pursue *some* kind of plugin system -in which you can inform Cargo that part of its normal work should instead by -done by an external tool. This might live in `Cargo.toml`, or in Cargo -configuration. In any case, each phase of Cargo's work should be swappable. +This RFC proposes to provide *some* means of customizing Cargo's activities at +various layers and stages. The details here are *very much* up for grabs, and +are part of the experimentation we need to do. -Note that the four-step division above is a useful conceptual model, but is -ultimately too coarse-grained; we will likely need to divide up Cargo's -functionality into numerous small pieces that can be re-used when integrating -into a larger build system. This finer division is left as a question for -experimentation. +#### Likely design constraints + +Some likely constraints for a Cargo customization/plugin system are: + +- It should be possible for Rust tools (like `rustfmt`, IDEs, linters) to "call + Cargo" to get information or artifacts in a standardized way, while remaining + oblivious to any customizations. Ideally, `Cargo` workflows (including custom + commands) would also work transparently. + +- It should be possible to customize or swap out a *small part* of Cargo's + behavior without understanding or reimplementing other parts. + +- The interface for customization should be *forward-compatible*: existing + plugins should continue to work with new versions of Cargo. + +- It should be difficult or impossible to introduce customizations that are + "incoherent", for example that result in unexpected differences in the way + that `rustc` is invoked in different workflows (because, say, the testing + workflow was customized but the normal build workflow wasn't). In other words, + customizations are subject to *cross-cutting concerns*, which need to be + identified and factored out. + +We will iterate on the constraints to form core design principles as we +experiment. + +#### A concrete example + +Since the above is quite hand-wavy, it's helpful to see a very simple, concrete +example of what a customization might look like. You could imagine something +like the following for supplying manifest information from an external build +system, rather than through `Cargo.toml`: + +**Cargo.toml** + +```toml +[plugins.bazel] +generate-manifest = true +``` + +**$root/.cargo/meta.toml** + +```toml +[plugins] + +# These dependencies cannot themselves use plugins. +# This file is "staged" earlier than Cargo.toml + +bazel = "1.0" # a regular crates.io dependency +``` + +**Semantics** + +If any `plugins` entry in `Cargo.toml` defines a `generate-manifest` key, +whenever Cargo would be about to return the parsed results of `Cargo.toml` , +instead: + +- look for the associated plugin in `.cargo/meta.toml`, and ask it to generate the manifest +- return that instead ## Specifics for the homogeneous build system case For homogeneous build systems, there are two kinds of code that must be dealt with: code originally written using vanilla Cargo and a crate registry, and code -written "natively" in the context of the external build system. Any integration -has to handle the first case (to have access to crates.io or a vendored mirror -thereof). +written "natively" in the context of the external build system. Any integration +has to handle the first case to have access to crates.io or a vendored mirror +thereof. ### Using crates vendored from or managed by a crate registry @@ -208,12 +262,13 @@ during the planning stage. ### Using crates managed by the build system Many organization want to employ their own strategy for maintaining and -versioning code, and for resolving dependencies. In this case, they may wish to -entirely forgo producing a meaningful Cargo.toml for the code the write, instead -having one that just forwards to a plugin. The description of dependencies is -then written in the external build system's rule format. Here, Cargo acts -primarily as a *workflow and tool orchestrator*, since it is not involved in -either planning or executing the build. +versioning code and for resolving dependencies, *in addition* to build +execution. In this case, they may wish to entirely forgo producing a meaningful +Cargo.toml for the code the write, instead having one that just forwards to a +plugin (as in the example plugin above). The description of dependencies is then +written in the external build system's rule format. Here, Cargo acts primarily +as a *workflow and tool orchestrator*, since it is not involved in either +planning or executing the build. Let's dig into that. #### Workflow and interop story @@ -229,7 +284,7 @@ then execute the build (possibly re-invoking Cargo for lowering). For `cargo run`, the same steps are followed by putting the resulting build artifact in an appropriate location, and then following Cargo's usual logic. And so on. -A similar story plays out when using, for example, the RLS or rustfmt. Ideally, +A similar story plays out when using, for example, the RLS or `rustfmt`. Ideally, these tools will have no idea that a Cargo plugin is in play; the information and artifacts they need can be obtained by using Cargo's in a standard way, transparently -- but the underlying information will be coming from the external From ff9bb8e5e20d004e834d37cefee53f60a28afebe Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 Sep 2017 20:25:03 -0700 Subject: [PATCH 09/12] Clarify mixed build system relationship to plugins --- text/0000-build-systems.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index dded8dd8f8a..45328754f88 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -377,31 +377,30 @@ plumb this information through to Cargo. ## Specifics for the mixed build system case -Switching gears, let's look at mixed build systems. Here, we generally don't -want to use Cargo plugins, because Cargo is still doing the vast majority of the -work. Instead, we need to incorporate new features or points of -customization/extensibility to handle specific concerns. The primary ones on the -radar right now are as follows. +Switching gears, let's look at mixed build systems. Here, we may address the +need for customization with a mixture of plugins and new core Cargo +features. The primary ones on the radar right now are as follows. - **Multiple/custom registries**. There is a longstanding desire to support registries other than crates.io, e.g. for private code, and to allow them to be used *in conjunction* with crates.io. In particular, this is a key pain point for customers who are otherwise happy to use Cargo as-is, but want a crates.io-like experience for their own code. There's - an [early RFC](https://github.com/rust-lang/cargo/pull/4036) on this topic, - and more work here is planned soon. + an [RFC](https://github.com/rust-lang/rfcs/pull/2141) on this topic, and more + work here is planned soon. Note: here, we address the needs via a + straightforward enhancement to Cargo's features, rather than via a plugin + system. - **Network and source control**. We've already put significant work into providing control over where sources live (though vendoring) and tools for preventing network access. However, we could do more to make the experience here first class, and to give people a greater sense of control and assurance - when using Cargo on their build farm. + when using Cargo on their build farm. Here again, this is probably more about + flags and configuration than plugins per se. - **Caching and artifact control**. Many organizations would like to provide a shared build cache for the entire organization, across all of its - projects. More generally, it seems plausible and useful to provide a way of - customizing caching and artifact storage without having to go all the way to a - full plugin. + projects. Here we'd likely need some kind of plugin. These bullets are quite vague, and that's because, while we know there are needs here, the precise problem -- let alone the solution -- it not yet clear. The From 8b7ce1981f90727d67a5683554913693bd631824 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 Sep 2017 20:25:53 -0700 Subject: [PATCH 10/12] Remove unnecessary sentence --- text/0000-build-systems.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index 45328754f88..8da5d72aa3b 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -107,9 +107,7 @@ model of the work that Cargo does today, broken into several stages: The first stage, dependency resolution, is the most complex; it's where our model of semver comes into play, as well as a huge list of related -concerns. Interestingly, it's also the area where external build systems tend to -have the *least* to say, because it's not really about *building* so much as -*figuring out what even goes into a build*. +concerns. Dependency resolution produces a lockfile, which records what crates are included in the dependency graph, coming from what sources and at what versions, From 9fe67c4cda4a213f812dac423d30d821cde5aaf1 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 19 Sep 2017 16:14:24 -0700 Subject: [PATCH 11/12] Additional clarifications around tool interop --- text/0000-build-systems.md | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/text/0000-build-systems.md b/text/0000-build-systems.md index 8da5d72aa3b..b65dfac5716 100644 --- a/text/0000-build-systems.md +++ b/text/0000-build-systems.md @@ -261,12 +261,22 @@ during the planning stage. Many organization want to employ their own strategy for maintaining and versioning code and for resolving dependencies, *in addition* to build -execution. In this case, they may wish to entirely forgo producing a meaningful -Cargo.toml for the code the write, instead having one that just forwards to a -plugin (as in the example plugin above). The description of dependencies is then -written in the external build system's rule format. Here, Cargo acts primarily -as a *workflow and tool orchestrator*, since it is not involved in either -planning or executing the build. Let's dig into that. +execution. + +In this case, the big question is: how can we arrange things such that the Rust +tooling ecosystem can understand what the external build system is doing, to +gather the information needed for the tools to operate. + +The possibility we'll examine here is using Cargo **purely as a conduit for +information from the external build system to Rust tools** (see Alternatives for +more discussion). That is, tools will be able to call into Cargo in a uniform +way, with Cargo subsequently just forwarding those calls along to custom user +code hooking into an external build system. In this approach, Cargo.toml will +generally consist of a single entry forwarding to a plugin (as in the example +plugin above). The description of dependencies is then written in the external +build system's rule format. Thus, Cargo acts primarily as a *workflow and tool +orchestrator*, since it is not involved in either planning or executing the +build. Let's dig into it. #### Workflow and interop story @@ -497,6 +507,13 @@ include: single-version, mono-repo world internally, and it's not clear what the gains would be. +One key open question is: what, exactly, do Rust tools need to do their work? +Tool interop is a major goal for this effort, but ideally we'd support it with a +minimum of fuss. It may be that the needs are simple enough that we can get away +with a separate interchange format, which both Cargo and other build tools can +create. As part of the "experimental" part of this RFC, the Cargo team will work +with the Dev Tools team to fully enumerate their needs. + # Unresolved questions [unresolved]: #unresolved-questions From b5c2615e1989d2c4051acf69639347c425aa38b0 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 1 Feb 2018 12:48:09 -0800 Subject: [PATCH 12/12] RFC 2136 is: Cargo build system integration --- text/{0000-build-systems.md => 2136-build-systems.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-build-systems.md => 2136-build-systems.md} (99%) diff --git a/text/0000-build-systems.md b/text/2136-build-systems.md similarity index 99% rename from text/0000-build-systems.md rename to text/2136-build-systems.md index b65dfac5716..65251e6d157 100644 --- a/text/0000-build-systems.md +++ b/text/2136-build-systems.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2017-09-01 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/2136 +- Rust Issue: N/A # Summary [summary]: #summary