-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cargo_crate repository rule #2
Comments
I would love to help out with this feature. I am very interested in using Bazel to build projects with Rust code and Cargo support is a must-have. Is the linked document about repository rules the best place to start, or are there other good documents to read? |
Awesome! The document linked above is the original design doc for Skylark repository rules. We now have documentation for repository rules as well. I took a brief look at the
Ideally, it would be nice to reuse some of the code in Cargo itself, but I haven't looked into the implementation in enough detail to know how feasible it is. |
Thanks for the information! I've been looking at Cargo's libraries to see what code we can reuse. It'd be really nice if I could write a cargo command to do what we need. |
There is a third-party Cargo subcommand |
Bikeshed: I might prefer this be called "cargo_source" or "cargo_crate_source", to indicate that these are not valid targets for an arbitrary |
Good point. The word crate in rust parlance seems to more accurately refer to compiled binaries rather than the source code. Let's go with Would you like to take this one since you already have code for fetching crates in your cargo2bazel tool? Having rules call a Python command line tool is completely fine and is the approach taken by the |
Yeah, I can take this issue. |
Thanks! |
Update: TL;DR: [cargo_source + WORKSPACE codegen + BUILD codegen] isn't likely to pan out as well as a hack around cargo, at least in the short term I don't have a PR up yet, but I'm currently spiking something more like a This line of reasoning was spurned initially when I was exploring repository rules in the abstract -- and I realized that these I recall that kythe is using something quick and dirty that calls cargo directly. I'd bet we could support a 100% integration into cargo by wrapping and cleaning that up, since cargo already builds targets that our own All of that said, we could always go the whole way, and replicate as much of cargo's functionality as possible, generate cargo sources, multiple build rules for each feature flag subset, and handle target platforms, but I'm not confident that's the fastest (or most reasonable) approach. |
For calling cargo directly, do you mean doing so for the most troublesome crates, such as I think the Apologies for the delayed reply. I will post a more detailed proposal for the approach that I have min mind shortly. |
My own apologies -- I haven't updated this with my latest stuff. Agreed -- "cargo compile" probably isn't where we want to work from -- we should prefer "rust_library" generally. However, I'm still skeptical of "cargo_source" as a useful unit. It papers over the relationships between sources that are intrinsic to a "source", and really is synonymous with "new_http_archive". As mentioned before, I think having a unique external_repository for each source adds more complexity, and leads to more ambiguity than necessary, esp when multiple versions of a crate are involved. My impression is that we'll be fighting cargo a bit more than is useful. I'll provide more feedback on a future doc, but I'm thinking we should lean on a cargo plugin to generate a single workspace with all dependencies. cargo-vendor is a good starting point there. After that, I think we're on the same page -- use another (or the same) cargo plugin to generate build rules and link the dependencies using cargo's own build planning. I don't think we need to wait for a special cargo change to do this, provided we have an appetite for cargo spelunking. I've already started on that myself. In short, a cargo_toml(...) workspace rule accepts an existing toml and, does some cargo-vendor like behavior into a genfiles directory, and some custom code runs cargo's compile machinery to generate the build targets. We can map those targets to rust_library rules without much fuss. Looking forward to seeing your proposal! |
Heres a quick hack up of the cargo part of the workflow: https://github.com/acmcarther/cargo-raze-vendor It generates WORKSPACES like this: https://github.com/acmcarther/cargo-raze-vendor/blob/prototype/cargo/WORKSPACE I haven't built any of the bazel-side glue (cargo_toml, new_crate_repository, crate_repository), but I think they'll be easier to work through. The jist of it is that cargo_toml shells out to cargo-raze-vendor, and the *crate_repository rules work just like git_repository rules except that they know how to pull off of crates.io. I should have a PR up specifically for this issue very soon (though it might be called new_crate_repository -- we can bikeshed on that when the PR is up). EDIT: Lets sync up some time soon. Theres some thinking behind this super roundabout workflow (bazel_rule -> cargo -> bazel_rules). Its my way of addressing target-specific dependencies -- defer the dependency planning. EDIT2: I'm not sure we have anything to gain from a cargo_source or new_crate_repository rule or anything like that. There's no additional things we can infer about the crate (because it depends on the crates around it). I've opted to use The fundamental issue, and the dissonance between what folks expect and what they'll get, is that a "crate" doesn't make sense in a vacuum. You need a set of desired crates and a resolver to nail down the precise dependencies, versions, and feature flags. |
Im interested in helping on pushing this issue forward. What else needs to be done? |
@softprops TL;DR: cargo-raze is the cutting edge, with no great story for multiplat or build.rs files yet. rules_cargo is a totally different philosophy (and has awful docs). Both are pretty unpolished. I've been exploring the solution space for the last couple of months. I've investigated:
I'm making the most forward progress on (1), but I haven't really sat down to write up a design doc. (2) is blocked by weird linker issues (as mentioned above). The largest unmet milestone before really publishing and integrating (1) here is a formal support story for replacing the role of build.rs. I have a notion of an "override", which I think should do that job via dependency replacement, additional compiler flags, and environment variables, but I haven't got an example yet. EDIT: I'm going to really clean up docs on both repos so you can get a better picture for the state of the world. |
Awesome. Cargo raze sounds like the perfect solution. https://github.com/johnynek/bazel-deps/blob/master/README.md follows a similar approach for the jvm community. I would get held up on and 2 and 3. 1 solves for the majority of cases I think. At least in my usage of cargo. I only used build.rs as a stop gap for using serde o. stable rust but now that macro derives arrived I can use serde on stable without build.rs files. Really appreciate the work your putting into this. I've had great experience with bazels remote build cache. I'm excited what that can do for rust productivity. never having to build the same rust code twice on any machine with the same arch will be awesome |
As this gone anywhere lately? I see that the traffic on the Rust RFC has died down, but don't know how that influences the outcome of this issue. I'm not really in a position to help push this along, though I would like to be able to use Rust and Bazel together with the Crates.io ecosystem. |
I may not be the best person to push this forward. I've chased my own tail all the way to reinventing stackage for cargo[1] as an imagined prerequisite. My thought process isn't super well grounded, but I found that the naive vendoring approach breaks down in practice quickly, as the cargo ecosystem is way too keen to pull down many versions of dependencies. It's also impossible to verify that your vendored dependencies actually work, as cargo doesn't lock dev_dependencies -- so you cant actually I've been working on this in a vacuum though, and if other people will find utility in a version of this that naively brings in dependencies, I can polish that up. |
This gets more into thoughts on the ecosystem, but part of the reason why this is both a difficult problem, and particularly acute for Bazel integration is that (a) means that we really need a solution to this problem before we can use bazel with any efficacy, and (b) means that we have to resolve a lot of crates for the average environment if we want to get anything done. |
@acmcarther can you expand on the problems caused by naievely vendoring multiple versions of dependencies? I imagine at a minimum the current rules don't handle multiple version of a crate in transitive dependencies. |
Issue #47 fixes 'multiple versions of the same crate'. With that and cargo-raze, it is possible to perform resolution for a single Cargo toml without much issue. I don't have a cohesive reasoning for going down the "alternative resolution strategy" track except that:
Not a great set of reasons. If you feel that your use case would directly benefit by polishing what already exists, I can prioritize that. I have a repo that demonstrates auto vendoring and BUILD generation here: cargo-raze-examples. EDIT: s/hobby project/repo: I was going to link next_space_coop, but ended up linking the more up to date cargo-raze-examples. |
I'll have to play with raze-one-toml before I ask you do do anything. Is [2] addressed by any rust issues? It seems pretty important to be able to test the vendored crates. |
TL;DR: No, there are no issues to my knowledge regarding locking dev_dependencies. I believe that locking dev_dependencies would be considered a "breaking change", as it would change the existing output of resolution passes, and make some arrangements of crates that currently resolve become unresolvable. |
@acmcarther raze solution works great so far. I wasn't using the forked rules (since I'm currently using my own fork), but haven't needed multiple versions of crates just yet. I was a little surprised that In general I think raze is a pretty reasonable way to do the cargo integration. |
Ah yeah, that's correct. I was planning on setting that up this weekend in that project but am waiting on some other PRs to go through (some issues related to TravisCI require something be done about CI and moving to Building with Bazel seems to make sense). |
I realistically don't have enough knowledge of any of those pieces to put together something meaningful here... Can you point me at an example today where you have to do something you don't like, which you think this could help with? |
I don't think you really need any knowledge of those sections of the rules. Each demonstrates some unique behavior or interaction that is currently solved by Cargo raze. If you have an understanding of Cargo raze, then I think you have all you need to try and replace the use of The one I'm most interested in is |
I think as of google/cargo-raze@0d76371 (release 0.9.0) the functionality missing from the library is now there, unblocking some of the exploration needed for |
I just pushed an additional commit to the branch which pulls in the latest cargo-raze, both fixing some issues and meaning we're no longer using a fork :) |
Disclaimer, I have very little experience with raze and in general with the repository rules that fetch stuff from package managers like Some questions of an uninformed reader:
|
There is already a
This is in the control of the rule author. Our private RBE deployment has a very minimal remote execution image, and we build system-like packages either using support for compiling them in -sys crates, or using things like If folks want to have some run-time environment installed on their remote execution systems and use that, it's up to them to make sure it's compatible with wherever they're going to be deploying. In practice, we've used the minimal environment approach to build things like brotoli, bzip2, fsevent, inotify, libgit2, openssh, openssl, ring, ...
We haven't experimented with this, but because everything uses platforms and toolchains, things should work properly, as long as the toolchains are properly configured. But I can't vouch for it personally through experience :)
We haven't tried, but it should be - there's no magic here, it's just generating
I don't know - https://buck.build/ only mentions rust_{binary,library,test} rules, and a way of pulling in a .rlib. I know I've seem talk on internals.rust-lang.org of consuming the output of It's worth noting, though, that this branch really just adds a wrapper around
Right now, within each universe, the implementation allows for one version per cargo-style configuration, so you can have independent versions in normal/build/dev deps, but within each configuration your version requirements need to be compatible and exactly one will be chosen. You can theoretically create multiple universes each of which contains a different version (though right now it's possible this will lead to naming clashes for the repositories we generate, but that's easily fixed) by doing something like: crate_universe(
name = "rust_deps_old_libc",
packages = [
crate.spec(
name = "libc",
semver = "=0.2.76",
),
],
)
crate_universe(
name = "rust_deps_new_libc",
packages = [
crate.spec(
name = "libc",
semver = "=0.2.86",
),
],
)
🎉 Thanks! I think the big issue we need to resolve before PRing is how to bootstrap. We need a compiled rust binary, which has many dependencies, in order to invoke our repository rule. Internally, we've been building one on CI as a merge hook, and having a function in the workspace create an Alternatively, as we've been talking about releases elsewhere, we could add "build binaries and put them in a tar" to the release process, but including per-platform compiled binaries would make it a sizeable tar... Thoughts very welcome here! Once we've worked that out, I'll put together a PR so we can do a proper review. Because it's a bit of a code-dump, as noted above there are a few areas where there are a few hacks in place, but it'd be great if we could tidy those up after-the-fact in small targeted reviews, rather than trying to get everything perfect up-front, if that's ok with folks! |
I believe other rules like rules_docker (maybe rules_go and bazel-gazelle too?) build and publish binaries into https://storage.googleapis.com , hopefully we can set up something similar for the rust_external resolver: if "go_puller_darwin" not in excludes:
http_file(
name = "go_puller_darwin",
executable = True,
sha256 = "4855c4f5927f8fb0f885510ab3e2a166d5fa7cde765fbe9aec97dc6b2761bb22",
urls = [("https://storage.googleapis.com/rules_docker/" + RULES_DOCKER_GO_BINARY_RELEASE + "/puller-darwin-amd64")],
) I think the CI config is: # Step: push the puller release binary for Linux AMD64
- name: "gcr.io/cloud-builders/gsutil"
args:
- "cp"
- "-a"
- "public-read"
- "bazel-bin/container/go/cmd/puller/puller_/puller"
- "gs://rules_docker/$COMMIT_SHA/puller-linux-amd64"
id: "push-puller-linux-amd64"
waitFor: ["build-linux-amd64"] |
I'd love to see some of the built-in rules get converted to use edit. Actually, the |
Though that said, I'd love to be able to utilize the existing |
Not generating a .bzl file is one of the improvements we agreed on wanting as well, so 👍 to that. From our point of view it would be nice to land this if it's useful, experimentally maybe, so we can iterate here instead of maintaining this as a branch and also internally. |
I think that's fine then. My only asks would be to have detailed documentation on the starlark and rust code that went into this and move the rule into |
I just updated |
I think it's time to open a PR for this then 😄 So we can appropriately comment on the changes. |
Understanding there's a sort of bootstrapping issue, I think it'd still be worth getting the code in. Maybe BuildKite has a way to build artifacts and commit them back to the repo? |
This rule dynamically looks up crate dependencies as a repository rule. See bazelbuild#2 for context.
I just created #598 adding the |
This rule dynamically looks up crate dependencies as a repository rule. See bazelbuild#2 for context.
This rule dynamically looks up crate dependencies as a repository rule. See bazelbuild#2 for context.
This rule dynamically looks up crate dependencies as a repository rule. See #2 for context. Note that this rule is currently experimental, and offers absolutely no stability guarantees. Expect significant churn in near future.
bazelbuild#1: Include details such as the target architecture in the cache folder hash. Previously only the path to rustc and the compilation mode was used to calculate the hash, and this was likely the cause of ICEs. We now include all of the env vars passed into the request in the hash, so that separate folders are created for different crate names, crate versions, and target architectures. bazelbuild#2: A new worker was previously being created for every rustc invocation, because the env parameter contained crate-specific details that caused WorkerKey to change each time. Instead of passing the environment in directly, persist it to a file, and pass it to process-wrapper with --env-file instead. We also use the contents of this file in order to generate the hash mentioned above. This will mean you'll be limited to 4 workers by default; you may want to also pass in --worker_max_instances=Rustc=HOST_CPUS*0.5 or similar, as JIT startup is not a concern for the Rust workers, and parts of rustc's compilation process are unable to take advantage of more than 1 CPU core. bazelbuild#3: Instead of storing incremental files in $TEMP, the worker is now enabled by providing it with a folder that it should store its incremental files in, such as: bazel build --@rules_rust//worker:cache_root=/path/to/cache/files This mimics the behaviour of --disk_cache and --repository_cache, and makes it clear where the files are being persisted to.
@illicitonion @hlopko I think this issue could be closed now that #598 is in |
Makes sense to me. Please complain if anybody disagrees :) |
Add a
cargo_crate
repository rule for fetching Rust crates from Cargo.io.The text was updated successfully, but these errors were encountered: