Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow path dependencies without having everything in one Cargo workspace #1773

Closed
shelbyd opened this issue Jan 13, 2023 · 10 comments
Closed

Comments

@shelbyd
Copy link

shelbyd commented Jan 13, 2023

My team has a monorepo for our Rust + Go + JS project. We're trying to get all of our rust code compiling in Bazel. We have path dependencies for our own code inside the monorepo.

AFAICT this function

pub fn is_workspace_owned(manifest: &Manifest) -> bool {
effectively requires us to have all of our Rust code in a single Cargo workspace if we're going to use path dependencies. This isn't a huge problem except for dependency resolution.

Cargo can't find compatible versions of dependencies, and I'm not aware of a way to resolve that. It's possible that specifying versions in the WORKSPACE file could work, or vendoring the dependencies. But I'd prefer not to do either of those options.

If we remove the root [workspace], normal Cargo builds everything fine (path dependencies included). So it seems that the requirement of a Cargo workspace when using path dependencies is unnecessary and would be the most effective way to resolve our issue.

Does this sound reasonable? Is there a better way to resolve our situation?

@hobofan
Copy link
Contributor

hobofan commented Jan 13, 2023

I think this is a duplicate (or at least has large overlap) with #1525.

@shelbyd
Copy link
Author

shelbyd commented Jan 17, 2023

This is a large issue for my team. We have a few binaries in our repo that we cannot bring into Bazel because of this.

Looking at this more, the issue is that we have:

  1. bin0 indirectly requires some_crate-v0 and our_lib
  2. bin1 indirectly requires some_crate-v1 and our_lib
  3. our_lib does not require some_crate (but may in the future)

There should be no target that actually ends up with both some_crate-v0 and some_crate-v1 in the build.

I tried to specify packages in our root crates_universe by aliasing the different versions of some_crate. But { "some_crate-v0": crate.spec(package = "some_crate", version = "0") } did not generate any rules related to some_crate, so I don't know what to do there. I could probably get this working if that package spec worked.

The only remaining solution I see is to vendor our crates and have a single version of some_crate vendored. They're almost surely compatible, or at least minimal changes required. But I'd prefer not to include all of the crates we use. Is there a way to vendor only certain crates that we need to modify but keep them in the same crate_universe?

As I said, this is a serious issue for my team, so I'd at least like some idea of which direction to go in, or potential alternatives. Thanks.

@hobofan
Copy link
Contributor

hobofan commented Jan 20, 2023

So I figured out a workaround (that I've used in variant with cargo-raze in the past which has similar limitations): symlinking the crates, so that they are present in multiple places in the directory tree.

I've built an example repository here: https://github.com/hobofan/rules_rust_multiple_workspaces - I'll see that I can describe it in more detail in the coming days after trying out that approach more in depth.

There are some downsides with that approach (but it seems to be enough for our team at least to make the switch), so first-class support would still be great.

Some limitations (I'm sure there are some I've yet to encounter):

  • You can't rely on the default native.package_name() for the package_name value of e.g. all_crate_deps macros, and as soon as you mix consuming workspaces, you have to define the value for all the ones in a BUILD.bazel file
  • The above causes a lot of duplication, which especially for deep local dependency trees could be a pain to maintain - I assume that can be combated with writing macros for reused targets that have the same structure
  • As packages are present multiple times in the dependency tree, this can cause things to be rebuilt multiple times, e.g. test suites may be executed multiple times if naively running bazel test //... (which depending on your needs may also have a positive property of testing all configurations, though it's not what we want) - With cargo-raze we were able to work around that with .bazelignore, but the crates_universe mechanism doesn't work if a symlinks dependency is ignored.

@shelbyd
Copy link
Author

shelbyd commented Jan 21, 2023

I was able to get my package compiling with [patch]ing dependencies with conflicts and updating them to remove the conflicts in another repo. See #1645 (comment) for some more info.

I'm not sure what the best way to resolve the core problem is. It's reasonable to require [patch]ing conflicting dependencies. But I shouldn't have to have them in a separate repo, so [patch]ing just the conflict in a locally vendored directory is reasonable.

@irajtaghlidi
Copy link

I encountered the same core issue in my monorepo, which includes multiple Cargo workspaces with path dependencies between different workspace crates. The symlink that @hobofan provided worked in my test project, but for the actual monorepo that I have, it would be very difficult to manage. I just wanted to ask about your recent findings. Have you found any better solutions for this problem?

@UebelAndre
Copy link
Collaborator

This is something we think would be good but haven't had time to really think about and implement. My primary concern with the feature is that folks start using path dependencies as a way to inject dependencies external to the current Bazel workspace. So Ideally relative paths leading beyond BUILD_WORKSPACE_DIRECTORY and absolute paths would be prohibited or some very loud and obvious warning raised informing users they're in dangerous waters.

If someone were to open a PR showing these things were covered in addition to the working feature then I'd be happy to review 😄

@irajtaghlidi
Copy link

irajtaghlidi commented Apr 29, 2024

Since using external dependencies is handled with crates universe and also path dependences (the ones which are outside of Cargo workspace but still inside the Bazel workspace) works fine if we don't use crates_repository and directly include them in BUILD files. What if we add a new flag option to ask the crate universe to accept and just add path dependencies outside of the Cargo workspace as is to all_crate_deps() output array? I just want to start working on a PoC to see how it acts. at least in my setup.

@hobofan
Copy link
Contributor

hobofan commented Apr 29, 2024

So we've used a variation of the symlink based approach I described in an earlier comment for more than half a year at the company I worked at until it went bankrupt. I still have access and rights to use the code, so I'll try to distill the "sugar" we built around that approach before I forget how it worked 😅. The rules_rust version used in the repo is 0.17.0, so I hope this isn't too outdated today.

At the end we had used the approach with 10 different crate indices (I've truncated it to two in the examples below). It worked quite decently for reducing the blast radius of dependency updates between different parts of the monorepo and resolving some walls that we hit with dependency resolution in building for different targets (Linux, macOS and WASM), but also had the downside that some compile-heavy deep dependencies (in our case the patched version of RocksDB that came with oxigraph at the time) was compiled ~4 times which could make up the bulk of our total build time. (The impact of that can possibly be mitigated in some way by building that crate more manually and providing that to crates_universe.)

Rough Cargo.toml and symlink structure

|
-- some_package
|   |- Cargo.toml
|   |- multi_index_targets.bzl
|   -  BUILD.bazel
| 
-- other_package
    |- Cargo.toml
    - BUILD.bazel
    - local_vendor
        |- some_package  <- symlink to /some_package

some_package and other_package are in two separate workspaces. some_package is symlinked into the workspace of other_package.

In the Cargo.toml of other_package:

[workspace.dependencies]
some_package = { path = "./local_vendor/some_package" }

Here, some_package appears as a normal local dependency, so on the consuming end there isn't anything special that has to be done with crates_universe. IIRC this worked a lot better than having it as one of the [members] of the workspace, as I did in the example repo I posted initially.

/bazel_tools/rust_workspace_indices.bzl

Used to switch between different aliases and all_crate_deps implementations based on index key.

load("@oxolotl_index//:defs.bzl", oxolotl_aliases = "aliases", oxolotl_all_crate_deps = "all_crate_deps")
load("@query_server_index//:defs.bzl", query_server_aliases = "aliases", query_server_all_crate_deps = "all_crate_deps")

def resolve_rust_workspace_index(index):
    alias_macro = None
    all_crate_deps_macro = None
    if index == "@query_server_index":
        alias_macro = query_server_aliases
        all_crate_deps_macro = query_server_all_crate_deps
    if index == "@oxolotl_index":
        alias_macro = oxolotl_aliases
        all_crate_deps_macro = oxolotl_all_crate_deps

    if alias_macro == None:
        fail("""
        ============
        ============
        It looks like the index `{index}` is not known. Please register it in `/bazel_tools/rust_workspace_indices.bzl`.
        ============
        ============""".format(index = index))

    return (alias_macro, all_crate_deps_macro)

some_package/multi_index_targets.bzl

For each target that should be built for multiple indices we created a thin rust_library macro that utilized resolve_rust_workspace_index that looked something like this example. The macros also take other targets that were specifically built for this index as an argument (here dep_oxolotl_proto_types).

load("@rules_rust//rust:defs.bzl", "rust_library")
load("//bazel_tools:rust_workspace_indices.bzl", "resolve_rust_workspace_index")

VISIBILITY = [
    "//oxolotl:__subpackages__",
    "//query_server:__subpackages__",
    "//msg_server:__subpackages__",
]

def oxolotl_library(name, package_name, index, dep_oxolotl_proto_types, **kwargs):
    (alias_macro, all_crate_deps_macro) = resolve_rust_workspace_index(index)

    rust_library(
        name = name,
        srcs = native.glob(
            [
                "src/**/*.rs",
            ],
        ),
        aliases = alias_macro(
            package_name = package_name,
        ),
        crate_name = "oxolotl",
        edition = "2021",
        proc_macro_deps = all_crate_deps_macro(
            package_name = package_name,
            proc_macro = True,
        ),
        visibility = VISIBILITY,
        deps = all_crate_deps_macro(
            package_name = package_name,
            normal = True,
        ) + [
            dep_oxolotl_proto_types,
        ],
        **kwargs
    )

some_package/BUILD.bazel

The macro was then used in the BUILD.bazel file of the package like so (as seen in package_name this BUILD file needs to know about it's consumers so some coupling is introduced). As seen in the rust_test, in this file the package_name has to be explicitly provided in it's non-symlinked version, because otherwise the BUILD file has problems during evaluation (I don't remember the exact reason).

oxolotl_library(
    name = "oxolotl",
    package_name = "oxolotl/oxolotl",
    dep_oxolotl_proto_types = "//oxolotl/oxolotl_proto_types:oxolotl_proto_types",
    index = "@oxolotl_index",
)

oxolotl_library(
    name = "oxolotl_for_query_server",
    package_name = "query_server/local_vendor/oxolotl",
    dep_oxolotl_proto_types = "//query_server/local_vendor/oxolotl_proto_types:oxolotl_proto_types_for_query_server",
    index = "@query_server_index",
)

rust_test(
    name = "oxolotl_lib_test",
    crate = ":oxolotl",
    env = {
        "INSTA_WORKSPACE_ROOT": "$(rootpath :oxolotl)",
    },
    deps = oxolotl_all_crate_deps(
        package_name = "oxolotl/oxolotl",
        normal_dev = True,
    ) + [":oxolotl"],
)

Hope this gives some insight and that I didn't forget an integral part about it 😬

@criemen
Copy link
Contributor

criemen commented Dec 6, 2024

@illicitonion was this issue also fixed by #3025?

@illicitonion
Copy link
Collaborator

Yes, I believe this should have been fixed as of #3030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants