Version management is kind of terrible #745

ThatGeoGuy · 2024-10-08T20:49:42Z

So I use the protobuf crate alongside protobuf-codegen to manage .proto → .rs conversions.

By and large, this is fine. However, I have a recurring problem with how versions are managed via constants with the library. Take for example my build.rs script:

fn main() {
    println!("cargo:rerun-if-changed=protos");
    _ = std::fs::create_dir("src/protos");
    protobuf_codegen::Codegen::new()
        .pure()
        // This is protos in the root of the crate
        .includes(["protos"])
        .input("protos/my_schema.proto")
        // This is protos in src in the crate
        .out_dir("src/protos")
        .run_from_script();
}

Fairly simple, and generally follows the kinds of practices you tend to see in build.rs. In particular, notice the println!("cargo:rerun-if-changed=protos"); line, which tells Cargo not to rerun this file unless the schemas themselves have changed.

Now here's the problem, in my Cargo.toml, this dependency is listed as:

protobuf = "3.5"

Generally, I expect a few things from this:

The crate adheres to semver (even if loosely speaking)
Newer versions of the crate should (according to semver) not break code (including generated code) unless there is an actual breaking change, which you expect in the major version.

This is how most crates work, and even crates that generate rust code (e.g. bindgen, capnproto-rust, etc) behave this way. The whole point of using protobufs is that they should be backwards compatible if nothing else.

Nevertheless, if I check the generated rust files in or am unfortunate enough to blow away my lockfile / run cargo update and a new version of the protobuf crate exists, I get a version error because the previous version constant:

/// Generated files are compatible only with the same version
/// of protobuf runtime.
const _PROTOBUF_VERSION_CHECK: () = ::protobuf::VERSION_3_5_1;

No longer exists. This seems problematic (as a policy) for a few reasons:

This basically means every release of this crate (including patch releases) is a major-semver breakage, because I need to regenerate all of this old code.
My options for getting around this restriction are all bad. I can remove the rerun-if-changed=protos line from my build.rs and stop checking these files into version control, but then this build script runs every time I run any build/lint/check/test command from Cargo. This sucks because if you have a lot of schemas, this can delay builds by tens of seconds on slow CI machines (and be careful you don't have to build for multiple architectures / platforms, now you're looking at potentially tens of minutes wasted on each build). I can set a hard specification of protobuf = "=3.5.1" in my Cargo.toml, but that means that now I have to track releases and bug fixes for this crate, as well as manage that across lockfiles, dependency versions, library dependencies, etc. This is do-able, but it's tough, and I would rather not have to export a 'major' breaking change from my generated code just to upgrade a patch release of the protobuf crate.
This makes it really hard to track what is a major semver breakage in my own code because these types and interfaces might get transitively pulled in.
This isn't how I would expect backwards compatibility for protobufs to work in any other language, and I certainly don't think it's inherent to the technology or codegen mechanisms being used. capnproto-rust, for instance, does not have this issue whatsoever.

I can see that the title of this issue (Version management is kind of terrible) comes off a bit strong, but this is a lot of work invoked downstream every time a new crate is published. Specifically, dealing with what is effectively a major version breakage every time the crate updates sucks, and works directly against how Cargo functions. Maybe this seems somewhat downstream of the fact that the "major" version is somewhat tied to the Protobuf 2 / 3 split, and I would be willing to accept if the crate had some sort of pseudo-zerover setup going on because of that. However, even patch changes can break this.

I guess at the end of it all, I really just want to know if this is something that:

The maintainers are interested in fixing
There is any better advice or workarounds

Cargo doesn't currently have any flag akin to rerun-if-changed that I can set to rerun if and only if the protobuf version in the lock file changes. I also think setting rerun-if-changed=Cargo.lock isn't really a solution either, because a given lockfile can change for a lot of reasons.

Note: on the side, it's also really hard to determine in workspaces if the same version of the codegen and protobuf crates were used because different features may be enabled in different crates in a workspace. This issue blows up about 100x more when you have a workspace that may have multiple versions of the protobuf dependency present, and it's awful :(

At the end of the day: thank you for providing this crate! It's very useful, and nobody at my company has really had issues with the library itself. But policies around backwards compatibility and how these constants change every release generate a lot of churn, and documentation hasn't really covered that up by providing any best practices for managing generated code either in a workspace or outside of it.

The text was updated successfully, but these errors were encountered:

ThatGeoGuy · 2024-10-17T15:21:18Z

For others' sake: we ended up working around this by just locking the version in our Cargo.toml, but by separating every protobuf message definition into its own crate.

Separating the message definitions into their own crate (which in turn gets imported by other crates in our workspace), we can control the major/minor/patch versions in an appropriate way. If we decide to upgrade from protobuf=@=3.6.0 to some other version, we have to export a major version bump in our "protobuf messages" crate. Otherwise, we're able to manage semver breakages normally.

We also chose to re-export the protobuf crate within our custom crate (so you'd import use protobuf_messages::protobuf; instead of use protobuf;). This was to avoid pulling in mismatched versions of protobuf with respect to the message definitions, as other crates might also use a different version of protobuf and pull that in transitively.

It's somewhat of a frustrating workaround: now we need to publish one more crate (with a unique name on our registry) for the purposes of working around versions in a dependency. Further, we needed to re-export protobuf from this new crate to be able to effectively lock the dependency without placing the same version restrictions on every other dependency that might transitively use the protobuf crate. It isolates the problem and helps regain that semver consistency that Cargo craves, but also just blows.

Either way, it would be lovely if the maintainers could do something about this to avoid all the extra work being invoked downstream. This is far from an ideal solution and works very differently to how almost every other similar system operates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version management is kind of terrible #745

Version management is kind of terrible #745

ThatGeoGuy commented Oct 8, 2024

ThatGeoGuy commented Oct 17, 2024

Version management is kind of terrible #745

Version management is kind of terrible #745

Comments

ThatGeoGuy commented Oct 8, 2024

ThatGeoGuy commented Oct 17, 2024