Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: support easy way to install a remote binary while respecting remote 'replace' and 'exclude' directives #31173

Open
thepudds opened this issue Mar 31, 2019 · 11 comments
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@thepudds
Copy link
Contributor

thepudds commented Mar 31, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.1 linux/amd64

Does this issue reproduce with the latest release?

Yes, including tip.

What did you do?

cd /tmp
go get foo

where:

  • foo is a module with replace directives chosen by the module author, and
  • foo is a binary command, and
  • I want foo to build as specified by the module author

What did you expect to see?

The replace directives in the go.mod for foo respected in my resulting foo binary.

What did you see instead?

The replace directives in the go.mod for foo are currently ignored.

Summary

The suggestion in this issue is to provide some easy way to install a remote binary while respecting replace and exclude directives in the remote module.

This seems necessary based on observed usage, as well as this seems part of the promise from the proposal and elsewhere that "a module author is in complete control of that module's build when it is the main program being built". That promise of control by the module author seems especially important when the author is publishing a binary for use by others.

Background

As far I as I was able to follow the design discussions:

  • By design, modules are a less expressive system than a more traditional approach, and
  • It seems the complete control given to the top-level build via replace and exclude directives was important aspect of balancing out that reduction in expressiveness.

For example, the "Build Control" section in the official proposal has a discussion about this, including (with emphasis added):

Minimal version selection gives the top-level module in the build additional control, allowing it to exclude specific module versions or replace others with different code, but those exclusions and replacements only apply when found in the top-level module, not when the module is a dependency in a larger build.

A module author is therefore in complete control of that module's build when it is the main program being built, but not in complete control of other users' builds that depend on the module. I believe this distinction will make this proposal scale to much larger, more distributed code bases than the Bundler/Cargo/Dep approach.

There are similar sentiments expressed in the initial vgo blog series, such as:

Minimal version selection is very simple. It achieves simplicity by
eliminating all flexibility about what the answer must be: the build
list is exactly the versions specified in the requirements. A real system
needs more flexibility, for example the ability to exclude certain module
versions or replace others
.

Those arguments seem reasonably compelling. It seems, however, the current Go modules system does not quite deliver on that promise when, for example:

  • The author of module foo decided to use a replace or exclude, and
  • The module foo is "the main program being built" via a go get foo executed outside another module.

In that scenario, the current modules system ignores the remote replace or exclude in foo. In other words, that scenario seems to illustrate not delivering on the promise of "A module author is therefore in complete control of that module's build when it is the main program being built".

"Don't use replace" as an alternative solution

When the concern in this issue has been raised in the past, sometimes the response has been something like "People shouldn't really use replace when releasing a module". However, I think that falls short as a solution.

@rogpeppe has stated for example that juju cannot currently be built without replace directives.

In general, one could imagine that the need for replace directives going down over time as the ecosystem adapts modules and semver more faithfully, but it is hard to imagine the need for need for replace directives going so low that the need for replace could be approximated by zero. For example:

  • Use of semver is never perfect, especially in the face of human error.
  • Changes can break a consumer without changing the statically checkable API.
  • v0 is a "compatibility free zone", yet people in practice and by necessity still use v0 dependencies. MVS can deliver incompatible results in the face of multiple require directives for a v0 dependency.
  • Etc.

I think the on-going need for replace is especially true given the purposeful reduction in expressivity elsewhere in the modules system.

"git clone" as an alternative solution

In discussions on this topic, a response is sometimes made along the lines of "If an author of a binary needs to use replace directives, they can always just update their readme to ask users to not do go get or go install and instead do a git clone followed by go install". A readme in theory could also include git clone --branch. However, I think a git clone solution falls short given:

  • The future benefits of GOPROXY mirrors and notaries, etc.
  • The benefits of the go command automatically picking a good semver tag for you, with a default of @latest (e.g., the semver-aware logic described in "Module aware go get").
  • The need to update the readme over time (e.g., changing the recommended semver tag if the readme supplies a specific recommended semver tag to use with git, or changing the readme if it normally specifies a more standard go get variation but then is only temporarily updated to specify using git clone in order to respect a replace directive for a few months while waiting for resolution of an upstream problem, etc.).
  • go get some/cool/cmd has proven to be popular within the Go community, including as a concrete "gateway" to Go for people who are not yet developing Go themselves.
  • go get nicely hides most VCS differences.

Hugo is an example of an early modules adopter that currently has a replace directive in its go.mod. It has the following installation instructions:

Since Hugo 0.48, Hugo uses the Go Modules support built into Go 1.11 to build. The easiest is to clone Hugo in a directory outside of GOPATH, as in the following example:

mkdir $HOME/src
cd $HOME/src
git clone https://github.com/gohugoio/hugo.git
cd hugo
go install

That might be the right choice for Hugo based on its needs and the current state of modules in Go 1.12, but those instructions for example ignore the benefit of semver tags.

Personally, I would view it as a step backwards if something along those lines became the recommended solution if you have a replace directive.

Other possible solutions

There are likely many possible solutions, but here are three sample solutions to help start conversation here.

Under all three of these options, it could be reasonable reject any filesystem-based replace directives in a remote go.mod that reach outside the module (as suggested by Bryan in #24250 (comment)). To help avoid placing a testing burden on authors to check for that, go release could warn if that condition is found, which is likely a good thing to do anyway as suggested in the go release issue in #26420 (comment).

Option 1

If #30515 is adopted with a new -b flag (for binary or bare), and go get -b foo ends up meaning "install foo while ignoring any current module's go.mod and without altering the current module's go.mod", it could be natural for go get -b foo to also respect any replace or exclude directives in foo. The rationale could be that there is no other top-level module being considered aside from foo when -b is specified.

The same could apply if a different spelling than -b is selected for #30515 (e.g., perhaps #30515 is resolved via a go get -global, go get -clone, go install, etc.).

Option 2

If #30515 is not adopted, then go get foo when run outside of a module could be redefined in 1.13 to respect any replace or exclude directives in foo. The rationale could be that there is no other top-level module being considered aside from foo when doing go get foo outside of a module.

This behavior was suggested by multiple people for Go 1.12 (including by Bryan in #24250 (comment)), but the ultimate Go 1.12 behavior was different, including Bryan commented in CL 148517 that his 1.12 change was "as minimal a change as I could comfortably make to enable 'go get' outside of a module for 1.12". (Part of the context for Bryan's comment in the CL is I think that minimal change might have been implemented after the 1.12 freeze).

Option 3

The behavior in Option 1 and Option 2 could both be the 1.13 behavior. In other words:

  • go get foo when run outside of a module in 1.13 would now respect any replace or exclude directives in foo (in addition to respecting the remote module's require directives and not changing any local go.mod).
  • go get -b foo in general means "run as if in a directory outside any module". (Under this definition, it would imply replace or exclude directives in a remote foo are respected).

Relationship to other issues

This has been discussed in multiple issue such as #27643, #24250, #30515 and others, but usually as a side topic. Perhaps this issue here can be closed as a duplicate of #30515, but @mvdan, the author of #30515, asked that this aspect be discussed in a different issue than #30515, so hence this issue is being opened now.

@bcmills
Copy link
Contributor

bcmills commented Apr 1, 2019

I'm still not convinced that long-term use of replace — other than by developers during development of their own modules — is actually something we should encourage.

(Note that we do support it, in the sense that it remains possible to check out the repository and run go build within its working tree, but dedicating a go subcommand or flag to it would significantly change the perception around when and how replace directives should be used.)

In particular, the compatibility argument for using replace seems off too me: if someone is building your module as the only thing passed to go get, then they get exactly the module versions that you've selected — so even if there are breaking changes after that point, why are they relevant?

So I would like some more detail: @rogpeppe, why does juju require replace directives today, and what is preventing you from migrating away from them over the long term?

@bcmills bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 1, 2019
@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 1, 2019

So I would like some more detail: @rogpeppe, why does juju require replace directives today, and what is preventing you from migrating away from them over the long term?

OK, here's some detail on the replacements used by Juju currently. I speak at some remove, because I haven't been directly involved in the Juju code base for a little while now, but I have been historically familiar with some of the issues.

Summary

It might be possible to move away from replace statements in the long term, but it's not necessarily possible in practice without hard-forking multiple external repositories, which has its own on-going cost. It's certainly not going to happen in the short to medium term, which means that replace statements are going to be a necessity for building juju for a while to come.

And even the need for these particular replace statements is dropped eventually, another similar issue might well arise in the meantime. The open source world is not always clean. Some kind of replace statement is a crucial band-aid at times, and one which anyone installing the juju commands, not just developers, will need to apply.

github.com/altoros/gosigma => github.com/juju/gosigma

Fixes races in tests and code. https://github.com/Altoros/gosigma/pull/1/files.
No new or changed API.

There's been a PR open since 2017-05-23 without response.

Making a hard fork of this probably wouldn't be that hard to do, although might
be politically awkward, as github.om/altoros/gosigma is the "official" API client.

gopkg.in/yaml.v2 => github.com/juju/yaml

A tweak to the accepted floating point syntax. No new API. A small but
significant tweak - the two repositories disagree on whether the a
member in:

a: 123456e1

holds a floating point number or a string. Changing this behaviour might
break backward compatibility with existing YAML files that are held in
immutable databases online.

Changing this unilaterally inside juju could break other code. For
example, code might return a yaml.MapSlice which is intended to be
marshaled as an order-preserving map, but would instead be marshaled as
an array because its type is not recognized.

Changing this across the entire code base would be hard, as many external
repositories have dependencies on this package.

gopkg.in/mgo.v2 => github.com/juju/mgo

19 commits. Some new API.

The parent repository is no longer maintained, and was not generally
accepting of patches even when it was. The main fixes here are to the
mgo/txn code, which is heavily used by Juju and has been the source of
some major operational issues. The patches address some of these issues,
amongst other things.

Changing this dependency to use github.com/juju/mgo throughout
would be a major piece of work. It is a dependency of many external
repositories which don't themselves care about the fixes that Juju
needs (because they don't use mgo/txn); changing the import path in
those repositories would require a major version change, and subsequent
cascading changes. Persuading the owners of all these repositories that
they should switch to Juju's fork of mgo for fixes that don't directly
affect them would require significant politics, if it could be achieved
at all.

@thepudds
Copy link
Contributor Author

thepudds commented Apr 1, 2019

Hi @bcmills, briefly cherry picking a couple points you made:

In particular, the compatibility argument for using replace seems off too me: if someone is building your module as the only thing passed to go get, then they get exactly the module versions that you've selected

...unless of course part of the way you selected the module versions as the binary author is via a replace or exclude directive, in which case your consumers don't get the module versions that you've selected if your consumers do a go get under the current system.

so even if there are breaking changes after that point, why are they relevant?

That is not my concern -- modules provide 100% reproducible builds, etc.

The compatibility argument is not about future breaking changes causing problems for consumers, and is more about incompatibilities that exist when a binary is being published given for example (a) semver is not always perfectly followed in practice and (b) people use v0 dependencies from the "compatibility free zone". Either of those mean MVS by itself can select problematic versions of foo's dependencies that the human author of foo can resolve with replace prior to publishing foo. As Russ wrote, "A real system needs more flexibility, for example the ability to exclude certain module versions or replace others". It seems to me incompatibilities are things that arise in a real system in ways that MVS (by design) cannot always resolve by itself.

@bcmills
Copy link
Contributor

bcmills commented Apr 1, 2019

unless of course part of the way you selected the module versions as the binary author is via a replace or exclude directive

That's a bit of a circular argument, though. 🙂

The central question here, as I see it, is: should replace directives be a tool for selecting module versions, or only for relocating them? (I can see a clear use-case for relocation: namely, local development of dependencies. The use-case for selection seems less clear-cut, because it partially overlaps with the usual version selection by MVS.)

@bcmills
Copy link
Contributor

bcmills commented Apr 1, 2019

@rogpeppe, thanks for the detail.

It seems to me that since gosigma and mgo are effectively unmaintained, they should probably be hard-forked (if they cannot be transferred to more active maintainers). That makes replace (and especially #26904) an important migration tool, but a temporary one.

Specifically in relation to mgo, you note:

changing the import path in those repositories would require a major version change, and subsequent cascading changes.

Could you give some more detail about why changing the import path would require a major version change? (Do packages pass through mgo types in their exported APIs in a way that cannot be transitioned using type aliases, interfaces, or a mix of the two?)

@bcmills
Copy link
Contributor

bcmills commented Apr 1, 2019

The issue with yaml seems very unfortunate, but if the two packages fundamentally disagree on the meaning of certain YAML files, then it seems even more important that dependencies outside your project receive the version of the package they were developed against, in order to consume their own YAML files (if any) correctly.

That is, I suspect you'll want to maintain a fork of the package for compatibility with existing YAML files, but you'll want to use that fork only for the parts of the code that consume existing files.

On the other hand, if you replacement really is benign for your dependencies, then perhaps the yaml package should have an explicit, supported hook (a runtime function call, or perhaps a build tag) that you could use to configure the mainline package. (Have you discussed that possibility with the go-yaml maintainers, presumably @niemeyer?)

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 1, 2019

Could you give some more detail about why changing the import path would require a major version change? (Do packages pass through mgo types in their exported APIs in a way that cannot be transitioned using type aliases, interfaces, or a mix of the two?)

Yes. For example see this package. The forked mgo package can't type-alias the original Collection type (because then it can't define its own methods on it); neither can the mgorootkeystore package change the type it's using to an interface.

Changing the mgorootkeystore package to use a different version of mgo would definitely be an API-breaking change.

I agree that replacements are not desirable and can usually be worked around in time (with negotiation with upstream maintainers, hard forks etc), but the fact remains that they are important at the time for making working products.

Even if the replacement is just a "temporary" fix, it can be an important one, and I consider it important that it's possible to publish a predictably-versioned Go binary with the fix in place until the replacement can be removed. Telling the user to do their own git clone (working out the correct incantation for that is non-obvious with vanity import paths) or similar is not a good solution.

I think Russ's words, aptly quoted above, bear repeating once again:

Minimal version selection is very simple. It achieves simplicity by eliminating all flexibility about what the answer must be: the build list is exactly the versions specified in the requirements. A real system needs more flexibility, for example the ability to exclude certain module versions or replace others.

Please, let's provide that flexibility for all users of Go modules.

BTW I would be perfectly happy if file-path replacements yielded an error in this case. I do consider file-path replacements suitable for developers only (and they're not a fundamental part of VCS, which replacements and exclusions in general are).

@bcmills bcmills added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Apr 1, 2019
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 1, 2019
@bcmills bcmills added this to the Go1.14 milestone May 10, 2019
@bcmills bcmills added the early-in-cycle A change that should be done early in the 3 month dev cycle. label Aug 8, 2019
@bcmills
Copy link
Contributor

bcmills commented Oct 4, 2019

This has an interesting interaction with #33848.

#33848 proposes to use the main module's vendor directory automatically when it exists, in much the same way that we use the main module's replace and exclude directives. However, the main module's vendor directory — much like filesystem targets of replace directives — may include go.mod files that cause it to be pruned out from the module's .zip file.

Also like a replace directive, the vendor directory may include local edits and patches, although I would not recommend patching the vendor directory in that way.

@bcmills
Copy link
Contributor

bcmills commented Oct 4, 2019

Perhaps #26904 provides a better way forward: one option we're considering there (from among many alternatives) is adding some new keyword to the go.mod file, which would remap or alias the import paths themselves rather than their source code.

Those paths are necessarily always import paths, not filesystem paths, which obviates the concern about treating filesystem paths and module paths differently.

Unfortunately, applying only a subset of the main module's directives still poses a testing problem: it would still mean that the module could be built — and would need to be tested in — a third configuration that is neither the same as the view from inside that module, nor from an external module.

@bcmills
Copy link
Contributor

bcmills commented Oct 4, 2019

The more I think about it, though, the more convinced I am that an “install from semi-inside” command should not be go get with a flag.

go get currently serves two purposes: it resolves and adds dependencies, and it installs binaries from within those dependencies. A command to install binaries from within one specific module is meaningfully different: it constrains the module relationship of the arguments in a way that go get in general does not.

You might even imagine that a build in this mode should not resolve missing imports: if the point is to exactly reproduce what the developer would have built, then the developer necessarily needs to record all of the versions that they intended to use.

@rogpeppe
Copy link
Contributor

rogpeppe commented Oct 4, 2019

You might even imagine that a build in this mode should not resolve missing imports: if the point is to exactly reproduce what the developer would have built, then the developer necessarily needs to record all of the versions that they intended to use.

This seems like a good idea to me.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
posener added a commit to posener/diff that referenced this issue Jan 17, 2020
posener added a commit to posener/gitfs that referenced this issue Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

5 participants