Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: build cache does not check #included headers for changes #24355

Open
FlorianUekermann opened this issue Mar 12, 2018 · 30 comments
Open
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@FlorianUekermann
Copy link
Contributor

go 1.10 linux/amd64

The go build and go test commands don't rebuild packages if included cgo files changed. I guess a solution would be to run the preprocessor first or disable caching for packages that use cgo altogether.

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 13, 2018
@andybons andybons added this to the Go1.11 milestone Mar 13, 2018
@andybons andybons changed the title cgo: false build cache hits if cgo dependencies changed cmd/cgo: false build cache hits if cgo dependencies changed Mar 13, 2018
@andybons
Copy link
Member

@ianlancetaylor

@ianlancetaylor
Copy link
Member

Can you show us a small standalone example? I'm not clear on what you mean by "included cgo files". Do you mean explicitly files included using #include?

@ianlancetaylor ianlancetaylor changed the title cmd/cgo: false build cache hits if cgo dependencies changed cmd/go: false build cache hits if cgo dependencies changed Mar 27, 2018
@FlorianUekermann
Copy link
Contributor Author

FlorianUekermann commented Mar 27, 2018

Do you mean explicitly files included using #include?

Yes.

@ianlancetaylor
Copy link
Member

One possibility would be to use -MD when running the C compiler, and add the listed files to the hash.

@FlorianUekermann
Copy link
Contributor Author

FlorianUekermann commented Mar 27, 2018

use -MD when running the C compiler, and add the listed files to the hash.

Just to make sure I'm following. Did you mean add the contents of all listed files or just the list?

I had running with -E and hashing the output in mind, but I guess your idea may be more efficient (is it?).

@ianlancetaylor
Copy link
Member

I meant the contents of the listed files, as in cache.FileHash in cmd/go/internal/cache/hash.go. The idea is to be able to use the cache to detect whether we can skip running the compiler. If we use -E then we have to run the compiler anyhow to see whether the cache is up to date.

@ghost
Copy link

ghost commented Apr 9, 2018

This is also affecting me in a more general sense that with 1.10 there is no way to model the build dependency to statically linked libraries anymore. If the library changes, the cached files are still reused and I silently end up using an older version of the library unless I do go clean -cache -i <package> before the build. With go versions previous to 1.10 I had cmake touching my cgo wrappers to have them rebuilt.

@rsc
Copy link
Contributor

rsc commented Apr 18, 2018

I think this is basically working as expected.
If you change the underlying C code, or you change the compiler,
then you have to rebuild with -a. I'll leave it open in case there is
a simple fix but I don't think there is.

@rsc rsc changed the title cmd/go: false build cache hits if cgo dependencies changed cmd/go: build cache does not check #included headers for changes Apr 18, 2018
@navytux
Copy link
Contributor

navytux commented Apr 18, 2018

There is a way to make it work with the help of gcc -MD & friends:

(hello.h)

#define WORLDNUM        3

(hello.c)

#include <stdio.h>
#include "hello.h"

int main() {
        printf("Hello world (%d)!\n", WORLDNUM);
}
$ gcc -c -MMD -MT cdeps hello.c
$ cat hello.d 
cdeps: hello.c hello.h
$ gcc -c -MD -MT cdeps hello.c 
$ cat hello.d 
cdeps: hello.c /usr/include/stdc-predef.h /usr/include/stdio.h \
 /usr/include/x86_64-linux-gnu/bits/libc-header-start.h \
 /usr/include/features.h /usr/include/x86_64-linux-gnu/sys/cdefs.h \
 /usr/include/x86_64-linux-gnu/bits/wordsize.h \
 /usr/include/x86_64-linux-gnu/bits/long-double.h \
 /usr/include/x86_64-linux-gnu/gnu/stubs.h \
 /usr/include/x86_64-linux-gnu/gnu/stubs-64.h \
 /usr/lib/gcc/x86_64-linux-gnu/7/include/stddef.h \
 /usr/include/x86_64-linux-gnu/bits/types.h \
 /usr/include/x86_64-linux-gnu/bits/typesizes.h \
 /usr/include/x86_64-linux-gnu/bits/types/__FILE.h \
 /usr/include/x86_64-linux-gnu/bits/types/FILE.h \
 /usr/include/x86_64-linux-gnu/bits/libio.h \
 /usr/include/x86_64-linux-gnu/bits/_G_config.h \
 /usr/include/x86_64-linux-gnu/bits/types/__mbstate_t.h \
 /usr/lib/gcc/x86_64-linux-gnu/7/include/stdarg.h \
 /usr/include/x86_64-linux-gnu/bits/stdio_lim.h \
 /usr/include/x86_64-linux-gnu/bits/sys_errlist.h hello.h

This way if there is something like

// #include "mycode.cinc"
import "C"

Cgo could see the dependency on mycode.cinc and other files mycode.cinc includes.

@FlorianUekermann
Copy link
Contributor Author

Sorry guys, I misclicked. Didn't mean to close. What @ianlancetaylor and @navytux suggest seems like a good fix to me.

@karalabe
Copy link
Contributor

karalabe commented Jul 3, 2018

I think this is basically working as expected.
If you change the underlying C code, or you change the compiler,
then you have to rebuild with -a.

While true, there's a hidden security subtlety here. Lets suppose there's a crypto library called go-crypto, which internally wraps the c-crypto project (random names). The devs of c-crypto find a fatal flaw, fix it and notify go-crypto, who update their vendored C code and issue a new release too.

I - as a user of the go-crypto library - see this and do a go get -u to fetch the new code, sleeping easy that I'm all protected. Except Go didn't bother to actually recompile anything because only the C code changes, so my binary is still vulnerable, even though I built it with the new code.

This same issue will happen arbitrarily high a dependency chain, where anyone forgetting to rebuild with -a could potentially be vulnerable.


Btw, I'm not saying I know how to fix this or whether it's even fixable. I just wanted to add a bit of weight behind this issue.

@FlorianUekermann
Copy link
Contributor Author

FlorianUekermann commented Jul 3, 2018

Just a quick note because nobody has mentioned solutions to the more general issue @rsc pointed out:

If you change the underlying C code, or you change the compiler, then you have to rebuild with -a.

This is going to cause confusing issues in practice. I doubt that everyone is aware of all packages that use C in some sub-dependency. Similarly a lot of people won't always know whether the compiler got updated recently.

As @karalabe points out this is a potential security risk. But it is also a general usability problem, as it may very well break builds or even the resulting binaries.

These problems seem pretty similar to the issues ccache and zapcc face. I don't know where this is documented for zapcc, but ccache has a few pointers here: https://ccache.samba.org/manual/latest.html#_common_hashed_information

In general I don't see much harm in hashing a little more of the environment ($CC -MD, relevant environment variables, $CC -v or the binary itself). I'm starting to doubt that this will ever be perfect, but a couple of safeguards could save a lot of people a lot of time and confusion.

@vcaesar
Copy link

vcaesar commented Aug 14, 2018

I think so too, a couple of safeguards could save a lot of people a lot of time and confusion.
If you change the underlying C code, or you change the compiler, then you have to rebuild with -a. This is going to cause confusing issues in practice.

After many people update the package code, they don't even know that the cache of go1.10 caused the bug to not be fixed.

@seebs
Copy link
Contributor

seebs commented Sep 30, 2018

As a naive user, I got bitten by this, but at least it was really obvious: there was a bug in the glfw package's C code (caught by newer compiler, which issued a warning). so i changed the code, ran go build... same error. it was not at all obvious why it was giving me an error that couldn't possibly refer to any existing file on the disk, but apparently "-a" would have helped... but that's extremely non-obvious, and there's no reason that i should have to rebuild other unrelated packages to hint "actually, this package has a thing that has changed".

my actual quick workaround: a blank line in the .go file including the affected .c file.

@dhobsd
Copy link
Contributor

dhobsd commented Aug 8, 2019

We ran into this yesterday in our CI environment. I'm happy to look into fixing this for 1.14; the -MD solution seems like a good first step, but as @FlorianUekermann points out above, we may want to consider mixing in a bit more information.

I wonder if it is too late to consider adding info to cgo documentation for 1.13?

@andybons
Copy link
Member

andybons commented Aug 8, 2019

@dhobsd for 1.13, documentation is fine. Code changes not so much :)

@dhobsd
Copy link
Contributor

dhobsd commented Aug 8, 2019

Ah, actually I see that there is already material included about GOCACHE and cgo interaction in the go tool docs; I'll hold off until the 1.14 cycle is open to poke at fixing this.

@cpuguy83
Copy link

So it seems like GOCACHE is simply not safe when linking to C at all.
Although, even the stdlib is linking C.
What happens if you update libc?

@rfjakob
Copy link

rfjakob commented Feb 8, 2020

Ah, actually I see that there is already material included about GOCACHE and cgo interaction in the go tool docs

As I have trouble finding what you meant there, I'm gonna copy-paste it from https://golang.org/cmd/go/#hdr-Build_and_test_caching for future readers of this ticket:

However, the build cache does not detect changes to C libraries imported with cgo. If you have made changes to the C libraries on your system, you will need to clean the cache explicitly or else use the -a build flag (see 'go help build') to force rebuilding of packages that depend on the updated C libraries.

In other words, if you use Cgo, you MUST use go build -a and go test -a, otherwise you'll never know what ended up in your binary, or what C code you were actually testing.

rfjakob added a commit to rfjakob/earlyoom that referenced this issue Feb 8, 2020
Go does not notice when the C code changes, so we have to
use `go test -a`.

Workaround for golang/go#24355 .
@rfjakob
Copy link

rfjakob commented Feb 8, 2020

Actually, go test -a does not seem to be enough to get the cache up to date. Subsequent go test runs without -a still use some older cached version of the C code.

The !!! enter message comes from C. No code was changed between the two test runs. First result is up to date, the second result is some older version of the C code:

$ go test -run Test_get_process_stats -a
!!! enter333
[...]

$ go test -run Test_get_process_stats
!!! enterYYY
[...]

This seems to fix it, so probably better to use this instead of (in addition to?) the -a flag:

go clean -cache -testcache .

rfjakob added a commit to rfjakob/earlyoom that referenced this issue Feb 9, 2020
The behavoir after `go test -a` is somewhat surprising,
so add `go clean`, which seems to actuall bring everything
up to date.

golang/go#24355 (comment)
rfjakob added a commit to rfjakob/earlyoom that referenced this issue Feb 9, 2020
As described in https://golang.org/cmd/cgo/ ,
Cgo only notices changes to .c and .h files in the
same folder.

Move the testsuite to the top-level folder to get rid of
the manual cache cleaning which made running the tests
so much slower.

golang/go#24355
@cpuguy83
Copy link

It (may) be nice to have a mode that invalidates all cgo but not proper go.

Kubuxu pushed a commit to filecoin-project/filecoin-ffi that referenced this issue Jun 14, 2021
Possibly works around golang/go#24355

Signed-off-by: Jakub Sztandera <[email protected]>
@mxmauro
Copy link

mxmauro commented Mar 21, 2024

After 6 years this issue is still open :(

@fearpro13
Copy link

fearpro13 commented May 1, 2024

Got it fixed with -a go build flag, but runs extremely slow :(
Could be helpful to have an option to invalidate cgo cache when any of related files was changed

@nickh-stripe
Copy link

is there any way of triggering a cgo rebuild (or relink in the case of #29843) without passing -a ? is there a environment variable or something that can be set to invalidate the cache? changing compile flags via CGO_CFLAGS or CGO_LDFLAGS maybe? using -a as a workaround makes the dev loop very slow especially when running tests.

@ianlancetaylor
Copy link
Member

You can clean the build cache using go clean -cache.

@ianlancetaylor
Copy link
Member

But I guess that might also lead to a slow rebuild...

@nickh-stripe
Copy link

nickh-stripe commented May 23, 2024

yeah avoiding the rework / doing the minimum amount of build work is what i'm after, if i moved the CGO related code into a separate package with little else and just invalidate that package in the cache, would that avoid -a and rebuilding all the go code but still ensure that the CGO code/relinking happens? ill have to test it i guess..

similar to this comment: #24355 (comment)

@cavokz
Copy link

cavokz commented May 24, 2024

I think that the point is whether or not your external C dependencies change between CGO builds. If they don't (ex. system libraries that do not get updated in the meanwhile) and you don't modify the way of using them (ex. different C macros), it's safe to use the cached results.

Remember that you can point Go to different cache locations by setting the environment variable GOCACHE. In the Pygolo Project where we want to jump to different Python environments and avoid cleaning the cache all the times, we simply set GOCACHE depending on the current interpreter.

I think it's good hygiene to have different GOCACHE for different CGO projects and pure Go, though the problem of how to known when deep in the dependency chain there is a CGO module that deserves a -a remains. This can only be underestimated.

There could be space for a tool that examines the C dependencies of a GCO module and builds a sort of fingerprint to detect any change. It remains to be proven that it would be faster than rebuilding with -a all the time and, more important, that you could fully rely it.

@nickh-stripe
Copy link

nickh-stripe commented May 24, 2024

yeah that usage of GOCACHE works in that example, but in the scenario where no go code is changing just the linked C library changing the GOCACHE actually makes it worse as it forces all go code to cache miss.

go clean <package> didn't work effectively either, the hack that does work (but is quite hacky) is to (ab)use go:embed as a proxy for being able to customise what non .go files are part of the go build inputs, so tl;dr:

  1. change the build for the C shared library being linked via CGO to produce a file containing the hash of the shared library, generate the hash file into the package path so go:embed can find it
  2. add a .gitignore for the generated file so its ignored
  3. add a go:embed entry within the package that uses CGO with a wildcard
  4. success

doing it this way means the shared library is indirectly part of the go build cache inputs, and changes to the file (well changes to the hash file) will trigger a go re-build without any cleaning/popping caches etc. A less hacky way would be nice, but i can live with this.

@iameli-streams
Copy link

It's surprising to me that there's no way to remove the build cache for a specific package. I know my CGO dependencies are in github.com/go-gst/go-gst/gst. So when I change up the GStreamer C library, I need to invalidate that specific package. Can't be done.

go clean -r github.com/go-gst/go-gst/gst

Doesn't work.

go clean -i github.com/go-gst/go-gst/gst

Doesn't work.

go clean -cache github.com/go-gst/go-gst/gst
go: clean -cache cannot be used with package arguments

But... why not? Surely cleaning the build cache for a specific package is a thing that would be useful? I know this because I'm trying to do it right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests