-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement an equivalent to CCACHE_BASEDIR
#35
Comments
Can't we always strip the cwd by default and have this for override? Maybe yet another env |
I am looking at this seriously, and after doing some cleanup of my newbie Rust, it should be ready for a pull request tonight. |
Do out of source object directories present a problem given the way the Firefox source is currently built. Say I have
When compiling To avoid that it would seem like |
In #104 (comment), @glandium said,
That seems incompatible with @jwatt's approach in #35 (comment). @glandium, can you explain what you were thinking, and comment on @jwatt's approach? |
In regards to @ncalexan's comment above: I'm involved in a project that builds both locally for developers and using automated builders and we would like to share the cache among all users. On Windows we currently pass full include paths via the compiler command-line. Automated builders download Visual Studio and Windows SDK as an archive and extract to a specific directory. Individual developers often have Visual Studio and Windows SDK installed in the default system locations. Our source code is then downloaded in a completely separate directory For example, a value like this on the automated builders:
Would be equivalent to this on a developer machine:
Being able to specify multiple basedir values would allow us to share the cache without requiring everyone to install files in the same locations. |
On Linux you could solve that quite easily by keeping your build tools in a docker image including sccache and mount the sources into the container an all your machines at the same location. |
To add an additional use case to this discussion. I am currently experimenting with building our internal libraries and tools with conan and just realized that effectively using sccache in this environment would need support for something like multiple base directories or something like path ignore patterns. In short, when building a package More than one base directory/ignore pattern is needed in the common case where the package additionally depends on other conan packages. For example, when |
Question: Was it ever considered to use sha256 sums for the files instead of their full paths? I know this is a different approach then what's taken in the ccache project but I think it's much more reliable. The idea is that the same way that every file path is mapped to an object in the cache, it's sha256sum could be used instead. |
The primary problem is that we hash commandline arguments to the compiler, and often full source paths wind up there. We did take a change in #208 to fix one issue around this where full paths wound up in the C compiler's preprocessor output. I suspect we could fix this fairly easily for most cases nowadays--it'd mostly just mean filtering commandline arguments that have full paths. |
So does it mean that when sccache is wrapping GCC, the path arguments it is given are never absolute and when it wraps rustc it may or may not be given absolute paths? My suggestion, is that whether the paths are absolute or not, we can read the files in the input (assuming we can know the caller's working directory in case the paths are relative) and compute a sha256 sum of all of them. Then, we can map this sha256 hash to the cached output instead of the command line arguments to the output. This way, whenever there's a match in the hash, the cache could be used. I'm no Rust developer and I'm not familiar at all with the internals of this project so maybe my idea will be hard to implement but I hope my explanation is good enough. BTW, implementing this idea will mean that existing caches of users will no longer be valid so in a certain sense it'll be backwards incompatible, but not in usage. |
@doronbehar It's a bit more complicated than that and I'd suggest you read the existing implementation first. We are already using the input files as part of the hash. For C/C++ compilation the Lines 482 to 511 in 5855673
For Rust compilation the Lines 926 to 1129 in 5855673
|
@luser thanks for pointing to the From a limited test run that I've performed, the contents of the Instead, I can see that on the function call: Lines 270 to 277 in 5855673
only "common args" are passed. When debugging my case, I can see that these are flags such as warning flags, optimization level, -std C++ standard and so on. None of these contain paths in them, although obviously that could be the case of the project I'm compiling.
Digging a bit deeper, I can see that the the flags with paths in them wind up as preprocessor flags. And the input file itself is also handled differently. Am I correct to assume the following:
|
Hi, I am wondering, what is the status of |
One way to avoid the problem altogether is to pass flags to your compiler that normalize file paths. e.g. |
But aren't the command lines hashed in sccache? The paths would be still different You mean like this? |
Is there any update to this? |
Another user wanting to use sccache with Conan reporting in, thanks @niosHD for doing some groundwork! I guess a working (easy to activate?) Conan integration for sccache could be a very useful feature for many users. As Conan uses often-changing absolute build-pathes, but always in the same manner, it should be possible to implement it in a standard way. Conan is also getting traction, so a feature like that could be a win-win for both tools. |
Presumably you'd also have to add special handling in sccache for those arguments so that, e.g., for
Mozilla currently has very little interest in a feature like this, so Mozilla is unlikely to do development on this particular issue. |
I gave --remap-path-prefix a try and it causes everything to be a cache miss likely because of this code: Line 984 in 385f738
I think one possible way to solve this issue would be to implement path prefix support for sccache:
|
FWIW, the distributed compilation code already uses Line 1628 in 385f738
|
This comment was marked as spam.
This comment was marked as spam.
We use |
Anyone could share how to use |
Sccache would be really helpful for local builds of Rust projects, especially when you have a great many with common dependencies (🙋♂️) and especially in a team environment (🙋♂️). Unfortunately the lack of this feature makes that quite painful in practice. So I'd love to better understand its status. In particular, is it stalled because:
I'm just having trouble figuring out which one (or N) it is. Thanks! 🙇♂️ |
While sccache is awesome, there are a few caveats with Rust and it is not all 🦄 unfortunately. In a nutshell and unless you take good care of aligning things, sccache will likely not work as you'd expect if:
That's a lot of caveats and in the meantime, the Rust compiler improved a lot. Nowadays, unless you have a very sopecific env that ticks all the checks above, you will likely be better not using sccache. That may explain the low activity. In the end, if sccache works for you, great, if not, you can also check cargo remote, which can also help for teams. |
In my case all these things are in fact aligned except for user accounts and the other reasons for paths to differ. Is there a problem with having different user accounts other than it making paths different? As for differing paths, I had thought that is what this whole issue is about. Perhaps I misunderstood? I'm thinking of making my own rustc wrapper that takes care of setting up an environment for sccache (files in a consistent location, maybe even doing a chroot or something if necessary) that ensures a cache hit whenever it should be possible. It might end up being the simplest path forward, as it sidesteps the questions about what is the correct way to handle all the messy details of path remapping. |
Yes, this is why I mentioned it :) On MacOS it is less trivial since MacOS does NOT let users created Hard links, especially not at the root unless you do some trickery that can be dangerous (ie brick temperary your disk, ask me how I know....).
That sounds interesting and I would love to hear about the journey. I definitely do NOT to discourage you, just set expectations so you know a few things that sounds trivial... will not be. I am not saying you cannot get it to work if indeed your team is not to "wildly spread" regarding how the env are set.
One of the main issue you will face is that Alice and Bob will not use (all the time) the same Rust version. They may even both be on nightly but Alice updates in the morning and Bob in the evening. That's however still an easy and good case: Alice will build up the cache abnd Bob will benefit. Issues arise if for whatever reason, those users decides to use different versions and it is hard to force a team (presumably working on several projects...) to use a unified and synchronized version of Rust. Not impossible and some tools can help but it s often not the case by default. |
Well, I guess if I have any success you're bound to hear about it. 😀 (But yes, I would report back here as well.) I have some experience building vaguely related sorts of tools, so I expect complications but am also confident I could find ways around them. It's more a question of getting around to it... (very limited time right now)
In my case this is simple — pretty much everything is in a monorepo with a single pinned version that gets bumped by a bot (via a pull request) when a new stable Rust release come out. We used to have some exceptions, but I'm pretty sure we deleted the last one recently. In this regard at least, you could say that we are playing on easy mode due to the rigid homogeneity of our dev environments. |
CCACHE_BASEDIR
allows ccache to get cache hits for the same source files stored at different paths, which is useful for developers building from different source trees:https://ccache.samba.org/manual.html#_configuration_settings
We want this so that we can get cross-branch cache hits on our buildbot builds, which have the branch name in their build directory, and we also want this so we can get cache hits from the S3 cache for local developers' builds.
The text was updated successfully, but these errors were encountered: