-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI's cache key results in a single snapshot being created in first build and never updated #102
Comments
Hey, yep, CI caches are not very easy to use effectively for the reasons you state. Cascading hashes are the way to squeeze something out of it, but it still doesn't solve the problem of the underlying directories (the cache values) growing without bound, until the save and restore become intolerably slow. And manually computing cache keys like that is a mess. Really the best way to work with Pants is (as your # FIXME implies) a Pants-aware remote cache, which caches at a very fine-grained resolution (every process that runs is cached), and on the same hashes Pants already generates for every process for local caching. Shameless plug: We at Toolchain (https://toolchain.com/), the lead sponsors of Pants, offers such a cache as a SaaS product. Happy to set you up with a trial! |
Feel free to reach out to me on Pants Slack if you want to chat about this option. |
Should this be closed as answered? |
This tripped me up and had our system doing a bunch of pointless cache downloading and surprisingly slow builds that gave a poor early impression of pants (and took me a while to diagnose). So, from my perspective, it'd be good to do at least one of:
(I'd love 1, especially the 'easy' part 😄 I'm in discussion with @benjyw on Slack, thanks for the tip.) |
Thanks for the update. It's definitely a good idea to have the demo repos demonstrate remote caching, since that is a strength of Pants, and we should not demonstrate obviously bad CI cache practices... |
Let's leave this issue open until we resolve that. |
Assigning this to @chrisjrn who is working on it. |
See #104 for improved comments and example. |
Re-reading your initial comment @huonw , I missed the part whert the cache is never updated at all if the key doesn't change. But reading the GHA cache docs confirms this, and it makes sense that the cache should be immutable. |
Addressed in #104 and pantsbuild/pants#16503 Thanks for opening this issue! |
Looks great, thanks for the explanation, that definitely clarifies things for me! |
We've been setting up our pants system and cribbing off this repo, so thanks for publishing it. However, we noticed that our CI was quickly getting much slower and discovered it was due to the use of a very simple cache key:
${{ runner.os }}-
.It seems like if there's an exact match for a key, GitHub Actions will restore the cache, and never update it even if the cached files changed during the run. The "Post Run" step for
actions/cache
saysCache hit occurred on the primary key Linux-, not saving cache.
. This means that the first build with the cache will seed it, and all future builds will reuse that old one, and that cache will get less useful slowly (or not so slowly, in our case 😄 ) and builds longer.We're experimenting with an alternative that incorporates progressively more detailed hashes:
(This will result in... a lot of cache usage for a reasonably large project, but will be better than caches that are barely useful.)
The text was updated successfully, but these errors were encountered: