-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with caching on self-hosted runners #52
Comments
For example, here's some caching files that were created in one of my runners: runner@runner-pod-h79x4:~$ find | grep workspace
./scala-steward/workspace
./scala-steward/workspace/repos
./scala-steward/workspace/repos/my-org
./scala-steward/workspace/store
./scala-steward/workspace/store/refresh_error
./scala-steward/workspace/store/refresh_error/v1
./scala-steward/workspace/store/refresh_error/v1/github
./scala-steward/workspace/store/refresh_error/v1/github/my-org
./scala-steward/workspace/store/refresh_error/v1/github/my-org/my-repo
./scala-steward/workspace/store/refresh_error/v1/github/my-org/my-repo/refresh_error.json After deleting those, my Scala Steward action doesn't fail anymore with issues like this:
|
A possible workaround might be to add a GHA step that removes all/some of those files...? |
Caching is an interesting topic that I'd like to discuss a bit more. I think there are some cases where you'd like to utilize results from previous action runs, for example, a big project with lots of dependencies and/or lots of open PRs from Scala Steward, or an action that handles updates for multiple projects. Running every time from scratch will eat up into your action minutes allotment. Also, Scala Steward has some features that used the persisted (in the wordspace) state, like scanning frequency, et. al. I was really thinking into adding a caching step to my own projects just to deal with all of this. So, if we're talking about adding a behaviour in the action to wipe out the workspace before calling SS, I would suggest to
Thoughts? |
The current/built-in caching usage of Scala Steward is only per runner and that cache might not be as helpful as compared to how one might otherwise run Scala Steward standalone (towards multiple repos). This in contrast with this action, where the default is to just run towards the same repo. I’ll see if I can troubleshoot what issues I’ve gotten that have been cached (preventing a rerun to succeed). Essentially whether any caching is performed could be less of an issue if I could rerun and get it to retry creating PRs etc (when it fails on an that is resolved by just retrying). |
Leaving a note to remember:
Another workaround could be to just remove that file (until we've found a suitable way forward regarding caching per runner/repo/etc). |
Do we have any suggestions on how one might do caching using the actions/cache action? (to cache Scala Steward workspace files between different runners/runs for the same repo) Had you given this a try @bpg? |
To share, here's what I'm trying at the moment: diff --git a/.github/workflows/scala-steward.yml b/.github/workflows/scala-steward.yml
index 9c79225..d8cbe8f 100644
--- a/.github/workflows/scala-steward.yml
+++ b/.github/workflows/scala-steward.yml
@@ -1,25 +1,29 @@
name: Scala Steward
on:
# This workflow will launch at 00:00 every Sunday
schedule:
- cron: '0 0 * * 0'
repository_dispatch:
types: [scala-steward]
jobs:
scala-steward:
runs-on: self-hosted
name: Launch Scala Steward
steps:
+ - run: rm -rf ~/scala-steward
+ - uses: actions/cache@v2
+ with:
+ path: ~/scala-steward
+ key: ${{ runner.os }}
- name: Setup Java and Scala
uses: olafurpg/setup-scala@v5
- name: Launch Scala Steward
uses: scala-steward-org/scala-steward-action@v2
with:
github-token: ${{ secrets.ORG_LEVEL_GITHUB_TOKEN }}
author-email: [email protected]
author-name: org-level-robot-user
- - run: rm -rf ~/scala-steward I get these results when running the above: Before scala-steward-action:
After:
🎉 And in the next job:
😭 |
Tried with caching based on diff --git a/.github/workflows/scala-steward.yml b/.github/workflows/scala-steward.yml
index d8cbe8f..43c2249 100644
--- a/.github/workflows/scala-steward.yml
+++ b/.github/workflows/scala-steward.yml
@@ -18,7 +18,7 @@ jobs:
- uses: actions/cache@v2
with:
path: ~/scala-steward
- key: ${{ runner.os }}
+ key: ${{ runner.os }}-sbt-${{ hashFiles('**/*.sbt') }}
- name: Setup Java and Scala
uses: olafurpg/setup-scala@v5
- name: Launch Scala Steward Unfortunately, that just resolved into I added that now, and get:
However, one problem here is that even though a dependency might have changed, the cache isn't updated. Changes to my |
Also, note how the cache is expanded into my workspace for GitHub Actions:
|
Maybe we should revisit the changes made in #42 (where we moved the workspace into ~/scala-workspace)... See: scala-steward-action/src/files.ts Line 26 in 3199096
...and maybe move the Scala Steward workspace into the GitHub Actions workspace for the repo instead @alejandrohdezma (so caching can be done per repo, and not per runner)? |
Hey! So sorry, I don't know why but I totally miss this issue/conversation. Thank you both so much for all this investigation, this is indeed really interesting! I did some testing with cache on one of the alpha versions using toolkit/cache but I remember I found some errors and postpone it. One first thing we could do is using the post action to cleanup the directory so people don't find this problem on self runners. |
Also, @fthomas could you tell us a bit about which things should be cached from the Scala Steward workspace? That way I can do some testing with it on a different branch :) |
I haven't got far, will try again on the weekend. In my other project I have a CircleCI Scala Steward custom job running on multiple projects, and caching is a huge help there, taking build time from 6..8 minutes to tens of seconds. So I think it is important to have an option to keep SS cache at the end, and reuse it next time, one way or another. |
I would cache everything of the Scala Steward workspace except the |
Thank you very much @fthomas for that explanation! 😸 I'll try to address that in a new PR today |
Or, could we change it like this?
Then we could benefit from cross-runners caching (1 + |
Either way, I think we should remove all the files created by the action since it is what's encouraged by Github so it won't matter if the workspace is in the home directory or the repo workspace and it will be fixed by using |
Could you share a link/example to “encouraged by GitHub”. I searched but couldn’t find that, for example here: https://help.github.com/en/actions/reference/virtual-environments-for-github-hosted-runners#filesystems-on-github-hosted-runners Also, were you able to use |
Right, the |
okey dokey! Thanks @fthomas |
@alejandrohdezma To add another viewpoint to this: If dependencies are cached for X days, then it won't be much value in running this action more often than every X days, no? I have a use-case where we'd like to try using Scala Steward to create fast and semi-automatic bumping of downstream versions from an upstream repo. Would setting a TTL for the cache be a suitable solution for this? For example, I could have:
Alternatively, if there were a setting for Scala Steward itself (@fthomas) to drop cache for a specific dependency and force-update just that one (similar to what I suggested in scala-steward-org/scala-steward#1470). |
Caches are stored for a maximum of 7 days, so yes, it won't have any effect if the action is run more often than that.
As far as I know, setting TTL for actions/cache is not allowed 😿
If this is enabled in scala-steward it shouldn't be to hard to add it to the action :) |
Scala Steward already has a |
Opened #62 to be able to set custom cache TTL. |
As mentioned in #38 (comment), Scala Steward seems to do some kind of caching (that maybe isn't useful for ephemeral GitHub Action runners), opening this issue for that.
The text was updated successfully, but these errors were encountered: