Problems with caching on self-hosted runners #52

MPV · 2020-06-04T15:23:56Z

As mentioned in #38 (comment), Scala Steward seems to do some kind of caching (that maybe isn't useful for ephemeral GitHub Action runners), opening this issue for that.

MPV · 2020-06-04T15:26:54Z

For example, here's some caching files that were created in one of my runners:

runner@runner-pod-h79x4:~$ find | grep workspace
./scala-steward/workspace
./scala-steward/workspace/repos
./scala-steward/workspace/repos/my-org
./scala-steward/workspace/store
./scala-steward/workspace/store/refresh_error
./scala-steward/workspace/store/refresh_error/v1
./scala-steward/workspace/store/refresh_error/v1/github
./scala-steward/workspace/store/refresh_error/v1/github/my-org
./scala-steward/workspace/store/refresh_error/v1/github/my-org/my-repo
./scala-steward/workspace/store/refresh_error/v1/github/my-org/my-repo/refresh_error.json

After deleting those, my Scala Steward action doesn't fail anymore with issues like this:

Launching org.scala-steward:scala-steward-core_2.13:0.5.0-385-e5e4789c-SNAPSHOT
  2020-06-01 10:02:06,285 INFO   
    ____            _         ____  _                             _
   / ___|  ___ __ _| | __ _  / ___|| |_ _____      ____ _ _ __ __| |
   \___ \ / __/ _` | |/ _` | \___ \| __/ _ \ \ /\ / / _` | '__/ _` |
    ___) | (_| (_| | | (_| |  ___) | ||  __/\ V  V / (_| | | | (_| |
   |____/ \___\__,_|_|\__,_| |____/ \__\___| \_/\_/ \__,_|_|  \__,_|
   v0.5.0-385-e5e4789c-SNAPSHOT
   
  2020-06-01 10:02:06,290 INFO  Run self checks
  2020-06-01 10:02:07,163 INFO  Add global sbt plugins
  2020-06-01 10:02:07,177 INFO  Clean workspace /home/runner/scala-steward/workspace
  2020-06-01 10:02:07,196 INFO  ──────────── Steward my-org/my-repo ────────────
  2020-06-01 10:02:07,199 INFO  Check cache of my-org/my-repo
  2020-06-01 10:02:07,230 INFO  Skipping due to previous error
  2020-06-01 10:02:07,237 INFO  ──────────── Total time: Steward my-org/my-repo: 40ms ────────────
  2020-06-01 10:02:07,239 INFO  ──────────── Total time: run: 957ms ────────────

MPV · 2020-06-04T15:28:46Z

A possible workaround might be to add a GHA step that removes all/some of those files...?

bpg · 2020-06-04T16:01:35Z

Caching is an interesting topic that I'd like to discuss a bit more. I think there are some cases where you'd like to utilize results from previous action runs, for example, a big project with lots of dependencies and/or lots of open PRs from Scala Steward, or an action that handles updates for multiple projects.

Running every time from scratch will eat up into your action minutes allotment. Also, Scala Steward has some features that used the persisted (in the wordspace) state, like scanning frequency, et. al. I was really thinking into adding a caching step to my own projects just to deal with all of this.

So, if we're talking about adding a behaviour in the action to wipe out the workspace before calling SS, I would suggest to

add a configuration input to enable/disable this behaviour
have it enabled by default (as I assume the majority of action users are small single project repos)
explain all of this in documentation and add configuration examples with explicit workspace caching

Thoughts?

MPV · 2020-06-04T22:00:36Z

The current/built-in caching usage of Scala Steward is only per runner and that cache might not be as helpful as compared to how one might otherwise run Scala Steward standalone (towards multiple repos). This in contrast with this action, where the default is to just run towards the same repo.

I’ll see if I can troubleshoot what issues I’ve gotten that have been cached (preventing a rerun to succeed). Essentially whether any caching is performed could be less of an issue if I could rerun and get it to retry creating PRs etc (when it fails on an that is resolved by just retrying).

MPV · 2020-06-05T11:52:02Z

Leaving a note to remember:
If I (or anyone else) runs into this again, we should probable check the contents of that file (from output above):

./scala-steward/workspace/store/refresh_error/v1/github/my-org/my-repo/refresh_error.json

Another workaround could be to just remove that file (until we've found a suitable way forward regarding caching per runner/repo/etc).

MPV · 2020-06-05T15:19:31Z

refresh_error aside...

Do we have any suggestions on how one might do caching using the actions/cache action?

(to cache Scala Steward workspace files between different runners/runs for the same repo)

Had you given this a try @bpg?

MPV · 2020-06-05T16:04:52Z

To share, here's what I'm trying at the moment:

diff --git a/.github/workflows/scala-steward.yml b/.github/workflows/scala-steward.yml
index 9c79225..d8cbe8f 100644
--- a/.github/workflows/scala-steward.yml
+++ b/.github/workflows/scala-steward.yml
@@ -1,25 +1,29 @@
 name: Scala Steward

 on:
   # This workflow will launch at 00:00 every Sunday
   schedule:
     - cron: '0 0 * * 0'
   repository_dispatch:
     types: [scala-steward]

 jobs:
   scala-steward:
     runs-on: self-hosted
     name: Launch Scala Steward
     steps:
+      - run: rm -rf ~/scala-steward
+      - uses: actions/cache@v2
+        with:
+          path: ~/scala-steward
+          key: ${{ runner.os }}
       - name: Setup Java and Scala
         uses: olafurpg/setup-scala@v5
       - name: Launch Scala Steward
         uses: scala-steward-org/scala-steward-action@v2
         with:
           github-token: ${{ secrets.ORG_LEVEL_GITHUB_TOKEN }}
           author-email: [email protected]
           author-name: org-level-robot-user
-      - run: rm -rf ~/scala-steward

I get these results when running the above:

Before scala-steward-action:

Run actions/cache@v2
  with:
    path: ~/scala-steward
    key: Linux
Cache not found for input keys: Linux

After:

Post Run actions/cache@v2
Cache saved successfully
Post job cleanup.
/bin/tar -z -cf cache.tgz -P -C /home/runner/_work/my-repo/my-repo --files-from manifest.txt
Cache saved successfully

🎉

And in the next job:

Run actions/cache@v2
Cache Size: ~0 MB (5867 B)
/bin/tar -z -xf /home/runner/_work/_temp/1a7847a1-58fa-482c-9b00-0bcf25638667/cache.tgz -P -C /home/runner/_work/my-repo/my-repo
Cache restored from key: Linux

Post Run actions/cache@v2
Post job cleanup.
Cache hit occurred on the primary key Linux, not saving cache.

😭

MPV · 2020-06-05T17:19:24Z

Tried with caching based on *.sbt files:

diff --git a/.github/workflows/scala-steward.yml b/.github/workflows/scala-steward.yml
index d8cbe8f..43c2249 100644
--- a/.github/workflows/scala-steward.yml
+++ b/.github/workflows/scala-steward.yml
@@ -18,7 +18,7 @@ jobs:
       - uses: actions/cache@v2
         with:
           path: ~/scala-steward
-          key: ${{ runner.os }}
+          key: ${{ runner.os }}-sbt-${{ hashFiles('**/*.sbt') }}
       - name: Setup Java and Scala
         uses: olafurpg/setup-scala@v5
       - name: Launch Scala Steward

Unfortunately, that just resolved into Linux-sbt- (no hash), since I had forgotten to checkout the repo (have any actions/checkout actions).

I added that now, and get:

Run actions/cache@v2
Cache Size: ~0 MB (8052 B)
/bin/tar -z -xf /home/runner/_work/_temp/4d38ab15-b49a-4a1c-ab30-7ce433cc8161/cache.tgz -P -C /home/runner/_work/sbt-services/sbt-services
Cache restored from key: Linux-sbt-b19d6d84297c3469e702a9fdf501c41235ffe8bd4b62aed495a8ca62cd1b8fd0

Post Run actions/cache@v2
Post job cleanup.
Cache hit occurred on the primary key Linux-sbt-b19d6d84297c3469e702a9fdf501c41235ffe8bd4b62aed495a8ca62cd1b8fd0, not saving cache.

However, one problem here is that even though a dependency might have changed, the cache isn't updated. Changes to my *.sbt files needs to happen before the cache is saved into again — note the not saving cache above.

MPV · 2020-06-05T17:20:47Z

Also, note how the cache is expanded into my workspace for GitHub Actions:
(-C /home/runner/_work/my-repo/my-repo and not ~/scala-steward)

Run actions/cache@v2
  with:
    path: ~/scala-steward
    key: Linux-sbt-b19d6d84297c3469e702a9fdf501c41235ffe8bd4b62aed495a8ca62cd1b8fd0
Cache Size: ~0 MB (8052 B)
/bin/tar -z -xf /home/runner/_work/_temp/17139ec5-0305-4040-9307-8e15c1de1709/cache.tgz -P -C /home/runner/_work/my-repo/my-repo
Cache restored from key: Linux-sbt-b19d6d84297c3469e702a9fdf501c41235ffe8bd4b62aed495a8ca62cd1b8fd0

MPV · 2020-06-05T17:23:59Z

Maybe we should revisit the changes made in #42 (where we moved the workspace into ~/scala-workspace)...

See:

scala-steward-action/src/files.ts

Line 26 in 3199096

const stewarddir = `${os.homedir()}/scala-steward`

...and maybe move the Scala Steward workspace into the GitHub Actions workspace for the repo instead @alejandrohdezma (so caching can be done per repo, and not per runner)?

alejandrohdezma · 2020-06-05T17:55:02Z

Hey! So sorry, I don't know why but I totally miss this issue/conversation. Thank you both so much for all this investigation, this is indeed really interesting!

I did some testing with cache on one of the alpha versions using toolkit/cache but I remember I found some errors and postpone it.

One first thing we could do is using the post action to cleanup the directory so people don't find this problem on self runners.

alejandrohdezma · 2020-06-05T17:56:51Z

Also, @fthomas could you tell us a bit about which things should be cached from the Scala Steward workspace? That way I can do some testing with it on a different branch :)

bpg · 2020-06-05T19:18:22Z

Do we have any suggestions on how one might do caching using the actions/cache action?

(to cache Scala Steward workspace files between different runners/runs for the same repo)

Had you given this a try @bpg?

I haven't got far, will try again on the weekend.

In my other project I have a CircleCI Scala Steward custom job running on multiple projects, and caching is a huge help there, taking build time from 6..8 minutes to tens of seconds. So I think it is important to have an option to keep SS cache at the end, and reuse it next time, one way or another.

fthomas · 2020-06-05T19:51:00Z

I would cache everything of the Scala Steward workspace except the store/refresh_error directory. The refresh_error store is used to temporarily ignore repos whose build can't be loaded by Scala Steward. This is useful when Scala Steward is working on a lot of repos and should not be slowed down by a few repos that have broken builds. I guess it is less useful if Scala Steward runs as GH action where it isn't desirable to ignore broken builds because the operator of the action is able to fix the builds or the action.

alejandrohdezma · 2020-06-06T09:15:58Z

@MPV I have created #57 to address the problem of files created by this action being left behind. Let me know if that would be enough :)

alejandrohdezma · 2020-06-06T09:16:43Z

Thank you very much @fthomas for that explanation! 😸

I'll try to address that in a new PR today

MPV · 2020-06-06T09:23:48Z

Or, could we change it like this?

The action (or Steward) has it's workspace within the repo workspace.
The action (or Steward) cleans any refresh_error files/directories after each run.

Then we could benefit from cross-runners caching (1 + actions/cache) and still not get blocked checks due to refresh error locks (2).

alejandrohdezma · 2020-06-06T10:40:05Z

Either way, I think we should remove all the files created by the action since it is what's encouraged by Github so it won't matter if the workspace is in the home directory or the repo workspace and it will be fixed by using action/cache in any case

MPV · 2020-06-07T06:21:39Z

Either way, I think we should remove all the files created by the action since it is what's encouraged by Github so it won't matter if the workspace is in the home directory or the repo workspace and it will be fixed by using action/cache in any case

Could you share a link/example to “encouraged by GitHub”. I searched but couldn’t find that, for example here: https://help.github.com/en/actions/reference/virtual-environments-for-github-hosted-runners#filesystems-on-github-hosted-runners

Also, were you able to use action/cache to cache anything outside the repo workspace?

alejandrohdezma · 2020-06-08T07:45:08Z

Hey @MPV, @bpg #59 should solve the cache problem :)

@fthomas I've noticed that Scala Steward always removes the repos directory inside the workspace upon start. Does it make sense then to cache the workspace/repos directory?

fthomas · 2020-06-08T10:45:08Z

Right, the repos directory does not need to be cached.

alejandrohdezma · 2020-06-08T10:48:25Z

okey dokey! Thanks @fthomas

MPV · 2020-06-08T11:55:01Z

@alejandrohdezma To add another viewpoint to this:
If we enable caching as in #59, how long will dependencies be cached for?

If dependencies are cached for X days, then it won't be much value in running this action more often than every X days, no?

I have a use-case where we'd like to try using Scala Steward to create fast and semi-automatic bumping of downstream versions from an upstream repo.

Would setting a TTL for the cache be a suitable solution for this?

For example, I could have:

a weekly workflow with longer TTL for generic dependencies
a repository_dispatch workflow with longer TTL for being triggered by changes by upstream repos

Alternatively, if there were a setting for Scala Steward itself (@fthomas) to drop cache for a specific dependency and force-update just that one (similar to what I suggested in scala-steward-org/scala-steward#1470).

alejandrohdezma · 2020-06-08T12:54:43Z

@alejandrohdezma To add another viewpoint to this:
If we enable caching as in #59, how long will dependencies be cached for?

If dependencies are cached for X days, then it won't be much value in running this action more often than every X days, no?

Caches are stored for a maximum of 7 days, so yes, it won't have any effect if the action is run more often than that.

I have a use-case where we'd like to try using Scala Steward to create fast and semi-automatic bumping of downstream versions from an upstream repo.

Would setting a TTL for the cache be a suitable solution for this?

As far as I know, setting TTL for actions/cache is not allowed 😿

Alternatively, if there were a setting for Scala Steward itself (@fthomas) to drop cache for a specific dependency and force-update just that one (similar to what I suggested in fthomas/scala-steward#1470).

If this is enabled in scala-steward it shouldn't be to hard to add it to the action :)

fthomas · 2020-06-08T18:16:41Z

Scala Steward already has a --cache-ttl option that controls how often it checks for new versions. The default value is 2hours.

MPV · 2020-06-09T13:24:33Z

Scala Steward already has a --cache-ttl option that controls how often it checks for new versions. The default value is 2hours.

Opened #62 to be able to set custom cache TTL.

MPV mentioned this issue Jun 4, 2020

Add author-* inputs and update README.md #38

Merged

alejandrohdezma mentioned this issue Jun 6, 2020

Add post step to cleanup files created by the action #57

Merged

alejandrohdezma linked a pull request Jun 6, 2020 that will close this issue

Add post step to cleanup files created by the action #57

Merged

alejandrohdezma removed a link to a pull request Jun 6, 2020

Add post step to cleanup files created by the action #57

Merged

alejandrohdezma mentioned this issue Jun 8, 2020

Save/restore Scala Steward workspace to/from cache #59

Merged

alejandrohdezma closed this as completed in #59 Jun 8, 2020

MPV mentioned this issue Jun 9, 2020

Support overriding default "--cache-ttl 2hours" #62

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with caching on self-hosted runners #52

Problems with caching on self-hosted runners #52

MPV commented Jun 4, 2020

MPV commented Jun 4, 2020

MPV commented Jun 4, 2020

bpg commented Jun 4, 2020

MPV commented Jun 4, 2020

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020 •

edited

Loading

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020

alejandrohdezma commented Jun 5, 2020

alejandrohdezma commented Jun 5, 2020 •

edited

Loading

bpg commented Jun 5, 2020 •

edited

Loading

fthomas commented Jun 5, 2020

alejandrohdezma commented Jun 6, 2020

alejandrohdezma commented Jun 6, 2020

MPV commented Jun 6, 2020 •

edited

Loading

alejandrohdezma commented Jun 6, 2020

MPV commented Jun 7, 2020

alejandrohdezma commented Jun 8, 2020

fthomas commented Jun 8, 2020

alejandrohdezma commented Jun 8, 2020

MPV commented Jun 8, 2020 •

edited

Loading

alejandrohdezma commented Jun 8, 2020

fthomas commented Jun 8, 2020

MPV commented Jun 9, 2020

Problems with caching on self-hosted runners #52

Problems with caching on self-hosted runners #52

Comments

MPV commented Jun 4, 2020

MPV commented Jun 4, 2020

MPV commented Jun 4, 2020

bpg commented Jun 4, 2020

MPV commented Jun 4, 2020

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020 • edited Loading

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020

MPV commented Jun 5, 2020

alejandrohdezma commented Jun 5, 2020

alejandrohdezma commented Jun 5, 2020 • edited Loading

bpg commented Jun 5, 2020 • edited Loading

fthomas commented Jun 5, 2020

alejandrohdezma commented Jun 6, 2020

alejandrohdezma commented Jun 6, 2020

MPV commented Jun 6, 2020 • edited Loading

alejandrohdezma commented Jun 6, 2020

MPV commented Jun 7, 2020

alejandrohdezma commented Jun 8, 2020

fthomas commented Jun 8, 2020

alejandrohdezma commented Jun 8, 2020

MPV commented Jun 8, 2020 • edited Loading

alejandrohdezma commented Jun 8, 2020

fthomas commented Jun 8, 2020

MPV commented Jun 9, 2020

MPV commented Jun 5, 2020 •

edited

Loading

alejandrohdezma commented Jun 5, 2020 •

edited

Loading

bpg commented Jun 5, 2020 •

edited

Loading

MPV commented Jun 6, 2020 •

edited

Loading

MPV commented Jun 8, 2020 •

edited

Loading