Enable pinning and better caching with stack_snapshot #1310

aherrmann · 2020-04-22T08:22:31Z

Is your feature request related to a problem? Please describe.

stack_snapshot calls out to stack at every fetch multiple times:

stack update in order to update the Hackage database (factored out into stack_update).
stack unpack to download the package sources.
stack ls dependencies and stack dot to determine the dependency graph.

This causes some issues:

Concurrency issues with stack update which does not support multiple simultaneous invocations. We can work around this within one Bazel repository, see Call stack update once #1199. But, this also causes issues with multiple simultaneous jobs on one CI runner which we cannot address well within rules_haskell, see Concurrency issues with stack_snapshot() #1167.
What stack unpack fetches is opaque to Bazel and is not cached well in Bazel's repository cache, see stack_snapshot leads to a big download that can't be cache by bazel #1112 and relatedly Cache the stackage snapshot repository #1168.

Describe the solution you'd like

I would like stack_snapshot to be more transparent for Bazel. The new stack ls dependencies json feature provides most of the required information: The package name, version, dependencies, and the package location (Hackage, URL, repository). We could then use download_and_extract or git_repo to let Bazel fetch the sources in a way that Bazel can cache.

Unfortunately, stack ls dependencies json does not provide the sha256. However, Bazel gives access to the sha256 after the download, so we could generate a lock file containing the sha256 (or the commit and shallow_since in case of git_repo).

Similar to rules_jvm_external I would like stack_snapshot to provide a pinning mechanism. Given a lock file stack_snapshot would not have to call out to stack at all, avoiding any of the issues around stack update.

Describe alternatives you've considered

Instead of shelling out to stack we could also implement a dedicated tool using the pantry library. However, this causes a bootstrapping issue.
Cache the stackage snapshot repository #1168 proposes to generate a tarball bundling all sources fetched by stack unpack and enabling stack_snapshot to download that from a user defined remote cache instead of calling stack unpack. However, this would probably require a fair bit of setup from the user's side and is not a solution that would be easily applicable by most users. To avoid any call to stack this would also need to generate some form of lock file that provides the required package metadata.

Additional context

I haven't tested how Bazel's download_and_extract would compare to stack unpack or the uber tarball proposed in Cache the stackage snapshot repository #1168 in terms of speed. However, with Bazel handling the downloads this would benefit from remote repository caching if that becomes available, see Allow using remote cache for repository cache bazelbuild/bazel#6359.
This requires parsing JSON in Starlark in some form. This is currently not supported by the Starlark API, see Please add a Skylark builtin for parsing JSON to a dict (useful for repo rules) bazelbuild/bazel#3732. There is an implementation of a JSON parser in Starlark which is also used by rules_jvm_external. Alternatively, we could use jq.

The text was updated successfully, but these errors were encountered:

mboes · 2020-08-05T10:37:45Z

@aherrmann do I understand correctly that if the SHA256 was exposed in stack ls dependencies json (as suggested in commercialhaskell/stack#5274), that the //:pin script would no longer be necessary?

aherrmann · 2020-08-05T13:05:55Z

The pin script itself is just a small script that copies the lock file into the user's repository (similar to rules_jvm_external). So, we'd probably keep that script if stack exposed the sha256. However, we could simplify the generation of the lock file if we didn't have to query all-cabal-hashes or download archive dependencies to determine hashes ourselves.

aherrmann added the type: feature request label Apr 22, 2020

aherrmann mentioned this issue Apr 22, 2020

Feature Request: Expose sha256 in ls dependencies json commercialhaskell/stack#5274

Open

aherrmann mentioned this issue Jun 12, 2020

Use stack ls dependencies json #1364

Merged

This was referenced Jun 26, 2020

Enable pinning with stack_snapshot #1376

Merged

Refactor implementation of stack_snapshot #1379

Merged

mergify bot closed this as completed in #1376 Jul 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable pinning and better caching with stack_snapshot #1310

Enable pinning and better caching with stack_snapshot #1310

aherrmann commented Apr 22, 2020

mboes commented Aug 5, 2020

aherrmann commented Aug 5, 2020

Enable pinning and better caching with stack_snapshot #1310

Enable pinning and better caching with stack_snapshot #1310

Comments

aherrmann commented Apr 22, 2020

mboes commented Aug 5, 2020

aherrmann commented Aug 5, 2020