Write pantry hashes into a .lock file instead of emitting warnings #4288

sol · 2018-09-04T12:06:51Z

Currently stack build will emit warnings if I do not specify size and sha256 for github dependencies in stack.yaml.

Running stack freeze will output the relevant information that I then can add to stack.yaml manually.

I propose to write the information that is currently produced by stack freeze to a .lock file after every successful build (e.g. pantry.lock):

when the .lock file exists then for each listed dependency:
- stack uses the information as if it were directly listed in stack.yaml; specifically it errors out on any hash mismatches
- when stack detects that a dependency was updated in stack.yaml it silently ignores the information from the .lock file
after a successful build stack creates the .lock file, overwriting any existing versions (which in effect means that we remove unused entries and add missing entries)

Regarding revision control, we can follow conventions that have been developed elsewhere:
When developing an application, the .lock file should be checked into revision control. When developing a library, the .lock file should be added to .gitignore.

A word on naming: If we expect the .lock file to be specific to stack we may want to name it stack.lock (or possibly .stack.lock to not interfere with file name completion); if on the other hand we think that it will be relevant for other build tools that implement support for pantry then we could name it pantry.lock.

Assuming we adopt this approach we can:

remove any warnings related to pantry hashes
remove the stack freeze command

CC @snoyberg

The text was updated successfully, but these errors were encountered:

qrilka · 2018-09-09T12:35:26Z

This looks to be a sensible alternative approach, it appears to be a bit more intrusive and a bit less explicit about the resulting constraints. E.g. shouldn't I set explicit hashes for my library so users should be able to get reproducible result when trying to build on their machines? With somewhat implicit lock file this will look as not quite required so in my opinion having visible warnings is a good thing to promote more accurate constraints. I'd prefer to follow "explicit is better than implicit" for this case but probably @snoyberg has some more arguments on this question.

snoyberg · 2018-09-12T02:12:42Z

Let's separate out the case of snapshot files and stack.yaml files. For snapshot files: a fair assumption is that authors will want to share these files outside of their project at some point. Having a lock file to go along with snapshot files in that context probably wouldn't make much sense. For example: imagine saying "if you download a snapshot file from http://example.com/snapshot.yaml, also check http://example.com/snapshot.yaml.lock." That seems pretty arbitrary and awkward.

By contrast, I can picture having lock files for stack.yaml. It's definitely nicer than barfing warnings at the user (my current design choice 😄) and doesn't require manual intervention from users. However:

Unlike other systems, stack.yaml files have always been intended to be "locked" already. These additional lock files are simply providing more secure locking. In contrast, Rust's lock files are providing the same level of information that our stack.yaml files provide. The difference could be confusing. (Note: the choice of freeze name is also a bit of a problem according to this argument.)
It's not clear to me what the exact workflow would be for how we update these lock files.
I do like the idea that everything necessary for reproducing the build is in the stack.yaml file itself. On the other hand, I also like the idea that the stack.yaml file is easily writeable by hand.

I'd say a good next step is identifying exactly what Stack would do in the possible cases it could encounter:

extra-deps value is present with missing hash info and no entry in the lock file
Info is available in the lock file but not in extra-deps

As I'm starting to think this through, it may not be as complicated as I was worried it would be.

qrilka · 2018-11-23T07:19:37Z

@snoyberg I think the last 2 questions you posted are easily resolved with what @sol originally proposed: missing hash will get completed and will get saved in a lock file a successful build, and that new lock file will contain only data corresponding to a stack.yaml.
But there's still a question about snapshots: we could add that info also into a pantry.lock (or stack.lock) but shouldn't snapshot maintainers get warned if they have incomplete data in their snapshots? And if they should then probably stack freeze could be useful for them?

qrilka · 2018-11-23T08:40:30Z

And another thing which bothers me while implementing this: what to do with changes user specifies on the command line? That could influence dependency graph quite a bit and it doesn't seem meaningful to store exact dependencies per every used CLI flag combination.

sol · 2018-11-23T10:19:10Z

I don't have a strong opinion on custom snapshots; what I care about is extra-deps in stack.yaml.

what to do with changes user specifies on the command line?

That makes things more complicated indeed. I'll describe what I would ideally want to use from a users perspective:

I would want the lock file to be created as if all build targets were requested (libs, executables, tests, benchmarks). I guess this implies that stack would have to construct the whole build plan whenever there is no pantry.lock (or the existing file is stale). We would still only want to actually build what the user requested.
I would tend to ignore the fact that flags can influence the content of pantry.lock. That is, just use the flags that are specified by the user and produce the pantry.lock that results from that combination of flags.

Discussion

Regarding (2) I think this is a corner case that should be relatively rare (you would need both, a dependency that is guarded by a flag + that same dependency listed as extra-deps in stack.yaml).

If a user still runs into this then I think there is a workaround that s/he can use in most situations: The user can change the .cabal or package.yaml file so that the dependencies that are listed under extra-deps are not guarded by flags (e.g. make them top-level dependencies in package.yaml; this won't work for Win32 but for almost anything else it should be ok).

Why is this ok?

A user really only cares about the content of pantry.lock (== put it under revision control) for applications (not for libraries)
For applications (in contrast to libraries) it's not crucial to minimize dependencies => guarding dependencies with flags is not crucial for applications

Finally, there is only one thing I'm worried about. Let's consider the following scenario:

I have a flag -fproduction that guards a dependency foo; and I also have foo listed under extra-deps. Locally during development I don't build with -fproduction, so my pantry.lock does not contain a hash for foo. My CI/CD pipeline on the other hand builds with -fproduction and will use a version of foo that is not properly locked.

One way to address this could be to fail on missing entries in the lock file on --pedantic.

Implementation

I'm not sure what (1) would mean on the implementation side.

sol · 2018-11-23T10:37:33Z

when stack detects that a dependency was updated in stack.yaml it silently ignores the information from the .lock file

At the risk of stating the obvious, I think this implies that if we e.g. have

extra-deps:
  - github: hspec/hspec
    commit: 3c0f266bd3ce71958bf6b6daaf7d0cbcda7e7227
    subdirs:
      - hspec-core

in stack.yaml then we need to include the commit hash 3c0f266bd3ce71958bf6b6daaf7d0cbcda7e7227 in the .lock file so that we can detect when the "dependency was updated in stack.yaml".

That is, the .lock file is a superset of extra-deps.

snoyberg · 2018-11-23T11:14:36Z

But there's still a question about snapshots: we could add that info also into a pantry.lock (or stack.lock) but shouldn't snapshot maintainers get warned if they have incomplete data in their snapshots? And if they should then probably stack freeze could be useful for them?

I'd say, for now, let's ignore the problems of creating snapshot files. That seems like something distinctly out of scope for Stack itself. Perhaps we'll have a Pantry executable that is useful for filling in missing info in a snapshot. After all, Pantry snapshots will hopefully work for more than just Stack users (e.g., Nix users).

what to do with changes user specifies on the command line?

Seems to be totally out of scope. The lock file should only address things that are in the stack.yaml itself. Command line changes are non-reproducible by nature.

Let's try to make things just a bit more concrete. Let's say we have @sol's example stack.yaml:

extra-deps:
  - github: hspec/hspec
    commit: 3c0f266bd3ce71958bf6b6daaf7d0cbcda7e7227
    subdirs:
      - hspec-core

I'm imagining we'll end up with a lock file that looks like this:

dependencies:
- original:
    github: hspec/hspec
    commit: 3c0f266bd3ce71958bf6b6daaf7d0cbcda7e7227
    subdirs:
    - hspec-core
  complete:
  - size: 99713
    subdir: hspec-core
    url: https://github.com/hspec/hspec/archive/3c0f266bd3ce71958bf6b6daaf7d0cbcda7e7227.tar.gz
    cabal-file:
      size: 4506
      sha256: f79008d723c203f36815b304c0db90b065fa1b20cc06033e5055ccea8b096c3b
    name: hspec-core
    version: 2.6.0
    sha256: ad03a807857ce4ed07c6a8410dbce8b5244c4e193e86767ce0ca93a3ba28dadd
    pantry-tree:
      size: 3751
      sha256: 1bad19b4c4afde31c5f57d919ed768a0e034587e30070e4ac3420896fcb48b90

Rules:

Any package in the extra-deps or any snapshots with incomplete information will appear in this lock file, regardless of whether or not it's used by the build.
If a package description in extra-deps or a snapshot changes (e.g., new commit, changes to subdirs), then a new entry will be added and the old one removed.
Similarly, if any packages are removed, they'll be removed from the lockfile.
Each time additions, changes, or removals happen, we'll get some message printed to the user

I think this approach will address all of the reproducibility and performance concerns I have, and mean that the original stack.yaml can be even more lightweight than it is today, e.g. including the @rev:0 information there will no longer be necessary at all.

@qrilka @sol does that make sense?

sol · 2018-11-23T11:55:14Z

I'd say, for now, let's ignore the problems of creating snapshot files. That seems like something distinctly out of scope for Stack itself. Perhaps we'll have a Pantry executable that is useful for filling in missing info in a snapshot. After all, Pantry snapshots will hopefully work for more than just Stack users (e.g., Nix users).

👍

what to do with changes user specifies on the command line?

Seems to be totally out of scope. The lock file should only address things that are in the stack.yaml itself. Command line changes are non-reproducible by nature.

That makes sense. I was misguided by the wrong assumption that you can specify Hackage dependencies without exact versions as extra-deps (but that is not the case and wouldn't really make sense anyway).

Now everything actually looks neat and simple! That's great!

qrilka · 2018-11-23T14:13:43Z

Actually I have one question still: the original proposal talks about saving lock file only on successful build but what if that build was modified by some flags - in that case we don't have any guarantees about original (completed) dependencies. Should we exclude that requirement of a successful build then @snoyberg ? And use lock file only to lock package versions without giving extra guarantees?

qrilka · 2018-11-23T14:44:56Z

Regarding the question of pantry.lock vs stack.lock - this feature looks to be out of Pantry scope at least with the current separation between Stack and Pantry: the details of stack.yaml are in Stack and in Pantry there's no entity combining some list of packages (i.e. extra-deps + project packages, though the latter don't need to be locked) and a snapshot. So I would propose to stick with stack.yaml if there are no objections to that.
Also it wasn't stated explicitly in this ticket yet but it's better to be noted that except information about completed packages we need to complete package location.

sol · 2018-11-23T15:19:46Z

Should we exclude that requirement of a successful build then

At least I would be fine with that. My original feature request is influenced by what I implemented in sol/tinc where I do dependency solving and have the successful build requirement. But for stack it seems like this can just be an (impure) function from stack.yaml to stack.lock.

Regarding the question of pantry.lock vs stack.lock

I used pantry.lock as a placeholder name only. I'm ok with stack.lock or anything else. One drawback of stack.lock is that completion a la

vim sta<tab><enter>

to edit stack.yaml won't work anymore. But not a major issue for me.

Still, should we consider .stack.lock?

snoyberg · 2018-11-24T17:46:12Z

the original proposal talks about saving lock file only on successful build but what if that build was modified by some flags - in that case we don't have any guarantees about original (completed) dependencies. Should we exclude that requirement of a successful build then @snoyberg ? And use lock file only to lock package versions without giving extra guarantees?

Yes, I think so, good catch.

Regarding the file name, we need to consider the fact that the stack.yaml file may not be called stack.yaml. It might be stack-lts-12.yaml, for example, or even something ridiculous like my-stack-yaml.txt. How about simply appending .lock to the filename?

qrilka · 2018-11-24T19:00:42Z

Sounds good to me, also I think it makes sense to resolve symlinks to get the real file name so stack.yaml pointing to stack-lts-12.yaml will work the same as using the file stack-lts-12.yaml itself.

sol · 2018-11-25T01:08:11Z

How about simply appending .lock to the filename?

I like this 👍

snoyberg · 2019-01-18T12:47:31Z

This is also vital for improving performance. Currently, Stack will spend significant time completing package information on each startup. Automatically writing a .lock file will bypass that overhead.

snoyberg · 2019-03-06T08:10:44Z

Specification for this lock file implementation is now written here: https://github.com/commercialhaskell/stack/blob/f2ac3e64d70c990c6a5045f02c13f5869a77c5c4/doc/lock_files.md

sol · 2019-03-06T08:55:03Z

Looks great 👍

QUESTION Do we want to go the easy way at first and later implement the more complicated update procedure?

No strong preference, but it's the approach I tend to go for. Maybe even simplify the file format by skipping originals until we actually need them.

qrilka · 2019-03-06T10:22:55Z

@simonmichael regarding originals - as far as I understand in #4550 they are used to simplify tracking between locations source yaml files and the resulting completed locations, so I'd say that we need them already.

sol changed the title ~~Write pantry hashes into a .lock file instead of warning~~ Write pantry hashes into a .lock file instead of emitting warnings Sep 4, 2018

sol added the component: pantry label Sep 4, 2018

snoyberg assigned qrilka Nov 26, 2018

qrilka mentioned this issue Jan 19, 2019

almost all stack actions result in SQLite error ... #4519

Closed

qrilka mentioned this issue Feb 18, 2019

[WIP] Lock files #4550

Closed

2 tasks

dbaynard mentioned this issue Feb 26, 2019

Detect when git commit refs are unlikely to be SHAs #3535

Open

snoyberg mentioned this issue Apr 1, 2019

Changing hackage metadata can invalidate ~/.stack cache, thus making builds non-reproducible #2217

Closed

snoyberg added this to the P1: Must milestone Apr 2, 2019

qrilka mentioned this issue Apr 16, 2019

Lock files attempt 2 #4746

Merged

2 tasks

snoyberg removed this from the P1: Must milestone Apr 22, 2019

snoyberg added this to the Stack 2.0 milestone Apr 22, 2019

snoyberg closed this as completed in #4746 Apr 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write pantry hashes into a .lock file instead of emitting warnings #4288

Write pantry hashes into a .lock file instead of emitting warnings #4288

sol commented Sep 4, 2018 •

edited

Loading

qrilka commented Sep 9, 2018

snoyberg commented Sep 12, 2018

qrilka commented Nov 23, 2018

qrilka commented Nov 23, 2018

sol commented Nov 23, 2018 •

edited

Loading

sol commented Nov 23, 2018

snoyberg commented Nov 23, 2018

sol commented Nov 23, 2018

qrilka commented Nov 23, 2018

qrilka commented Nov 23, 2018

sol commented Nov 23, 2018

snoyberg commented Nov 24, 2018

qrilka commented Nov 24, 2018

sol commented Nov 25, 2018

snoyberg commented Jan 18, 2019

snoyberg commented Mar 6, 2019

sol commented Mar 6, 2019 •

edited

Loading

qrilka commented Mar 6, 2019

Write pantry hashes into a .lock file instead of emitting warnings #4288

Write pantry hashes into a .lock file instead of emitting warnings #4288

Comments

sol commented Sep 4, 2018 • edited Loading

qrilka commented Sep 9, 2018

snoyberg commented Sep 12, 2018

qrilka commented Nov 23, 2018

qrilka commented Nov 23, 2018

sol commented Nov 23, 2018 • edited Loading

Discussion

Implementation

sol commented Nov 23, 2018

snoyberg commented Nov 23, 2018

sol commented Nov 23, 2018

qrilka commented Nov 23, 2018

qrilka commented Nov 23, 2018

sol commented Nov 23, 2018

snoyberg commented Nov 24, 2018

qrilka commented Nov 24, 2018

sol commented Nov 25, 2018

snoyberg commented Jan 18, 2019

snoyberg commented Mar 6, 2019

sol commented Mar 6, 2019 • edited Loading

qrilka commented Mar 6, 2019

sol commented Sep 4, 2018 •

edited

Loading

sol commented Nov 23, 2018 •

edited

Loading

sol commented Mar 6, 2019 •

edited

Loading