Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom snapshots #111

Closed
UnkindPartition opened this issue May 29, 2015 · 22 comments
Closed

Custom snapshots #111

UnkindPartition opened this issue May 29, 2015 · 22 comments
Assignees
Milestone

Comments

@UnkindPartition
Copy link
Contributor

Custom snapshots are essential for me and generally important for user freedom.

By 'custom' I mean the ability to specify both custom constraints and a custom download repository.

I am happy to try to work on this myself, but wanted to find out your thoughts first.

@snoyberg
Copy link
Contributor

Definitely in favor. For the download repository, see #29. This is already somewhat supported by the (undocumented) URL overriding feature, something like:

urls:
    package-index-http-url: "some-url"

For custom snapshots, there are two approaches we can take:

  • Continue with the extra-deps approach, and possibly add a field extra-deps-file so that the custom snapshot info can be kept in a separate file from stack.yaml. Then you could combine this with something like resolver: ghc-7.10
  • Allow a new resolver type for custom snapshots. The advantage of this is that the custom snapshot database could be shared among multiple projects (as opposed to extra-deps, which are project-specific). This is actually pretty easy to do, we'd just have to decide on the naming. In particular, we'd need some assurances that a custom snapshot doesn't change arbitrarily, otherwise your installed snapshot database could become corrupted. The two approaches I can think of are (1) trust the user, and (2) make the name based off of a hash of the snapshot contents.

@snoyberg
Copy link
Contributor

@feuerbach Did you have any thoughts on which direction you think this should take?

@UnkindPartition
Copy link
Contributor Author

I'd certainly like to add a new resolver type.

Although I understand the need for immutability, I'm a bit confused about "the naming". Do you mean naming the snapshot itself (which is presumably done by the user), or naming the internal package database, or something else?

I don't like relying on a dumb hash; e.g. adding packages shouldn't be a problem, only changing the existing versions is. Right? Although a dumb hash could serve as a fast check.

Do I understand correctly that a resolver in general doesn't assume one version per package (e.g. the ghc resolver doesn't have this property), but a "snapshot resolver" does?

Should we also try to check that a custom package store doesn't replace foo-1.0 with a different foo-1.0? I don't think this will lead to any breakage; in the worst case the change will simply be ignored (if foo-1.0 is already installed).

@snoyberg
Copy link
Contributor

The naming situation I'm worried about is for the internal directory names. Imagine if you name your custom snapshot "roche1". Then stack installs packages in ~/.stack/snapshots/roche1/pkgdb (or something like that). If you then tweak the version of some package, stack would have to know to unregister that package and anything that depends on it, which it doesn't know about now (since it works under the assumption that snapshots are immutable). That's not a big deal of course, we can add that logic. The difficult thing is: what if a different project you're working on also uses roche1, but has a slightly different definition? Then you end up in the cabal-ish situation where each build will break your other project.

The answer to all of this may be: caveat emptor, custom snapshots are an advanced feature, and users need to think about these things when using them. I'm OK with that being the answer, I just want to make sure we make such a decision consciously instead of accidentally.

@UnkindPartition
Copy link
Contributor Author

FYI, I'm studying stack's codebase to implement this feature, but it isn't going fast, so in case anyone wants to do this, go ahead and don't be blocked on me. But if not, I'm still determined to get this done eventually.

@snoyberg
Copy link
Contributor

snoyberg commented Jun 1, 2015

I'm deep inside the multiple package indices right now, which is a prereq for this. I'd be happy to help you get through this at the same time (it would be good to have more people familiar with the codebase). Let me know if there's any assistance I can give you. If you're still stuck on it when I get my other high priority tasks for this milestone complete, let's touch base again.

@UnkindPartition
Copy link
Contributor Author

If we could get on a call (say, later today?) and walk around the codebase together, that'd be super helpful, actually.

@snoyberg
Copy link
Contributor

snoyberg commented Jun 1, 2015

Cool let's do it. I'm free most of the morning and early afternoon, just
ping me on hangout/jabber

On Mon, Jun 1, 2015, 9:05 AM Roman Cheplyaka [email protected]
wrote:

If we could get on a call (say, later today?) and walk around the codebase
together, that'd be super helpful, actually.


Reply to this email directly or view it on GitHub
#111 (comment).

@snoyberg snoyberg modified the milestones: Third release, Second release Jun 5, 2015
@UnkindPartition
Copy link
Contributor Author

Here's my design so far (there's also some code, but it's not finished yet).

The snapshot.yaml has the following format:

packages: >
  base-1.0
  foo-bar-2.1.3
flags:
  unification-fd:
    base4: true

(the flags section is optional). The rationale behind this format (as opposed to a more structured mapping from PackageName to Version) is to be able to copy-paste the output of ghc-pkg list.

In stack.yaml, a URL to snapshot.yaml is given. It can be file:, http:, or https:. We store this file locally under ~/.stack upon first download and never re-download it, which ensures immutability of the snapshot.

The name of the package db contains the host, the path, and a short hash of the full URL of the snapshot, so it is both human readable and is uniquely determined by the URL. The same name is used for the local copy of snapshot.yaml.

@snoyberg
Copy link
Contributor

snoyberg commented Jun 7, 2015

Sounds like a good approach to me.

@UnkindPartition UnkindPartition removed their assignment Jun 9, 2015
@snoyberg snoyberg modified the milestones: Later improvements, First stable release (0.1.0.0?) Jun 15, 2015
@3noch
Copy link
Member

3noch commented Jul 14, 2015

How's this coming?

@UnkindPartition
Copy link
Contributor Author

Very slowly -- I haven't had time to work on this during the past few weeks, unfortunately.

@snoyberg
Copy link
Contributor

If I have an opportunity to do so, you OK if I tackle this?

On Wed, Jul 15, 2015, 12:09 AM Roman Cheplyaka [email protected]
wrote:

Very slowly -- I haven't had time to work on this during the past few
weeks, unfortunately.


Reply to this email directly or view it on GitHub
#111 (comment)
.

@UnkindPartition
Copy link
Contributor Author

That'd be great.

snoyberg added a commit that referenced this issue Jul 15, 2015
@snoyberg snoyberg assigned snoyberg and unassigned UnkindPartition Jul 15, 2015
@snoyberg snoyberg modified the milestones: 0.2.0.0, Later improvements Jul 15, 2015
@snoyberg
Copy link
Contributor

I've done an initial implementation, available on the 111-custom-snapshot branch. Can you guys have a look and try this out? The syntax ended up slightly different than mentioned above.

@3noch
Copy link
Member

3noch commented Jul 15, 2015

This looks amazing! I won't be able to test thoroughly until tomorrow. I can't wait to try it out.

I didn't realize how strongly this correlates to supporting GHC variants. Since GHC variants will often have their own version of base, integer library, etc., using a GHC variant will essentially beg the user to build a custom snapshot. This might be a way to properly support integer-simple GHC variants, by hosting snapshots for it and allowing users to select them here. @manny-fp

@3noch
Copy link
Member

3noch commented Jul 15, 2015

Could the snapshot file specify URLs for GHC bindists on each platform?

@snoyberg
Copy link
Contributor

That's an interesting idea, we could consider it. I haven't thought it
through fully yet, I'm worried that it might break some invariants around
unique names

On Wed, Jul 15, 2015, 4:02 PM Elliot Cameron [email protected]
wrote:

Could the snapshot file specify URLs for GHC bindists on each platform?


Reply to this email directly or view it on GitHub
#111 (comment)
.

@3noch
Copy link
Member

3noch commented Jul 16, 2015

What do you mean by "invariants around unique names."

I suppose to truly "normalize" the union of snapshots and compilers, you might prefer some sort of "linking table" that connects GHCs to their bindists. Then the question becomes, is commercial Haskell's "linking table" extensible/configurable like its snapshots are?

@snoyberg
Copy link
Contributor

What I mean is this: the benefit we're trying to get from custom snapshots (vs just large extra-deps lists) is reusable builds. (There's also a bit of convenience in an external format, but that's much more easily achieved.) In order to make that work well, we need to have a stable mapping from custom snapshot identifier to set of packages/flags.

@3noch
Copy link
Member

3noch commented Jul 20, 2015

This is a huge step forward. Thank you. I'll open another issue to document the enhancement we've been discussing.

@UnkindPartition
Copy link
Contributor Author

Thanks a lot, Michael!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants