[RFC] Move dependency solver to library #2768

bgamari · 2015-08-16T16:10:58Z

One of the major challenges when working within cabal-install is the monolithic nature of the codebase. Moreover, since the package provides only an executable component, features like sandbox support and the solver are inaccessible to third-party tooling (a problem which I am current feeling with cabal-ghc-dynflags). This takes the first step towards breaking up cabal-install into a set of reusable libraries by taking the largest largely-independent chunk of code, the constraint solvers, and moving them into a separate library, cabal-install-solver.

Of course, the solvers do have some dependencies on the rest of cabal-install. I place these in a third package, cabal-install-lib.

This consists of two commits,

Split out the solvers themselves (Distribution.Client.Dependency.*) into cabal-install-solver, moving the bare minimum modules necessary to cabal-install-lib. This exclusively involves file renames and cabal file mangling.
Move Distribution.Client.Dependency itself out to cabal-install-solver. This involves bringing over a few more modules to cabal-install-solver and a bit of surgery to remove the policies

Looking forward, I would like to continue to move functionality into more fine-grained libraries. The package index (e.g. PackageIndex currently in cabal-install-lib and perhaps a few others still in cabal-install) is probably the next logical step here (moving it to cabal-install-index). I would especially like to expose some of the sandbox interfaces as sandboxes are currently entirely opaque to third-party tools. This, however, may require substantial effort as much of cabal-install is written specifically as the backend to a command-line application.

More on this later.

This involves doing some surgery to factor out the sandbox bits.

23Skidoo · 2015-08-16T17:14:06Z

/cc @dcoutts, @kosmikus

bgamari · 2015-08-17T07:34:02Z

Up to merge conflicts this should now pass the testsuite. However, I'm going to hold off on addressing the conflicts until there is some positive feedback on the idea and approach.

phadej · 2015-08-17T15:45:20Z

I personally very like the idea! 👍

Other thing worth splitting is a code reading package database from tar file (I guess it's duplicated in a way or another in cabal-db and various other tools).

mietek · 2015-08-18T16:14:14Z

I like the idea. I would like to understand how this PR would bring Cabal closer to OPAM, with the ability to use CUDF-compatible solvers:

/cc @mboes @tomjaguarpaw

mboes · 2015-08-18T22:20:51Z

👍

@mietek this is a first step. The major difference with CUDF framework is that there package managers (such as OPAM) have a textual interface with the solvers, rather than interacting via a library API. So CUDF decouples components more strongly. But it could well be pretty easy to slap a CUDF frontend in front of the existing solvers, now that they are standalone packages.

hvr · 2015-08-18T22:32:28Z

One thing I couldn't figure out from the CUDF primer and skimming through the CUDF paper: Can .cabal specifications actually be fully expressed in the CUDF model? For instance, I haven't seen an obvious way to express cabal flags & conditionals.

erikd · 2015-08-19T02:52:40Z

I'm very much 👍 in this.

bgamari · 2015-08-19T08:31:05Z

Alright, so it sounds as though there is fairly wide support for this idea. It would be great if I could get a more thorough code review. This will be a very difficult patch to keep up-to-date and will be rather annoying for anyone with outstanding patches, so I'd like to ensure that we plan this carefully.

@dcoutts, do you have any objections here?

tomjaguarpaw · 2015-08-19T17:13:24Z

since the package provides only an executable component, features like sandbox support and the solver are inaccessible to third-party tooling

Isn't the correct solution to this to just turn the whole package into a library? One could literally make everything a library apart from the Main module. I think this would actually make a lot of sense.

bgamari · 2015-08-19T17:25:34Z

@tomjaguarpaw personally I would rather see it split up a bit. Currently cabal-install is a bit of a monolith and unnecessarily so. A more modular layout would improve readability substantially IMHO.

mietek · 2015-08-19T17:28:54Z

Seems like turning cabal-install into a library would be a good first step towards further modularisation.

BardurArantsson · 2015-09-05T08:52:56Z

@mietek and @tomjaguarpaw see #1597 for reasons why that would not be a good idea. (But certainly general-enough and disconnected parts could perhaps.)

@bgamari I'll have a go at reviewing the code; I'm a noob round these parts (even more so in the solver), so I'm not sure what it's worth -- perhaps I can spot style issues and such.

ttuegel · 2015-09-15T14:50:32Z

I would prefer if we separated the network-aware components of the solver from the agnostic components and move the latter into the Cabal library itself. I think separating the solver from the transport is an important step toward enabling different package distribution schemes. Also, from a purely intellectual standpoint, I think the solver should care about the contents of the package index, not where it came from. Having the transport-agnostic parts of the solver in Cabal itself would help solve #2777. I still think it's a good idea to separate the transport-sensitive parts of the solver from cabal-install, too.

kosmikus · 2015-10-11T11:59:54Z

Sorry for being late to join this discussion. Could we please make it explicit what exactly it is we hope to achieve by doing this? I'm not opposed to it, but I'm not sure what the goals are. If the goals are clearly stated, it would make it easier for me to evaluate if the currently proposed split is a good way to achieve them. Note that if the goal is mostly to be able to plug in external solvers (such as SMT solvers or CUDF solvers), this can in principle be done already, without separating out the solvers into extra packages. But I think there's more involved here, such as having library interfaces for cabal-install itself and/or the solvers so that functions can be called easily from other tools/libraries. It's here where I'm not quite sure what exactly it is we want or are hoping for.

BardurArantsson · 2015-11-08T07:03:19Z

So, I think I think I'd like to take a stab at this -- if not thing else then for my own edification. However, I want to try do it in a somewhat more... "disciplined" way than this PR. (That is, to not just create a 3rd "commons"-type library which both Cabal and cabal-install depend on.)

At the very least I can hopefully see what some of the difficulties with the "disciplined" approach are and provide a summary here.

There seems to be some disagreement on whether the solver should be moved to its own entirely-separate-from-Cabal library or "just" moved into Cabal-the-library. Would moving it into Cabal be an acceptable first step for the "solver" portion of your use case @bgamari?

@ttuegel If it turns out that it's feasible to separate the solver entirely from Cabal (and to provide an API which would remain long-term stable), would you object to that? I'm thinking of moving only the solver, no transports, file access, or anything like that.

bgamari · 2015-11-08T09:41:43Z

Bardur Arantsson [email protected] writes:

So, I think I think I'd like to take a stab at this -- if not thing else then for my own edification. However, I want to try do it in a somewhat more... "disciplined" way than this PR. (That is, to not just create a 3rd "commons"-type library which both Cabal and cabal-install depend on.)

I'm very glad to hear this. While I would love to see things start move
in this direction, I'm afraid I just don't have the time to perform the
sort of restructuring that a more principled approach might require.
Good luck!

BardurArantsson · 2015-11-09T04:24:02Z

Somewhat leading question for everyone in here: Is anybody actually still using the TopDown solver? Given that it's not part of any stable API, does it make sense to keep it around?

After having fiddled about with the code a bit, here's my tentative plan:

Move the modular solver into the Cabal library with absolutely minimal changes. It will initially remain in an .Internal. module namespace which will be exposed such that cabal-install can use it, but as such will be clearly marked as "unstable API" which does not adhere to the PVP (or any stability guarantees, really.) If anyone starts to use it from non-Cabal packages, then any breakage is on them and will not be considered when moving forward.
The TopDown solver will remain in cabal-install and no effort will be expended on trrying to move it. This is because it has quite a few (rather larger) dependencies on bits of cabal-install; bits which I'm not sure it makes sense to move.
Splitting TopDown from Modular will require splitting a few definitions cabal-install/Distribution/Client/Dependency/Types.hs in some sensible way -- here's where the leading question comes in. It would be much easier if we could just outright remove the TopDown solver. I'm pretty confident the split could be achieved without removing TopDown, but it would make my life a lot easier :).
Once this works and all tests still pass, etc. this becomes a "checkpoint" for merging. This is mainly to avoid stalling other work on the solver in the mean time.
At this point we can start refining the solver's "exposed & stable" API. We can also start thinking about how/if the solver can be separated from Cabal itself and whether that makes sense.

Does this plan make sense to everyone? Objections? Comments?

23Skidoo · 2015-11-09T04:52:37Z

@BardurArantsson

Somewhat leading question for everyone in here: Is anybody actually still using the TopDown solver?

See #2531.

BardurArantsson · 2015-11-09T04:55:44Z

So that would be... no? ;)

23Skidoo · 2015-11-09T05:50:08Z

@BardurArantsson The old solver is deprecated, but the plan is to keep it for a while (1-2 releases, I think).

hvr · 2015-11-09T08:54:29Z

@BardurArantsson re

Move the modular solver into the Cabal library with absolutely minimal changes.

Tbh, I'm not sure this belongs into the Cabal library, but I think this rather belongs in a new package/library, especially if this incurs additional build-depends which we'd have to integrate into GHC's build-system, which is quite a pain that causes problems on several levels. I'd rather see Cabal remain as minimal as possible (in order to satisfy the CABAL specification) with as few external dependencies as possible.

PS: another problem I see is that tying this into Cabal makes cabal-install less flexible if we want to extend/improve/fix the solver inbetween major releases of Cabal. cabal-install is tied to whatever solver is available in the linked against Cabal library.

kosmikus · 2015-11-09T09:24:03Z

Yes, I agree with @hvr. I said a while ago I'd think about whether to include the solvers in the library, and after doing so, I don't think it's a good idea. The Cabal lib is not tied up to the solver, and won't need to be, and putting the solver(s) in there is only going to make changes to the solvers unnecessarily harder. So I'd also go for a new package.

kosmikus · 2015-11-09T09:25:41Z

@BardurArantsson Regarding topdown: I don't think it's needed anymore, as also @23Skidoo says. I would very much like to keep it though. Even though it's mostly inferior to the modular solver, it's always good to have two implementations of an interface, and it's also worthwhile to have two independent solvers in the first place. I sometimes still use it for testing purposes.

BardurArantsson · 2015-11-09T16:26:50Z

@hvr : Right, for some reason it didn't cross my mind that I could just create a separate package which depends on Cabal and which cabal-install then depends on in turn. That's what I'll do. (Btw, the only added dependency I stumbled upon is "mtl".)

@kosmikus I definitely agree that it's better to have two implementations of the same interface in principle, but given how the code looks, I'm not really sure that that's actually what we have at the moment. There's also a non-trivial cost to having the two implementations. In particular the top-down implementation looks as though is quite tied in to internal cabal-install data structures. (Though, keep in mind I haven't spent that much time analyzing the code yet, so I may be mistaken.) I'll see what can be done about preserving the top-down solver as an option in cabal-install, but I am pretty sure that it won't be possible to split it from cabal-install in any sensible way -- it looks like quite a lot of (ultimately pointless since it's already deprecated) work to do so.

BardurArantsson · 2015-11-15T21:04:15Z

(Not sure how much the following summary will make for people not intimately familiar with the solver code, but here it is anyway: :) )

OK, so I've got a basic split-up implemented, but it needs a little bit of cleanup. Basically, it turned out that it wasn't that hard to move the two solvers to a new cabal-solver package which goes "between" Cabal and cabal-install. The biggest sticking point remaining right now is that the new cabal-solver package depends on network because the PackageLocation data type transitively contains a URI. Since no details of the PackageLocation type are actually used in the solvers, I'm pretty confident that the dependency can be broken by adding a new type parameter to SourcePackage. Thus it should be possible to keep PackageLocation in cabal-install -- I just haven't done the work quite yet.

I'll probably open an RFC Pull Request sometime during the coming week.

phadej · 2015-11-16T13:26:06Z

@BardurArantsson sounds great! 👍

BardurArantsson · 2015-11-17T06:33:56Z

See #2928.

ezyang · 2015-12-23T07:13:59Z

The discussion here seems to have glossed over a point @ttuegel, which is that currently the Cabal library has a very dumb solver for dependencies and flags which occasionally can use up a lot of memory (#2777).

It seems to me that if cabal-installs is having its solver factored out, then so should Cabal. Specifically, whatever interface the separated modular solver ends up having, Cabal's solver should also be factored to follow the same interface. Though as a type this, I'm not sure how possible this is, since Cabal's solver is really dumb (it does not even attempt to make sure that the dependencies are globally consistent; it just checks it afterwards.) It is too bad we cannot just retire the solver!

BardurArantsson · 2016-01-17T08:25:45Z

It seems to me that if cabal-installs is having its solver factored out, then so should Cabal.

How does this follow? cabal-install is (naturally) higher up in the software stack.

It is too bad we cannot just retire the solver!

I'm guessing you're thinking of the relatively hard requirement that "./Setup configure" should work? I could imagine splitting Cabal further such that the really core bits (reading the cabal files, etc.) were split up, but it's more work than I'm will to go through, at least. (Given the compatibility requirements it'd probably be a multi-year undertaking, I'm guessing. Perhaps the hassle could be alleviated by making Cabal into an "umbrella" project for compatibility, but still...) Would be nice, though...

hvr · 2016-01-17T08:50:05Z

@BardurArantsson Taking a step back and considering this from a meta-perspective: Sadly there's also an associated cost with splitting up Cabal into smaller pieces. Even though I'd like myself to be able to tap directly into the solver as a library conveniently, there needs to be a significant benefit for the Cabal project to justify the incurred cost of management overhead of managing and coordinating multiple interdependent packages, including diligent PVP conformance.

Then there's the decision to make whether to keep all sub-packages in a single Git repo (Git branching and tagging becomes confusing and unnatural, cabal get -s doesn't properly support this) or split up into different Git repos (Git repo inter-dependencies, Git submodule mess). And also how the issue-tracker is affected by this.

At this point I'd argue that we desperately need all resources to move forward Cabal+cabal-install development, and consequently postpone such a package refactoring to a time when the big features that are in the pipeline right now are closer to completeness, and we also have a better picture how the various moving parts of the finished thing depend on each other.

BardurArantsson · 2016-01-17T09:01:15Z

@hvr What I'm trying to do is actually the much simpler version (which I think you actually suggested?) where we have Cabal -> cabal-solver -> cabal-install as a simple dependency chain and all in the same repo. That will at least allow others to depend on just Cabal + cabal-solver. It is the 80% solution that I personally can live with and which I think adds a lot of value -- enough to overcome the downsides. Besides, it's just good software development practice to separate modules as strictly as possible. (See #2928 for a little more detail on what I've done to this end. It needs a refresh, but I'm waiting until the solver fixes/updates from @grayjay hopefully make it in at some point.)

(What I wrote in the previous comment is merely "wishlist", it's not something I'm actively pursuing.)

EDIT: The higher-level "end game" here is to get to a place where we can experiment with completely new front ends which aren't cabal-install relatively easily. At least that's my ultimate goal -- AFAICT the code base of cabal-install has become so crufty over time that getting any significant new functionality (and removing old) is nigh impossible unless it's in some niche/specialized area. (That and people have actually started to rely on the command line interface, partly because of the solver, etc.) But let's take things one step at a time...

At this point I'd argue that we desperately need all resources to move forward Cabal+cabal-install development, and consequently postpone such a package refactoring to a time when the big features that are in the pipeline right now are closer to completeness, and we also have a better picture how the various moving parts of the finished thing depend on each other.

Unless you're going to just actively turn away contributors then this is what's called "herding cats". Welcome to project management! :). I don't know if anybody around here is getting paid to do Cabal/cabal-install development, but I'm certainly not... which means I'm going to work primarily work on things that interest me... or which impede my progress on things that interest me (see e.g. the new test framework for cabal-install).

If the conclusion of "the management" is that no separation of the solver will be accepted, then I'll just find a more productive (and, frankly, fun) use of my time. (I'm not suggesting that this is about to actually happen right now, but if the decision on my approach is "no", then that's what's probably going to happen.)

hvr · 2016-01-17T09:32:21Z

@BardurArantsson

(btw, I'm not the project lead and I'm just yet another of those unpaid contributors voicing his opinion)

Don't get me wrong, I still think that separating out the solver is a good thing long-term, I'm just arguing that maybe now is not the best time to do that just yet until cabal has stabilized again, unless it's apparent that factoring out the solver is orthogonal and won't hinder progress in the big new features.

Btw, do you have a ticket/wiki-page for the new test framework you're working on? I wasn't aware of that

BardurArantsson · 2016-01-17T09:37:30Z

Sure, no problem with having opinions :). I'm also just providing counterpoints.

EDIT: This problem with not doing long-term work (even when said long-term work may impede short-term work temporarily) is that it leaves completely stuck in a local minimum with no way to get out. Progress on big new features is already slow enough (because the cabal-install code base is so crufty) that a little bit of extra slowness won't matter. That said, I believe that my proposal is basically orthogonal to everything other than the solver. The first step at least is a pretty simple refactor that basically just adds a type parameter here and there.

Re: test framework: Already merged a few months ago. It was most of my work was in #2864 and @phadej took it from there: #2868

grayjay · 2016-01-17T22:48:50Z

@BardurArantsson I don't want to hold up this change, so I'll just resolve any conflicts with my branch.

ezyang · 2016-01-18T00:40:50Z

(The conversation has moved on, but I mentioned Cabal the library because there is a solver, it's kind of a delicate and not very performant piece of code (#2777), and it is fully subsumed by a better solver, like the one cabal-install has. I just think it's something to keep in mind, let's do small refactors when we can!)

BardurArantsson · 2016-01-18T06:26:00Z

@grayjay Thank you, but I don't think there's necessarily the necessary consensus that my approach is actually a good (enough) idea yet... though, given your offer I may try to actually redo my RFC PR just so people have something concrete to look at. We'll see. Thanks regardless.

...

All: Maybe this is all getting too messy at this point what with two RFC pull requests and descriptions scattered over tens of comments, etc. I'll try to make a new GH Issue for this where I try to lay out my suggestion as clearly as I possibly can in the issue description and we can try to take it from here. I'll try to get to that over the next few days.

Once that's done, I'll also close the two pending RFC PRs for this (mine and this one) and possibly related issues. @bgamari From what you've said so far, I take it you have no objection to this, right?

BardurArantsson · 2016-01-29T19:10:49Z

(Assigned to self just so that I remember once I get a breather from work duties.)

BardurArantsson · 2016-03-12T18:52:48Z

This is continued in #3222.

@bgamari : If you think this should remain open, please just let me know and I'll reopen. (I'm not sure if you can re-open, but if so, feel free to do it yourself if it's relevant.)

bgamari force-pushed the split-up branch 3 times, most recently from 33a5311 to 2ddd877 Compare August 16, 2015 16:52

bgamari added 3 commits August 16, 2015 18:53

Move dependency solver to library

426e7b1

Move Distribution.Client.Dependency to cabal-install-solver

18c5fe9

This involves doing some surgery to factor out the sandbox bits.

bootstrap: Add support for cabal-install-* packages

30b1cf6

Fix cabal bounds

a824564

bgamari force-pushed the split-up branch from 2ddd877 to a824564 Compare August 16, 2015 17:55

bgamari added 2 commits August 17, 2015 09:20

cabal-install: Add missing imports to unit tests

ade3d4c

Move Modular.PSQ test to cabal-install-solver

ec38e9a

bgamari force-pushed the split-up branch from 568698a to ec38e9a Compare August 17, 2015 07:21

bgamari mentioned this pull request Aug 18, 2015

Request: 'cabal freeze --dry-run --no-installed-constraint' #2178

Closed

bgamari mentioned this pull request Sep 5, 2015

Cabal-install as a library #1597

Closed

RichiH mentioned this pull request Sep 15, 2015

./Setup configure eats memory, cabal configure does not #2777

Closed

ezyang added Cabal: other cabal-install: other labels Sep 24, 2015

BardurArantsson mentioned this pull request Nov 17, 2015

[RFC] Split modular solver into separate package #2928

Closed

ezyang added the blocked: action-required label Dec 23, 2015

BardurArantsson self-assigned this Jan 29, 2016

23Skidoo added the post-1.24 label Jan 29, 2016

23Skidoo mentioned this pull request Feb 18, 2016

No ability to use --allow-newer when configuring with Setup.hs #3163

Closed

BardurArantsson closed this Mar 12, 2016

[RFC] Move dependency solver to library #2768

[RFC] Move dependency solver to library #2768

Conversation

bgamari commented Aug 16, 2015

23Skidoo commented Aug 16, 2015

bgamari commented Aug 17, 2015

phadej commented Aug 17, 2015

mietek commented Aug 18, 2015

mboes commented Aug 18, 2015

hvr commented Aug 18, 2015

erikd commented Aug 19, 2015

bgamari commented Aug 19, 2015

tomjaguarpaw commented Aug 19, 2015

bgamari commented Aug 19, 2015

mietek commented Aug 19, 2015

BardurArantsson commented Sep 5, 2015

ttuegel commented Sep 15, 2015

kosmikus commented Oct 11, 2015

BardurArantsson commented Nov 8, 2015

bgamari commented Nov 8, 2015

BardurArantsson commented Nov 9, 2015

23Skidoo commented Nov 9, 2015

BardurArantsson commented Nov 9, 2015

23Skidoo commented Nov 9, 2015

hvr commented Nov 9, 2015

kosmikus commented Nov 9, 2015

kosmikus commented Nov 9, 2015

BardurArantsson commented Nov 9, 2015

BardurArantsson commented Nov 15, 2015

phadej commented Nov 16, 2015

BardurArantsson commented Nov 17, 2015

ezyang commented Dec 23, 2015

BardurArantsson commented Jan 17, 2016

hvr commented Jan 17, 2016

BardurArantsson commented Jan 17, 2016

hvr commented Jan 17, 2016

BardurArantsson commented Jan 17, 2016

grayjay commented Jan 17, 2016

ezyang commented Jan 18, 2016

BardurArantsson commented Jan 18, 2016

BardurArantsson commented Jan 29, 2016

BardurArantsson commented Mar 12, 2016