Make the modular solver the default always, and deprecate the top-down solver #2531

edsko · 2015-04-09T16:58:34Z

This PR is a single commit. It looks big because it depends on all the previous PRs.

This PR makes the modular solver the default solver even for GHC < 7, because we can now deal with the base shim package (PR #2530). To make sure that this is a sensible thing to do, I ran both the top-down solver and the modular solver against every package on Hackage using GHC 6.12.3. Here are some statistics:

7936 packages on Hackage
Out of those 7936, the topdown solver found solutions for 4545 packages.
Out of those 4545, the modular solver gave precisely the same solution for 2893 packages, a different solution for 1485 packages, and failed to give a solution (with default flags) for 166 packages.
Conversely, the modular solved found solutions for 4954 packages. This means that there are (4954-4545+166=) 575 packages for which the modular solver found a solution where the topdown solver did not.
The top-down solver failed to terminate (at least, failed to terminate within half an hour) on two packages: leaky and seqaid.

Performance-wise, the top-down solver usually took around 11 seconds to solve these goals (and rarely faster than that), while the modular solver was an order of magnitude faster:

I took a look at the cases where the modular solver failed to find a solution. Adding --reorder-goals did not help very much: it only solved an additional 28 cases, leaving a 138 packages where the top-down solver found a solution but the modular solver did not. This also slowed down the modular solver quite a bit:

(time in seconds on the x-axis on all these plots). I tried running with --max-backjumps=-1, with a timeout of a maximum of 5 minutes, and on the packages that I tried this actually still did not enable the modular solver to find a solution within the allotted time.

The conclusion from all this is, in my opinion, that it's okay to switch the default now: we find solutions for more packages than we did before, and find them much faster. I guess it's not yet okay to remove the top-down solver completely, for the remaining 138 packages, but we can deprecate the top-down solver and see who complains (which is what this PR does). Meanwhile, we could look at these failing packages to see if we can add any heuristics to the modular solver that would enable it to find solutions. For completeness sake, I added these failures to the Wiki.

It turns out not to be the right solution for general private dependencies and is just complicated. However we keep qualified goals, just much simpler. Now dependencies simply inherit the qualification of their parent goal. This gets us closer to the intended behaviour for the --independent-goals feature, and for the simpler case of private dependencies for setup scripts. When not using --independent-goals, the solver behaves exactly as before (tested by comparing solver logs for a hard hackage goal). When using --independent-goals, now every dep of each independent goal is qualified, so the dependencies are solved completely independently (which is actually too much still).

POption annotates a package choice with a "linked to" field. This commit just introduces the datatype and deals with the immediate fallout, it doesn't actually use the field for anything.

This is implemented as a separate pass so that it can be understood independently of the rest of the solver.

In particular, in the definition of dependencyInconsistencies. One slightly annoying thing is that in order to validate an install plan, we need to know if the goals are to be considered independent. This means we need to pass an additional Bool to a few functions; to limit the number of functions where this is necessary, also recorded whether or not goals are independent as part of the InstallPlan itself.

Since we didn't really have a unit test setup for the solver yet, this introduces some basic tests for solver, as well as tests for independent goals specifically.

@23Skidoo

This address @23Skidoo's comment #2500 (comment)

I don't know why we we constructed this graph manually here rather than calling `graphFromEdges`; it doesn't really matter except that we will want to change the structure of this graph somewhat once we have more fine-grained dependencies, and then the manual construction becomes a bit more painful; easier to use the standard construction.

This commit does nothing but rearrange the Modular.Dependency module into a number of separate sections, so that's a bit clearer to see what's what. No actual code changes here whatsoever.

The ComponentDeps datatype will give us fine-grained information about the dependencies of a package's components. This commit just introduces the datatype, we don't use it anywhere yet.

The modular solver has its own representation for a package (PInfo). In this commit we modify PInfo to keep track of the different kinds of dependencies. This is a bit intricate because the solver also regards top-level goals as dependencies, but of course those dependencies are not part of any 'component' as such, unlike "real" dependencies. We model this by adding a type parameter to FlaggedDeps and go which indicates whether or not we have component information; crucially, underneath flag choices we _always_ have component information available. Consequently, the modular solver itself will not make use of the ComponentDeps datatype (but only using the Component type, classifying components); we will use ComponentDeps when we translate out of the results from the modular solver into cabal-install's main datatypes. We don't yet _return_ fine-grained dependencies from the solver; this will be the subject of the next commit.

In this commit we modify the _output_ of the modular solver (CP, the modular's solver internal version of ConfiguredPackage) to have fine-grained dependency. This doesn't yet modify the rest of cabal-install, so once we translate from CP to ConfiguredPackage we still lose the distinctions between different kinds of dependencies; this will be the topic of the next commit. In the modular solver (and elsewhere) we use Data.Graph to represent the dependency graph (and the reverse dependency graph). However, now that we have more fine-grained dependencies, we really want an _edge-labeled_ graph, which unfortunately it not available in the `containers` package. Therefore I've written a very simple wrapper around Data.Graph that supports edge labels; we don't need many fancy graph algorithms, and can still use Data.Graph on these edged graphs when we want (by calling them on the underlying unlabeled graph), so adding a dependency on `fgl` does not seem worth it.

The crucial change in this commit is the change to PackageFixedDeps to return a ComponentDeps structure, rather than a flat list of dependencies, as long with corresponding changes in ConfiguredPackage and ReadyPackage to accomodate this. We don't actually take _advantage_ of these more fine-grained dependencies yet; any use of depends is now a use of CD.flatDeps . depends but we will :) Note that I have not updated the top-down solver, so in the output of the top-down solver we cheat and pretend that all dependencies are library dependencies.

Although we don't use the new setup dependency component anywhere yet, I've replaced all uses of CD.flatDeps with CD.nonSetupDeps. This means that when we do introduce the setup dependencies, all code in Cabal will still use all dependencies except the setup dependencies, just like now. In other words, using the setup dependencies in some places would be a conscious decision; the default is that we leave the behaviour unchanged.

This patch adds it to the package description types and to the parser. There is a new custom setup section which contains the setup script's dependencies. Also add some sanity checks.

(and, therefore, also to the modular solver's output)

By chosing setup dependencies after regular dependencies we get more opportunities for linking setup dependencies against regular dependencies.

The only problematic thing is that when we call `cabal clean` or `cabal haddock` (and possibly others), _without_ first having called `configure`, we attempt to build the setup script without calling the solver at all. This means that if you do, say, cabal configure cabal clean cabal clean for a package with a custom setup script that really needs setup dependencies (for instance, because there are two versions of Cabal in the global package DB and the setup script needs the _older_ one), then first call to `clean` will succeed, but the second call will fail because we will try to build the setup script without the solver and that will fail.

This happened independently in a number of places, which was bad; and was about to get worse with the base 3/4 thing.

Never consider flag choices as independent from their package.

(previously the default was the topdown solver for GHC < 7). Also adds a deprecation warning when the topdown solver is selected.

23Skidoo · 2015-04-16T12:21:22Z

@edsko

So we're waiting for @kosmikus to review this, since solver is his domain. Meanwhile, maybe you could take a look/leave a comment at #1575 - @kosmikus mentioned that your work on independent goals provides us with a way forward on that issue.

kosmikus · 2015-04-16T12:27:31Z

Sorry for taking so long ...

23Skidoo · 2015-04-23T23:14:13Z

For some reason GitHub didn't notify me about review comments that @kosmikus made here. /cc @edsko in case he also missed them.

dcoutts · 2015-05-21T21:30:13Z

So I'm happy switching now that the new solver works for 6.12 and older. And of course it'll help if ghc ships with multiple base versions again in future.

Make the modular solver the default always, and deprecate the top-down solver

dcoutts and others added 30 commits March 27, 2015 15:38

Add union operation to PSQ

2085511

Prefer base no matter the qualifier

6b7fe10

Make PP (PackagePath) structured type

3a1f1f2

Introduce POption

66f2b23

POption annotates a package choice with a "linked to" field. This commit just introduces the datatype and deals with the immediate fallout, it doesn't actually use the field for anything.

Add single instance restriction

6b85cdc

Prefer to link when possible

ce955ec

Actually add link nodes

7e192b2

This is implemented as a separate pass so that it can be understood independently of the rest of the solver.

Link validation

ae377ae

Unit tests for the solver

1885fb8

Since we didn't really have a unit test setup for the solver yet, this introduces some basic tests for solver, as well as tests for independent goals specifically.

Add Modular.Linking to other-modules

c178ef7

Compatibility for 7.4 and 7.8

ff89079

This address @23Skidoo's comment #2500 (comment)

Code layout

c2c73da

This commit does nothing but rearrange the Modular.Dependency module into a number of separate sections, so that's a bit clearer to see what's what. No actual code changes here whatsoever.

Introduce ComponentDeps

6019667

The ComponentDeps datatype will give us fine-grained information about the dependencies of a package's components. This commit just introduces the datatype, we don't use it anywhere yet.

Allow for dups in configuredPackageProblems

f88c9b6

Extend .cabal format with a custom-setup section

1cfec90

This patch adds it to the package description types and to the parser. There is a new custom setup section which contains the setup script's dependencies. Also add some sanity checks.

Add setup dependenices to modular solver's input

e6a88ea

(and, therefore, also to the modular solver's output)

Treat setup dependencies as independent (always)

d78cfec

Add "defer setup choices" heuristic.

afeb48f

By chosing setup dependencies after regular dependencies we get more opportunities for linking setup dependencies against regular dependencies.

Take setup deps into account in plan validation

e733f53

Unit tests for setup dependencies

a721fbf

Abstract out qualification of goals

21b6b2b

This happened independently in a number of places, which was bad; and was about to get worse with the base 3/4 thing.

Better implementation of qualifyDeps

e8cf0ac

Never consider flag choices as independent from their package.

edsko added 4 commits April 7, 2015 15:53

Treat base special in goal qualification

1ce1307

Only qualify base if a base shim is present

390f837

Unit tests for dealing with base shims

72e2ea1

Make the modular solver the default always

17bba08

(previously the default was the topdown solver for GHC < 7). Also adds a deprecation warning when the topdown solver is selected.

23Skidoo mentioned this pull request Apr 16, 2015

Handle apparent package dependency cycles due to test suites / benchmarks #1575

Open

dcoutts added a commit that referenced this pull request May 21, 2015

Merge pull request #2531 from edsko/pr/switch-default

4083456

Make the modular solver the default always, and deprecate the top-down solver

dcoutts merged commit 4083456 into haskell:master May 21, 2015

edsko deleted the pr/switch-default branch June 1, 2015 13:12

edsko mentioned this pull request Jun 1, 2015

Address remaining solver comments #2635

Merged

23Skidoo mentioned this pull request Nov 9, 2015

[RFC] Move dependency solver to library #2768

Closed

dcoutts mentioned this pull request Apr 21, 2016

RFC: Remove top-down solver (v1) #3364

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the modular solver the default always, and deprecate the top-down solver #2531

Make the modular solver the default always, and deprecate the top-down solver #2531

edsko commented Apr 9, 2015

23Skidoo commented Apr 16, 2015

kosmikus commented Apr 16, 2015

23Skidoo commented Apr 23, 2015

dcoutts commented May 21, 2015

Make the modular solver the default always, and deprecate the top-down solver #2531

Make the modular solver the default always, and deprecate the top-down solver #2531

Conversation

edsko commented Apr 9, 2015

23Skidoo commented Apr 16, 2015

kosmikus commented Apr 16, 2015

23Skidoo commented Apr 23, 2015

dcoutts commented May 21, 2015