Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solver loops when there are cycles through setup dependencies #3160

Closed
edsko opened this issue Feb 18, 2016 · 9 comments
Closed

Solver loops when there are cycles through setup dependencies #3160

edsko opened this issue Feb 18, 2016 · 9 comments

Comments

@edsko
Copy link
Contributor

edsko commented Feb 18, 2016

Currently the solver does not try to do anything clever when there are cyclic dependencies in the package dependency graph. For example, if A depends on B and B depends on A, and we solve for A, the solver will simply return that we need to install A and B; cabal will then subsequently fail with an internal error about in invalid install plan. Not great, but not a major problem either.

However, the problem gets worse when we have setup dependencies (or other kinds of qualified dependencies). Suppose package C has a setup dependency on D, and D depends (normally) on C, and we solve for C. Now the solver starts adding an infinite number goals:

      C
      C-setup.D
(*)   C-setup.C
      C-setup.C-setup.D
      C-setup.C-setup.C
      C-setup.C-setup.C-setup.D
      ...

and so calling, say, cabal new-build will not terminate.

This might seem like a contrived situation, but it is not. To see how this might arise in practice, consider:

  • unbounded-delays has a setup dependency on Cabal (at least, it does when using the new-build command; unbounded-delays has a custom setup script without an explicit declaration of setup dependencies, so new-build adds an implicit setup dependency on Cabal)
  • Cabal, with the flag +tests enabled, has a regular dependency on tasty
  • tasty has a regular dependency on, you guessed it, unbounded-delays

Of course, the solver should instead pick Cabal -tests, but it it picks + tests first, it will go down an infinite path and never return; it won't notice that this solution cannot lead anywhere.

I'm working on a solution to this problem, and will hopefully submit a PR, but submitting the issue now so that you are aware that there is a problem.


By the by: The line marked (*) looks odd, but it's morally correct; compare to the goals that we would get if D instead relied on E, which would have a setup dependency on F:

       C
       C-setup.D
(**)   C-setup.E
(***)  C-setup.E-setup.F

Line (**) means that the version we pick for E when building C's setup script is independent of the version we pick elsewhere in the solution, which is correct. Similarly, line (***) says that the version of F that we pick to build E's setup script, for the version of E that we we use to build C's setup script, is independent from all other uses of F.

@edsko
Copy link
Contributor Author

edsko commented Feb 18, 2016

Qualified Goals Tradeoffs

Qualified goals were introduced to increase the solution space. The original motivation was explicit setup dependencies. If we have a package DB containing

C-1.0 -> D-*
C-1.0 -> Cabal-2.0
D-1.0 -setup-> Cabal-1.0

then we want to be able install C-1.0, even though it means that we need two versions of Cabal: version 1.0 to build the setup script for D-1.0, and then version 2.0 to link C-1.0 against.

When we solve for C, the solver introduces a goal C. It then picks a version (in this example, only one choice) and introduces goals for the dependencies of that version: here D and Cabal. The same happens when the solver subsequently picks version 1.0 for D. The introduction of qualified goals means that the solver will introduce the goal D-setup.Cabal. This means that the solver can pick one version of Cabal for the top-level goal Cabal and a different version for the goal D-setup.Cabal.

Current approach

The general idea is that within a certain namespace (top-level, or the name space determined by a particular series of qualifiers) we can only install a single version/instance of a package. Introducing more qualifiers means decoupling package choices, thus allowing for more solutions. Asking for versions C-1.0 and C-1.1 to be installed at top-level will not work because they will share the goal C, but if we tell the solver that these are independent goals, we qualify them as "0.C-1.0" and "1.C-1.0" (where 0 and 1 are just prefixes) and decouple the two package choices.

Currently if we are solving for a particular package P with a bunch of qualifiers qq1, qq2, etc., (i.e, in the name space qq1.qq2.*), then if we find that P has setup dependencies we add an additional qualifier to record that this choice is independent from the current namespace (introducing the name namespace qq1.qq2.P-setup.*).

Enlarging the solution space somewhat

Currently we introduce qualifiers only for setup dependencies (and when the user indicated that the top-levels goals are independent, although we do not expose this at the command line at the moment). However, as discussed in #1575 (comment), we might also want to introduce a qualifier for the dependencies of test suites (at least those dependencies that are not shared with the library itself). Consider:

optparse-applicative-1.0 -test-> tasty-*
optparse-applicative-1.1 -test-> tasty-*
tasty-1.0 -> optparse-applicative-.*

Right now this situation is impossible; the solver will not be able to find a solution. However, we could introduce qualified goals for dependencies of test suites, just like we do for setup dependencies. Then when we solve for optparse-applicative+tests, we would end up with goals optparse-applicative,
optparse-applicative-test.tasty, and optparse-applicative-test.optparse-applicative; we can then pick a different version of optparse-applicative (without tests, presumably) for the qualified goal optparse-applicative-test.optparse-applicative than we do for the top-level optparse-applicative goal.

This illustrates that a goal of the form P-qq.P might in general be useful.

Remark: self-dependencies for setup scripts

Although it's probably not very important, if we have

C-1.0 (no dependencies)
C-1.1 -setup-> C-*

we also end up with a goal of the same form C-setup.C, and we can indeed build this in the same way; build the setup script of C-1.1 by compiling it against version 1.0 of C.

Problem in the solver

If we solved for C in the example from the previous section, the solver would actually start creating an infinite series of goals to solve for; C, C-setup.C, C-setup.C-setup.C, ad infinitum. It's not so easy to fix this in a principled, general way. After all, how many goals should it generate? As many as there are versions of C? That hard-codes the single instance restriction right into the building of the solution tree; what happens when we want to lift that restriction (which, ultimately, we want to do to solve Cabal Hell once and for all)? In general the maximum length of this chain depends on how many different ways we can satisfy the constraint, which is a highly non-trivial property of the package database.

Reducing the solution space

Hence, we need to approximate and artificially restrict the solution space.

My suggestion is to remove the notion of a package path, and just allow for a single qualifier. This still allows for setup dependencies to be satisfied independently from top-level dependencies, and even means the setup dependencies of different packages are independent from each other. It also still allows the optparse-applicative -test-> tasty -> optparse-applicative cycle discussed above.

Thus, where we currently would create a goal C-setup.D-setup.E, we instead no longer inherit the C-setup qualifier and just use the goal D-setup.E instead, artificially coupling it to the version of E that we use for the setup dependency of the top-level goal D, if it exists (this matters only if the top-level version of D and the version of D that we pick when building the setup script for C have different requirements on E).

In reality, I think the impact of restricting qualifiers to a single package name and component, rather than a series of them (a package path), is very small. Most setup dependencies will be simple, so the situation where we want different versions for the setup dependencies of setup dependencies than we want for those setup dependencies [sic] seems somewhat far-fetched. For test suites it is even less relevant because we generally want to build test suites for top-level goals only anyway, and we don't want to build the test suites of the dependencies of our test suites.

We can consider other heuristics (allow package paths up to a certain length, package paths where qualifiers cannot occur more than once, package paths without infinitely repeating subpaths [how would we even detect that?]), but I don't increased expressive power is worth it, and the more complicated the heuristic the more difficult it would be understand why the solver cannot find a solution
in particular circumstances.

Moreover, any such heuristic (at least the ones I just mentioned) as really just as arbitrary as restricting paths to be of length 1; you can always find an example that would be solvable without the heuristic but no longer solvable with the heuristic. Just the longer the paths are that you allow, the more deeply nested the constraints will need to be before we fail to find a solution. In a way, the introduction of qualified goals changed the maximum paths length from 0 to be infinitely long; I'm suggesting that if we pick a finite path length, 1 will probably suffice.

So my vote is to allow for a single qualifier, unless there are very clear reasons (example use cases) not to.

@23Skidoo
Copy link
Member

/cc @kosmikus

@edsko
Copy link
Contributor Author

edsko commented Feb 18, 2016

Actually, I just realized my comment above isn't quite right. Bear with me.

@edsko
Copy link
Contributor Author

edsko commented Feb 18, 2016

Ok, comment updated. Section on existing limitations was incorrect. My conclusion still stands however.

@23Skidoo
Copy link
Member

Related: #1575. One other example where this occurs in the wild is the containers test suite.

@edsko
Copy link
Contributor Author

edsko commented Mar 19, 2016

Note that this is addressed by #3220 .

@grayjay
Copy link
Collaborator

grayjay commented Apr 23, 2016

Should this be closed now that #3220 is merged?

@edsko
Copy link
Contributor Author

edsko commented Apr 23, 2016

Yes, it should. @kosmikus I suspect this is the cycle issue that @hvr was talking about (instead of the test suite dependencies stuff). But I could be wrong.

@edsko edsko closed this as completed Apr 23, 2016
@kosmikus
Copy link
Contributor

@edsko Yes, indeed. This sounds like the issue @dcoutts and I looked at yesterday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants