-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerun dependency solver to generate a better error message (issue #4823). #5012
Conversation
…kell#4823). This commit changes the way that the solver generates the summarized log that it displays at normal verbosity. Previously, the solver saved the full log from the start to the first backjump. Then it filtered the log using the conflict set from the node where the first backjump occurred, i.e., it removed all lines from the log that did not relate to variables in the conflict set. The solver also printed the final conflict set at the end of the log. This approach had several problems: 1. It was possible for the conflicts at the first backjump to be unrelated to the final conflict set (issue haskell#941). The conflicts in the summarized log could be irrelevant to the failure, for example, if they were caused by only a single version of a dependency, which the solver could skip, and the real problem was a different dependency that was missing from the index. Even if the summarized log was relevant, showing two different explanations for the same failure could be confusing. 2. Filtering the full log was error prone and could remove the wrong lines. It caused bugs mentioned in haskell#2853 and haskell#4154. 3. The conflict set at the first backjump contains the variables directly involved in the conflicts at that level and the variables that introduced them, but it doesn't contain the whole chain of variables starting with the user targets (issue haskell#4792). When the log is filtered with that conflict set, it can be unclear why the solver needed to choose the conflicting packages in the first place. This commit creates the summarized log by rerunning the solver with a backjump limit of zero and using the full log. Using an unfiltered log avoids (2) and (3). However, it is also important to shorten the log by only showing choices that are relevant to conflicts. This commit uses different approaches for the two types of solver failures. No solution: This commit makes the solver prefer variables from the first run's final conflict set when choosing goals in the second run. This means that the log to the first backjump is more likely to be relevant to the final failure, because it only mentions packages, flags, and stanzas from the final conflict set. Backjump limit reached: There is no final conflict set in this case, since the solver did not traverse the whole tree. This commit tries to create a final conflict set by rerunning the solver with a subtree of the original search tree that contains the path to the first backjump. Then it uses the final conflict set from that run to generate a log message, as in the case where the solver found that there was no solution. Here is an example of the differences between the new and old logs, using the command from issue haskell#4792 and GHC 8.2.1: Before: $ cabal install --dry-run --index-state=2018-01-04T21:05:55Z thorn Resolving dependencies... cabal: Could not resolve dependencies: trying: base-4.10.0.0/installed-4.1... (dependency of thorn) next goal: profunctors (dependency of thorn) rejecting: profunctors-5.2.1, profunctors-5.2, profunctors-5.1.2, profunctors-5.1.1, profunctors-5.1, profunctors-5.0.1, profunctors-5.0.0.1, profunctors-5 (conflict: thorn => profunctors<5) trying: profunctors-4.4.1 next goal: transformers (dependency of profunctors) rejecting: transformers-0.5.2.0/installed-0.5..., transformers-0.5.5.0, transformers-0.5.4.0, transformers-0.5.2.0, transformers-0.5.1.0, transformers-0.5.0.1, transformers-0.5.0.0 (conflict: profunctors => transformers>=0.2 && <0.5) rejecting: transformers-0.4.3.0, transformers-0.4.2.0 (conflict: base==4.10.0.0/installed-4.1..., transformers => base>=2 && <4.9) rejecting: transformers-0.4.1.0, transformers-0.3.0.0, transformers-0.2.2.1 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.8) rejecting: transformers-0.2.1.0, transformers-0.2.0.0 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.3) rejecting: transformers-0.1.4.0, transformers-0.1.3.0, transformers-0.1.1.0, transformers-0.1.0.1, transformers-0.0.1.0, transformers-0.0.0.0, transformers-0.5.3.1, transformers-0.5.3.0, transformers-0.5.0.2 (conflict: profunctors => transformers>=0.2 && <0.5) rejecting: transformers-0.4.0.0 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.8) rejecting: transformers-0.2.2.0 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.6) rejecting: transformers-0.1.0.0 (conflict: profunctors => transformers>=0.2 && <0.5) After searching the rest of the dependency tree exhaustively, these were the goals I've had most trouble fulfilling: transformers, contravariant, base, thorn After: $ cabal install --dry-run --index-state=2018-01-04T21:05:55Z thorn Resolving dependencies... cabal: Could not resolve dependencies: [__0] trying: thorn-0.2 (user goal) [__1] next goal: contravariant (dependency of thorn) [__1] rejecting: contravariant-1.4, contravariant-1.3.3, contravariant-1.3.2, contravariant-1.3.1.1, contravariant-1.3.1, contravariant-1.3, contravariant-1.2.2.1, contravariant-1.2.2, contravariant-1.2.1, contravariant-1.2.0.1, contravariant-1.2, contravariant-1.1, contravariant-1.0 (conflict: thorn => contravariant<1) [__1] trying: contravariant-0.6.1.1 [__2] next goal: transformers (dependency of contravariant) [__2] rejecting: transformers-0.5.2.0/installed-0.5..., transformers-0.5.5.0, transformers-0.5.4.0, transformers-0.5.2.0, transformers-0.5.1.0, transformers-0.5.0.1, transformers-0.5.0.0 (conflict: contravariant => transformers>=0.2 && <0.5) [__2] trying: transformers-0.4.3.0 [__3] next goal: base (dependency of thorn) [__3] rejecting: base-4.10.0.0/installed-4.1... (conflict: transformers => base>=2 && <4.9) [__3] rejecting: base-4.10.1.0, base-4.10.0.0, base-4.9.1.0, base-4.9.0.0, base-4.8.2.0, base-4.8.1.0, base-4.8.0.0, base-4.7.0.2, base-4.7.0.1, base-4.7.0.0, base-4.6.0.1, base-4.6.0.0, base-4.5.1.0, base-4.5.0.0, base-4.4.1.0, base-4.4.0.0, base-4.3.1.0, base-4.3.0.0, base-4.2.0.2, base-4.2.0.1, base-4.2.0.0, base-4.1.0.0, base-4.0.0.0, base-3.0.3.2, base-3.0.3.1 (constraint from non-upgradeable package requires installed instance) After searching the rest of the dependency tree exhaustively, these were the goals I've had most trouble fulfilling: transformers, contravariant, base, thorn Differences: - The new summary has level numbers, like the full log. - The conflicts are different. The old log mentions thorn, base, profunctors, and transformers, and the new log mentions the four packages from the conflict set: thorn, contravariant, transformers, and base. - The new log starts with the solver choosing a user goal, thorn. The solver continues to display the conflicts at the first backjump when it reaches the backjump limit, i.e, it shows profunctors instead of contravariant: Before: $ cabal install --dry-run --index-state=2018-01-04T21:05:55Z thorn --max-backjumps=10 Resolving dependencies... cabal: Could not resolve dependencies: trying: base-4.10.0.0/installed-4.1... (dependency of thorn) next goal: profunctors (dependency of thorn) rejecting: profunctors-5.2.1, profunctors-5.2, profunctors-5.1.2, profunctors-5.1.1, profunctors-5.1, profunctors-5.0.1, profunctors-5.0.0.1, profunctors-5 (conflict: thorn => profunctors<5) trying: profunctors-4.4.1 next goal: transformers (dependency of profunctors) rejecting: transformers-0.5.2.0/installed-0.5..., transformers-0.5.5.0, transformers-0.5.4.0, transformers-0.5.2.0, transformers-0.5.1.0, transformers-0.5.0.1, transformers-0.5.0.0 (conflict: profunctors => transformers>=0.2 && <0.5) rejecting: transformers-0.4.3.0, transformers-0.4.2.0 (conflict: base==4.10.0.0/installed-4.1..., transformers => base>=2 && <4.9) rejecting: transformers-0.4.1.0, transformers-0.3.0.0, transformers-0.2.2.1 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.8) rejecting: transformers-0.2.1.0, transformers-0.2.0.0 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.3) rejecting: transformers-0.1.4.0, transformers-0.1.3.0, transformers-0.1.1.0, transformers-0.1.0.1, transformers-0.0.1.0, transformers-0.0.0.0, transformers-0.5.3.1, transformers-0.5.3.0, transformers-0.5.0.2 (conflict: profunctors => transformers>=0.2 && <0.5) rejecting: transformers-0.4.0.0 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.8) rejecting: transformers-0.2.2.0 (conflict: base==4.10.0.0/installed-4.1..., transformers +/-applicativeinbase => base>=1.0 && <4.6) rejecting: transformers-0.1.0.0 (conflict: profunctors => transformers>=0.2 && <0.5) Backjump limit reached (currently 10, change with --max-backjumps or try to run with --reorder-goals). After: $ cabal install --dry-run --index-state=2018-01-04T21:05:55Z thorn --max-backjumps=10 Resolving dependencies... cabal: Could not resolve dependencies: [__0] trying: thorn-0.2 (user goal) [__1] next goal: profunctors (dependency of thorn) [__1] rejecting: profunctors-5.2.1, profunctors-5.2, profunctors-5.1.2, profunctors-5.1.1, profunctors-5.1, profunctors-5.0.1, profunctors-5.0.0.1, profunctors-5 (conflict: thorn => profunctors<5) [__1] trying: profunctors-4.4.1 [__2] next goal: transformers (dependency of profunctors) [__2] rejecting: transformers-0.5.2.0/installed-0.5..., transformers-0.5.5.0, transformers-0.5.4.0, transformers-0.5.2.0, transformers-0.5.1.0, transformers-0.5.0.1, transformers-0.5.0.0 (conflict: profunctors => transformers>=0.2 && <0.5) [__2] trying: transformers-0.4.3.0 [__3] next goal: base (dependency of thorn) [__3] rejecting: base-4.10.0.0/installed-4.1... (conflict: transformers => base>=2 && <4.9) [__3] rejecting: base-4.10.1.0, base-4.10.0.0, base-4.9.1.0, base-4.9.0.0, base-4.8.2.0, base-4.8.1.0, base-4.8.0.0, base-4.7.0.2, base-4.7.0.1, base-4.7.0.0, base-4.6.0.1, base-4.6.0.0, base-4.5.1.0, base-4.5.0.0, base-4.4.1.0, base-4.4.0.0, base-4.3.1.0, base-4.3.0.0, base-4.2.0.2, base-4.2.0.1, base-4.2.0.0, base-4.1.0.0, base-4.0.0.0, base-3.0.3.2, base-3.0.3.1 (constraint from non-upgradeable package requires installed instance) Backjump limit reached (currently 10, change with --max-backjumps or try to run with --reorder-goals). One downside of this change is that the solver may reach the backjump limit when generating the summarized log, if the backjump limit is very low: $ cabal install --dry-run --index-state=2018-01-04T21:05:55Z thorn --max-backjumps=1 Resolving dependencies... cabal: Backjump limit reached (currently 1, change with --max-backjumps or try to run with --reorder-goals). Failed to generate a summarized dependency solver log due to low backjump limit. Another downside is the performance impact of rerunning the solver. It looks like there isn't a big change in run time when the solver finds a solution or fails after an exhaustive search. However, rerunning the solver to the first backjump after it reaches the backjump limit can take a significant amount of time. The worst case I could find was acme-everything with GHC 7.10.3, where that step took 13 seconds. The difference was normally small, though. I ran hackage-benchmark on packages from Hackage to try to find packages where the run time changed by more than a few percent. I stopped it after all packages starting with "b" (That includes all uppercase packages). compiler: GHC 8.2.1 index state: 2018-01-04T21:05:55Z parameters: --min-run-time-percentage-difference-to-rerun=1 --pvalue=0.01 --trials=20 --print-skipped-packages Out of 2219 packages, 1064 were skipped because the run times in the first trial were within 1%, 1065 differed by more than 1% in the first trial but did not show a significant difference in run time in 20 trials, and 90 did show a significant difference in run time. Here are the counts of packages for different ranges of speedup, for those 90 packages: speedup (master avg. run time / branch avg. run time) package count [0.93, 0.94) 1 [0.94, 0.95) 0 [0.95, 0.96) 0 [0.96, 0.97) 1 [0.97, 0.98) 7 [0.98, 0.99) 29 [0.99, 1.00) 47 [1.00, 1.01) 3 [1.01, 1.02) 2 The package with the biggest percentage change was bittorrent, which ran for 3.85 seconds on master and 4.12 seconds on this branch. It reached the backjump limit.
Hi @grayjay. This looks good, and I agree that it's better and less fragile. I don't understand the reasoning about failure to produce a log with low backjump limit. The second invocation is with a limit of 0 anyway, so why exactly does it matter? |
Thanks! Sorry, I didn't explain the "backjump limit reached" case very well. In that case, this branch reruns the solver with the original backjump limit to try to calculate a final conflict set for the subtree that contains the first backjump (since it couldn't calculate one for the whole tree). It may need more than one backjump to backtrack from the last leaf back to the root, though, since every level that adds a variable to the conflict set counts as one backjump. When the solver finishes traversing the subtree, this branch runs the solver a third time, with --max-backjumps=0, and makes it prefer variables from the final conflict set from the second run. It might be better to remove the backjump limit in the first rerun, since returning to the root of the tree probably doesn't take much time. |
AppVeyor failure is due to breaking command-line UI changes in |
Merged, thanks! |
@grayjay Looks like I'll have to revert this, it's causing a strange failure on Windows:
|
This reverts commit 11d8a77, reversing changes made to 94a7374. That change was causing failures on AppVeyor, see haskell#5012 (comment)
This reverts commit 11d8a77, reversing changes made to 94a7374. That change was causing failures on AppVeyor, see haskell#5012 (comment)
Previously, the solver generated the summarized log by using a goal order that preferred goals from the final conflict set. It completely ignored the original goal order parameter. This caused a failure in one of the unit tests that set a goal order and then checked the summarized log, when the solver's goal order changed for other reasons. See haskell#5012.
After haskell#5012, the solver generated the summarized log by using a goal order that preferred goals from the final conflict set. It completely ignored the original goal order parameter. This caused a failure in one of the unit tests that set a goal order and then checked the summarized log, when the solver's goal order changed for other reasons (See haskell#5012).
After haskell#5012, the solver generated the summarized log by using a goal order that preferred goals from the final conflict set. It completely ignored the original goal order parameter. This caused a failure in one of the unit tests that set a goal order and then checked the summarized log, when the solver's goal order changed for other reasons (See haskell#5012). This commit combines the two goal orders, with the goal order that prefers goals from the conflict set taking precedence over the original goal order.
After haskell#5012, the solver generated the summarized log by using a goal order that preferred goals from the final conflict set. It completely ignored the original goal order parameter. This caused a failure in one of the unit tests that set a goal order and then checked the summarized log, when the solver's goal order changed for other reasons (See haskell#5012). This commit combines the two goal orders, with the goal order that prefers goals from the conflict set taking precedence over the original goal order.
Please include the following checklist in your PR:
[ci skip]
is used to avoid triggering the build bots.Please also shortly describe how you tested your change. Bonus points for added tests!
This commit changes the way that the solver generates the summarized log that it
displays at normal verbosity.
Previously, the solver saved the full log from the start to the first backjump.
Then it filtered the log using the conflict set from the node where the first
backjump occurred, i.e., it removed all lines from the log that did not relate
to variables in the conflict set. The solver also printed the final conflict
set at the end of the log.
This approach had several problems:
It was possible for the conflicts at the first backjump to be unrelated to
the final conflict set (issue Incorrect error messages for non-existing dependencies #941). The conflicts in the summarized log
could be irrelevant to the failure, for example, if they were caused by only
a single version of a dependency, which the solver could skip, and the real
problem was a different dependency that was missing from the index. Even if
the summarized log was relevant, showing two different explanations for the
same failure could be confusing.
Filtering the full log was error prone and could remove the wrong lines. It
caused bugs mentioned in Always warn when 'cabal install' cannot find a dependency #2853 and new-build dependency solver failure with GHC 7.0 #4154.
The conflict set at the first backjump contains the variables directly
involved in the conflicts at that level and the variables that introduced
them, but it doesn't contain the whole chain of variables starting with the
user targets (issue Solver sometimes filters out relevant lines in the summarized log. #4792). When the log is filtered with that conflict set,
it can be unclear why the solver needed to choose the conflicting packages in
the first place.
This commit creates the summarized log by rerunning the solver with a backjump
limit of zero and using the full log. Using an unfiltered log avoids (2) and
(3). However, it is also important to shorten the log by only showing choices
that are relevant to conflicts. This commit uses different approaches for the
two types of solver failures.
No solution:
This commit makes the solver prefer variables from the first run's final
conflict set when choosing goals in the second run. This means that the log to
the first backjump is more likely to be relevant to the final failure, because
it only mentions packages, flags, and stanzas from the final conflict set.
Backjump limit reached:
There is no final conflict set in this case, since the solver did not traverse
the whole tree. This commit tries to create a final conflict set by rerunning
the solver with a subtree of the original search tree that contains the path to
the first backjump. Then it uses the final conflict set from that run to
generate a log message, as in the case where the solver found that there was no
solution.
Here is an example of the differences between the new and old logs, using the
command from issue #4792 and GHC 8.2.1:
Before:
After:
Differences:
and transformers, and the new log mentions the four packages from the conflict
set: thorn, contravariant, transformers, and base.
The solver continues to display the conflicts at the first backjump when it
reaches the backjump limit, i.e, it shows profunctors instead of contravariant:
Before:
After:
One downside of this change is that the solver may reach the backjump limit when
generating the summarized log, if the backjump limit is very low:
Another downside is the performance impact of rerunning the solver. It looks
like there isn't a big change in run time when the solver finds a solution or
fails after an exhaustive search. However, rerunning the solver to the first
backjump after it reaches the backjump limit can take a significant amount of
time. The worst case I could find was acme-everything with GHC 7.10.3, where
that step took 13 seconds. The difference was normally small, though.
I ran hackage-benchmark on packages from Hackage to try to find packages where
the run time changed by more than a few percent. I stopped it after all
packages starting with "b" (That includes all uppercase packages).
compiler: GHC 8.2.1
index state:
2018-01-04T21:05:55Z
parameters:
--min-run-time-percentage-difference-to-rerun=1 --pvalue=0.01 --trials=20 --print-skipped-packages
Out of 2219 packages, 1064 were skipped because the run times in the first trial
were within 1%, 1065 differed by more than 1% in the first trial but did not
show a significant difference in run time in 20 trials, and 90 did show a
significant difference in run time. Here are the counts of packages for
different ranges of speedup, for those 90 packages:
The package with the biggest percentage change was bittorrent, which ran for
3.85 seconds on master and 4.12 seconds on this branch. It reached the backjump
limit.
/cc @kosmikus