Skip to content

Commit

Permalink
Solver: Check for cycles after every step.
Browse files Browse the repository at this point in the history
Previously, the solver only checked for cycles after it had already found a
solution. That reduced the number of times that it performed the check in the
common case when there were no cycles. However, when there was a cycle, the
solver could spend a lot of time searching subtrees that already had a cyclic
dependency and therefore could not lead to a solution. This is part of
haskell#3824.

Changes in this commit:
- Store the reverse dependency map on all choice nodes in the search tree, so
  that 'detectCyclesPhase' can access it at every step.
- Check for cycles incrementally at every step. Any new cycle must contain the
  current package, so we just check whether the current package is reachable
  from its neighbors.
- If there is a cycle, we convert the map to a graph and find a strongly
  connected component, as before.
- Instead of using the whole strongly connected component as the conflict set,
  we select one cycle. Smaller conflict sets are better for backjumping.
- The incremental cycle detection automatically fixes a bug where the solver
  filtered out the message about cyclic dependencies when it summarized the full
  log. The bug occurred when the failure message was not immediately after the
  line where the solver chose one of the packages involved in the conflict. See
  haskell#4154.

I tried several approaches before I found something with reasonable performance.
Here is a comparison of runtime and memory usage. I turned off assertions when
building cabal.

Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z
GHC 8.0.1

Runtime in seconds:
Packages                    Search tree depth   Trials   master   This PR   haskell#1      haskell#2
yesod                       343                 3        2.00     2.00      2.13    2.02
yesod gi-glib leksah        744                 3        3.21     3.31      4.10    3.48
phooey                      66                  3        3.48     3.54      3.56    3.57
stackage nightly snapshot   6791                1        186      193       357     191

Total memory usage in MB, with '+RTS -s':
Packages                                        Trials   master    This PR   haskell#1     haskell#2
yesod                                           1         189       188       188     198
yesod gi-glib leksah                            1         257       257       263     306
stackage nightly snapshot                       1        1288      1338      1432   12699

haskell#1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after
     every step.
haskell#2 - Store dependencies in Distribution.Compat.Graph in the search tree, and
     check for cycles containing the current package at every step.
  • Loading branch information
grayjay committed Dec 19, 2016
1 parent 83498e0 commit e075c2a
Show file tree
Hide file tree
Showing 8 changed files with 226 additions and 172 deletions.
33 changes: 18 additions & 15 deletions cabal-install/Distribution/Solver/Modular/Builder.hs
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ extendOpen qpn' gs s@(BS { rdeps = gs', open = o' }) = go gs' o' gs
go g o (ng@(OpenGoal (Simple (Dep _ qpn _) c) _gr) : ngs)
| qpn == qpn' = go g o ngs
-- we ignore self-dependencies at this point; TODO: more care may be needed
| qpn `M.member` g = go (M.adjust ((c, qpn'):) qpn g) o ngs
| qpn `M.member` g = go (M.adjust (addIfAbsent (c, qpn')) qpn g) o ngs
| otherwise = go (M.insert qpn [(c, qpn')] g) (cons' ng () o) ngs
-- code above is correct; insert/adjust have different arg order
go g o ( (OpenGoal (Simple (Ext _ext ) _) _gr) : ngs) = go g o ngs
Expand All @@ -79,6 +79,9 @@ extendOpen qpn' gs s@(BS { rdeps = gs', open = o' }) = go gs' o' gs

cons' = P.cons . forgetCompOpenGoal

addIfAbsent :: Eq a => a -> [a] -> [a]
addIfAbsent x xs = if x `elem` xs then xs else x : xs

-- | Given the current scope, qualify all the package names in the given set of
-- dependencies and then extend the set of open goals accordingly.
scopedExtendOpen :: QPN -> I -> QGoalReason -> FlaggedDeps Component PN -> FlagInfo ->
Expand Down Expand Up @@ -125,11 +128,11 @@ addChildren :: BuildState -> TreeF () QGoalReason BuildState
-- If we have a choice between many goals, we just record the choice in
-- the tree. We select each open goal in turn, and before we descend, remove
-- it from the queue of open goals.
addChildren bs@(BS { rdeps = rds, open = gs, next = Goals })
| P.null gs = DoneF rds ()
| otherwise = GoalChoiceF $ P.mapKeys close
$ P.mapWithKey (\ g (_sc, gs') -> bs { next = OneGoal g, open = gs' })
$ P.splits gs
addChildren bs@(BS { rdeps = rdm, open = gs, next = Goals })
| P.null gs = DoneF rdm ()
| otherwise = GoalChoiceF rdm $ P.mapKeys close
$ P.mapWithKey (\ g (_sc, gs') -> bs { next = OneGoal g, open = gs' })
$ P.splits gs

-- If we have already picked a goal, then the choice depends on the kind
-- of goal.
Expand All @@ -142,15 +145,15 @@ addChildren (BS { index = _ , next = OneGoal (OpenGoal (Simple (Lang _
error "Distribution.Solver.Modular.Builder: addChildren called with Lang goal"
addChildren (BS { index = _ , next = OneGoal (OpenGoal (Simple (Pkg _ _ ) _) _ ) }) =
error "Distribution.Solver.Modular.Builder: addChildren called with Pkg goal"
addChildren bs@(BS { index = idx, next = OneGoal (OpenGoal (Simple (Dep _ qpn@(Q _ pn) _) _) gr) }) =
addChildren bs@(BS { rdeps = rdm, index = idx, next = OneGoal (OpenGoal (Simple (Dep _ qpn@(Q _ pn) _) _) gr) }) =
-- If the package does not exist in the index, we construct an emty PChoiceF node for it
-- After all, we have no choices here. Alternatively, we could immediately construct
-- a Fail node here, but that would complicate the construction of conflict sets.
-- We will probably want to give this case special treatment when generating error
-- messages though.
case M.lookup pn idx of
Nothing -> PChoiceF qpn gr (W.fromList [])
Just pis -> PChoiceF qpn gr (W.fromList (L.map (\ (i, info) ->
Nothing -> PChoiceF qpn rdm gr (W.fromList [])
Just pis -> PChoiceF qpn rdm gr (W.fromList (L.map (\ (i, info) ->
([], POption i Nothing, bs { next = Instance qpn i info gr }))
(M.toList pis)))
-- TODO: data structure conversion is rather ugly here
Expand All @@ -159,8 +162,8 @@ addChildren bs@(BS { index = idx, next = OneGoal (OpenGoal (Simple (Dep _ qpn@(Q
-- that is indicated by the flag default.
--
-- TODO: Should we include the flag default in the tree?
addChildren bs@(BS { next = OneGoal (OpenGoal (Flagged qfn@(FN (PI qpn _) _) (FInfo b m w) t f) gr) }) =
FChoiceF qfn gr weak m (W.fromList
addChildren bs@(BS { rdeps = rdm, next = OneGoal (OpenGoal (Flagged qfn@(FN (PI qpn _) _) (FInfo b m w) t f) gr) }) =
FChoiceF qfn rdm gr weak m (W.fromList
[([if b then 0 else 1], True, (extendOpen qpn (L.map (flip OpenGoal (FDependency qfn True )) t) bs) { next = Goals }),
([if b then 1 else 0], False, (extendOpen qpn (L.map (flip OpenGoal (FDependency qfn False)) f) bs) { next = Goals })])
where
Expand All @@ -172,8 +175,8 @@ addChildren bs@(BS { next = OneGoal (OpenGoal (Flagged qfn@(FN (PI qpn _) _) (FI
-- the stanza by replacing the False branch with failure) or preferences
-- (try enabling the stanza if possible by moving the True branch first).

addChildren bs@(BS { next = OneGoal (OpenGoal (Stanza qsn@(SN (PI qpn _) _) t) gr) }) =
SChoiceF qsn gr trivial (W.fromList
addChildren bs@(BS { rdeps = rdm, next = OneGoal (OpenGoal (Stanza qsn@(SN (PI qpn _) _) t) gr) }) =
SChoiceF qsn rdm gr trivial (W.fromList
[([0], False, bs { next = Goals }),
([1], True, (extendOpen qpn (L.map (flip OpenGoal (SDependency qsn)) t) bs) { next = Goals })])
where
Expand Down Expand Up @@ -218,7 +221,7 @@ addChildren bs@(BS { next = Instance qpn i (PInfo fdeps fdefs _) _gr }) =
-- https://github.com/haskell/cabal/issues/2899
addLinking :: LinkingState -> TreeF () c a -> TreeF () c (Linker a)
-- The only nodes of interest are package nodes
addLinking ls (PChoiceF qpn@(Q pp pn) gr cs) =
addLinking ls (PChoiceF qpn@(Q pp pn) rdm gr cs) =
let linkedCs = fmap (\bs -> Linker bs ls) $
W.fromList $ concatMap (linkChoices ls qpn) (W.toList cs)
unlinkedCs = W.mapWithKey goP cs
Expand All @@ -229,7 +232,7 @@ addLinking ls (PChoiceF qpn@(Q pp pn) gr cs) =
goP :: POption -> a -> Linker a
goP (POption i Nothing) bs = Linker bs $ M.insertWith (++) (pn, i) [pp] ls
goP _ _ = alreadyLinked
in PChoiceF qpn gr allCs
in PChoiceF qpn rdm gr allCs
addLinking ls t = fmap (\bs -> Linker bs ls) t

linkChoices :: forall a w . LinkingState
Expand Down
123 changes: 95 additions & 28 deletions cabal-install/Distribution/Solver/Modular/Cycles.hs
Original file line number Diff line number Diff line change
@@ -1,50 +1,117 @@
{-# LANGUAGE TypeFamilies #-}
module Distribution.Solver.Modular.Cycles (
detectCyclesPhase
) where

import Prelude hiding (cycle)
import Data.Graph (SCC)
import qualified Data.Graph as Gr
import qualified Data.Map as Map
import qualified Data.Map as M
import qualified Data.Set as S

import qualified Distribution.Compat.Graph as G
import Distribution.Simple.Utils (ordNub)
import Distribution.Solver.Modular.Dependency
import Distribution.Solver.Modular.Flag
import Distribution.Solver.Modular.Package
import Distribution.Solver.Modular.Tree
import qualified Distribution.Solver.Modular.ConflictSet as CS
import Distribution.Solver.Types.ComponentDeps (Component)
import Distribution.Solver.Types.PackagePath

-- | Find and reject any solutions that are cyclic
-- | Find and reject any nodes with cyclic dependencies
detectCyclesPhase :: Tree d c -> Tree d c
detectCyclesPhase = cata go
where
-- The only node of interest is DoneF
-- Only check children of choice nodes.
go :: TreeF d c (Tree d c) -> Tree d c
go (PChoiceF qpn gr cs) = PChoice qpn gr cs
go (FChoiceF qfn gr w m cs) = FChoice qfn gr w m cs
go (SChoiceF qsn gr w cs) = SChoice qsn gr w cs
go (GoalChoiceF cs) = GoalChoice cs
go (FailF cs reason) = Fail cs reason

-- We check for cycles only if we have actually found a solution
-- This minimizes the number of cycle checks we do as cycles are rare
go (DoneF revDeps s) = do
case findCycles revDeps of
Nothing -> Done revDeps s
go (PChoiceF qpn rdm gr cs) =
PChoice qpn rdm gr $ fmap (checkChild qpn) cs
go (FChoiceF qfn@(FN (PI qpn _) _) rdm gr w m cs) =
FChoice qfn rdm gr w m $ fmap (checkChild qpn) cs
go (SChoiceF qsn@(SN (PI qpn _) _) rdm gr w cs) =
SChoice qsn rdm gr w $ fmap (checkChild qpn) cs
go x = inn x

checkChild :: QPN -> Tree d c -> Tree d c
checkChild qpn x@(PChoice _ rdm _ _) = failIfCycle qpn rdm x
checkChild qpn x@(FChoice _ rdm _ _ _ _) = failIfCycle qpn rdm x
checkChild qpn x@(SChoice _ rdm _ _ _) = failIfCycle qpn rdm x
checkChild qpn x@(GoalChoice rdm _) = failIfCycle qpn rdm x
checkChild _ x@(Fail _ _) = x
checkChild qpn x@(Done rdm _) = failIfCycle qpn rdm x

failIfCycle :: QPN -> RevDepMap -> Tree d c -> Tree d c
failIfCycle qpn rdm x =
case findCycles qpn rdm of
Nothing -> x
Just relSet -> Fail relSet CyclicDependencies

-- | Given the reverse dependency map from a 'Done' node in the tree, check
-- | Given the reverse dependency map from a node in the tree, check
-- if the solution is cyclic. If it is, return the conflict set containing
-- all decisions that could potentially break the cycle.
findCycles :: RevDepMap -> Maybe ConflictSet
findCycles revDeps =
case cycles of
[] -> Nothing
c:_ -> Just $ CS.unions $ map (varToConflictSet . P) c
--
-- TODO: The conflict set should also contain flag and stanza variables.
findCycles :: QPN -> RevDepMap -> Maybe ConflictSet
findCycles pkg rdm =
-- This function has two parts: a faster cycle check that is called at every
-- step and a slower calculation of the conflict set.
--
-- 'hasCycle' checks for cycles incrementally by only looking for cycles
-- containing the current package. It searches for cycles in the 'RevDepMap',
-- which is the data structure used to store reverse dependencies in the
-- search tree. We store the reverse dependencies in a map, because Data.Map
-- is smaller and/or has better sharing than Distribution.Compat.Graph.
--
-- If there is a cycle, we call G.cycles to find a strongly connected
-- component. Then we choose one cycle from the component to use for the
-- conflict set. Choosing only one cycle can lead to a smaller conflict set,
-- such as when a choice to enable testing introduces many cycles at once.
-- In that case, all cycles contain the current package and are in one large
-- strongly connected component.
--
if hasCycle
then let scc :: G.Graph RevDepMapNode
scc = case G.cycles $ revDepMapToGraph rdm of
[] -> findCyclesError "cannot find a strongly connected component"
c : _ -> G.fromList c

next :: QPN -> QPN
next p = case G.neighbors scc p of
Just (n : _) -> G.nodeKey n
_ -> findCyclesError "cannot find next node in the cycle"

oneCycle :: [QPN]
oneCycle = case iterate next pkg of
[] -> findCyclesError "empty cycle"
x : xs -> x : takeWhile (/= x) xs
in Just $ CS.fromList $ map P oneCycle
else Nothing
where
cycles :: [[QPN]]
cycles = [vs | Gr.CyclicSCC vs <- scc]
hasCycle :: Bool
hasCycle = pkg `elem` closure (neighbors pkg)

closure :: [QPN] -> S.Set QPN
closure = foldl go S.empty
where
go :: S.Set QPN -> QPN -> S.Set QPN
go s x =
if x `S.member` s
then s
else foldl go (S.insert x s) $ neighbors x

neighbors :: QPN -> [QPN]
neighbors x = case x `M.lookup` rdm of
Nothing -> findCyclesError "cannot find node"
Just xs -> map snd xs

findCyclesError = error . ("Distribution.Solver.Modular.Cycles.findCycles: " ++)

data RevDepMapNode = RevDepMapNode QPN [(Component, QPN)]

scc :: [SCC QPN]
scc = Gr.stronglyConnComp . map aux . Map.toList $ revDeps
instance G.IsNode RevDepMapNode where
type Key RevDepMapNode = QPN
nodeKey (RevDepMapNode qpn _) = qpn
nodeNeighbors (RevDepMapNode _ ns) = ordNub $ map snd ns

aux :: (QPN, [(comp, QPN)]) -> (QPN, QPN, [QPN])
aux (fr, to) = (fr, fr, map snd to)
revDepMapToGraph :: RevDepMap -> G.Graph RevDepMapNode
revDepMapToGraph rdm = G.fromList
[RevDepMapNode qpn ns | (qpn, ns) <- M.toList rdm]
20 changes: 10 additions & 10 deletions cabal-install/Distribution/Solver/Modular/Explore.hs
Original file line number Diff line number Diff line change
Expand Up @@ -94,15 +94,15 @@ assign tree = cata go tree $ A M.empty M.empty M.empty
where
go :: TreeF d c (Assignment -> Tree Assignment c)
-> (Assignment -> Tree Assignment c)
go (FailF c fr) _ = Fail c fr
go (DoneF rdm _) a = Done rdm a
go (PChoiceF qpn y ts) (A pa fa sa) = PChoice qpn y $ W.mapWithKey f ts
go (FailF c fr) _ = Fail c fr
go (DoneF rdm _) a = Done rdm a
go (PChoiceF qpn rdm y ts) (A pa fa sa) = PChoice qpn rdm y $ W.mapWithKey f ts
where f (POption k _) r = r (A (M.insert qpn k pa) fa sa)
go (FChoiceF qfn y t m ts) (A pa fa sa) = FChoice qfn y t m $ W.mapWithKey f ts
go (FChoiceF qfn rdm y t m ts) (A pa fa sa) = FChoice qfn rdm y t m $ W.mapWithKey f ts
where f k r = r (A pa (M.insert qfn k fa) sa)
go (SChoiceF qsn y t ts) (A pa fa sa) = SChoice qsn y t $ W.mapWithKey f ts
go (SChoiceF qsn rdm y t ts) (A pa fa sa) = SChoice qsn rdm y t $ W.mapWithKey f ts
where f k r = r (A pa fa (M.insert qsn k sa))
go (GoalChoiceF ts) a = GoalChoice $ fmap ($ a) ts
go (GoalChoiceF rdm ts) a = GoalChoice rdm $ fmap ($ a) ts

-- | A tree traversal that simultaneously propagates conflict sets up
-- the tree from the leaves and creates a log.
Expand All @@ -120,22 +120,22 @@ exploreLog enableBj (CountConflicts countConflicts) t = cata go t M.empty
go (FailF c fr) = \ !cm -> failWith (Failure c fr)
(c, updateCM c cm)
go (DoneF rdm a) = \ _ -> succeedWith Success (a, rdm)
go (PChoiceF qpn gr ts) =
go (PChoiceF qpn _ gr ts) =
backjump enableBj (P qpn) (avoidSet (P qpn) gr) $ -- try children in order,
W.mapWithKey -- when descending ...
(\ k r cm -> tryWith (TryP qpn k) (r cm))
ts
go (FChoiceF qfn gr _ _ ts) =
go (FChoiceF qfn _ gr _ _ ts) =
backjump enableBj (F qfn) (avoidSet (F qfn) gr) $ -- try children in order,
W.mapWithKey -- when descending ...
(\ k r cm -> tryWith (TryF qfn k) (r cm))
ts
go (SChoiceF qsn gr _ ts) =
go (SChoiceF qsn _ gr _ ts) =
backjump enableBj (S qsn) (avoidSet (S qsn) gr) $ -- try children in order,
W.mapWithKey -- when descending ...
(\ k r cm -> tryWith (TryS qsn k) (r cm))
ts
go (GoalChoiceF ts) = \ cm ->
go (GoalChoiceF _ ts) = \ cm ->
let (k, v) = getBestGoal' ts cm
in continueWith (Next k) (v cm)

Expand Down
14 changes: 7 additions & 7 deletions cabal-install/Distribution/Solver/Modular/Linking.hs
Original file line number Diff line number Diff line change
Expand Up @@ -74,15 +74,15 @@ validateLinking index = (`runReader` initVS) . cata go
where
go :: TreeF d c (Validate (Tree d c)) -> Validate (Tree d c)

go (PChoiceF qpn gr cs) =
PChoice qpn gr <$> T.sequence (W.mapWithKey (goP qpn) cs)
go (FChoiceF qfn gr t m cs) =
FChoice qfn gr t m <$> T.sequence (W.mapWithKey (goF qfn) cs)
go (SChoiceF qsn gr t cs) =
SChoice qsn gr t <$> T.sequence (W.mapWithKey (goS qsn) cs)
go (PChoiceF qpn rdm gr cs) =
PChoice qpn rdm gr <$> T.sequence (W.mapWithKey (goP qpn) cs)
go (FChoiceF qfn rdm gr t m cs) =
FChoice qfn rdm gr t m <$> T.sequence (W.mapWithKey (goF qfn) cs)
go (SChoiceF qsn rdm gr t cs) =
SChoice qsn rdm gr t <$> T.sequence (W.mapWithKey (goS qsn) cs)

-- For the other nodes we just recurse
go (GoalChoiceF cs) = GoalChoice <$> T.sequence cs
go (GoalChoiceF rdm cs) = GoalChoice rdm <$> T.sequence cs
go (DoneF revDepMap s) = return $ Done revDepMap s
go (FailF conflictSet failReason) = return $ Fail conflictSet failReason

Expand Down
Loading

0 comments on commit e075c2a

Please sign in to comment.