-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull request: Add the 'fork' command to cabal-install. #2
Conversation
+1 (adding this in case it helps influence this code getting accepted, if it's just obnoxious then I apologize!) |
Hmm. From a UI perspective, I wonder if we should share this with the existing "unpack" command. |
I don't really want to bikeshed, but it really sounds like 'cabal branch' should be some sub-option of 'cabal fetch' (--from-vcs maybe). Also 'branch' is only used by bzr and mercurial, git uses 'checkout' and darcs 'get'. And by popularity git, then darcs are the main used vcs, which have a very big majority (6000 packages) compared to bzr/mercurial (600 packages), per @jmillikin 's numbers. Otherwise the feature seems like a worthy addition. |
So the existing UI has "fetch" which gets tarballs and stashes them in the download cache. It doesn't yet but could have a -o flag or similar to stick the tarball in the current dir or wherever. Then we also have "unpack" which downloads a tarball and unpacks it (by default in the current subdir). |
I strongly prefer making this a separate command, because it's doing something different from either The term "branch" is fairly universal among version control systems. Bazaar, Git, Mercurial, and Subversion all use it for basically the same operation (creating a new lightweight fork from an existing line of development). The only significant VCS that does not use "branch" is Darcs, but we shouldn't use the Darcs terminology because The only workable alternative to "branch" is "clone", but I think |
I agree with John's justifications. The only thing I would change, is to go back in time and implement a plugin Jason On Tue, Apr 24, 2012 at 6:05 PM, jmillikin <
|
I'd also strongly prefer a separate command (for the same reasons @jmillikin pointed out). At first I found For comparison:
What about @dagit Same thought here about plugins. Looking at |
I'm not sure I'm convinced. Keep in mind that cabal is not itself a dvcs so the verbs that are common for a dvcs are not necessarily appropriate, simply because the user has a different context in mind. For a dvcs the implicit context is operations on repositories, for cabal or other package management tools, the implicit context is operations on packages. Branching a repo makes sense, I'm not sure branching a package makes sense. Is it really doing something different from Or alternatively, perhaps we want a new sub command for source repo operations. In that case the context changes from operations on packages to operations on repos, and checkout/get/clone etc become appropriate terms. |
I think I'd vote for |
I don't particularly want cabal to become a vcs abstraction, and this is why i suggested fetch with an option in the first place. particularly in this case, i imagined it as "you'll fetch the package's repository instead of fetching a package's tarball". I'm also fine with unpack (which i didn't know hence my suggestion of fetch) although maybe "unpacking a package's repository" is a bit odd. clone as in "cloning a package" seems to make perfect sense as cloning the whole history of the package hence the repository. |
I'd be fine with The reason I don't want to make this part of |
Gentle ping -- @dcoutts, would renaming the command to |
I've briefly scanned, but not tested the code. It looks ok to me. I'd be willing to merge if this is renamed to |
Apologies for the delay; I got distracted by another project. Pull request updated to use 'fork' instead of 'branch'. I left one of the internal types (Brancher) unchanged, because otherwise it'd be called a Forker and that's just silly. |
This command reads the source-repositories from a package's description, determines which VCS to use, and then creates a local repository or branch of the package's repository.
@dcoutts and I had more naming discussion, and I think we came to the conclusion that we'd like to name this |
I'd really rather not call this
The two "types" of source can be very different, such as when the Hackage tarball is generated with a preprocessor, or when multiple packages share the same source repository (example: Cabal and cabal-install). Since they do different things, the commands should not share a name just because they both happen to be dealing with some sort of source code. As to the command name "get", that's just plain confusing. Come on. If you tell someone "cabal supports branching a package from its source-repo", they ought to be able to guess the command to use in only 2-3 tries. If the command is "get", the only way to discover it is |
They're doing different but similar things. I think there's value in keeping the number of top-level commands small. Tarballs can be seen as snapshots of the repository at certain times, so I don't see a huge problem in joining the two under a common top-level command. I think we have to think carefully about how to design the user interface so that it does not become confusing, which is why I'd suggest for As to |
I really care about this feature, and not so much about how it's named. Anyway, rather than building something that may be turned into a consistent UI eventually, it makes more sense to me to build that consistent UI right now. So I'll throw in an other suggestion (in the hope that this will not turn into extensive bikeshedding).
Where Would that make sense? @jmillikin If I understand correctly |
I agree with |
I think I agree with sol's suggestion here.
(Note that if you think it's too long, don't worry, you can use any unique prefix, e.g. --source) The only difference here is instead of
And we have pre-defined meanings for different repo kinds, "head", "this" and also allow other named kinds. So The idea of allowing --dry-run here may be useful, however in practice we may not have enough info to know where we'll unpack to until we do it. We could probably say something useful though. |
+1 for this feature. |
@jmillikin It seems like there is an agreement on what is to be done, are you interested in pushing through, making the agreed changes and have this committed? |
I disagree with the proposed user interface changes, so I won't be changing my patch to integrate them. Feel free to use the current state of the patch as a starting place in implementing |
If nobody wants to take this on, I can make the necessary changes once I'm done with the sandbox patches. This code could be useful for implementing references to remote source repositories from sandboxes (a-la |
I think fork is the widely accepted terminology in distributed versions control, where everything is a fork. It's used for this action on both GitHub and BitBucket. |
I propose |
I don't particularly like any of the previous suggestions. How about |
I like simonmar's suggestion. In the future we may want to support additional vcs commands. So it might be best to have I would be fine with anything here being implemented though :) |
Not sure if it makes any difference, but I don't like the overloaded semantics of that UI. @dcoutts even if you say that you agree with my suggestion, it is different from what I proposed. Please note that my last suggestion is not my favorite. It was just an attempt to foster consensus, and more importantly to prevent a state of "we do it like that for now, and fix it later" (which IMHO is almost always a bad idea). That said, I think whatever UI is agreed upon, the person who did the initial investment (@jmillikin) has to be ok with it (it does not have to be his favorite, but it still has to be ok with him). Then, even if I don't agree with the final result, I can live with it, because he did the work. This also has practical advantages. If writing code entitles me to have a say when it comes to bikeshedding, then there is an incentive to write code. But if I have to be prepared that I do the work, and then it will end up as something I just don't like, then I'd rather think twice before investing any of my time. |
@sol I'd actually prefer having a separate |
So I think we can close this topic now, 23Skidoo has made a new patch and that's now close to going in. That said, I don't mean to close down the UI discussion. There's some interesting suggestions for future directions here with more vcs integration. So don't assume this is now set in stone. We can make UI changes in a somewhat backwards compat way (like leaving unpack as an alias for the new command). |
Previously, the solver only checked for cycles after it had already found a solution. That reduced the number of times that it performed the check in the common case when there were no cycles. However, when there was a cycle, the solver could spend a lot of time searching subtrees that already had a cyclic dependency and therefore could not lead to a solution. This is part of haskell#3824. Changes in this commit: - Store the reverse dependency map on all choice nodes in the search tree, so that 'detectCyclesPhase' can access it at every step. - Check for cycles incrementally at every step. Any new cycle must contain the current package, so we just check whether the current package is reachable from its neighbors. - If there is a cycle, we convert the map to a graph and find a strongly connected component, as before. - Instead of using the whole strongly connected component as the conflict set, we select one cycle. Smaller conflict sets are better for backjumping. - The incremental cycle detection automatically fixes a bug where the solver filtered out the message about cyclic dependencies when it summarized the full log. The bug occurred when the failure message was not immediately after the line where the solver chose one of the packages involved in the conflict. See haskell#4154. I tried several approaches before I found something with reasonable performance. Here is a comparison of runtime and memory usage. I turned off assertions when building cabal. Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z GHC 8.0.1 Runtime in seconds: Packages Search tree depth Trials master This PR haskell#1 haskell#2 yesod 343 3 2.00 2.00 2.13 2.02 yesod gi-glib leksah 744 3 3.21 3.31 4.10 3.48 phooey 66 3 3.48 3.54 3.56 3.57 stackage nightly snapshot 6791 1 186 193 357 191 Total memory usage in MB, with '+RTS -s': Packages Trials master This PR haskell#1 haskell#2 yesod 1 189 188 188 198 yesod gi-glib leksah 1 257 257 263 306 stackage nightly snapshot 1 1288 1338 1432 12699 haskell#1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after every step. haskell#2 - Store dependencies in Distribution.Compat.Graph in the search tree, and check for cycles containing the current package at every step.
Previously, the solver only checked for cycles after it had already found a solution. That reduced the number of times that it performed the check in the common case where there were no cycles. However, when there was a cycle, the solver could spend a lot of time searching subtrees that already had a cyclic dependency and therefore could not lead to a solution. This is part of haskell#3824. Changes in this commit: - Store the reverse dependency map on all choice nodes in the search tree, so that 'detectCyclesPhase' can access it at every step. - Check for cycles incrementally at every step. Any new cycle must contain the current package, so we just check whether the current package is reachable from its neighbors. - If there is a cycle, we convert the map to a graph and find a strongly connected component, as before. - Instead of using the whole strongly connected component as the conflict set, we select one cycle. Smaller conflict sets are better for backjumping. - The incremental cycle detection automatically fixes a bug where the solver filtered out the message about cyclic dependencies when it summarized the full log. The bug occurred when the failure message was not immediately after the line where the solver chose one of the packages involved in the conflict. See haskell#4154. I tried several approaches and compared performance when solving for packages with different numbers of dependencies. Here are the results. None of these runs involved any cycles, so they should have only tested the overhead of cycle checking. I turned off assertions when building cabal. Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z GHC 8.0.1 Runtime in seconds: Packages Search tree depth Trials master This PR haskell#1 haskell#2 yesod 343 3 2.00 2.00 2.13 2.02 yesod gi-glib leksah 744 3 3.21 3.31 4.10 3.48 phooey 66 3 3.48 3.54 3.56 3.57 Stackage nightly snapshot 6791 1 186 193 357 191 Total memory usage in MB, with '+RTS -s': Packages Trials master This PR haskell#1 haskell#2 yesod 1 189 188 188 198 yesod gi-glib leksah 1 257 257 263 306 Stackage nightly snapshot 1 1288 1338 1432 12699 haskell#1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after every step. haskell#2 - Store dependencies in Distribution.Compat.Graph in the search tree, and check for cycles containing the current package at every step.
Previously, the solver only checked for cycles after it had already found a solution. That reduced the number of times that it performed the check in the common case where there were no cycles. However, when there was a cycle, the solver could spend a lot of time searching subtrees that already had a cyclic dependency and therefore could not lead to a solution. This is part of #3824. Changes in this commit: - Store the reverse dependency map on all choice nodes in the search tree, so that 'detectCyclesPhase' can access it at every step. - Check for cycles incrementally at every step. Any new cycle must contain the current package, so we just check whether the current package is reachable from its neighbors. - If there is a cycle, we convert the map to a graph and find a strongly connected component, as before. - Instead of using the whole strongly connected component as the conflict set, we select one cycle. Smaller conflict sets are better for backjumping. - The incremental cycle detection automatically fixes a bug where the solver filtered out the message about cyclic dependencies when it summarized the full log. The bug occurred when the failure message was not immediately after the line where the solver chose one of the packages involved in the conflict. See #4154. I tried several approaches and compared performance when solving for packages with different numbers of dependencies. Here are the results. None of these runs involved any cycles, so they should have only tested the overhead of cycle checking. I turned off assertions when building cabal. Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z GHC 8.0.1 Runtime in seconds: Packages Search tree depth Trials master This PR #1 #2 yesod 343 3 2.00 2.00 2.13 2.02 yesod gi-glib leksah 744 3 3.21 3.31 4.10 3.48 phooey 66 3 3.48 3.54 3.56 3.57 Stackage nightly snapshot 6791 1 186 193 357 191 Total memory usage in MB, with '+RTS -s': Packages Trials master This PR #1 #2 yesod 1 189 188 188 198 yesod gi-glib leksah 1 257 257 263 306 Stackage nightly snapshot 1 1288 1338 1432 12699 #1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after every step. #2 - Store dependencies in Distribution.Compat.Graph in the search tree, and check for cycles containing the current package at every step.
* Fix bug causing spurious app or lib dirs to be generated by libraries and executables respectively. * Add unit tests for init flag correctness * Export useful testing functions from `Init/Command.hs` and `Init.Types.hs` * Modify `cabal-install.cabal.dev` to reflect change in test file name for Init/FileCreators -> Init.hs *
This command reads the source-repositories from a package's description,
determines which VCS to use, and then creates a local branch or checkout
of the package's repository.