Pull request: Add the 'fork' command to cabal-install. #2

jmillikin · 2012-04-24T03:34:57Z

This command reads the source-repositories from a package's description,
determines which VCS to use, and then creates a local branch or checkout
of the package's repository.

dagit · 2012-04-24T05:03:36Z

+1 (adding this in case it helps influence this code getting accepted, if it's just obnoxious then I apologize!)

dcoutts · 2012-04-24T11:36:54Z

Hmm. From a UI perspective, I wonder if we should share this with the existing "unpack" command.

vincenthz · 2012-04-24T11:42:05Z

I don't really want to bikeshed, but it really sounds like 'cabal branch' should be some sub-option of 'cabal fetch' (--from-vcs maybe).

Also 'branch' is only used by bzr and mercurial, git uses 'checkout' and darcs 'get'. And by popularity git, then darcs are the main used vcs, which have a very big majority (6000 packages) compared to bzr/mercurial (600 packages), per @jmillikin 's numbers.

Otherwise the feature seems like a worthy addition.

dcoutts · 2012-04-24T19:24:07Z

So the existing UI has "fetch" which gets tarballs and stashes them in the download cache. It doesn't yet but could have a -o flag or similar to stick the tarball in the current dir or wherever. Then we also have "unpack" which downloads a tarball and unpacks it (by default in the current subdir).

jmillikin · 2012-04-25T01:05:20Z

I strongly prefer making this a separate command, because it's doing something different from either fetch or unpack. Giving two separate operations the same command name, then distinguishing based on a flag, leads to UI catastrophes like Git.

The term "branch" is fairly universal among version control systems. Bazaar, Git, Mercurial, and Subversion all use it for basically the same operation (creating a new lightweight fork from an existing line of development). The only significant VCS that does not use "branch" is Darcs, but we shouldn't use the Darcs terminology because cabal get some-package would be confusing.

The only workable alternative to "branch" is "clone", but I think cabal clone some-package would also be confusing.

dagit · 2012-04-25T01:36:01Z

I agree with John's justifications.

The only thing I would change, is to go back in time and implement a plugin
architecture for cabal so that this functionality could be implemented as a
plugin instead of part of cabal directly. In my idealized world with cabal
plugins the plugins would be similar to git plugins. Which is to say there
would be a directory somewhere that cabal searches when the user types
"cabal branch ..." and when it finds "cabal-branch" it invokes that with
the supplied command line arguments. We would also be able to teach cabal
how to download plugins from hackage and install them when they are
missing. My motivation for mentioning the non-existant plugins is that I'm
trying to find further evidence to justify them. Maven (a java build
system) uses plugins to great success and extensibility.

Jason

On Tue, Apr 24, 2012 at 6:05 PM, jmillikin <
[email protected]

wrote:

I strongly prefer making this a separate command, because it's doing
something different from either fetch or unpack. Giving two separate
operations the same command name, then distinguishing based on a flag,
leads to UI catastrophes like Git.

The term "branch" is fairly universal among version control systems.
Bazaar, Git, Mercurial, and Subversion all use it for basically the same
operation (creating a new lightweight fork from an existing line of
development). The only significant VCS that does not use "branch" is Darcs,
but we shouldn't use the Darcs terminology because cabal get some-package
would be confusing.

The only workable alternative to "branch" is "clone", but I think cabal clone some-package would also be confusing.

Reply to this email directly or view it on GitHub:
#2 (comment)

sol · 2012-04-25T09:34:29Z

I'd also strongly prefer a separate command (for the same reasons @jmillikin pointed out).

At first I found branch very confusing, but I probably could cope with it. Anyway, I think it is fair to say that it is biased towards bzr (and it has a completely different meaning in git).

For comparison:

git clone
hg clone
darcs get
bzr branch
svn checkout

What about cabal fork? It is generic in the way that none of the supported systems use it.

@dagit Same thought here about plugins. Looking at sbt shows how awesome it can get.

dcoutts · 2012-04-25T10:15:11Z

I'm not sure I'm convinced. Keep in mind that cabal is not itself a dvcs so the verbs that are common for a dvcs are not necessarily appropriate, simply because the user has a different context in mind. For a dvcs the implicit context is operations on repositories, for cabal or other package management tools, the implicit context is operations on packages. Branching a repo makes sense, I'm not sure branching a package makes sense.

Is it really doing something different from unpack? Isn't it doing the same thing but for a different target? Both give you the source tree locally, but one comes from a tarball and the other from a source repo. I admit that unpack is a verb we use on tarballs and not on remote source repos, but it seems like we ought to be able to find something that covers both.

Or alternatively, perhaps we want a new sub command for source repo operations. In that case the context changes from operations on packages to operations on repos, and checkout/get/clone etc become appropriate terms.

kosmikus · 2012-04-25T10:26:53Z

I think I'd vote for clone or get. To me, merging it with unpack in some way sounds like a good idea as well. I don't like branch very much, because while it exists as terminology in most VCS, it actually has quite different meanings. Perhaps we can find a name that is fresh. I like the fork suggestion, too.

vincenthz · 2012-04-25T10:45:00Z

I don't particularly want cabal to become a vcs abstraction, and this is why i suggested fetch with an option in the first place. particularly in this case, i imagined it as "you'll fetch the package's repository instead of fetching a package's tarball".

I'm also fine with unpack (which i didn't know hence my suggestion of fetch) although maybe "unpacking a package's repository" is a bit odd.

clone as in "cloning a package" seems to make perfect sense as cloning the whole history of the package hence the repository.

jmillikin · 2012-04-25T15:38:33Z

I'd be fine with fork -- you're right, it's generic across versioning systems, and means pretty much exactly what's being done in this command.

The reason I don't want to make this part of unpack is because I consider the final product to be different. Someone who wanted to get a versioned copy of a branch's source would be unhappy if they ended up with just a directory grabbed from a tarball.

jmillikin · 2012-04-27T00:36:56Z

Gentle ping -- @dcoutts, would renaming the command to fork satisfy your concerns, or do you really want it to be a sub-sub-command of something else?

kosmikus · 2012-04-27T07:14:38Z

I've briefly scanned, but not tested the code. It looks ok to me. I'd be willing to merge if this is renamed to fork, if @dcoutts is ok with it, too.

jmillikin · 2012-05-05T02:31:11Z

Apologies for the delay; I got distracted by another project.

Pull request updated to use 'fork' instead of 'branch'. I left one of the internal types (Brancher) unchanged, because otherwise it'd be called a Forker and that's just silly.

This command reads the source-repositories from a package's description, determines which VCS to use, and then creates a local repository or branch of the package's repository.

kosmikus · 2012-05-07T19:34:34Z

@dcoutts and I had more naming discussion, and I think we came to the conclusion that we'd like to name this get rather than fork now, and that while we can then merge your change under that name for now, we'd like to aim for a future where get takes over the responsibilities of unpack, too.

jmillikin · 2012-05-08T03:06:05Z

I'd really rather not call this get, or merge it with unpack, because the two commands are doing different things.

cabal fork creates a version-controlled branch of a package's source repository.

cabal unpack creates an unversioned directory containing the contents of the tarball uploaded to Hackage.

The two "types" of source can be very different, such as when the Hackage tarball is generated with a preprocessor, or when multiple packages share the same source repository (example: Cabal and cabal-install).

Since they do different things, the commands should not share a name just because they both happen to be dealing with some sort of source code.

As to the command name "get", that's just plain confusing. Come on. If you tell someone "cabal supports branching a package from its source-repo", they ought to be able to guess the command to use in only 2-3 tries. If the command is "get", the only way to discover it is cabal --help | less.

kosmikus · 2012-05-08T08:20:31Z

They're doing different but similar things. I think there's value in keeping the number of top-level commands small. Tarballs can be seen as snapshots of the repository at certain times, so I don't see a huge problem in joining the two under a common top-level command. I think we have to think carefully about how to design the user interface so that it does not become confusing, which is why I'd suggest for get and unpack to coexist for a potentially long transition period.

As to branch vs. fork vs. get: I simply disagree. I don't think I'd ever just blindly type in a guessed command I've never used before and expect it to work correctly. But even if I did: I'd never be able to guess branch or fork, whereas get is quite descriptive to me. You're saying it's confusing. That suggests you'd associate a completely different meaning with get? Such as?

sol · 2012-05-08T10:44:30Z

I really care about this feature, and not so much about how it's named.

Anyway, rather than building something that may be turned into a consistent UI eventually, it makes more sense to me to build that consistent UI right now.

So I'll throw in an other suggestion (in the hope that this will not turn into extensive bikeshedding).

cabal get <package>
cabal get <package> --upstream

Where get is the same as unpack and get --upstream is get/clone/branch/fork/whatever.

Would that make sense?

@jmillikin If I understand correctly cabal fork foo puts it's result into directory ./foo-x.y.z, wouldn't it make more sense to put it into ./foo instead? And have you considered a --dry-run option? I think that could be useful if you want to know in advance what vcs/repo will be used.

bmillwood · 2012-05-08T13:07:14Z

I agree with get --upstream, for what it's worth. I see a prominent use of this command being not forking at all, but just getting the latest source code to look at it or test it for compatibility with something else. Hence fork is kind of inappropriate (and I would never guess it, either).

dcoutts · 2012-05-10T20:00:15Z

I think I agree with sol's suggestion here.

cabal get <package>
cabal get <package> --source-repository
cabal get <package> --source-repository=[head|this|...]

(Note that if you think it's too long, don't worry, you can use any unique prefix, e.g. --source)

The only difference here is instead of --upstream, we're using the exact same terminology as we already have in .cabal files, namely source-repository. For example in .cabal files we have:

source-repository head
  type:     darcs
  location: http://darcs.haskell.org/cabal/
  subdir:   Cabal

And we have pre-defined meanings for different repo kinds, "head", "this" and also allow other named kinds.

So cabal get <package> --source-repository=[head|this|...] lets you be explicit about which one. When you don't specify then we should pick one sensibly: if there's just one declared use that, if there's several, pick 'head' then 'this' then the first other one specified.

The idea of allowing --dry-run here may be useful, however in practice we may not have enough info to know where we'll unpack to until we do it. We could probably say something useful though.

byorgey · 2012-07-25T13:25:23Z

+1 for this feature. cabal get <package> --source-repository sounds good to me, though honestly I really don't care too much what it's named.

tibbe · 2012-08-13T23:41:21Z

@jmillikin It seems like there is an agreement on what is to be done, are you interested in pushing through, making the agreed changes and have this committed?

jmillikin · 2012-08-15T02:24:25Z

I disagree with the proposed user interface changes, so I won't be changing my patch to integrate them. Feel free to use the current state of the patch as a starting place in implementing cabal get, if cabal fork is rejected.

23Skidoo · 2012-08-15T14:26:12Z

If nobody wants to take this on, I can make the necessary changes once I'm done with the sandbox patches. This code could be useful for implementing references to remote source repositories from sandboxes (a-la cabal-meta).

tibbe · 2012-08-15T15:48:21Z

I think fork is the widely accepted terminology in distributed versions control, where everything is a fork. It's used for this action on both GitHub and BitBucket.

dag · 2012-08-25T05:38:14Z

I propose cabal source: it's clear that you're getting the source (unlike cabal get which could mean anything), doesn't suffer from VCS terminology issues and is consistent with e.g. apt-get source and yumdownloader --source. It also goes well with the existing terminology in Cabal around "source repositories".

simonmar · 2012-08-28T11:00:36Z

I don't particularly like any of the previous suggestions. How about cabal get-repo?

gregwebs · 2012-10-01T15:25:20Z

I like simonmar's suggestion. In the future we may want to support additional vcs commands. So it might be best to have cabal repo as a base, with cabal repo get as the command discussed here.

I would be fine with anything here being implemented though :)

23Skidoo · 2012-10-20T19:18:30Z

I have implemented the UI changes @dcoutts requested.

sol · 2012-10-30T11:33:10Z

Not sure if it makes any difference, but I don't like the overloaded semantics of that UI.

@dcoutts even if you say that you agree with my suggestion, it is different from what I proposed.

Please note that my last suggestion is not my favorite. It was just an attempt to foster consensus, and more importantly to prevent a state of "we do it like that for now, and fix it later" (which IMHO is almost always a bad idea).

That said, I think whatever UI is agreed upon, the person who did the initial investment (@jmillikin) has to be ok with it (it does not have to be his favorite, but it still has to be ok with him). Then, even if I don't agree with the final result, I can live with it, because he did the work.

This also has practical advantages. If writing code entitles me to have a say when it comes to bikeshedding, then there is an incentive to write code. But if I have to be prepared that I do the work, and then it will end up as something I just don't like, then I'd rather think twice before investing any of my time.

23Skidoo · 2012-10-30T14:09:24Z

@sol I'd actually prefer having a separate cabal fork command, but I think that it's better to have this included in some form instead of not including at all. Supporting command aliases looks like a good compromise.

dcoutts · 2012-11-01T13:04:43Z

So I think we can close this topic now, 23Skidoo has made a new patch and that's now close to going in.

That said, I don't mean to close down the UI discussion. There's some interesting suggestions for future directions here with more vcs integration. So don't assume this is now set in stone. We can make UI changes in a somewhat backwards compat way (like leaving unpack as an alias for the new command).

Previously, the solver only checked for cycles after it had already found a solution. That reduced the number of times that it performed the check in the common case when there were no cycles. However, when there was a cycle, the solver could spend a lot of time searching subtrees that already had a cyclic dependency and therefore could not lead to a solution. This is part of haskell#3824. Changes in this commit: - Store the reverse dependency map on all choice nodes in the search tree, so that 'detectCyclesPhase' can access it at every step. - Check for cycles incrementally at every step. Any new cycle must contain the current package, so we just check whether the current package is reachable from its neighbors. - If there is a cycle, we convert the map to a graph and find a strongly connected component, as before. - Instead of using the whole strongly connected component as the conflict set, we select one cycle. Smaller conflict sets are better for backjumping. - The incremental cycle detection automatically fixes a bug where the solver filtered out the message about cyclic dependencies when it summarized the full log. The bug occurred when the failure message was not immediately after the line where the solver chose one of the packages involved in the conflict. See haskell#4154. I tried several approaches before I found something with reasonable performance. Here is a comparison of runtime and memory usage. I turned off assertions when building cabal. Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z GHC 8.0.1 Runtime in seconds: Packages Search tree depth Trials master This PR haskell#1 haskell#2 yesod 343 3 2.00 2.00 2.13 2.02 yesod gi-glib leksah 744 3 3.21 3.31 4.10 3.48 phooey 66 3 3.48 3.54 3.56 3.57 stackage nightly snapshot 6791 1 186 193 357 191 Total memory usage in MB, with '+RTS -s': Packages Trials master This PR haskell#1 haskell#2 yesod 1 189 188 188 198 yesod gi-glib leksah 1 257 257 263 306 stackage nightly snapshot 1 1288 1338 1432 12699 haskell#1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after every step. haskell#2 - Store dependencies in Distribution.Compat.Graph in the search tree, and check for cycles containing the current package at every step.

Previously, the solver only checked for cycles after it had already found a solution. That reduced the number of times that it performed the check in the common case where there were no cycles. However, when there was a cycle, the solver could spend a lot of time searching subtrees that already had a cyclic dependency and therefore could not lead to a solution. This is part of haskell#3824. Changes in this commit: - Store the reverse dependency map on all choice nodes in the search tree, so that 'detectCyclesPhase' can access it at every step. - Check for cycles incrementally at every step. Any new cycle must contain the current package, so we just check whether the current package is reachable from its neighbors. - If there is a cycle, we convert the map to a graph and find a strongly connected component, as before. - Instead of using the whole strongly connected component as the conflict set, we select one cycle. Smaller conflict sets are better for backjumping. - The incremental cycle detection automatically fixes a bug where the solver filtered out the message about cyclic dependencies when it summarized the full log. The bug occurred when the failure message was not immediately after the line where the solver chose one of the packages involved in the conflict. See haskell#4154. I tried several approaches and compared performance when solving for packages with different numbers of dependencies. Here are the results. None of these runs involved any cycles, so they should have only tested the overhead of cycle checking. I turned off assertions when building cabal. Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z GHC 8.0.1 Runtime in seconds: Packages Search tree depth Trials master This PR haskell#1 haskell#2 yesod 343 3 2.00 2.00 2.13 2.02 yesod gi-glib leksah 744 3 3.21 3.31 4.10 3.48 phooey 66 3 3.48 3.54 3.56 3.57 Stackage nightly snapshot 6791 1 186 193 357 191 Total memory usage in MB, with '+RTS -s': Packages Trials master This PR haskell#1 haskell#2 yesod 1 189 188 188 198 yesod gi-glib leksah 1 257 257 263 306 Stackage nightly snapshot 1 1288 1338 1432 12699 haskell#1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after every step. haskell#2 - Store dependencies in Distribution.Compat.Graph in the search tree, and check for cycles containing the current package at every step.

Previously, the solver only checked for cycles after it had already found a solution. That reduced the number of times that it performed the check in the common case where there were no cycles. However, when there was a cycle, the solver could spend a lot of time searching subtrees that already had a cyclic dependency and therefore could not lead to a solution. This is part of #3824. Changes in this commit: - Store the reverse dependency map on all choice nodes in the search tree, so that 'detectCyclesPhase' can access it at every step. - Check for cycles incrementally at every step. Any new cycle must contain the current package, so we just check whether the current package is reachable from its neighbors. - If there is a cycle, we convert the map to a graph and find a strongly connected component, as before. - Instead of using the whole strongly connected component as the conflict set, we select one cycle. Smaller conflict sets are better for backjumping. - The incremental cycle detection automatically fixes a bug where the solver filtered out the message about cyclic dependencies when it summarized the full log. The bug occurred when the failure message was not immediately after the line where the solver chose one of the packages involved in the conflict. See #4154. I tried several approaches and compared performance when solving for packages with different numbers of dependencies. Here are the results. None of these runs involved any cycles, so they should have only tested the overhead of cycle checking. I turned off assertions when building cabal. Index state: index-state(hackage.haskell.org) = 2016-12-03T17:22:05Z GHC 8.0.1 Runtime in seconds: Packages Search tree depth Trials master This PR #1 #2 yesod 343 3 2.00 2.00 2.13 2.02 yesod gi-glib leksah 744 3 3.21 3.31 4.10 3.48 phooey 66 3 3.48 3.54 3.56 3.57 Stackage nightly snapshot 6791 1 186 193 357 191 Total memory usage in MB, with '+RTS -s': Packages Trials master This PR #1 #2 yesod 1 189 188 188 198 yesod gi-glib leksah 1 257 257 263 306 Stackage nightly snapshot 1 1288 1338 1432 12699 #1 - Same as master, but with cycle checking (Data.Graph.stronglyConnComp) after every step. #2 - Store dependencies in Distribution.Compat.Graph in the search tree, and check for cycles containing the current package at every step.

* Fix bug causing spurious app or lib dirs to be generated by libraries and executables respectively. * Add unit tests for init flag correctness * Export useful testing functions from `Init/Command.hs` and `Init.Types.hs` * Modify `cabal-install.cabal.dev` to reflect change in test file name for Init/FileCreators -> Init.hs *

Add the 'fork' command to cabal-install.

beaea12

This command reads the source-repositories from a package's description, determines which VCS to use, and then creates a local repository or branch of the package's repository.

This was referenced May 24, 2012

Clean up hooks #127

Closed

Handle framework paths (-F) in Cabal #182

Closed

gregwebs mentioned this pull request Oct 1, 2012

fix git support; add darcs support yesodweb/cabal-meta#15

Merged

sol mentioned this pull request Oct 30, 2012

Provide a run command #1088

Closed

sol mentioned this pull request Oct 30, 2012

Add support for command aliases to cabal-install #1089

Closed

dcoutts closed this Nov 1, 2012

23Skidoo mentioned this pull request Nov 19, 2012

Syntax for setting option to no value in ~/.cabal/config #1053

Closed

23Skidoo mentioned this pull request Nov 30, 2012

Replace the 'unpack' command with a more general 'get'. #1075

Merged

sol mentioned this pull request Oct 27, 2013

Top-level command for Freezing dependencies #1519

Merged

mzero mentioned this pull request Jul 22, 2014

Emergency workaround for GHC 7.8.1 on OSX #1740

Closed

grayjay mentioned this pull request Dec 19, 2016

Solver: Check for cycles after every step. #4176

Merged

gbaz mentioned this pull request Feb 11, 2018

Is it possible to force consistent versions for a library and an associated build tool with new-build? #5105

Open

emilypi mentioned this pull request Jan 30, 2021

Proposal for Init #7265

Closed

emilypi mentioned this pull request Aug 25, 2021

Update Logo #7432

Draft

emilypi mentioned this pull request Sep 22, 2021

cabal init does not put cabal-version on the first line #7680

Closed

sol mentioned this pull request Nov 15, 2021

Upadte README sol/doctest#321

Merged

andreasabel added the cabal-install: cmd/get label Nov 16, 2021

sol mentioned this pull request Feb 23, 2023

Add Distribution.Client.Main #8793

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pull request: Add the 'fork' command to cabal-install. #2

Pull request: Add the 'fork' command to cabal-install. #2

jmillikin commented Apr 24, 2012

dagit commented Apr 24, 2012

dcoutts commented Apr 24, 2012

vincenthz commented Apr 24, 2012

dcoutts commented Apr 24, 2012

jmillikin commented Apr 25, 2012

dagit commented Apr 25, 2012

sol commented Apr 25, 2012

dcoutts commented Apr 25, 2012

kosmikus commented Apr 25, 2012

vincenthz commented Apr 25, 2012

jmillikin commented Apr 25, 2012

jmillikin commented Apr 27, 2012

kosmikus commented Apr 27, 2012

jmillikin commented May 5, 2012

kosmikus commented May 7, 2012

jmillikin commented May 8, 2012

kosmikus commented May 8, 2012

sol commented May 8, 2012

bmillwood commented May 8, 2012

dcoutts commented May 10, 2012

byorgey commented Jul 25, 2012

tibbe commented Aug 13, 2012

jmillikin commented Aug 15, 2012

23Skidoo commented Aug 15, 2012

tibbe commented Aug 15, 2012

dag commented Aug 25, 2012

simonmar commented Aug 28, 2012

gregwebs commented Oct 1, 2012

23Skidoo commented Oct 20, 2012

sol commented Oct 30, 2012

23Skidoo commented Oct 30, 2012

dcoutts commented Nov 1, 2012

Pull request: Add the 'fork' command to cabal-install. #2

Pull request: Add the 'fork' command to cabal-install. #2

Conversation

jmillikin commented Apr 24, 2012

dagit commented Apr 24, 2012

dcoutts commented Apr 24, 2012

vincenthz commented Apr 24, 2012

dcoutts commented Apr 24, 2012

jmillikin commented Apr 25, 2012

dagit commented Apr 25, 2012

sol commented Apr 25, 2012

dcoutts commented Apr 25, 2012

kosmikus commented Apr 25, 2012

vincenthz commented Apr 25, 2012

jmillikin commented Apr 25, 2012

jmillikin commented Apr 27, 2012

kosmikus commented Apr 27, 2012

jmillikin commented May 5, 2012

kosmikus commented May 7, 2012

jmillikin commented May 8, 2012

kosmikus commented May 8, 2012

sol commented May 8, 2012

bmillwood commented May 8, 2012

dcoutts commented May 10, 2012

byorgey commented Jul 25, 2012

tibbe commented Aug 13, 2012

jmillikin commented Aug 15, 2012

23Skidoo commented Aug 15, 2012

tibbe commented Aug 15, 2012

dag commented Aug 25, 2012

simonmar commented Aug 28, 2012

gregwebs commented Oct 1, 2012

23Skidoo commented Oct 20, 2012

sol commented Oct 30, 2012

23Skidoo commented Oct 30, 2012

dcoutts commented Nov 1, 2012