Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a jj parallelize command #1079

Closed
martinvonz opened this issue Jan 20, 2023 · 20 comments
Closed

Add a jj parallelize command #1079

martinvonz opened this issue Jan 20, 2023 · 20 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@martinvonz
Copy link
Member

martinvonz commented Jan 20, 2023

This came up in arxanas/git-branchless#757.

Assume we have a linear history like this:

@  D
│
o  C
│
o  B
│
o  A
│
~ 

We should provide a command that's something like jj parallelize B:C, which would reshape the history to be this:

@    D'
├─╮
o │  C'
│ │
│ o  B
├─╯
o  A
│
~ 

I don't know if we should insert an empty merge commit or let D' be a merge commit. Other VCSs might call that an "evil merge", but since merge commits are easier to work with in jj than in most other VCSs, it might be best to let D' be the merge commit. We could also make it optional.

@martinvonz martinvonz added enhancement New feature or request good first issue Good for newcomers labels Jan 20, 2023
@ilyagr
Copy link
Contributor

ilyagr commented Jan 20, 2023

I often do this as rebase -r B -d B- and then rebase -s D -d B -d C'. This only works if B is a single revision.

I'm not sure rebase -r belongs in the rebase command. Perhaps it would be good to replace it with a "ripout" command (better name, anyone? tear?) that's like rebase -r X -d X- but can apply to a sequence of several revisions. Or there could be a way to have rebase -r to apply to a sequence.

Finally, this is a bit of a distraction, but the inverse operation to this "parallelize" it's tricky, especially if the merge commit resolves a conflict.

@martinvonz
Copy link
Member Author

I often do this as rebase -r B -d B- and then rebase -s D -d B -d C'. This only works if B is a single revision.

Yes, for parallelizing three commits, you have to do something like jj rebase -r C -d A; jj rebase -r D -d A; jj rebase -s E -d B -d C' -d D'

I'm not sure rebase -r belongs in the rebase command.

I'm not sure either, but note that the command will hopefully learn to accept many -r and will then preserve their internal order. That's what hg rebase -r does, so e.g. hg rebase -r X:: -d Y is equivalent to hg rebase -s X -d Y.

the inverse operation to this "parallelize" it's tricky, especially if the merge commit resolves a conflict.

How so? It seems very similar to how you might get conflicts in C' in the example above.

@samueltardieu
Copy link
Contributor

We should provide a command that's something like jj parallelize B:C, which would reshape the history to be this:

I'm having a hard time understanding B:C in this example. Does it extract the first commit of the revset and do a merge with the last commit? What if B:C resolves to more than two commits, let's say 5, how would the split be organized?

For example, let's assume we have the following starting point:

o  G
│
o  F
│
o  E
│
o  D
│
o  C
│
o  B
│
o  A
│
~

How would we get (cli-syntax-wise) to

o  G'
│─╮
│ o F'
│ │
│ o E'
│ │
o │ D
│ │
o │ C
│ │
o │ B
│─╯
o  A
│
~

or even to

o  G'
│─╮
│ o F'
│ │
│ o D'
│ │
o │ E'
│ │
o │ C
│ │
o │ B
│─╯
o  A
│
~

?

@martinvonz
Copy link
Member Author

Does it extract the first commit of the revset and do a merge with the last commit?

I had imagined jj parallelize would make all the commits parallel. I think the occasions when it's useful to parallelize many commits are rare, but it happens once in a while. For me, it has typically happened recently when I've imported a bunch of Rust crates into Google's monorepo. It's easier to just add new commits on top when doing that. When I'm done importing, I can often send many of the imports for review in parallel. However, as I think more about this, it's quite often the case that some crates depend on each other (e.g. prost, prost-derive, and prost-build, but also among less closely related crated), so I wouldn't have been able to parallelize all commits in my case. I don't know if there's a good CLI for describing the cases you came up with in a simple way.

@ilyagr
Copy link
Contributor

ilyagr commented Jan 20, 2023

I'm not sure either, but note that the command will hopefully learn to accept many -r and will then preserve their internal order.

Agreed, that's very close to what I was thinking, and more important than whether we keep it as part of rebase. Do you think there would still be a need for parallelize if that were possible?

(While I was writing this, you posted a comment that implies I misunderstood what parallelize is about. So, I guess the answer is yes)

How so? It seems very similar to how you might get conflicts in C' in the example above.

Let's say you have two parallel branches with several commits each that merge, and the merge resolves conflicts. Then, you want to stack the branches, so you rebase one on top of the tip of the other.

The conflicts are now smeared out across the various commits of the branch that's now on top. It's often easier to resolve them again than to reuse the resolutions you already made in the merge commit.

Perhaps a way to address that would be something like hg absorb that was smart enough to understand conflict resolutions.

@ilyagr
Copy link
Contributor

ilyagr commented Sep 20, 2023

Here's an idea of how jj parallelize could work. I'm not sure whether it matches Martin's original idea or not.

Let's say we start with 5 commits:

        1  2  3  4  5  6
(root)--*--*--*--*--*--*

Running jj parallelize 1::5 could open a text editor that asks the user to label the commits to something like (probably with better syntax, and certainly with more instructions):

1: A
2: B
3: C
4: B
5: A

Then, jj would reshape the graph as:

(root)---------*--*---
        \      1  5   \
         --*-*----------*
         | 2 4       /  6
          \         /
           -*------/
            3

Some notes:

  • There's probably going to be a special group for "unmentioned or empty label". So, the user could just delete 3 from the file instead of writing 3:C.
  • jj parallelize --create-branches could also create branches A and B (and C if it's listed).
  • This seems similar to what tools with names like "git stack" do. (Though those tools usually use commit description for tags)
  • It seems somewhat important to me that the command is flexible enough to create trees of depth more than 1. Otherwise, it feels like it'll be too situational.
  • I'm not sure what the default should be (if the user doesn't edit the file), but one option is to put each commit on its own branch.

WDYT?

I'm not 100% sure myself whether I'd find such a command useful, but it seems possible and, again, there seem to be tools with the explicit purpose of making such a workflow work in git.

@martinvonz
Copy link
Member Author

I had originally wanted a simple command for just making a few commits siblings. Maybe your version could be jj parallelize --interactive :)

For my use case (importing a bunch of Rust crates into Google's monorepo), I think the simple version that parallelizes all input commits would actually be enough. In the example you provided, I think I could then do this:

$ jj parallelize -r 1::5
$ jj rebase -s 5 -d 1
$ jj rebase -s 4 -d 2

I hadn't thought of that multi-step process involving rebase before, but it seems like that would actually work well.

@ilyagr
Copy link
Contributor

ilyagr commented Sep 20, 2023

Maybe your version could be jj parallelize --interactive :)

True.

In the example you provided, I think I could then do this: ...

This would only work well if either

  • 5 does not have any children initially

or

  • jj parallelize rebases children of 5 onto the parent of 1 (like another branch)

I'm not sure how important the case of 5 having children is, but if it does, I'm not sure rebasing them onto the parent of 1 is a good idea.

@ilyagr
Copy link
Contributor

ilyagr commented Sep 20, 2023

Another random thought (probably not actionable at this point): this is actually similar to the potential direction the absorb command could go into eventually that I described on Discord. This could matter if, instead of adding crates, you were upgrading crates and some of them were on neighboring lines of Cargo.toml.

@martinvonz
Copy link
Member Author

In the example you provided, I think I could then do this: ...

This would only work well if either

  • 5 does not have any children initially

In the example above, it has commit 6 as child. That's what I based my example on.

or

  • jj parallelize rebases children of 5 onto the parent of 1 (like another branch)

I'm not sure how important the case of 5 having children is, but if it does, I'm not sure rebasing them onto the parent of 1 is a good idea.

Ah, I had assumed that that jj parallelize 1::5 would make children of 5 children of all the commits, i.e. becoming commits. Sorry that I didn't think to mention that.

@ilyagr
Copy link
Contributor

ilyagr commented Sep 20, 2023

Ah, I had assumed that that jj parallelize 1::5 would make children of 5 children of all the commits, i.e. becoming commits. Sorry that I didn't think to mention that.

I'm still a bit confused, though maybe I figured it out.

I was imagining that jj parallelize would make children of 5 into the children of each commit (or each branch for jj parallelize --interactive), I only considered the other option briefly. I was thinking that if you did the commands you proposed with my example above, you'd get approximately the same result as what I proposed, but 6 would also be a direct descendant of 2 and of 1.

Now I'm recalling that maybe jj simplifies that situation automatically, since I think we do not allow the same commit to be both a direct and an indirect descendant of the same other commit. So, perhaps that's what I was missing.

@martinvonz
Copy link
Member Author

Now I'm recalling that maybe jj simplifies that situation automatically, since I think we do not allow the same commit to be both a direct and an indirect descendant of the same other commit. So, perhaps that's what I was missing.

Ah, good point. Yes, I was relying on that behavior without really thinking about it.

@ilyagr
Copy link
Contributor

ilyagr commented Sep 20, 2023

I don't find this behavior obvious, but we could maybe still have people rely on it if we explain it enough and encourage people enough to do it.

(This was previously an update to a previous message)

@martinvonz
Copy link
Member Author

I removed the "Good first issue" label from this earlier today, but given that I again think the basic version I initially had intended is good enough, I'll add it back.

@martinvonz martinvonz added the good first issue Good for newcomers label Sep 20, 2023
emesterhazy added a commit that referenced this issue Mar 31, 2024
This refactor will allow us to reuse new `rebase_descendants` function for the
`jj split --siblings` feature (#2274) and later possibly for `jj parallelize`
(#1079).
@emesterhazy
Copy link
Contributor

I have the basic version of this implemented and will push a PR soon.

@emesterhazy
Copy link
Contributor

emesterhazy commented Mar 31, 2024

Now I'm recalling that maybe jj simplifies that situation automatically, since I think we do not allow the same commit to be both a direct and an indirect descendant of the same other commit. So, perhaps that's what I was missing.

Ah, good point. Yes, I was relying on that behavior without really thinking about it.

I'm not observing this behavior in practice. jj seems happy to allow a commit to be a direct and indirect ancestor of another commit.

Here's an example. I replaced the actual change ids with the numbers from above to make it easier to compare.

jj log
◉   6
◉   5
◉   4
◉   3
◉   2
◉   1
◉
jj parallelize 1::5
jj log
◉           6
├─┬─┬─┬─╮
│ │ │ │ ◉   4
│ │ │ ◉ │   3
│ │ │ ├─╯
│ │ ◉ │   2
│ │ ├─╯
│ ◉ │   1
│ ├─╯
◉ │   5
├─╯
◉
jj rebase -s 5 -d 1
jj log
◉           6
├─┬─┬─┬─╮
│ │ │ │ ◉   4
│ │ │ ◉ │   3
│ │ │ ├─╯
│ │ ◉ │   2
│ │ ├─╯
◉ │ │   5
├─╯ │
◉   │   1
├───╯
◉

You can see here that 6 is a direct descendant of both 5 and 1. This issue is somewhat orthogonal to the jj parallelize command though.

emesterhazy added a commit that referenced this issue Mar 31, 2024
emesterhazy added a commit that referenced this issue Mar 31, 2024
emesterhazy added a commit that referenced this issue Mar 31, 2024
@martinvonz
Copy link
Member Author

I'm not observing this behavior in practice.

Yes, that's correct. I actually changed the default quite recently. Now it's only the rebase command that simplifies ancestor merges.

I agree that it's orthogonal to jj parallelize.

@emesterhazy
Copy link
Contributor

Now it's only the rebase command that simplifies ancestor merges.

I used jj rebase in the example though, so I'm not sure what you mean. Mind clarifying?

emesterhazy added a commit that referenced this issue Mar 31, 2024
@martinvonz
Copy link
Member Author

Now it's only the rebase command that simplifies ancestor merges.

I used jj rebase in the example though, so I'm not sure what you mean. Mind clarifying?

Sorry, I was confused. Even rebase preserves ancestor merges these days. The state I was thinking of was very short-lived and existed only within PR #3080, between commit 2bbefcc and commit bbda146.

emesterhazy added a commit that referenced this issue Apr 1, 2024
This refactor will allow us to reuse new `rebase_descendants` function for the
`jj split --siblings` feature (#2274) and later possibly for `jj parallelize`
(#1079).
emesterhazy added a commit that referenced this issue Apr 1, 2024
emesterhazy added a commit that referenced this issue Apr 1, 2024
emesterhazy added a commit that referenced this issue Apr 1, 2024
This refactor will allow us to reuse new `rebase_descendants` function for the
`jj split --siblings` feature (#2274) and later possibly for `jj parallelize`
(#1079).
emesterhazy added a commit that referenced this issue Apr 1, 2024
This refactor will allow us to reuse new `rebase_descendants` function for the
`jj split --siblings` feature (#2274) and later possibly for `jj parallelize`
(#1079).
emesterhazy added a commit that referenced this issue Apr 1, 2024
@emesterhazy
Copy link
Contributor

Since this has been implemented I'm going to close the issue. There were some interesting suggestions in this thread about how to further extend the command. Perhaps they can be moved to a separate issue if people are still interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants