Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Being able to call exp show with multiple branches (--rev) and a number of commits (-n) #9390

Closed
sroy3 opened this issue May 2, 2023 · 10 comments · Fixed by #9391
Closed
Assignees
Labels
A: experiments Related to dvc exp feature request Requesting a new feature product: VSCode Integration with VSCode extension

Comments

@sroy3
Copy link

sroy3 commented May 2, 2023

In the VS Code extension, we would like to show multiple branches at the same time in the table. Each with its own number of commits. We're currently doing this by calling exp show once per branch (dvc exp show --rev main -n 5, dvc exp show --rev other-branch -n 3...). We'd like to be able to do this in one call only (something like dvc exp show --rev main -n 5 --rev other-branch -n 3...). Would it be possible?

@daavoo
Copy link
Contributor

daavoo commented May 2, 2023

Would it be possible?

The internal API already supports multiple revs.

This should be a simple change to make for the CLI (TBH I thought the CLI already supported it)

@daavoo daavoo added product: VSCode Integration with VSCode extension A: experiments Related to dvc exp feature request Requesting a new feature labels May 2, 2023
@daavoo daavoo self-assigned this May 2, 2023
daavoo added a commit that referenced this issue May 2, 2023
@daavoo daavoo linked a pull request May 2, 2023 that will close this issue
@sroy3
Copy link
Author

sroy3 commented May 2, 2023

That was fast! Thanks!

@mattseddon
Copy link
Member

@daavoo we have an issue with this where if we call for a revision multiple times we only get the revision returned once. Would it be difficult (and/or dangerous) to change the logic or introduce a flag to change the behaviour to return duplicates?

The use case is a user displaying multiple instances of the same commit across different branches.

example:

make a new branch from main (branch-a)
make two commits
running dvc exp show --rev branch-a~0 branch-a~1 branch-a~2 branch-a~3 main~0 main~1 will not give me duplicate entries for main.

The reason we need this is to avoid making our own mapping of commits to branches before calling dvc or calling dvc multiple times.

@dberenbaum
Copy link
Collaborator

Sorry @mattseddon, I should have also realized this when we discussed yesterday.

Are you also going to duplicate experiments in that case? More importantly, if we can make dvc exp show return stable results, is it an option to drop the duplicates in VS Code also, at least as a first step? It hasn't been an issue in DVC or Studio so far, so I'm wondering if there's something different about the VS Code UI where duplicates are particularly important?

@dberenbaum
Copy link
Collaborator

Also, I see in iterative/vscode-dvc#3941 (comment) that you are calling git log to get the commit messages. Since we now have support for custom commit messages, we could probably add those to dvc exp show. Not sure they make since in the CLI table, but they could be in the JSON.

@shcheklein
Copy link
Member

Tbh, it feels we should have the mapping on the VS Code side and pass only unique hashes to the API. @mattseddon how complicated is it? Why is it an issue considering that we already run git log, etc? I though that we had this information already tbh.

@mattseddon
Copy link
Member

Are you also going to duplicate experiments in that case? More importantly, if we can make dvc exp show return stable results, is it an option to drop the duplicates in VS Code

If a commit is on both branch-a and branch-b and we are displaying both branches separately then how would we de-duplicate and not have to explain in detail to a user why a commit is missing from one of the selected branches?

Tbh, it feels we should have the mapping on the VS Code side and pass only unique hashes to the API. @mattseddon how complicated is it? Why is it an issue considering that we already run git log, etc? I though that we had this information already tbh.

We need to rework what has been written to do this. Atm exp show provides the list of commits which we then pass to git log to get the commit information. I did not dig into the code before writing #9390 (comment)

@dberenbaum
Copy link
Collaborator

Is the goal to get branches working in some simple way, or to have a UI that duplicates commits for every branch? Was duplication a priority at the start, or was it a natural consequence of the old approach with multiple dvc exp show calls? It feels like it is taking extra work to try to duplicate commits across all branches when I haven't seen a lot of complaints about it in Studio or DVC, and I thought from the last discussion with the team that the goal was to do something simple. It's an interesting idea, but it also looks like it's significant extra effort, it has its own downsides (commits and especially experiments in multiple places could be confusing), and it's inconsistent with the other products. Curious why duplicating commits across branches became a blocker for VS Code?

Re: git log calls, if we add the commit message to dvc exp show, could we get rid of the git log calls? I'd like to reduce the number of calls if we can.

@sroy3
Copy link
Author

sroy3 commented May 25, 2023

Re: git log calls, if we add the commit message to dvc exp show, could we get rid of the git log calls? I'd like to reduce the number of calls if we can.

We don't know what we are asking for until after it was sent back. If we're asking for branch~1 and it's the same as main~3 and it's not returned because of that, we have no way of associating it with branch, and the branch association gets shifted by one. This can be circumvented by taking calling git log before with the revision branch~1 and getting a commit hash that we will associate with branch; later, we will run git log on the revision main~3 and get the same hash and add main to that hash as well. When we get the results from exp show, we'll add the branch to the output with the previous mapping. If a hash has more than one branch, we'll duplicate the output for the other branch. Adding the commit message to the exp show output won't allow us to remove the git log calls for that reason, unfortunately.

We need duplication because when asking for more commits, we need to show more commits. We don't know ahead of time which commit is duplicated. Even if we call git log with a previous revision (branch~2, branch~3...) until we get another unique hash (or not), there is no way for us to know if we can allow the user to show more commits or not previous to that.

@dberenbaum
Copy link
Collaborator

Thanks @sroy3! Sorry to make you explain all that, but now I understand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp feature request Requesting a new feature product: VSCode Integration with VSCode extension
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants