Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plots diff: duplicate revisions not returned #7265

Closed
mattseddon opened this issue Jan 13, 2022 · 10 comments
Closed

plots diff: duplicate revisions not returned #7265

mattseddon opened this issue Jan 13, 2022 · 10 comments
Labels
A: plots Related to the plots bug Did we break something? diff/show Related to the diff/show feature p3-nice-to-have It should be done this or next sprint product: VSCode Integration with VSCode extension

Comments

@mattseddon
Copy link
Member

mattseddon commented Jan 13, 2022

Bug Report

Description

If I give plots diff two revisions that are actually the same (e.g f123def and HEAD when that commit is the HEAD or simply f123def & f123def) then neither are returned with the output.

Reproduce

❯ dvc plots diff $(git rev-parse --short HEAD) HEAD  -o .dvc/tmp/plots --show-json
No plots were loaded, visualization file will not be created.
dvc plots diff $(git rev-parse --short HEAD~1) HEAD~1 HEAD HEAD  -o .dvc/tmp/plots --show-json
No plots were loaded, visualization file will not be created.
❯ dvc plots diff $(git rev-parse --short HEAD~1) HEAD~1 HEAD HEAD HEAD  -o .dvc/tmp/plots --show-json
No plots were loaded, visualization file will not be created.

A mixture of duplicate and non-duplicate entries will return the output only for non-duplicates:

❯ dvc plots diff $(git rev-parse --short HEAD~1) HEAD~1 HEAD  -o .dvc/tmp/plots --show-json

output for HEAD only

Expected

Duplicate revs return a single consolidated entry for that revision with a consistent identifier (should probably always be the short sha but unsure).

Environment information

Output of dvc doctor:

DVC version: 2.9.3 (pip)
---------------------------------
Platform: Python 3.9.5 on macOS-12.1-x86_64-i386-64bit
Supports:
        webhdfs (fsspec = 2021.11.1),
        http (aiohttp = 3.7.4.post0, aiohttp-retry = 2.4.5),
        https (aiohttp = 3.7.4.post0, aiohttp-retry = 2.4.5),
        s3 (s3fs = 2021.11.1, boto3 = 1.17.49)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5s1
Caches: local
Remotes: s3
Workspace directory: apfs on /dev/disk1s5s1
Repo: dvc (subdir), git

Additional Information (if any): Please let me know if you need any further background on this. Thank you

@mattseddon mattseddon added the product: VSCode Integration with VSCode extension label Jan 13, 2022
@daavoo daavoo added bug Did we break something? A: plots Related to the plots diff/show Related to the diff/show feature labels Jan 14, 2022
@dberenbaum
Copy link
Collaborator

What is the situation that is causing you to need to pass duplicate revisions?

Also, why is deduplication the behavior you expect? This seems like it could be surprising and confusing if someone passes multiple identifiers not realizing they resolve to the same revision.

@mattseddon
Copy link
Member Author

What is the situation that is causing you to need to pass duplicate revisions?

A couple of examples of things that I ran into are branch and HEAD for the previous commit and checkpoint tip and HEAD when the HEAD is being moved around in a checkpoint experiment. It seems like this doesn't happen with workspace which helps a lot.

Also, why is deduplication the behavior you expect? This seems like it could be surprising and confusing if someone passes multiple identifiers not realizing they resolve to the same revision.

You are correct, I got that wrong. Passing back all of the revisions would be the least surprising thing to do.

@dberenbaum
Copy link
Collaborator

Passing back all of the revisions would be the least surprising thing to do.

Would this behavior work for vscode needs?

@mattseddon
Copy link
Member Author

Would this behaviour work for vscode needs?

Yep

@mattseddon
Copy link
Member Author

mattseddon commented Jun 30, 2022

I've just run into this again whilst testing an apply experiment workflow. After applying an experiment to the workspace I get:

dvc plots diff 2f10596 -o .dvc/tmp/plots --split --show-json - FAILED with code 255 (716ms)
ERROR: unexpected error - 2f10596: ambiguous SHA1 prefix - multiple matches for prefix: 2f105960ab947472f479d4f7354b6fd4fd0c441a

Where 2f105960ab947472f479d4f7354b6fd4fd0c441a is the sha most recently finished experiment (not the applied experiment).

@dberenbaum
Copy link
Collaborator

@mattseddon Is it the same as above? It seems like the original issue was about duplicate refs that pointing to the same commit. From the error ambiguous SHA1 prefix - multiple matches for prefix, it seems like the latest issue is about one ref that could refer to multiple commits. Is there any more to the error message? It says "multiple matches" but I only see one.

@mattseddon
Copy link
Member Author

I saw it as duplicate because I assumed that when you checkout the commit into the workspace it changes the ref for workspace underneath and fails but I could be wrong.

That was the entirety of the error message. The default behaviour is that when you call diff with only one commit it will returns the workspace as well so that it has something to diff against.

@dberenbaum
Copy link
Collaborator

Can you reproduce this consistently with any experiment ref or was it a one-time thing? I'm able to exp apply and then run the above plots diff command without a problem, so I wonder if it's truly an ambiguous reference. Is there a way you can get/use the full hash instead?

@mattseddon
Copy link
Member Author

Can you reproduce this consistently with any experiment ref or was it a one-time thing? I'm able to exp apply and then run the above plots diff command without a problem, so I wonder if it's truly an ambiguous reference. Is there a way you can get/use the full hash instead?

Looks like you are right @dberenbaum I cannot recreate. We can use the full hash it may actually simplify the code. We will still need to return the short-sha whenever we show it in a toast popup.

@dberenbaum dberenbaum added the p3-nice-to-have It should be done this or next sprint label Mar 16, 2023
@mattseddon
Copy link
Member Author

Closing this as not required for the next iteration of plots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: plots Related to the plots bug Did we break something? diff/show Related to the diff/show feature p3-nice-to-have It should be done this or next sprint product: VSCode Integration with VSCode extension
Projects
None yet
Development

No branches or pull requests

3 participants