Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd: document target granularity for push/pull/etc, et al. #1384

Merged
merged 71 commits into from
Aug 6, 2020

Conversation

efiop
Copy link
Contributor

@efiop efiop commented May 31, 2020

Fixes #886

UPDATE: See also #1384 (comment) (could be extracted into another PR)

UPDATE: Merge #1602 after this.

@shcheklein shcheklein temporarily deployed to dvc-landing-fix-886-bvk3cbmqka May 31, 2020 23:53 Inactive
@calibre-analytics

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

@efiop

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions on the current changes:

content/docs/command-reference/pull.md Outdated Show resolved Hide resolved
content/docs/command-reference/push.md Outdated Show resolved Hide resolved
content/docs/command-reference/status.md Outdated Show resolved Hide resolved
@efiop
Copy link
Contributor Author

efiop commented Jul 9, 2020

@jorgeorpinel Both, tbh. Was trying to find the best way to explain it here and in dvc help, then other stuff got in the way, so not enough focus for this for now.

The granularity itself might not need any explanation at all, actually, because it is something that is expected, if you think about it. So I'm leaning towards just documenting that a target can be a dvc-file or a file/directory tracked by dvc (without saying that subfiles and subdirs are supported, since that seems kinda expected).

@jorgeorpinel jorgeorpinel self-requested a review July 9, 2020 22:35
@jorgeorpinel jorgeorpinel self-assigned this Jul 9, 2020
@jorgeorpinel
Copy link
Contributor

Say no more, I'll take this over 🙂

The granularity itself might not need any explanation at all

Good point. Will read the cmd refs and determine this.

Say no more

Actually... Can you just confirm which commands does this impact @efiop? I'm guessing checkout, fecth, pull, push, and list.

@efiop
Copy link
Contributor Author

efiop commented Jul 9, 2020

@jorgeorpinel Thank you so much! 🙏 It affects push, pull, fetch, status -c(non -c one is dumb), list, get, import.

@jorgeorpinel jorgeorpinel changed the title [WIP] docs: document target granularity for push/pull/etc cmd: document target granularity for push/pull/etc Jul 9, 2020
@jorgeorpinel jorgeorpinel changed the title cmd: document target granularity for push/pull/etc cmd: document target granularity for push/pull/etc, et al. Jul 9, 2020
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Jul 10, 2020

I'm leaning towards just documenting that a target can be a dvc-file or a file/directory tracked by dvc (without saying that subfiles and subdirs are supported
status -c(non -c one is dumb)

Maybe for most commands (will review them now) but for status, since this behavior only exists in the remote mode, a complex explanation is needed. Risking being dumb, can you remind me why that only works on remote mode BTW? Maybe should be universal 🙂

UPDATE: Mentioned in iterative/dvc/issues/4191

@efiop

This comment has been minimized.

| remote | `--remote` | Comparisons are made between the cache, and the given remote. Remote storage is defined using the `dvc remote` command. |
| remote | `--cloud` | Comparisons are made between the cache, and the default remote (typically defined with `dvc remote --default`). |

Without arguments, this command checks all stages (in `dvc.yaml` and `dvc.lock`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it union of two files? or only one of them determines the list of stages to check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's effectively the union, from a conversation I had with @efiop once: It checks dvc.yaml to determine the stages, but later those not found in dvc.lock are skipped. Maybe a warning is printed.

Copy link
Contributor Author

@efiop efiop Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgeorpinel It is not the union, we've discussed that dvc.yaml is the ultimate source of all truth. Can't say it is a union, really. Even if something is in dv.cyaml but not in dvc.lock, status will reflect that (-c or not -c)

Copy link
Contributor

@jorgeorpinel jorgeorpinel Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. But since stages not found in dvc.lock are then skipped, that's why I say it's effectively the union.

But in any case, that wasn't the intention of this text. It mentions dvc.lock because it talks about the hash values of the stage outputs. Updated a bit in 6ac8b0f.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since stages not found in dvc.lock are then skipped

if they are skipped w/o a single message then it's intersection? Only those that exist in both are checked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIR there is a WARNING message.

@jorgeorpinel jorgeorpinel requested a review from shcheklein August 4, 2020 06:15
command synchronizes the workspace data with the versions specified in the
current `.dvc` and `dvc.lock` files.
This command is usually needed after `git checkout`, `git clone`, or any other
operation that changes the `dvc.lock` or `.dvc` in the workspace. It restores
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if dvc.yaml is the source of truth for checkout then we should mention it as well probably .. or come up with some way to mention them at once like abbr DVC files abbr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree... But this one is pretty tricky. Mentioned dvc.yaml as a note for now in a919f85.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or come up with some way to mention them at once like abbr DVC files abbr

I think we definitely need to do this and not just here but in many documents... But I'm going to extract it: see #1663


- After checking out a fresh copy of a <abbr>DVC repository</abbr>, to get
DVC-tracked data from multiple project branches or tags into your machine.
- To use comparison commands across different Git commits, for example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first one here is actually needed for the second point ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the first one is more general, the 2nd one is a specific application of the 1st one indeed.

| remote | `--cloud` | Comparisons are made between the cache, and the default remote (typically defined with `dvc remote --default`). |

Without arguments, this command checks all stages (in `dvc.yaml`) and `.dvc`
files to compare the hash values of their <abbr>outputs</abbr> (in `dvc.lock`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(for stages output hashes as defined in dvc.lock)?

Copy link
Contributor

@jorgeorpinel jorgeorpinel Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"checks stages (in dvc.yaml) for output hash values as defined in dvc.lock" ? That's contradictory/ also confusing. Let me think 🧠

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to

"checks all stages (defined in dvc.yaml) and .dvc files, and compares the hash values of their outputs (found in dvc.lock for stages) against"

in acf0196.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1663 can probably simplify this too.

Copy link
Member

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Some final comments to check, please check it online - how does it look, any issue etc, and let's merge!

@jorgeorpinel jorgeorpinel temporarily deployed to dvc-landing-fix-886-jaxjy72t1y August 6, 2020 00:18 Inactive
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-landing-fix-886-jaxjy72t1y August 6, 2020 00:23 Inactive
@jorgeorpinel
Copy link
Contributor

Sanity checks passed! (Redeployed https://dvc-landing-fix-886-jaxjy72t1y.herokuapp.com/doc/command-reference)

@jorgeorpinel jorgeorpinel merged commit 81fbb66 into master Aug 6, 2020
@jorgeorpinel jorgeorpinel deleted the fix-886 branch August 11, 2020 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cmd ref: document target granularity for push/pull/fetch/checkout/status -c et al.
4 participants