-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref: data status
updates
#3924
ref: data status
updates
#3924
Conversation
`dvc-commit`-ed that differs with the last Git commit. There might be more | ||
detailed state on how each of those files changed: _added_, _modified_, | ||
_deleted_ and _unknown_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question though: what's an unknown
change? Does it apply to untracked files too? I didn't know where to mention that in the updated text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, let's say you clone a repo with a tracked directory but haven't done dvc fetch/pull
. You have an md5 in the .dvc file but the directory is empty. If you modify the directory (for example, add a file to it), DVC can tell that the directory doesn't match the md5 but has no info about the granular changes to the files in that directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DVC can tell that the directory doesn't match the md5 but has no info about the granular changes to the files in that directory
So it only applies to --granular
right? (Assuming that generalizing that example is the one scenario for this.) I added a note in that option instead of trying to cover it up here (it still counts as "new, modified, or deleted"). aca9d8a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry Do you know if it can ever apply to non-granular data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it only applies to granular.
There may be other things for you to include here based on pending comments under #3812 (comment) @dberenbaum Thanks! |
Link Check Report
1/3 links failed. |
@@ -83,31 +88,6 @@ option. | |||
|
|||
- `-v`, `--verbose` - displays detailed tracing information. | |||
|
|||
## Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why removing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest removing it because it's already used in the description.
You asked me to make a PR with the most important changes from my feedback and IMO this is one of them: #3812 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3812 (comment) - the problem is it was discussed already. Question - do you have an actually strong opinion about this? :) (I'm personally fine and Dave was fine it seems)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize it had been discussed (I didn't read all the resolved comments). I still think it's unnecessary since the --granular
example can still refer to the one in the description.
It's not a very strong opinion but it does go against good practices for cmd ref examples, I think: they should add special value to the doc, not just cover obvious cases. And to avoid redundancy in general.
But anyway, I rolled it back in 725c7e4 since it was discussed by them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgeorpinel we go into another round of discussions here I feel, again there are copy edits + substnatial product changes together. It makes review complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per #3924 (review)
Co-authored-by: Restyled.io <[email protected]>
This may happen after cloning a DVC repository but before using `dvc pull` (or | ||
`dvc fetch`) to download the data; or after using `dvc gc`. | ||
|
||
- _DVC committed changes_ are new, modified, or deleted tracked files or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call it added
to keep it in sync with the output.
- _DVC committed changes_ are new, modified, or deleted tracked files or | |
- _DVC committed changes_ are added, modified, or deleted tracked files or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same on DVC uncommitted changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not using back ticks because this doesn't intend to reflect the exact output. I think "added" is misleading and the output itself seems to be up for debate. Please see/ comment in iterative/dvc#7943 (review).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgeorpinel It's a good point, but @skshetry also has a reasonable rationale for using "added," so what do you think about continuing discussion in another issue and leaving "added" to get this merged?
@@ -1,6 +1,6 @@ | |||
# data status | |||
|
|||
Show changes in the data tracked by DVC in the workspace. | |||
Show changes to the files and directories tracked by DVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command name is data
. I am not sure what we get by replacing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are specific about what we mean by data. Docs should explain 🙂
directories. Note that some granular changes may be reported as `unknown` as | ||
DVC tracks | ||
[directory-level hash values](doc/user-guide/project-structure/internal-files#directories). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment tbh.
From aca9d8a#r83797370
Do you mean you're not familiar with the underlying situation this refers to @shcheklein? Please see #3924 (comment) if so. It's kind of confusing (maybe I didn't even get it right, waiting for another review from the core team).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative is to not even document this specifically, if it's evident what unknown means in the CLI output (self-documenting).
directories. Note that some granular changes may be reported as `unknown` as | |
DVC tracks | |
[directory-level hash values](doc/user-guide/project-structure/internal-files#directories). | |
directories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might understand this if read the comment (i haven't yet), but I would expect to be able to understand it from the text, right?
do you mean that some files (not granular changes) can be reported as "unknown" if (when?
as DVC tracks is not informative)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's not that easy to explain briefly. How about this?
directories. Note that some granular changes may be reported as `unknown` as | |
DVC tracks | |
[directory-level hash values](doc/user-guide/project-structure/internal-files#directories). | |
directories. Note that granular changes may have "unknown" state if tracked | |
[directory contents](doc/user-guide/project-structure/internal-files#directories) | |
are not present in the local cache (see also `dvc fetch`). |
Probably still doesn't explain why but at least it describes the scenario more clearly, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status for files inside DVC-tracked directories may be reported as "unknown" if information about directories is missing from the cache yet (see ...)
This comment was marked as resolved.
This comment was marked as resolved.
improve top admons in data status and dvc data status
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK well I guess it's all approved now. The only change after that is improved cross-tip admon/links:
<admon type="tip"> | ||
|
||
For [pipelines](/doc/user-guide/pipelines) status, use `dvc status`. | ||
For the status of <abbr>data pipelines</abbr>, see `dvc status`. | ||
|
||
</admon> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
<admon type="tip"> | ||
|
||
For an equivalent to `git status`, use `dvc data status`. | ||
For the status of tracked data, see `dvc data status` (similar to `git status`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this
Link Check ReportThere were no links to check! |
Follow up to #3812 (comment)