-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref: document import-url
cloud versioning changes
#4142
Changes from all commits
a87198b
b1a2349
fccda92
60df383
e3255f2
d5a4e9a
2d3e24a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -265,6 +265,7 @@ These include a subset of the fields in `.dvc` file | |
| `persist` | Whether the output file/dir should remain in place during `dvc repro` (`false` by default: outputs are deleted when `dvc repro` starts) | | ||
| `checkpoint` | (Optional) Set to `true` to let DVC know that this output is associated with [checkpoint experiments](/doc/user-guide/experiment-management/checkpoints). These outputs are reverted to their last cached version at `dvc exp run` and also `persist` during the stage execution. | | ||
| `desc` | (Optional) User description for this output. This doesn't affect any DVC operations. | | ||
| `push` | Whether or not this file or directory, when previously <abbr>cached</abbr>, is uploaded to remote storage by `dvc push` (`true` by default). | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Should we plan to recommend this a lot in Data Pipeline docs? Specifically for intermediate pipeline outputs. Assuming the happy path out there is to push only raw data and likely final ML model files (everything else may be best to If we don't at least emphasize the possibility, users may realize too late they have pushed a bunch of intermediate output versions and they are pretty difficult to clean up with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure that not pushing is the right default behavior, even for intermediate outputs. If the user wants to take advantage of run-cache to not re-run stages that have already been reproduced, they still need to push/pull intermediate outs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorgeorpinel Thinking about it some more, I like the suggestion and think it makes sense as a possible product direction to make it easier to get started with pipelines, so let's brainstorm more on it. |
||
|
||
<admon type="warn"> | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also explain that dvc will pull that version from the source location even if it's overwritten, and will not push another copy of it to the remote.
cc @jorgeorpinel Is there somewhere in the data management user guide we want to this info also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll def. need UG updates to go over cloud versioning (feel free to make a separate docs issue) -- can't explain everything in an option text. For now I'd focus on what the flag does, and put some explanations in the Description (which in this case is already super long and should be rewritten/ moved to UG eventually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. Specifically, there's is some draft content about this in https://github.com/iterative/dvc.org/pull/4119/files#diff-d01612907e4ab14238625d537eaf42852d8566901d1bfe2fd3f2d4406a2d1dfc ATM.