-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd: rewrite push/pull et al. #1602
Changes from 6 commits
7fa290e
be0423b
431ab3d
36ceb9a
491f271
308092c
7553c2f
b2b6204
dc87fe5
9af40e9
4717b78
246f446
999a35d
1d4d56d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,55 +18,48 @@ positional arguments: | |
## Description | ||
|
||
The `dvc pull` and `dvc push` commands are the means for uploading and | ||
downloading data to and from remote storage. These commands are similar to | ||
`git pull` and `git push`, respectively (with some key differences given the | ||
nature of DVC, see details below). | ||
downloading data to and from remote storage (S3, SSH, GCS, etc.). These commands | ||
are similar to `git pull` and `git push`, respectively. | ||
|
||
[Data sharing](/doc/use-cases/sharing-data-and-model-files) across environments, | ||
and preserving data versions (input datasets, intermediate results, models, | ||
[metrics](/doc/command-reference/metrics), etc.) | ||
[remotely](/doc/command-reference/remote) are the two most common use cases for | ||
these commands. | ||
[metrics](/doc/command-reference/metrics), etc.) remotely are the most common | ||
use cases for these commands. | ||
|
||
The `dvc push` command allows us to upload data to remote storage. It doesn't | ||
save any changes to the code, `dvc.yaml`, or `.dvc` files (those should be saved | ||
with `git commit` and `git push`). | ||
The `dvc push` command allows us to upload data to | ||
[remote storage](/doc/command-reference/remote). It doesn't save any changes to | ||
the code, `dvc.yaml`, or `.dvc` files (that should be saved with `git commit` | ||
and `git push`). | ||
|
||
💡 For convenience, a Git hook is available to automate running `dvc push` after | ||
`git push`. See `dvc install` for more details. | ||
|
||
Under the hood a few actions are taken: | ||
The default remote is used (see `dvc config core.remote`) unless the `--remote` | ||
option is used. See `dvc remote` for more information on how to configure a | ||
remote. | ||
|
||
- The push command by default uses all stages (in `dvc.yaml` and `dvc.lock`) and | ||
`.dvc` files in the <abbr>workspace</abbr>. The command options will either | ||
limit or expand the set of stages or `.dvc` files to consult. | ||
Without arguments, it uploads all files and directories missing from remote | ||
storage, found as <abbr>outputs</abbr> in the stages (in `dvc.lock`) or `.dvc` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dvc.yaml? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm yeah... Kinda. dvc.lock is where you find the output file names — that's what this meant. But probably this paragraph should be simplified and not even mention either one. See 246f446. |
||
files present in the workspace (the `--all-branches` and `--all-tags` enable | ||
using multiple workspace versions). | ||
|
||
- For each <abbr>output</abbr> referenced in every selected stage or `.dvc` | ||
file, DVC finds a corresponding file or directory in the <abbr>cache</abbr>. | ||
DVC then checks whether it exists in the remote. From this, DVC gathers a list | ||
of files missing from the remote storage. | ||
The `targets` given to this command (if any) limit what to push. It accepts | ||
paths to tracked files or directories (including paths inside tracked | ||
directories), `.dvc` files, or stage names (found in `dvc.yaml`). | ||
|
||
- Upload the cache files missing from remote storage, if any, to the remote. | ||
💡 For convenience, a Git hook is available to automate running `dvc push` after | ||
`git push`. See `dvc install` for more details. | ||
|
||
The DVC `push` command always works with a remote storage, and it is an error if | ||
none are specified on the command line nor in the configuration. The default | ||
remote is used (see `dvc config core.remote`) unless the `--remote` option is | ||
used. See `dvc remote` for more information on how to configure a remote. | ||
Under the hood, a few actions are taken: | ||
|
||
With no arguments, just `dvc push` or `dvc push --remote REMOTE`, it uploads | ||
only the files (or directories) that are new in the local repository to remote | ||
storage. It will not upload files associated with earlier commits in the | ||
<abbr>repository</abbr> (if using Git), nor will it upload files that have not | ||
changed. | ||
- The push command checks the appropriate `dvc.lock` and `.dvc` files in the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dvc.yaml? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a command-reference, but I don't think this information should be here (perhaps on Dvc files?). Also, @jorgeorpinel, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
could you clarify please what information exactly do you mean?
we come back to a single term "DVC files" w/o specifying details with some tooltip that mentions that DVC files == dvc.yaml + dvc.lock + .dvc . All of those files play their role here. We can expand later and bit explicit if needed where it is needed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dvc.yaml is mentioned just before: "
This is not a bad idea... We do go over implementation details in cmd refs though, in fact they're probably the lowest-level docs we have. But we could def. hide some of those details in expandable sections.
Yes, let's try to merge this one as best we can though, and follow up on that in #1663 ?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I turn this whole bullet list into a more simple paragraph in dc87fe5. PTAL |
||
<abbr>workspace</abbr>. | ||
- For each <abbr>output</abbr> referenced in each stage or `.dvc` file, DVC | ||
finds a corresponding file or directory in the <abbr>cache</abbr>. DVC then | ||
gathers a list of files missing from the remote storage. | ||
- The cached files missing from remote storage, if any, are uploaded. | ||
|
||
The `dvc status -c` command can list files tracked by DVC that are new in the | ||
cache (compared to the default remote.) It can be used to see what files | ||
Note that the `dvc status -c` command can list files tracked by DVC that are new | ||
in the cache (compared to the default remote.) It can be used to see what files | ||
`dvc push` would upload. | ||
|
||
The `targets` given to this command (if any) limit what to push. It accepts | ||
paths to tracked files or directories (including paths inside tracked | ||
directories), `.dvc` files, or stage names (found in `dvc.yaml`). | ||
|
||
## Options | ||
|
||
- `-a`, `--all-branches` - determines the files to upload by examining | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should include
dvc.lock
here ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. That whole sentence is kind of a note though, now that I think about it. I changed it a little in 4717b78. It's still not perfect but we'll have to review this again soon, when addressing #1663.