Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run "dvc push" before running "git push"? #1765

Closed
seisman opened this issue Feb 24, 2022 · 4 comments · Fixed by #1776
Closed

Run "dvc push" before running "git push"? #1765

seisman opened this issue Feb 24, 2022 · 4 comments · Fixed by #1776
Labels
documentation Improvements or additions to documentation
Milestone

Comments

@seisman
Copy link
Member

seisman commented Feb 24, 2022

pygmt/doc/contributing.md

Lines 640 to 641 in cd8dcef

git push
dvc push

The contributing guides tell that users should run git push then dvc push.

The problem is, when someone pushes changes to github (git push), the dvc-diff.yml workflow will be triggered immediately.

After the workflow starts, it will take about 45 seconds to set up everything and starts to pull images from DagsHub (i.e., running the dvc pull command). So it means that users must run dvc push quickly after running git push, otherwise the dvc-diff.yml workflow will fail.

From the commit message of e30c708, it seems that the order of dvc push and git push doesn't matter. I think running dvc push before git push is a better and safer option.

@weiji14 weiji14 added the question Further information is requested label Mar 1, 2022
@maxrjones
Copy link
Member

I think we should also add dvc status --remote upstream before dvc push as an option for checking that the appropriate files will be pushed.

@seisman
Copy link
Member Author

seisman commented Mar 1, 2022

I think we should also add dvc status --remote upstream before dvc push as an option for checking that the appropriate files will be pushed.

Sounds good.

@seisman seisman added documentation Improvements or additions to documentation and removed question Further information is requested labels Mar 1, 2022
@seisman seisman added this to the 0.6.0 milestone Mar 1, 2022
@seisman
Copy link
Member Author

seisman commented Mar 1, 2022

Do you have any idea what's the difference between dvc status and dvc status --remote upstream? They give different outputs for me and we use dvc status many times in the contributing guides.

$ dvc status
Data and pipelines are up to date.
$ dvc status --remote upstream
Cache and remote 'upstream' are in sync.

@maxrjones
Copy link
Member

dvc status will tell you if there are differences between the "workspace" files and your local files in .dvc/cache while dvc status --remote upstream will tell you if there are differences between the files in your local .dvc/cache and the .dvc/cache of the remote repository set as upstream.

For example, if you start with an up-to-date repository and run dvc status and dvc status --remote upstream you'll see the report that you mentioned above. If you were to then replace one of the dvc tracked image files inside pygmt/tests/baseline with a new file, the output from dvc status would change but the output from dvc status --remote upstream would not change (because the new file has not yet been added to your cache). If you were to then do dvc add following by committing the changed .dvc file to git, you would get Data and pipelines are up to date from dvc status and new information that the file is missing from the remote repository fromdvc status --remote upstream.

As an aside, I generally find dvc diff more informative than dvc status. It does not matter as much for PyGMT with its 1:1 matching of .dvc files to images, but for GMT it reports much more helpful information about which specific files were changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants