-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
list: document usage for data export/archive #1521
Comments
@jorgeorpinel User can do that with tar or zip no problem. Wouldn't bother with this until someone asks for this functionality and has good reasons why he can't use tar or zip 🙂 There is a special reason for |
What about huge data files? I'm talking about a lightweight copy of the repo as if you just cloned it with Git (but for |
@jorgeorpinel Ah, got it. Yes, I can now see the value there. Thanks for clarifying! 🙂 Please feel free to raise the priority if you need this feature, otherwise I would probably wait until there is a clear useful scenario in which someone will actually use this. |
No problem. Yes, I agree maybe no one really needs this haha. |
p.s. another variant of this feature could be something like |
Another alternative solution might also be provided by |
We've received a very similar question from a user https://opendatascience.slack.com/archives/CGGLZJ119/p1574762045023000?thread_ts=1574761369.020000&cid=CGGLZJ119 (russian-only, sorry 🙁 ). Long story short, the guy is creating an arcive with code and data to send to the customer, which is very similar to the idea from @jorgeorpinel described above. I've asked him to leave a comment here too. |
Hi, So, I would like to have something like |
@RomanSteinberg Thanks for your comment! I'm pretty sure Speaking about |
Thanks guys! There's some confusion though, my original idea here is for DVC projects that DO NOT use a Git repository as base. There would be no .git/ dir or .gitignore file.
@efiop actually I was not thinking to include the tracked data in the export. So maybe Maybe the idea of Seems a bit risky to me, TBH. You would lose any outputs from other Git versions (not linked in checked out DVC-files) and if there's no other copy of the project, its gone forever. |
So to summarize, we're basically talking about 2 different things:
|
@jorgeorpinel I couldn't imagine that someone can use dvc without git. I don't understand this case at all. How can one versioning data and not versioning code? So, I can't give any feedback about your idea. |
In fact without Git you could not version the data either. But still we offer |
Just to clarify, we could have
Then Use case: deploy releases without SCM/DVC: I'd agree that |
For the record: |
iterative/dvc#4108 I see :) |
Right so to archive a snapshot: git archive -o code.zip HEAD
dvc list . -R --dvc-only | zip -@ data.zip |
Perfect! Maybe we just need to put a note about this in the dvc list cmd ref. and link from a few more places? If so please move this issue to the docs repo. Thanks |
Closed by #2075. |
For projects created with
dvc init --no-scm
, since there's no Git repo to version all the files NOT tracked by DVC (code, DVC-files), it could be useful to have advc export <external-location>
command to easily create a lightweight copy of the project (for backup). It's "lightweight" because it wouldn't include any of the data tracked by DVC.Similar to
git archive
. Could even include an--archive
flag to make a tar/zip bundle of the export.Just a random idea! (It came from reading some conversations about non-Git projects on Discord.)
UPDATE: To see the latest discussion go to #1521 (comment), but in summary:
Don't need a new command for now, just document this to archive a snapshot:
The text was updated successfully, but these errors were encountered: