-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
June gems #1510
June gems #1510
Conversation
consuming, and the dependencies and outputs haven't changed. You can use the | ||
`--no-exec` flag to get around this: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add dvc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add $ before the command - here and in other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are some bugs like this in the previous Gems btw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. might be good to revise previous gems then too
|
||
_Just like this but with technical documentation._ | ||
|
||
### Q: After I pushed my local data to remote S3 storage, I noticed the file names are different in S3- they're hash values. [Can I make them more meaningful names?](https://discord.com/channels/485586884165107732/563406153334128681/717737163122540585) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to mention S3 - we can generalize it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, would be great to briefly provide motivation - e.g. deduplication , security - file are immutable, etc, GitFlow ...
In addition to dvc list
mention data registry article and/or other commands dvc get
, dvc import
, Python dvc.api
- - all of them provide a holistic data access layer for DVC-tracked objects (files, ML models, directories) which can be used usually as a drop-in replacement for regular data access libraries (e.g. aws boto
,aws cli
, in case of S3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK have developed this answer more in the next version, let me know what you think
@shcheklein revisions are pushed |
### Q: After I pushed my local data to remote storage, I noticed the file names are different in my storage repository- they're hash values. [Can I make them more meaningful names?](https://discord.com/channels/485586884165107732/563406153334128681/717737163122540585) | ||
|
||
No, but for a good reason! What you're seeing are cached files, and they're | ||
stored in a special format that makes DVC versioning and addressing possible- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format -> way (we don't change format, it might confuse some folks). CSV stays CSV, we only change its name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm going to say "naming convention"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been updated in the latest commit
I think all issues are addressed. Aiming to publish tomorrow AM so let's merge then? |
Co-authored-by: Restyled.io <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments
DVC cache and remote if the contents of your dataset change frequently. | ||
|
||
Generally, we would recommend first trying a plain unzipped directory. DVC is | ||
designed to work with large numbers of files (on the order of millions) and has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra has
:)
must set the `endpointurl` too. For example: | ||
|
||
```dvc | ||
$ dvc remote add -d myremote s3://mybucket/path/to/dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it best to use long flag names for commands in documentation? --default
better than -d
? Otherwise people may accidentally change their default remote when copy-pasting this command.
June Gems are here- going to try to get it out while it's still June! :)