-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: create docs #908
api: create docs #908
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More Qs. Will try to answer myself but any answers are welcome.
|
||
- `dvc.exceptions.NoRemoteError` - no `remote` is found. | ||
|
||
## Example: Use data tracked in a DVC repository online |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand the name, to be honest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Use data from a DVC repository online"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove online
... instead, in rare cases we can figure out how to specify "offline"/local repos ... if it's needed at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the word "online" is throwing you off. I removed it from the text below but I'm not sure how best to replace it here. I can't say "remote DVC repo" because that's confusing: DVC remotes are not repositories but storage (cache backups, basically).
"...repo on the cloud" ? Sounds like marketing.
"...repo on Github" ? Too specific for the title.
Notice the word "online" is also used in the explanation of the repo
param, so maybe it's OK to keep it here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just omit it? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I guess using local repos is an edge case? Currently that's only shown as a side note in the 2nd example (and explained in the remote
param). Removed.
'get-started/data.xml', | ||
repo='https://github.com/iterative/dataset-registry' | ||
) as fd: | ||
xmldom = parse(fd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super relevant example would to show SAX or StAX parser instead of a DOM one - that's where it shines. Or we can make CSV example the main one and show how we process it in steam fashion (e.g. calculating sum or avg) - it would show the "streaming" aspect of the open()
way better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make CSV example the main one and show how we process it in steam fashion (e.g. calculating sum or avg
Thinking about this, I don't think we're talking about real-time data streaming (e.g. from a Kafka server) so that continuously calculating a metric would be logical. Or maybe I missed the point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged the PR, but still thinking about this one. What's the advantage of streaming files in open/read? Probably just making a big file available quickly so you can start processing it before it's all downloaded, but again, I don't think you'll want to show the progress of such processing, or is that a major use case you guys see? Cc @Suor @shcheklein
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this discussion to a new PR: #1037
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL ... we are almost there - this time mostly some content improvements, minor fixes
per #908 (review) and several other comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to merge. Up to you to address other comments that left.
Thanks for the review. Seems like it's all addressed and also per #908 (review) this is marked mergeable.
* api: update to latest definitions to match iterative/dvc.org/pull/908 * Update dvc/api.py * metrics show: update -h output to match docs per iterative/dvc.org@2c34521 * api: update docstrings to match iterative/dvc.org/pull/908 * api: refactor DVC repo check in get_url, and document it * dvc: cosmetic edits as I explored exceptions that api functions may raise * api: copy default info to read() docstring from open() per #3426 (review) * api: improve open() docstring for clarity and add example per #3426 (review) * api: remove unnecessary info from get_url docstring per #3426 (comment) * api: produces->generated in open() docstring * api: simplify open docstring per #3426 (comment) Co-authored-by: Ruslan Kuprieiev <[email protected]>
* api: update to latest definitions to match iterative/dvc.org/pull/908 * Update dvc/api.py * metrics show: update -h output to match docs per iterative/dvc.org@2c34521 * api: update docstrings to match iterative/dvc.org/pull/908 * api: refactor DVC repo check in get_url, and document it * dvc: cosmetic edits as I explored exceptions that api functions may raise * api: copy default info to read() docstring from open() per #3426 (review) * api: improve open() docstring for clarity and add example per #3426 (review) * api: remove unnecessary info from get_url docstring per #3426 (comment) * api: produces->generated in open() docstring * api: simplify open docstring per #3426 (comment) * term: "metrics" plural in output messages iterative/dvc.org#1848 (review) * typo Co-authored-by: Ruslan Kuprieiev <[email protected]>
Closes #463