-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4504: Existence of the configuration option explained #1941
Conversation
- `core.jobs` - number of workers started. Accepts positive integers. The | ||
default value is `4 * cpu_count()`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such option. You probably want to add jobs
for remote section. https://dvc.org/doc/command-reference/remote/modify#available-parameters-for-all-remotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I don't understand for now how the config sections merge and under what circumstances, thus some confusions are coming up. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanations! I'm confused a little now with the documentation sections that I have to update.
@@ -35,6 +35,9 @@ manual editing could be used to change the configuration. | |||
- `-u`, `--unset` - delete configuration value for the given config `option`. | |||
Don't provide a `value` when employing this flag. | |||
|
|||
- `--jobs` - number of workers started. Accepts positive integers. The default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it looks like I added two identical explanations of identical command options. Is there a place and need to explain configuration field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@I159 dvc remote
doesn't have --jobs
flag though. It has jobs
config option. See verify
in "Available parameters for all remotes" below and add the same thing for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Co-authored-by: Ruslan Kuprieiev <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! As noted in #1941 (comment) , the only thing left is to add that --jobs
note to every command that uses `--jobs).
content/docs/command-reference/gc.md
content/docs/command-reference/status.md
content/docs/command-reference/push.md
content/docs/command-reference/fetch.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @I159! Please try to address the few comments left by Ruslan and myself, but in general this seems correct and we can also finish this up if needed.
Co-authored-by: Jorge Orpinel <[email protected]>
Thank you @jorgeorpinel and @efiop! The notes addressed. Looks better now, also brought some understanding. |
content/docs/command-reference/gc.md
Outdated
the default is `4`. Using more jobs may improve the overall transfer speed. | ||
the default is `4`. Note that the default value can be set using the `jobs` | ||
config option with `dvc remote modify`. Using more jobs may improve the | ||
overall transfer speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gc
is not about transfer, @efiop do we parallelize delete operations in gc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't. Same as with status, this might speedup checking files on remotes. Could just delete the last sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gc is not about transfer
Changed to "connection speed" in gc and status for now.
|
||
Accepts positive integers. The default value is `4 * cpu_count()`. For SSH | ||
remotes, the default is `4`. Using more jobs may improve the overall transfer | ||
speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, not sure transfer applies well to gc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful for most of the commands, so I don't think this is worth clarifying more here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "connection speed" here as well for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connection speed
has a very precise meaning though, and --jobs
doesn't change it. speedup the operation
sounds good. For pull/push - etc transfer is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"speed up the operation" it is, in gc and status. Note that in remote modify I removed that sentence. Applied in 89dd826 (see #1945 (review)).
transfer speed. | ||
SSH remotes, the default is `4`. Note that the default value can be set using | ||
the `jobs` config option with `dvc remote modify`. Using more jobs may improve | ||
the overall transfer speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, status
- what exactly do we parallelize here? especially considering that exists checks can be complicated (vary from remote to remote)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly parallelizing remote listings (depends on specific implementation and wether there is an index and how complete it is).
Merged this to move on, but lmk if you see pending problems, guys. Also, thanks again @I159 ! |
Per iterative/dvc/pull/4872 (rel iterative/dvc/issues/4504)
❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.
🐛 Please make sure to mention
Fix #issue
(if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.Please choose to allow us to edit your branch when creating the PR.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏