Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4504: Existence of the configuration option explained #1941

Merged
merged 13 commits into from
Nov 16, 2020

Conversation

I159
Copy link
Contributor

@I159 I159 commented Nov 13, 2020

Per iterative/dvc/pull/4872 (rel iterative/dvc/issues/4504)

You may disregard these recommendations if you used the Edit on GitHub button from dvc.org to improve a doc in place.

❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.

🐛 Please make sure to mention Fix #issue (if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.

Please choose to allow us to edit your branch when creating the PR.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

Comment on lines 98 to 100
- `core.jobs` - number of workers started. Accepts positive integers. The
default value is `4 * cpu_count()`.

Copy link
Contributor

@efiop efiop Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such option. You probably want to add jobs for remote section. https://dvc.org/doc/command-reference/remote/modify#available-parameters-for-all-remotes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I don't understand for now how the config sections merge and under what circumstances, thus some confusions are coming up. Thanks!

Copy link
Contributor Author

@I159 I159 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanations! I'm confused a little now with the documentation sections that I have to update.

@@ -35,6 +35,9 @@ manual editing could be used to change the configuration.
- `-u`, `--unset` - delete configuration value for the given config `option`.
Don't provide a `value` when employing this flag.

- `--jobs` - number of workers started. Accepts positive integers. The default
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, it looks like I added two identical explanations of identical command options. Is there a place and need to explain configuration field?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@I159 dvc remote doesn't have --jobs flag though. It has jobs config option. See verify in "Available parameters for all remotes" below and add the same thing for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@I159 I159 requested a review from efiop November 13, 2020 16:34
@I159 I159 requested a review from efiop November 13, 2020 16:49
Copy link
Contributor

@efiop efiop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! As noted in #1941 (comment) , the only thing left is to add that --jobs note to every command that uses `--jobs).

content/docs/command-reference/gc.md
content/docs/command-reference/status.md
content/docs/command-reference/push.md 
content/docs/command-reference/fetch.md

jorgeorpinel
jorgeorpinel previously approved these changes Nov 14, 2020
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @I159! Please try to address the few comments left by Ruslan and myself, but in general this seems correct and we can also finish this up if needed.

@I159
Copy link
Contributor Author

I159 commented Nov 15, 2020

Thank you @jorgeorpinel and @efiop! The notes addressed. Looks better now, also brought some understanding.

@I159 I159 requested a review from efiop November 15, 2020 20:27
the default is `4`. Using more jobs may improve the overall transfer speed.
the default is `4`. Note that the default value can be set using the `jobs`
config option with `dvc remote modify`. Using more jobs may improve the
overall transfer speed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gc is not about transfer, @efiop do we parallelize delete operations in gc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. Same as with status, this might speedup checking files on remotes. Could just delete the last sentence.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gc is not about transfer

Changed to "connection speed" in gc and status for now.


Accepts positive integers. The default value is `4 * cpu_count()`. For SSH
remotes, the default is `4`. Using more jobs may improve the overall transfer
speed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, not sure transfer applies well to gc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful for most of the commands, so I don't think this is worth clarifying more here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using "connection speed" here as well for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connection speed has a very precise meaning though, and --jobs doesn't change it. speedup the operation sounds good. For pull/push - etc transfer is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"speed up the operation" it is, in gc and status. Note that in remote modify I removed that sentence. Applied in 89dd826 (see #1945 (review)).

transfer speed.
SSH remotes, the default is `4`. Note that the default value can be set using
the `jobs` config option with `dvc remote modify`. Using more jobs may improve
the overall transfer speed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, status - what exactly do we parallelize here? especially considering that exists checks can be complicated (vary from remote to remote)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly parallelizing remote listings (depends on specific implementation and wether there is an index and how complete it is).

@jorgeorpinel jorgeorpinel self-assigned this Nov 16, 2020
@jorgeorpinel jorgeorpinel dismissed their stale review November 16, 2020 19:03

I'm taking this one over.

@jorgeorpinel jorgeorpinel merged commit e12ba4a into iterative:master Nov 16, 2020
@jorgeorpinel
Copy link
Contributor

Merged this to move on, but lmk if you see pending problems, guys.

Also, thanks again @I159 !

jorgeorpinel added a commit that referenced this pull request Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants