-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] pull: display a lot of jobs #3453
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @JIoJIaJIu, it'd be good if you could create a separate PR and close this if you intend to change the implementation. 🙂
It could be useful later.
It's definitely a good first step. The main intention is too show that something is happening and probably it's fine to limit the number of
how would it look like? do we show names, or only number of files? can we take a look at how other tools like |
@skshetry I have plans to change implementation, but would like to get your opinion about the current one. Maybe things change I will keep/improve this one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're trying to do here, but it's not really a full fix (it's just the temporary patch I was thinking of in #2905 (comment)).
There's a more meta comment which maybe needs to be addressed first: #2905 (comment)
Hi @casperdcl , you mentioned that it's not a full fix, but how do you see it should work at the latest stage?
As I got for network operations user gets benefit with multiple open connections, it would be great if you can share some info about it @efiop. Another point that it's just an interface - we provide option to point |
@JIoJIaJIu there are a few small bugs I can see with this PR but it's not really worth getting into if we can avoid it altogether by say setting Apart from that (in a general use case/if we decide there should be no upper bound on threads) there probably should be a PR on the upstream https://github.com/tqdm/tqdm repo... Anyway for DVC purposes, the best way to go about this is to collect info about all tasks first (e.g. total number of bytes) and have them all report back to just one total bar if there are more than say 10 tasks. @shcheklein / @efiop lemme know if actually you would prefer some per-task info printed at least after each task completes. If you do, this may be fairly easy to solve using something similar to this PR. |
@casperdcl there are definitely cases where 10 or even 20 jobs is not enough. Since some workloads are IO bound it might be beneficial to have 100 jobs, so I doubt that it's a good solution to limit to some upper bound. Could we check some other tools that do parallel processing? sync tools, package managers, etc .. to see some ideas and how people solve this problem. |
Well I thought originally nobody wanted this approach (#2905 (comment)) but it looks like a temporary patch is indeed best for now until jobs are combined into a single bar in other PR(s). Rather than commenting a lot on this PR I'll open a new one. |
@JIoJIaJIu feel free to take over #3500 |
Closing in favour or #3500 |
Disclaimer
I would like to change the implementation completely but shared for collaboration
How to try
There is a script for simulating behavior
Before
After
TLDR
I tried an approach with limiting amount of visible pbars but would like to change the solution to the single bar that contains information about all files.
The idea was show only N pbars, and when one is finished - add new from the background. It allows to avoid flickering, but it has problems:
Tqdm
stuffClose #2905
❗ Have you followed the guidelines in the Contributing to DVC list?
📖 Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏