Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull: going completely crazy on large dataset #2905

Closed
shcheklein opened this issue Dec 6, 2019 · 12 comments · Fixed by #3500
Closed

pull: going completely crazy on large dataset #2905

shcheklein opened this issue Dec 6, 2019 · 12 comments · Fixed by #3500
Assignees
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do research ui user interface / interaction

Comments

@shcheklein
Copy link
Member

shcheklein commented Dec 6, 2019

Version:

DVC version: 0.73.0
Python version: 3.6.8
Platform: Linux-4.15.0-1056-aws-x86_64-with-Ubuntu-18.04-bionic
Binary: False
Package: None
Filesystem type (cache directory): ('xfs', '/dev/nvme1n1')
Filesystem type (workspace): ('xfs', '/dev/nvme1n1')

Reproduce:

git clone [email protected]:iterative/dataset-registry-private.git
cd dataset-registry-private
dvc pull -j 100 ILSVRC.dvc

Output:

(sorry for the video in this format, was the fastest way to capture it while it was running)

https://www.dropbox.com/s/m821sgc1flk770o/dvc-pull-going-crazy.mov?dl=0

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Dec 6, 2019
@shcheklein shcheklein added bug Did we break something? ui user interface / interaction labels Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Dec 6, 2019
@shcheklein shcheklein changed the title pull: linking state going completely crazy on large dataset pull: going completely crazy on large dataset Dec 6, 2019
@Suor
Copy link
Contributor

Suor commented Dec 6, 2019

Should be fixed by combining pbars. @casperdcl

@casperdcl
Copy link
Contributor

-j 100 would be the issue (nested bars scrolling off the screen). I should be able to come up with a temp patch for now. P.S. what happened with the old method? No scrolling but lots of flickering and very slow?

@casperdcl casperdcl self-assigned this Dec 6, 2019
@casperdcl casperdcl added the p1-important Important, aka current backlog of things to do label Dec 6, 2019
@Suor
Copy link
Contributor

Suor commented Dec 6, 2019

@casperdcl what's the point? Let's go for the actual fix, i.e. combining into a single pbar.

@casperdcl
Copy link
Contributor

The point would be I can downgrade a p1 to p2/p3 today, but don't think I'll have time for a full fix soon.

@Suor
Copy link
Contributor

Suor commented Dec 6, 2019

@casperdcl p1 is not p0 (fix asap), p1 can wait. TBH, I see combining threads as top priority in pbars (maybe even UI as a whole) now since it's the highest value, so no point doing something else instead/in the meantime unless it makes dvc unusable.

@efiop
Copy link
Contributor

efiop commented Dec 6, 2019

Totally agree with @Suor .

@casperdcl
Copy link
Contributor

Btw dvc push (without any explicit multithread args) also has this issue.

@gurobokum
Copy link
Contributor

The problem appears when amount of the bars is more then rows in the terminal.
We use threads where Tqdm instance is created. So when we have amount of jobs (threads) are more then rows in the terminal Tqdm draws bars on top each other

  • I saw the solution that pointed in the PR [WIP] pull: display a lot of jobs #3453, with hidden bars:
    limit visible bars to some number, and replace finished one with some from background. Unfortunately solution is not ideal, and interacts with Tqdm internals, I think if needs to apply something like this then better to do it on the library side.
  • Another way is to return from self.display when self.pos >= ROWS, but there are also problems with internals

Hi @casperdcl, I know that you are an author of the Tqdm library (awesome tool 👍 🥇 ), could you take a look on my PR and share your thoughts please?

Another approach is to limit bars to only to one and show accumulated progress. It requires to rework few Tqdm calls in dvc pull, dvc push, dvc add(on adding >1Gb files), provide wrapper that can retrieve updates through Queue and draws single bar. I like that approach but it will affect a lot of places of the system

@casperdcl
Copy link
Contributor

casperdcl commented Mar 16, 2020

I should ask - are there any opetations where a user would ever benefit from having more than about 5 threads? Surely I/O is a bottleneck for anything more? Maybe we can just set an upper limit to number of threads?

@efiop
Copy link
Contributor

efiop commented Mar 18, 2020

@casperdcl 5 is not a limit for many applications. Downloading millions of tiny files from s3 benefits from dozens of workers, for example.

@efiop
Copy link
Contributor

efiop commented Mar 18, 2020

@casperdcl "limiting number of threads because of the progress bar is a very weird approach, feels like we are not solving the right issue here 😄

@casperdcl
Copy link
Contributor

not because of the progress bar, hah. I'm just asking if we can avoid solving the progress bar problem by solving a different problem.

Apparently not. :)

@efiop efiop assigned casperdcl and unassigned gurobokum Mar 24, 2020
casperdcl added a commit to casperdcl/dvc that referenced this issue Mar 28, 2020
efiop pushed a commit that referenced this issue Mar 29, 2020
* progress: limit maximum running bars

Fixes #2905
Closes #3453

* progress: add hidden message for overflow bars

* progress: fix nested bars properly

Just use latest `tqdm` :)
casperdcl added a commit to casperdcl/dvc that referenced this issue Apr 2, 2020
casperdcl added a commit that referenced this issue Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do research ui user interface / interaction
Projects
None yet
6 participants