-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Windows, dvc repro spends an awful lot of time in rwlock #3653
Comments
Thanks for the investigation! That is indeed pretty strange, we need to take a closer look. Could you show your
Actually, that's what we used to do, but rwlock was introduced to support granular locking, so that you could run dvc in parallel.
Useful link: https://dvc.org/doc/user-guide/dvc-files-and-directories 🙂 |
@charlesbaynham Btw, have you tried running that on Linux? Wonder if this is windows-specific or if we have troubles everywhere. |
Thanks for the run down @efiop, that's helpful. The version that ran the above was I'll give it a try on linux: annoyingly the repository I tested on has some case sensitivity problems which I've glossed over so far on Windows but which prevent it from working on linux, but I should fix that anyway. The cProfile results for my repo on Windows: |
Here's the cProfile results for It looks like the same fraction of time is spend writing to |
@charlesbaynham Sorry for the delay. Admitedly didn't have much time to look deeply into it, but could you give my patch a try, please:
? |
But doesn't that just effectively disable the lock? |
I also notice that we're making 1.4 million calls to |
Took a quick look at the profiler results to see if the excessive |
@charlesbaynham Sorry for the delay. It doesn't really disable it, just doesn't try to be paranoid about sync-ing to disk. We'll need to take a closer look. We don't have the capacity to do it right now, but we'll try to do that in the nearest future. |
No need for any apology, the work you guys do is amazing. Maybe a compromise could be to only flush the lock if dvc determines that actual work needs to be done to reproduce a stage. In that scenario it'd be a small fraction of the time taken. |
The thing is that it modifies rwlock before checking that something has changed, so that no other stage is in the middle of modifying something that we will be checking. 🙁 It might be as simple as my patch that removes paranoid fsync-ing, but need to double check that. If not, I'm sure we'll find another way to optimize this. You and @pmrowla have also noticed something that we know is problematic: relpath calls. The solution there is not stop trying to |
@efiop, I think we can get rid of the In my synthetic benchmark, I see 12X improvement (146s vs 12s) just by removing We can assume Also, it's interesting to see that the |
@skshetry Sounds good. Let's get rid of it for now. |
BTW, we already discussed this |
You guys are great, thank you! |
@charlesbaynham, can you give |
Btw, for the record: we'll add a benchmark soon for this (or related) iterative/dvc-bench#73 . |
Windows 7 / 10, dvc 0.93.0+f5d89f (after merging #3645)
Running cProfile on
dvc repro -P
in a repository with around 200 steps all of which are up to date. On my system this takes 65s, of which 42s is spent in_edit_rwlock
.I wonder, is it necessary to lock each individual file here? Couldn't the whole repository be locked instead?
Maybe a broader question: what are the respective purposes of
.dvc/tmp/rwlock
,.dvc/lock
and.dvc/updater.lock
?The text was updated successfully, but these errors were encountered: