-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression - max_workers
not being used
#235
Comments
Possibly #211. I haven't merged it yet, but #217 is another change in a similar vein, but with a better implementation of backups. |
I checked out the commit immediately prior to #211, but still seeing the same problem EDIT: It is of course possible that it's something I've changed in the configuration rather than the code that is causing this. |
This looks like the case described in #222, which is a rechunk operation with two stages of 5000 and 334 tasks. You can check by seeing if there is an This isn't a new problem, but it may have arisen after changing memory settings. |
I'm pretty sure there was a performance regression in the last few weeks, which appears for the quadratic means workload on the timeline visualization:
Before:
After:
Other things were changed too so perhaps ignore other aspects of this plot that are different, just notice that for some steps there is a group of tasks (exactly 334 out of 5334) that complete as a separate bunch later on. Looking at this again makes me wonder if this is actually instead some issue with the visualization...
@tomwhite you said you had an idea which commit might have caused this? If you tell me which one(s) you think is suspect can checkout a version prior to that and try it out now?
Ideally this sort of thing would be caught by an automated regression test, see #234.
The text was updated successfully, but these errors were encountered: