-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
100% CPU usage during idle #51
Comments
Right now, this is partly on us (that loop you point out), partly on Timely (TimelyDataflow/differential-dataflow#114). A first step to rectifying this was event-driven scheduling landing in Timely, but some pieces are still missing. And it is not clear whether we even want it for large-scale, data-processing use cases. If you increase the polling timeout here: declarative-dataflow/src/bin/server.rs Line 222 in 038d659
edit: tried it with 100ms, that reduced the usage to 0.1% |
Right now I think we could do inputs completely blocking, because we always step the worker to catch up with all inputs before continuing the event loop. But that is something we want to keep flexible. edit: nope, that would be dangerous, because the other workers would stop receiving commands. |
Understandably, dedicated DF nodes could peg at 100%, but if adoption is a priority, then it seems worthwhile to be able to run DF (perhaps in debug mode) against blocking channels so it doesn't use up one of my eight cores. Since mutations arrive via evented websockets, it seems that could wake up the poll thread. For distributed replication, maybe a layer of crossbeam-channel indirection could feed into the main loop? |
Yeah, I agree. Does the polling timeout workaround help? With superficial testing I don't notice any immediate problems. Running with a blocking input source for single-worker, development mode would be a helpful feature. I tried it just now and it stalls sometimes, because the command-sequencing dataflow is not taken into consideration when the server decides, whether more work is needed to process all inputs. Fixing that shouldn't be too hard, but needs a bit of work. |
Update: server now comes with a "blocking" compile-time feature flag. Another potential problem there is sources, which is why we now have some scheduling logic (1a393c4). That has to be integrated and the sequencer dataflow has to be probed. |
This is the first step to remediate #51 without running the chance of accidentally slowing down useful work. * Add until_next() method to scheduler * Use step_or_park in server loop
Fixed, now that TimelyDataflow/timely-dataflow#268 landed in Timely master. |
Not sure if this is timely related, websockets or something about polling, but I'm seeing release executable using 100% CPU during idle. Is this expected?
Probably this loop
The text was updated successfully, but these errors were encountered: