-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking profiling run results #686
Comments
The developer onboarding says that we currently use ASV
pyinstrument
The maintain-ability (?) issue jumps out as something of a red flag to me, but - Checkout repository
- Setup conda
- Setup conda envrionment from developer/user docs
- Install pyinstrument into the evironment
- Run pyinstrument producing a HTML output (and maybe a session output so we can reload later)
- Push HTML file somewhere? Maybe to a separate branch that we an manually view the files with htmlpreview?
|
A couple of options (more details in this file)
The Opinions welcome: the github-pages branch of this repository is un-used so we can initially send the HTML outputs to there for viewing. |
Some notes from meeting of @tamuri, @willGraham01 and myself today to discuss this issue
|
Statistics to potentially capture:The kind of things to monitor;
File sizes of the pyisession outputsNOTE: Even a 1-month simulation produces a
|
At some point, we can move the profiling repo into the TLOmodel org (https://github.com/TLOmodel). |
Closing this as profiling workflow now capturing statistics and working reliably |
We would like to be able to track how the the timings measured in profiling runs of the
src/scripts/profiling/scale_run.py
script changes as new pull-requests are merged in. This would help identifying when PRs lead to performance regressions and allow us to be more proactive in fixing performance bottlenecks.Ideally this should be as automated using GitHub Actions workflows. Triggering the workflow on pushes to
master
would give the most detail in terms of giving a direct measurement of the performance differences arising from a particular PR, but when lots of PRs are going in could potentially create a large backlog of profiling runs, so an alternative would be to run on a schedule (for example nightly) using thecron
event. It would probably be worth also allowing triggering either using theworkflow_dispatch
event or using the comment-triggered workflow functionality to allow manually triggering in PRs that it is thought might have a significant effect on performance before merging.Key questions to be resolved are what profiling outputs we want to track (for example at what level of granularity, using which profiling tool) and how we want to visualize the outputs. One option would be to save the profiler output as a workflow artifact. While this would be useful in allowing access to the raw profiling data, the only option for accessing workflow artifacts appears to be downloading the artifact as a compressed zip file so this is not necessarily itself that useful for visualizing the output. One option for visualizing the profiling results would be to use the GitHub Actions job summary which allows using Markdown to produce customized output showed on the job summary page. Another option would be to output the profiling results to HTML files and then deploy these to either a GitHub Pages site or potentially to a static site on Azure storage.
Potentially useful links
The airspeed velocity package allows tracking the results of benchmarks of Python packages overtime and visualizing the results as plots in a web interface. While focused on suites of benchmarks it does also have support for running single benchmarks with profiling.
htmlpreview allows directly previewing HTML files in a GitHub repository as GitHub forces them to use the "text/plain" content-type, so they cannot be interpreted
The text was updated successfully, but these errors were encountered: