-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plan how to rework test suite to reduce memory usage #9553
Comments
Exalate commented: koernerfelicia commented: Related: #9734 I think this will be more pressing with the upgrade to TF 2.6 |
Exalate commented: stale[bot] commented: This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Exalate commented: m-vdb commented: @joejuzl @TyDunn after discussion with @usc-m, I think that the first 2 items should be driven by Atom. And depending on findings, a sync with Engine for next steps, especially for ML changes:
So I'm in favour of keeping this in Atom, if we need to loop someone from Engine at some point, we can always do that |
➤ Maxime Verger commented: 💡 Heads up! We're moving issues to Jira: https://rasa-open-source.atlassian.net/browse/OSS. From now on, this Jira board is the place where you can browse (without an account) and create issues (you'll need a free Jira account for that). This GitHub issue has already been migrated to Jira and will be closed on January 9th, 2023. Do not forget to subscribe to the corresponding Jira issue! ➡️ More information in the forum: https://forum.rasa.com/t/migration-of-rasa-oss-issues-to-jira/56569. |
We're currently using test suite fixtures in a way that causes them to live much longer than they should. This means a test runner keeps them around for a much longer duration than necessary, creating memory pressure. Additionally, due to the way python handles multithreading,
pytest
spawns separate workers which duplicate this for each process - for 2 workers (what we have now), we pay these memory costs twice. This means that without careful management we are likely to exceed memory limits on the GH Actions workers. Windows/Py3.7 seems to be the canary for this, as it seems to have higher resource overheads than other configurations.One solution is to rework the test suite with the aim of reducing the amount of time fixtures live for, so that they are not loaded into memory and kept for longer than necessary by each worker. This would mean that as much as possible, workers do not have duplicate copies of fixtures in memory.
Another possibility (not exclusive) is to reduce the amount of time the fixtures take to load/set up (mostly this is an issue with TensorFlow model loading) to remove the need to have scoped fixtures at all, meaning data is only loaded as-needed and not persisted.
We should come up with a plan on how best to resolve this and create tickets to track that plan.
Definition of done
The text was updated successfully, but these errors were encountered: