Plan how to rework test suite to reduce memory usage #9553

usc-m · 2021-09-06T15:32:57Z

We're currently using test suite fixtures in a way that causes them to live much longer than they should. This means a test runner keeps them around for a much longer duration than necessary, creating memory pressure. Additionally, due to the way python handles multithreading, pytest spawns separate workers which duplicate this for each process - for 2 workers (what we have now), we pay these memory costs twice. This means that without careful management we are likely to exceed memory limits on the GH Actions workers. Windows/Py3.7 seems to be the canary for this, as it seems to have higher resource overheads than other configurations.

One solution is to rework the test suite with the aim of reducing the amount of time fixtures live for, so that they are not loaded into memory and kept for longer than necessary by each worker. This would mean that as much as possible, workers do not have duplicate copies of fixtures in memory.

Another possibility (not exclusive) is to reduce the amount of time the fixtures take to load/set up (mostly this is an issue with TensorFlow model loading) to remove the need to have scoped fixtures at all, meaning data is only loaded as-needed and not persisted.

We should come up with a plan on how best to resolve this and create tickets to track that plan.

Definition of done

Investigate the components of the memory usage, factoring in time, memory usage, and flakiness
Determine what mitigations could be done
Create issues for them

The text was updated successfully, but these errors were encountered:

koernerfelicia · 2021-09-28T16:20:02Z

Exalate commented:

koernerfelicia commented:

Related: #9734

I think this will be more pressing with the upgrade to TF 2.6

stale · 2022-01-08T22:25:08Z

Exalate commented:

stale[bot] commented:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

m-vdb · 2022-02-07T15:54:55Z

Exalate commented:

m-vdb commented:

@joejuzl @TyDunn after discussion with @usc-m, I think that the first 2 items should be driven by Atom. And depending on findings, a sync with Engine for next steps, especially for ML changes:

Another possibility (not exclusive) is to reduce the amount of time the fixtures take to load/set up (mostly this is an issue with TensorFlow model loading) to remove the need to have scoped fixtures at all, meaning data is only loaded as-needed and not persisted.

So I'm in favour of keeping this in Atom, if we need to loop someone from Engine at some point, we can always do that

sync-by-unito · 2022-12-19T12:24:01Z

➤ Maxime Verger commented:

💡 Heads up! We're moving issues to Jira: https://rasa-open-source.atlassian.net/browse/OSS.

From now on, this Jira board is the place where you can browse (without an account) and create issues (you'll need a free Jira account for that). This GitHub issue has already been migrated to Jira and will be closed on January 9th, 2023. Do not forget to subscribe to the corresponding Jira issue!

➡️ More information in the forum: https://forum.rasa.com/t/migration-of-rasa-oss-issues-to-jira/56569.

usc-m changed the title ~~Rework test suite to reduce memory usage~~ Plan how to rework test suite to reduce memory usage Sep 6, 2021

wochinge added type:maintenance 🔧 Improvements to tooling, testing, deployments, infrastructure, code style. area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/infrastructure 🚅 All things related to infrastructure or deployments labels Sep 7, 2021

koernerfelicia mentioned this issue Sep 28, 2021

Upgrade to TF 2.6 #7619

Closed

19 tasks

koernerfelicia mentioned this issue Oct 25, 2021

Merge 2.8.x into main #9869

Merged

2 tasks

stale bot added the stale label Jan 8, 2022

koernerfelicia removed the stale label Jan 10, 2022

ancalita mentioned this issue Jan 12, 2022

Plan to fix flakey CI tests #10645

Closed

2 tasks

TyDunn added the effort:atom-squad/4 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. label Jan 21, 2022

ancalita mentioned this issue Feb 8, 2022

Speed-up CI follow-up on making Windows workflows multithreaded #10831

Closed

2 tasks

rasabot-exalate added area:rasa-oss :ferris wheel: and removed area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/infrastructure 🚅 All things related to infrastructure or deployments labels Mar 17, 2022 — with Exalate Issue Sync

m-vdb closed this as completed Jan 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan how to rework test suite to reduce memory usage #9553

Plan how to rework test suite to reduce memory usage #9553

usc-m commented Sep 6, 2021 •

edited by rasabot-exalate

Loading

koernerfelicia commented Sep 28, 2021 •

edited by rasabot-exalate

Loading

stale bot commented Jan 8, 2022 •

edited by rasabot-exalate

Loading

m-vdb commented Feb 7, 2022 •

edited by rasabot-exalate

Loading

sync-by-unito bot commented Dec 19, 2022

Plan how to rework test suite to reduce memory usage #9553

Plan how to rework test suite to reduce memory usage #9553

Comments

usc-m commented Sep 6, 2021 • edited by rasabot-exalate Loading

koernerfelicia commented Sep 28, 2021 • edited by rasabot-exalate Loading

stale bot commented Jan 8, 2022 • edited by rasabot-exalate Loading

m-vdb commented Feb 7, 2022 • edited by rasabot-exalate Loading

sync-by-unito bot commented Dec 19, 2022

usc-m commented Sep 6, 2021 •

edited by rasabot-exalate

Loading

koernerfelicia commented Sep 28, 2021 •

edited by rasabot-exalate

Loading

stale bot commented Jan 8, 2022 •

edited by rasabot-exalate

Loading

m-vdb commented Feb 7, 2022 •

edited by rasabot-exalate

Loading