[Core][Scale] Large number of launchplans yielded (8k+) in dynamic workflows cause memory bloat #1660
Closed
2 tasks done
Labels
enhancement
New feature or request
flytekit
FlyteKit Python related issue
scale
Scale, Reliability and Performance of the platform
Milestone
Motivation: Why do you think this is important?
Consider an example of this type
for a large number of input_integers - in our example 8k. In this case, the node-ids created by flytekit are extremely verbose and for 8k nodes cause a total overhead of more than 1MB+. On the other hand the name of the node itself, which is a derivative of the function is also extremely large and causes a bloat of 0.6MB.
The reason why the node-ids cause a large bloat, is because they are repeated in
In the above example the node-id is of the type
dynamic-launch-lps-n7999
, when it could simply ben7999
ordn7999
Moreover, the name is about 60 characters long as it is fully qualified, which is unnecessary.
** Note: This example should probably use a map task, but for debugging this is a good enough usecase**
Goal: What should the final outcome look like, ideally?
compiled closures should be small, this drastically affects the performance
Describe alternatives you've considered
NA
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: