You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice that the commit "401891edef8aebefb129f59d7422fb2b26b0f746" add the code to cache label and features columns. This logic causes Intel-MLlib to consume more memory when executing Bayes algorithm. We previously can use 1TB memory to run Bytes algorithm with 450GB data scale. After catching more data, 1.5TB memory is not enough.
The text was updated successfully, but these errors were encountered:
Performance is always a tradeoff. It trades memory for performance. In our experiment, caching did improve some E2E workloads if the memory is enough, since without caching the input RDD will be calculated again and again.
So it is needed to prove the tradeoff is invalid, otherwise I will not consider it as an issue even it can't run the workload you mentioned.
I notice that the commit "401891edef8aebefb129f59d7422fb2b26b0f746" add the code to cache label and features columns. This logic causes Intel-MLlib to consume more memory when executing Bayes algorithm. We previously can use 1TB memory to run Bytes algorithm with 450GB data scale. After catching more data, 1.5TB memory is not enough.
The text was updated successfully, but these errors were encountered: