You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using oneDAL distr algos to optimize Spark ML. Some metrics are missing and Could you check if you can add the following stats in distributed low-order moments (basic stats) ?
Clarification details per our discussion with Xiaochang:
Count: [Xiaochang]: User usually get several metrics instead of single one, it's convenient for them to get observations’ count from result along with other metrics. Otherwise, user needs extra coding effort.
numNonzeroes: [Xiaochang]: just count the number of non 0.0
weightSum: [Xiaochang]: there is a separate column called weight in Spark's dataframe for each row.
Need to investigate possibility of adding corresponding API into compute_inpute and compute_result.
Also, need to check how much adding all these metrics will affect performance of default case (when all metrics are calculated).
We are using oneDAL distr algos to optimize Spark ML. Some metrics are missing and Could you check if you can add the following stats in distributed low-order moments (basic stats) ?
Check for details: https://spark.apache.org/docs/latest/api/scala/org/apache/spark/mllib/stat/MultivariateStatisticalSummary.html
The text was updated successfully, but these errors were encountered: