-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistency between docstring and implementation in explanation_dashboard_input.py #1270
Comments
@smrnvdhy great find! I think we should make it 100,000 rows, since we can sometimes handle 10k-20k rows for certain datasets. However, after 15k rows usually the visualization becomes very slow to load in the browser, and past 20k rows it usually fails with some error. There are some ideas in our team on how we might be able to lift these restrictions - specifically, by doing more of the calculations/clustering in the backend and using streaming so that we don't load all of the data into the UI. I'll send a PR to update the docstrings. |
@imatiach-msft Thank you for addressing this so quickly! I also observe long load time just like you mentioned (40 minutes wall time for 14k rows). I'm wondering if you could share which functionality/calculation is making the UI slow to load. Is there an easy way to make that part optional? My team love that we can easily investigate model performance and feature importance by subgroups, but the load time is a major concern. |
@smrnvdhy And also in the notebook (where the sample_dataset parameter doesn't need to be passed since the original explanation was downsampled): I even have a POC working of running Error Analysis Dashboard on spark (millions of rows) for the tree view. |
closing issue as PR has been merged: |
The docstring mentions dataset must have fewer than 10000 rows and fewer than 1000 columns. However, the code checks for 100000 rows and 1000 columns.
Should be an easy fix that anyone could do. I'd fork and PR, but I don't know which one is the intended value. Slightly leaning towards 10000, because it also appears in explanation_dashboard.py docstring
The text was updated successfully, but these errors were encountered: