Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Data Frame Analytics: Improved error handling for scatterplot matrix. #91993

Merged
merged 5 commits into from
Feb 22, 2021

Conversation

walterra
Copy link
Contributor

@walterra walterra commented Feb 19, 2021

Summary

Part of #84420.
Fixes #91001.

Improves error handling for the scatterplot matrix. Documents with fields with arrays of values cannot be visualized in the scatterplot matrix. This adds a warning callout when the fetched data includes such documents.

image

image

Vega has some issues with field names with dots in them. By default they are treated as attributes in nested objects. There are workarounds for escaping dots but this doesn't seemed to be picked up in the case of the scatterplot matrix where the field names are part of the row/columns configuration. To work around this issue, we now replace dots in field names with a different but similar UTF-8 character so Vega won't treat it as a dot. The only drawback I could think of is that if a user wanted to copy/paste a field value that would include the wrong UTF-8 character, but since the visualization is rendered as a bitmap within a canvas element text cannot be copied from the visualization anyway.

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

];

const queryFallback = searchQuery !== undefined ? searchQuery : { match_all: {} };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the searchQuery passed in already be the default one when no query is set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I noticed this during updating this PR that the fallback to match_all here is unnecessary.

@walterra
Copy link
Contributor Author

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 6.4MB 6.4MB +2.1KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @walterra

@peteharverson
Copy link
Contributor

peteharverson commented Feb 22, 2021

Testing this out with a regression job on the ecommerce data set, I noticed there is an issue with the Total Feature Importance ExpandableSection, where the 'no data callout' does not get displayed.

Job config is:

  "analysis": {
    "regression": {
      "dependent_variable": "taxful_total_price",
      "num_top_feature_importance_values": 5,
      "prediction_field_name": "taxful_total_price_prediction",
      "training_percent": 10,
      "randomize_seed": 6066456206323220000,
      "loss_function": "mse",
      "early_stopping_enabled": true
    }
  },

Raised #92181 for this issue, which is not connected to the changes in this PR.

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and LGTM.

Added a comment with an unrelated issue I found when creating a regression job with the ecommerce data set.

Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⚡

@walterra walterra added auto-backport Deprecated - use backport:version if exact versions are needed bug Fixes for quality problems that affect the customer experience labels Feb 22, 2021
@walterra walterra merged commit 1c3515f into elastic:master Feb 22, 2021
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 22, 2021
…trix. (elastic#91993)

Improves error handling for the scatterplot matrix. Documents with fields with arrays of values cannot be visualized in the scatterplot matrix. This adds a warning callout when the fetched data includes such documents.
@kibanamachine
Copy link
Contributor

💚 Backport successful

7.12 / #92203

Successful backport PRs will be merged automatically after passing CI.

@walterra walterra deleted the ml-fix-blank-scatterplot-matrix branch February 22, 2021 15:33
gmmorris added a commit to gmmorris/kibana that referenced this pull request Feb 22, 2021
* master:
  Ability to filter alerts by string parameters (elastic#92036)
  [APM] Fix for flaky correlations API test (elastic#91673) (elastic#92094)
  [Enterprise Search] Migrate shared role mapping components (elastic#91723)
  [file_upload] move ml Importer classes to file_upload plugin (elastic#91559)
  [Discover] Always show the "hide missing fields" toggle (elastic#91889)
  v2 migrations should exit process on corrupt saved object document (elastic#91465)
  [ML] Data Frame Analytics exploration page: filters improvements (elastic#91748)
  [ML] Data Frame Analytics: Improved error handling for scatterplot matrix. (elastic#91993)
  [coverage] speed up merging results of functional tests (elastic#92111)
  Adds a Reason indicator to the onClose handler in AddAlert and EditAlert (elastic#92149)
walterra added a commit to walterra/kibana that referenced this pull request Feb 22, 2021
…trix. (elastic#91993)

Improves error handling for the scatterplot matrix. Documents with fields with arrays of values cannot be visualized in the scatterplot matrix. This adds a warning callout when the fetched data includes such documents.
kibanamachine added a commit that referenced this pull request Feb 22, 2021
…trix. (#91993) (#92203)

Improves error handling for the scatterplot matrix. Documents with fields with arrays of values cannot be visualized in the scatterplot matrix. This adds a warning callout when the fetched data includes such documents.

Co-authored-by: Walter Rafelsberger <[email protected]>
walterra added a commit that referenced this pull request Feb 22, 2021
…trix. (#91993) (#92242)

Improves error handling for the scatterplot matrix. Documents with fields with arrays of values cannot be visualized in the scatterplot matrix. This adds a warning callout when the fetched data includes such documents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed bug Fixes for quality problems that affect the customer experience Feature:Data Frame Analytics ML data frame analytics features :ml release_note:fix v7.12.0 v7.13.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Data Frame Analytics: scatterplot charts sometimes blank
5 participants