Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scope-related fields to Annotation #56417

Merged
merged 1 commit into from
May 13, 2020

Conversation

przemekwitek
Copy link
Contributor

@przemekwitek przemekwitek commented May 8, 2020

This PR implements scoped annotations by adding 7 new fields to Annotation type: detector_index, partition_field_name, partition_field_value, over_field_name, over_field_value, by_field_name, by_field_value.

Rationale
Currently the user of Elastic ML has an ability to attach an annotation to a moment in time (or time range) when something interesting happens (e.g.: Black Friday sale causing a sudden bump in request rate). The scope of such an annotation is an ML job. So even if the user configures anomaly detection job with partition_field_name, over_field_name or by_field_name parameters, the annotation is unaware of them. This may cause usability issues for a number of reasons:

  • the user could have forgotten the exact time series they were looking at while creating the annotation
  • another user seeing the annotation may not have the knowledge (context) their colleague had while creating the annotation
  • we (ML team) plan to introduce more system-generated annotations based on model changes. Such annotations only make sense in the context of the model/time series they were created for

Because of these reasons, we should implement scoped (or scope-aware) annotations that carry the relevant context information with them.

Relates #55781

@przemekwitek przemekwitek removed the WIP label May 11, 2020
@przemekwitek przemekwitek marked this pull request as ready for review May 11, 2020 07:04
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent
Copy link
Member

Design comment:

Is it possible to be more general with the change? Could we add two fields? scope: [<scope>] and scope_value: [<score values>] or something?

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@droberts195 droberts195 added v7.9.0 and removed v7.8.0 labels May 11, 2020
@przemekwitek
Copy link
Contributor Author

przemekwitek commented May 13, 2020

Is it possible to be more general with the change? Could we add two fields? scope: [<scope>] and scope_value: [<score values>] or something?

I was thinking about an abstraction like this but then we loose the information which field was partition, over or by. In order to maintain this information (which is AFAIU needed by the UI), we would need to:

  1. either have an "enum" field to distinguish between roles:
{
  ...
  scopes: [
    {
      "name": "country_name",
      "value": "Poland",
      "role": "partition_field"
    },
    {
      "name: "city_name",
      "value: "Cracow",
      "role: "by_field"
    }
  ]
}
  1. or make the scope abstraction a key-value pair essentially:
{
  ...
  "partition_scope": {
    "name": "country_name",
    "value": "Poland"
  },
  "by_scope": {
    "name": "city_name",
    "value": "Cracow"
  }
}

Given that we are already using flat structure in ModelPlotData and other places, I decided to stick to this format. If you think that one of the formats I listed above is better (or have another one I should consider) please let me know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants