Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an annotation when a model snapshot is stored #53783

Merged

Conversation

przemekwitek
Copy link
Contributor

@przemekwitek przemekwitek commented Mar 19, 2020

This PR creates an annotation every time a model snapshot is stored and deletes an annotation when the model snapshot is deleted.
It is achieved the following way:

  1. The logic to persist annotation documents is extracted into AnnotationPersister class in a separate file.
  2. AnnotationPersister is used in DatafeedJob which creates annotations for delayed data
  3. AnnotationPersister is re-used in AutodetectResultProcessor on every new ModelSnapshot object received from C++
  4. JobDataDeleter is extended so that it also deletes model snapshot annotation from .ml-annotations* index

Points 1. and 2. is just refactoring.
Point 3. is the new behavior.

Relates #52149

@przemekwitek przemekwitek force-pushed the annotation_on_model_snapshot branch 4 times, most recently from ed54e46 to ef9fa5d Compare March 20, 2020 10:28
@przemekwitek przemekwitek changed the title [DRAFT] Create an annotation when a model snapshot is stored Create an annotation when a model snapshot is stored Mar 20, 2020
@przemekwitek przemekwitek removed the WIP label Mar 20, 2020
@przemekwitek przemekwitek marked this pull request as ready for review March 20, 2020 13:11
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent benwtrent self-requested a review March 20, 2020 14:13
@@ -66,7 +66,7 @@ public void clusterChanged(ClusterChangedEvent event) {
// The atomic flag prevents multiple simultaneous attempts to create the
// index if there is a flurry of cluster state updates in quick succession
if (event.localNodeMaster() && isIndexCreationInProgress.compareAndSet(false, true)) {
AnnotationIndex.createAnnotationsIndexIfNecessary(settings, client, event.state(), ActionListener.wrap(
AnnotationIndex.createAnnotationsIndexIfNecessary(client, event.state(), ActionListener.wrap(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still leaves open the following bug (unsure how to fix it really).

  • Node running the job is updated
  • Job gets reallocated to a new node
  • Job writes annotation for a model snapshot
  • The master node is still Old, consequently never crated the annotations index
  • Annotation persistence auto-created the index and now the UI fails to read from the index.

It seems to me that we need to verify if the index exists already and create it if necessary when we write annotations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that scenario was a problem in 6.7 and 6.8 as there were 6.x indices that didn't have the annotations index - it probably explains a few bug reports we had when the annotations index was not available as expected. But now any rolling upgrades must be starting from 6.8 at the earliest, so the old version will have the annotations index already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, my concern is that this will occur again whenever we update the mapping of the annotation index.

We can sort of 'boot' that for now I suppose. But we should definitely make sure that whomever updates the mapping of the annotations index knows that they need make this sort of change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, my concern is that this will occur again whenever we update the mapping of the annotation index.

My PR does not update mappings of annotations index.
The new annotations I'm adding fit the existing mappings so there was no point in changing them.

@przemekwitek przemekwitek force-pushed the annotation_on_model_snapshot branch 6 times, most recently from ea8be29 to c9615ca Compare March 27, 2020 15:37
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I just left a couple of minor comments.

After this is merged I think this would also be good to test on a real cluster that's left running for a few days with a real-time job so that the effect of periodic model snapshots and nightly maintenance can be observed, and the annotations viewed in Kibana.

@@ -66,7 +66,7 @@ public void clusterChanged(ClusterChangedEvent event) {
// The atomic flag prevents multiple simultaneous attempts to create the
// index if there is a flurry of cluster state updates in quick succession
if (event.localNodeMaster() && isIndexCreationInProgress.compareAndSet(false, true)) {
AnnotationIndex.createAnnotationsIndexIfNecessary(settings, client, event.state(), ActionListener.wrap(
AnnotationIndex.createAnnotationsIndexIfNecessary(client, event.state(), ActionListener.wrap(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that scenario was a problem in 6.7 and 6.8 as there were 6.x indices that didn't have the annotations index - it probably explains a few bug reports we had when the annotations index was not available as expected. But now any rolling upgrades must be starting from 6.8 at the earliest, so the old version will have the annotations index already.

@przemekwitek przemekwitek force-pushed the annotation_on_model_snapshot branch from c9615ca to 8717e21 Compare March 30, 2020 06:47
@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/default-distro

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@przemekwitek przemekwitek force-pushed the annotation_on_model_snapshot branch from 03b0c58 to 70fbc6e Compare March 30, 2020 11:10
@przemekwitek przemekwitek merged commit 7c5c912 into elastic:master Mar 30, 2020
Comment on lines +392 to +393
modelSnapshot.getTimestamp(),
modelSnapshot.getTimestamp(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these should have been modelSnapshot.getLatestResultTimeStamp()?

See #56076 (comment) for more details on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just raised #56093 to address that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants