Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Deprecate anomaly detection post data endpoint #66347

Merged

Conversation

droberts195
Copy link
Contributor

There is little evidence of this endpoint being used
and there is quite a lot of code complexity associated
with the various formats that can be used to upload
data and the different errors that can occur when direct
data upload is open to end users.

In a future release we can make this endpoint internal
so that only datafeeds can use it, and remove all the
options and formats that are not used by datafeeds.

End users will have to store their input data for
anomaly detection in Elasticsearch indices (which we
believe all do today) and use a datafeed to feed it
to anomaly detection jobs.

There is little evidence of this endpoint being used
and there is quite a lot of code complexity associated
with the various formats that can be used to upload
data and the different errors that can occur when direct
data upload is open to end users.

In a future release we can make this endpoint internal
so that only datafeeds can use it, and remove all the
options and formats that are not used by datafeeds.

End users will have to store their input data for
anomaly detection in Elasticsearch indices (which we
believe all do today) and use a datafeed to feed it
to anomaly detection jobs.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lcawl
Copy link
Contributor

lcawl commented Dec 15, 2020

Will the flush jobs API be deprecated as well? Per https://www.elastic.co/guide/en/elasticsearch/reference/master/ml-flush-job.html#ml-flush-job-desc "The flush jobs API is only applicable when sending data for analysis using the post data API".

@droberts195
Copy link
Contributor Author

Will the flush jobs API be deprecated as well? Per https://www.elastic.co/guide/en/elasticsearch/reference/master/ml-flush-job.html#ml-flush-job-desc "The flush jobs API is only applicable when sending data for analysis using the post data API".

Not immediately. There are actually troubleshooting use cases where the flush jobs API can be used in conjunction with a datafeed, to skip or advance time in between opening the job and starting the datafeed. Potentially we could deprecate calling the flush jobs API without arguments, as that use case will be useless. But there isn't time to properly think this through for 7.11 so let's leave it for 7.12 or after.

@droberts195
Copy link
Contributor Author

Jenkins run elasticsearch-ci/2

Copy link
Contributor

@lcawl lcawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation LGTM

@droberts195 droberts195 merged commit c5bef7f into elastic:master Dec 15, 2020
@droberts195 droberts195 deleted the deprecate_ml_post_data_endpoint branch December 15, 2020 18:37
droberts195 added a commit to droberts195/elasticsearch that referenced this pull request Dec 15, 2020
There is little evidence of this endpoint being used
and there is quite a lot of code complexity associated
with the various formats that can be used to upload
data and the different errors that can occur when direct
data upload is open to end users.

In a future release we can make this endpoint internal
so that only datafeeds can use it, and remove all the
options and formats that are not used by datafeeds.

End users will have to store their input data for
anomaly detection in Elasticsearch indices (which we
believe all do today) and use a datafeed to feed it
to anomaly detection jobs.

Backport of elastic#66347
droberts195 added a commit that referenced this pull request Dec 15, 2020
There is little evidence of this endpoint being used
and there is quite a lot of code complexity associated
with the various formats that can be used to upload
data and the different errors that can occur when direct
data upload is open to end users.

In a future release we can make this endpoint internal
so that only datafeeds can use it, and remove all the
options and formats that are not used by datafeeds.

End users will have to store their input data for
anomaly detection in Elasticsearch indices (which we
believe all do today) and use a datafeed to feed it
to anomaly detection jobs.

Backport of #66347
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants