Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] _xpack/usage should return stats about forecasting #31395

Closed
hendrikmuhs opened this issue Jun 18, 2018 · 9 comments
Closed

[ML] _xpack/usage should return stats about forecasting #31395

hendrikmuhs opened this issue Jun 18, 2018 · 9 comments

Comments

@hendrikmuhs
Copy link

hendrikmuhs commented Jun 18, 2018

_xpack/usage currently does not return any stats about forecasts of ml jobs. Information of interest:

  • general usage (overall, per job)
  • resource usage
    • memory
    • runtime
  • counts
    • duration
    • data points
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@hendrikmuhs hendrikmuhs self-assigned this Jun 18, 2018
@tsullivan
Copy link
Member

Given that this is more info that what we're displaying in the Monitoring dashboards, it would be great if we can get this data is in to the monitoring indices for telemetry, but use it for charting in the Monitoring UI as well.

@hendrikmuhs
Copy link
Author

After some investigation, I suggest to add forecast stats to _xpack/ml/anomaly_detectors/{jobid|_all}/_stats which could look like this:

image

Note: This is just an example for the job stats integration, categories of interest are to be decided. From there it is easy to expose it to _usage and monitoring.

An alternative solution would be forecast own _stats endpoint.

@sophiec20
CC @elastic/ml-core

@sophiec20
Copy link
Contributor

Interesting, I like the idea of forecast_stats. Team discussion required, for API and UI (as stats are shown in the Job List).

@hendrikmuhs
Copy link
Author

I checked:

The UI (I guess you mean the Counts tab) is not affected, it looks like it has a an overlay (CC @peteharverson) over the results from the job stats API. So fortunately there is no immediate action required for the UI. Longer term it would be nice to make use of these new stats, of course.

The main question, which API endpoint:

  1. Proposal, reuse: _xpack/ml/anomaly_detectors/{jobid|_all}/_stats

  2. New: _xpack/ml/forecast_stats/{jobid|_all}

  3. New: _xpack/ml/forecast/{jobid|_all}/_stats

  4. New: _xpack/ml/anomaly_detectors/{jobid|_all}/_forecast_stats

Note: The way forecast is triggered is _xpack/ml/anomaly_detectors/{jobid|_all}/_forecast, therefore option 2 and 3 are in my opinion not a good choice, except we want to change triggering as well.

Interesting, I like the idea of forecast_stats

Which of the above choices match that?

I collect a couple more material and then schedule a team discussion but happily receive some more feedback here, maybe there are more options than the above?

@stevedodson
Copy link

  1. Proposal, reuse: _xpack/ml/anomaly_detectors/{jobid|_all}/_stats

+1 for reusing the endpoint

@davidkyle
Copy link
Member

Can we add units to the forecast memory stats (min, max, avg, sum) I presume they are bytes

@droberts195
Copy link
Contributor

Can we add units to the forecast memory stats (min, max, avg, sum) I presume they are bytes

There are two ways that suggestion can be interpreted: add units to the names or add units to the values. For consistency with model_size_stats and data_counts I think _bytes should be added to the field names rather than units added to the field values. If we were starting from scratch I think it would be better to add units to the field values, but doing that in this case would create an inconsistency with other fields returned by the same endpoint.

hendrikmuhs pushed a commit that referenced this issue Jul 4, 2018
…sage API (#31647)

This change adds stats about forecasts, to the jobstats api as well as xpack/_usage. The following 
information is collected:

_xpack/ml/anomaly_detectors/{jobid|_all}/_stats:

 -  total number of forecasts
 -  memory statistics (mean/min/max)
 -  runtime statistics
 -  record statistics
 -  counts by status

_xpack/usage

 -  collected by job status as well as overall (_all):
     -  total number of forecasts
     -  number of jobs that have at least 1 forecast
     -  memory, runtime, record statistics
     -  counts by status

Fixes #31395
@hendrikmuhs
Copy link
Author

@tsullivan Regarding your request:

Given that this is more info that what we're displaying in the Monitoring dashboards, it would be great if we can get this data is in to the monitoring indices for telemetry, but use it for charting in the Monitoring UI as well.

As far as I can see it, the change in #31647 automatically adds the new telemetry data into monitoring as well, because monitoring indexes the jobstats object as is. Querying the .monitoring-es... indexes contains the added data points. So this part is fine.

For follow up, I suggest a new issue. I think there are several follow up items. Short term (probably easy to implement) would be to add the number of forecasts to the already existing Jobs table. Long term I would like to see charts, but 1st for jobs, 2nd for forecasts.

hendrikmuhs pushed a commit that referenced this issue Jul 4, 2018
…sage API (#31647)

This change adds stats about forecasts, to the jobstats api as well as xpack/_usage. The following 
information is collected:

_xpack/ml/anomaly_detectors/{jobid|_all}/_stats:

 -  total number of forecasts
 -  memory statistics (mean/min/max)
 -  runtime statistics
 -  record statistics
 -  counts by status

_xpack/usage

 -  collected by job status as well as overall (_all):
     -  total number of forecasts
     -  number of jobs that have at least 1 forecast
     -  memory, runtime, record statistics
     -  counts by status

Fixes #31395
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants