Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Fix datafeed start time is incorrect when the job has trailing empty buckets #71976

Merged
merged 6 commits into from
Jul 16, 2020

Conversation

qn895
Copy link
Member

@qn895 qn895 commented Jul 15, 2020

Summary

This PR addresses #69537 which corrects the earliest start/continue time from dataCounts?.latest_record_timestamp to max(latest_record_timestamp, latest_bucket.timestamp + bucket_span).

Example below is the gallery1_1594832288_800_7370 job with bucket_span of 1h where:

  • latest_record_timestamp = 20:25:27
  • latest_bucket.timestamp = 20:00:00

So the earliest available timestamp is 21:00

Before
unmodified

After

Screen Shot 2020-07-15 at 5 33 23 PM

Checklist

Delete any items that are not applicable to this PR.

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

bucketSpan?: Duration | null | undefined
): number | undefined {
if (latestRecordTimestamp !== undefined && latestBucketTimestamp !== undefined) {
// if bucket span is availble (e.g. 15m) add it to the latest bucket timestamp in ms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// if bucket span is availble (e.g. 15m) add it to the latest bucket timestamp in ms
// if bucket span is available (e.g. 15m) add it to the latest bucket timestamp in ms

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed here dcce6b6

): number | undefined {
if (latestRecordTimestamp !== undefined && latestBucketTimestamp !== undefined) {
// if bucket span is availble (e.g. 15m) add it to the latest bucket timestamp in ms
const latestBucketStartTime = bucketSpan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latestBucketStartTime might be a confusing name for future readers of this code, as bucket timestamps are the bucket start time, so latestBucketTimestamp is the start time of the latest bucket we have results for. When you add on the bucket span you're getting the start time of the first bucket we don't have results for.

It might be clearer to rename latestBucketStartTime in this code to adjustedBucketStartTime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed here dcce6b6

@qn895
Copy link
Member Author

qn895 commented Jul 16, 2020

@elasticmachine merge upstream

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment, but generally LGTM

@@ -171,6 +172,11 @@ export function jobsProvider(mlClusterClient: ILegacyScopedClusterClient) {
description: job.description || '',
groups: Array.isArray(job.groups) ? job.groups.sort() : [],
processed_record_count: job.data_counts?.processed_record_count,
earliestStartTimestampMs: getEarliestDatafeedStartTime(
dataCounts?.latest_record_timestamp as number,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't look like the number type assertions are needed here.
they're also not need for the call to getLatestDataOrBucketTimestamp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the casting check for both here 09ae890 since they are both not needed.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

page load bundle size

id value diff baseline
ml 4.3MB +579.0B 4.3MB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@qn895 qn895 merged commit 7868a56 into elastic:master Jul 16, 2020
@qn895 qn895 deleted the incorrect-datafeed-start-time-69537 branch July 16, 2020 19:42
qn895 added a commit to qn895/kibana that referenced this pull request Jul 16, 2020
qn895 added a commit that referenced this pull request Jul 16, 2020
qn895 added a commit that referenced this pull request Jul 16, 2020
…ling empty buckets (#71976) (#72145)

Co-authored-by: Elastic Machine <[email protected]>

Co-authored-by: Elastic Machine <[email protected]>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 17, 2020
* master: (214 commits)
  replacing hard coded links for ela.st (elastic#72240)
  skip flaky suite (elastic#60865)
  chore(NA): teardown dynamic dll plugin (elastic#72096)
  Register navLink actions for declared applications (elastic#72109)
  Fix value for process.hash.sha256 draggable (elastic#72142)
  Call setupIngest before fleet_install tests (elastic#72214)
  [Security Solution][Detections] Better toast errors (elastic#72205)
  skip flaky suite (elastic#64696)
  [Security Solution][Detections] Disable exceptions for Threshold and ML rules (elastic#72137)
  [Security Solution][Detections,Lists] Miscellaneous post-FF fixes (elastic#71990)
  [baseline/capture] use high-memory nodes with ramDisks (elastic#71894)
  skip flaky suite (elastic#77207)
  [Maps] Fix issue preventing TMS from rendering correctly (elastic#71946)
  using test_user with minimum privs (elastic#71988)
  Fixed Webhook connector doesn't retain added HTTP header settings (elastic#71924)
  [Ingest Manager] Do not show enrolling and unenrolling agents as online in agent counters (elastic#71921)
  [Maps] fix 'New Map' from getting added to recently accessed (elastic#72125)
  [Visualizations] Pass 'aggs' parameter to custom request handlers (elastic#71423)
  [Monitoring] Out of the box alert tweaks (elastic#71942)
  [ML] Fix datafeed start time is incorrect when the job has trailing empty buckets (elastic#71976)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants