-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report timing stats as part of the Job stats response #42709
Report timing stats as part of the Job stats response #42709
Conversation
Pinging @elastic/ml-core |
run elasticsearch-ci/bwc |
acf1415
to
ca04d84
Compare
I think you need to add a method Additionally, the new field names should go in Finally, are you still planning to add another field for some sort of exponential moving average in addition to the average over all time? |
@@ -273,11 +286,13 @@ public void writeTo(StreamOutput out) throws IOException { | |||
out.writeOptionalString(assignmentExplanation); | |||
out.writeOptionalTimeValue(openTime); | |||
out.writeOptionalWriteable(forecastStats); | |||
out.writeOptionalWriteable(timingStats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be version-aware so it works in a mixed version cluster:
if (out.getVersion().onOrAfter(V_7_3_0)) {
out.writeOptionalWriteable(timingStats);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -185,6 +190,7 @@ public JobStats(StreamInput in) throws IOException { | |||
assignmentExplanation = in.readOptionalString(); | |||
openTime = in.readOptionalTimeValue(); | |||
forecastStats = in.readOptionalWriteable(ForecastStats::new); | |||
timingStats = in.readOptionalWriteable(TimingStats::new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be version-aware so it works in a mixed version cluster:
if (in.getVersion().onOrAfter(V_7_3_0)) {
timingStats = in.readOptionalWriteable(TimingStats::new);
} else {
timingStats = null;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droberts195 is right. Additionally, making it onOrAfter(V_7_3_0)
immediately could break the bwc tests as the actual v7.3.0 code has not been backported to yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, since this is the first BWC issue JobStats serialization, BWC should be added for job stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droberts195: Done.
@benwtrent: Any hints on how to add BWC for job stats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@przemekwitek you should be able to add them to the BWC YAML tests. These are in xpack.qa.rolling-upgrade.resources
. Probably cases added in 30_ml_jobs_crud.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed BWC tests with @dimitris-athanasiou and he suggested that it will be easier if I add them (in a separate PR) after merging in this PR into master and backporting into 7.x. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will be much easier to add the BWC tests in a follow-up PR. Otherwise you'd have to add them on the basis that 7.x doesn't contain this functionality, then adjust both master and 7.x once 7.x does contain this functionality. So I agree with the idea of adding them in a separate PR.
@@ -185,6 +190,7 @@ public JobStats(StreamInput in) throws IOException { | |||
assignmentExplanation = in.readOptionalString(); | |||
openTime = in.readOptionalTimeValue(); | |||
forecastStats = in.readOptionalWriteable(ForecastStats::new); | |||
timingStats = in.readOptionalWriteable(TimingStats::new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droberts195 is right. Additionally, making it onOrAfter(V_7_3_0)
immediately could break the bwc tests as the actual v7.3.0 code has not been backported to yet.
@Override | ||
public void writeTo(StreamOutput out) throws IOException { | ||
out.writeString(jobId); | ||
out.writeOptionalLong(bucketCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketCount
is not optional. If it wrote a null
there would be errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* Returns true if given stats objects differ from each other by more than 10% for at least one of the statistics. | ||
*/ | ||
public static boolean differSignificantly(TimingStats stats1, TimingStats stats2) { | ||
return differSignificantly(stats1.minBucketProcessingTimeMs, stats2.minBucketProcessingTimeMs, 0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make 0.1
a private constant so that it is self-documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I've also removed the "p" parameter as it is not used in any meaningful way at the moment.
|
||
public TimingStats(StreamInput in) throws IOException { | ||
this.jobId = in.readString(); | ||
this.bucketCount = in.readOptionalLong(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketCount
is not optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -65,8 +67,12 @@ protected Response createTestInstance() { | |||
if (randomBoolean()) { | |||
openTime = parseTimeValue(randomPositiveTimeValue(), "open_time-Test"); | |||
} | |||
Response.JobStats jobStats = new Response.JobStats(jobId, dataCounts, sizeStats, forecastStats, jobState, node, explanation, | |||
openTime); | |||
TimingStats timingStats = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#nit
TimingStats timingStats = null; | |
TimingStats timingStats = randomBoolean() ? null : TimingStatsTests.createTestInstance("foo"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (refactored the whole method this way).
/** | ||
* Updates the statistics (min, max, avg) for the given data point (bucket processing time). | ||
*/ | ||
public void updateStats(double bucketProcessingTimeMs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we disallow negatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -124,6 +135,11 @@ public Builder setModelSizeStats(ModelSizeStats modelSizeStats) { | |||
return this; | |||
} | |||
|
|||
public Builder setTimingStats(TimingStats timingStats) { | |||
this.timingStats = timingStats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.timingStats = timingStats; | |
this.timingStats = new TimingStats(timingStats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on why do you think this change is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@przemekwitek TimingStats
has a method that allows its internal state to mutate. Not doing a copy would allow the parameter timingStats#updateStats
to be called AFTER it is passed to this method causing the Builder#timingStats
to be updated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that. It's just this setter in builder is only called once and the temporary object is not saved anywhere else:
paramsBuilder.setTimingStats(parseSearchHit(hit, TimingStats.PARSER, errorHandler));
But ok, I applied your suggestion to prevent some random refactoring of breaking this assumption.
@@ -185,6 +190,7 @@ public JobStats(StreamInput in) throws IOException { | |||
assignmentExplanation = in.readOptionalString(); | |||
openTime = in.readOptionalTimeValue(); | |||
forecastStats = in.readOptionalWriteable(ForecastStats::new); | |||
timingStats = in.readOptionalWriteable(TimingStats::new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, since this is the first BWC issue JobStats serialization, BWC should be added for job stats.
Done.
Done.
Yes, but I'd rather do it in a separate PR as I wanted to check in and verify the base work first. |
run elasticsearch-ci/2 |
3ab9022
to
dc385ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Initial version of timing stats reporting.
TimingStats object returned as part of the GetJobStats response contains 3 numbers: min, max and avg of bucket processing times.
Additionally, it contains bucket count. This bucket count may differ from the bucket count stored in DataCounts.
Related issue: #29857