Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090

capistrant · 2021-04-09T22:11:04Z

Description

Follow on to #10676.

The main change is adding segmentAvailabilityWaitTimeMs to IngestionStatsAndErrorsTaskReportData. This is the milliseconds that a task waited for its segments to be handed off to Historical nodes. If there is no wait for handoff, the value is simply 0. Otherwise the handoff value can be between 0 and the configured timeout.

Secondly, I made a small refactor of AbstractBatchIndexTask#waitForSegmentAvailability. The method now sets the value for segmentAvailabilityConfirmationCompleted as well as returns the boolean value assigned to that variable. In the past the code only returned the boolean and relied on the caller to set the value, this seemed like poor practice.

Key changed/added classes in this PR

IngestionStatsAndErrorsTaskReportData
AbstractBatchIndexTask

This PR has:

been self-reviewed.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

…or batch handoff

capistrant · 2021-07-02T17:37:20Z

@jihoonson you mentioned that it would be nice to have this added to the report in #10676 (discussion) If you get a chance, could you take a peek at this in the next few weeks? would be great to get it into the code before 0.22.0 release cycle starts so it goes in as the same time as the pr to add the handoff wait. Thanks!

jihoonson · 2021-07-03T04:59:06Z

Thank you for the follow-up! Code changes LGTM, but can you please add some integration tests? I guess you can modify those integration tests you added in the previous PR to verify that the waitTime is always larger than 0.

…s wait for segments happened

capistrant · 2021-07-08T20:57:02Z

Thank you for the follow-up! Code changes LGTM, but can you please add some integration tests? I guess you can modify those integration tests you added in the previous PR to verify that the waitTime is always larger than 0.

good point. I went with a simple assertion in the existing code to verify the report has a non-zero wait time for the IT that perform a handoff wait. I believe that should suffice in making sure the report stays valid

suneet-s

LGTM! Do we need any accompanying docs for this change?

Nice addition to get the IT for free :)

capistrant · 2021-08-27T18:00:09Z

docs/ingestion/native-batch.md

@@ -198,7 +198,6 @@ A sample task is shown below:
 |id|The task ID. If this is not explicitly specified, Druid generates the task ID using task type, data source name, interval, and date-time stamp. |no|
 |spec|The ingestion spec including the data schema, IOConfig, and TuningConfig. See below for more details. |yes|
 |context|Context containing various task configuration parameters. See below for more details.|no|
-|awaitSegmentAvailabilityTimeoutMillis|Long|Milliseconds to wait for the newly indexed segments to become available for query after ingestion completes. If `<= 0`, no wait will occur. If `> 0`, the task will wait for the Coordinator to indicate that the new segments are available for querying. If the timeout expires, the task will exit as successful, but the segments were not confirmed to have become available for query. Note for compaction tasks: you should not set this to a non-zero value because it is not supported by the compaction task type at this time.|no (default = 0)|


Realized that this doc was misplaced and duplicated in the native-batch file. It is found below in the tuningConfig section where it should be. This is just clean up.

capistrant · 2021-08-27T18:01:16Z

LGTM! Do we need any accompanying docs for this change?

Nice addition to get the IT for free :)

good point. just added to the tasks.md docs to help inform users

capistrant · 2021-09-07T18:36:43Z

@jihoonson should we have to do anything more than adding this to release notes in future release to notify clients of json payload change? I suppose it is possible that clients are relying on some static schema for this report JSON so the new field could break code that doesn't ignore new fields being added.

jihoonson · 2021-09-21T05:48:27Z

LGTM, Thanks @capistrant. I don't particularly worry about it but we can include some warning about it in the release notes if you want.

maytasm · 2021-09-21T07:41:29Z

This change is causing build to fail. Seems like the change #11688 that went in a few days ago use the IngestionStatsAndErrorsTaskReportData constructor which is now changed in this PR. Travis green check in this PR was from 25 days ago. In the future, I recommend rerunning Travis before merging for checks that passed long time ago.

capistrant · 2021-09-30T20:40:04Z

This change is causing build to fail. Seems like the change #11688 that went in a few days ago use the IngestionStatsAndErrorsTaskReportData constructor which is now changed in this PR. Travis green check in this PR was from 25 days ago. In the future, I recommend rerunning Travis before merging for checks that passed long time ago.

agree 100%. I've been trying to stay on top of merging with master on a regular basis to make sure I keep CI fresh but let this one slip. Thanks for the quick fix for master

Add handoff wait time to ingestion stats report. Refactor some code f…

fa6c9a9

…or batch handoff

capistrant requested review from jihoonson and suneet-s April 9, 2021 22:30

fix checkstyle

c44d814

clintropolis added Area - Batch Ingestion Area - Operations labels Apr 13, 2021

capistrant added 4 commits April 27, 2021 16:33

Merge branch 'master' into report-segment-availability-wait-time

f51823f

Merge branch 'master' into report-segment-availability-wait-time

1b9cc13

Merge branch 'master' into report-segment-availability-wait-time

f78d5ad

Merge branch 'master' into report-segment-availability-wait-time

afe7276

Add assertion to AbstractITBatchIndexTask to make sure report reflect…

d7e740b

…s wait for segments happened

suneet-s approved these changes Aug 27, 2021

View reviewed changes

add docs to the task reports section of doc

fe1281d

capistrant commented Aug 27, 2021

View reviewed changes

jihoonson approved these changes Sep 21, 2021

View reviewed changes

jihoonson merged commit 5c3f3da into apache:master Sep 21, 2021

maytasm mentioned this pull request Sep 21, 2021

Fix broken build due to https://github.com/apache/druid/pull/11090 #11727

Merged

9 tasks

capistrant mentioned this pull request Nov 29, 2021

Add new metric that quantifies how long batch ingest jobs waited for segment availability and whether or not that wait was successful #12002

Merged

6 tasks

abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022

abhishekagarwal87 mentioned this pull request May 25, 2022

[Draft] 0.23.0 Release notes #12510

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090

Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090

capistrant commented Apr 9, 2021 •

edited

Loading

capistrant commented Jul 2, 2021

jihoonson commented Jul 3, 2021

capistrant commented Jul 8, 2021

suneet-s left a comment

capistrant Aug 27, 2021

capistrant commented Aug 27, 2021

capistrant commented Sep 7, 2021

jihoonson commented Sep 21, 2021

maytasm commented Sep 21, 2021 •

edited

Loading

capistrant commented Sep 30, 2021

Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090

Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090

Conversation

capistrant commented Apr 9, 2021 • edited Loading

Description

Key changed/added classes in this PR

capistrant commented Jul 2, 2021

jihoonson commented Jul 3, 2021

capistrant commented Jul 8, 2021

suneet-s left a comment

Choose a reason for hiding this comment

capistrant Aug 27, 2021

Choose a reason for hiding this comment

capistrant commented Aug 27, 2021

capistrant commented Sep 7, 2021

jihoonson commented Sep 21, 2021

maytasm commented Sep 21, 2021 • edited Loading

capistrant commented Sep 30, 2021

capistrant commented Apr 9, 2021 •

edited

Loading

maytasm commented Sep 21, 2021 •

edited

Loading