-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090
Add handoff wait time to IngestionStatsAndErrorsTaskReportData #11090
Conversation
@jihoonson you mentioned that it would be nice to have this added to the report in #10676 (discussion) If you get a chance, could you take a peek at this in the next few weeks? would be great to get it into the code before 0.22.0 release cycle starts so it goes in as the same time as the pr to add the handoff wait. Thanks! |
Thank you for the follow-up! Code changes LGTM, but can you please add some integration tests? I guess you can modify those integration tests you added in the previous PR to verify that the waitTime is always larger than 0. |
…s wait for segments happened
good point. I went with a simple assertion in the existing code to verify the report has a non-zero wait time for the IT that perform a handoff wait. I believe that should suffice in making sure the report stays valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Do we need any accompanying docs for this change?
Nice addition to get the IT for free :)
@@ -198,7 +198,6 @@ A sample task is shown below: | |||
|id|The task ID. If this is not explicitly specified, Druid generates the task ID using task type, data source name, interval, and date-time stamp. |no| | |||
|spec|The ingestion spec including the data schema, IOConfig, and TuningConfig. See below for more details. |yes| | |||
|context|Context containing various task configuration parameters. See below for more details.|no| | |||
|awaitSegmentAvailabilityTimeoutMillis|Long|Milliseconds to wait for the newly indexed segments to become available for query after ingestion completes. If `<= 0`, no wait will occur. If `> 0`, the task will wait for the Coordinator to indicate that the new segments are available for querying. If the timeout expires, the task will exit as successful, but the segments were not confirmed to have become available for query. Note for compaction tasks: you should not set this to a non-zero value because it is not supported by the compaction task type at this time.|no (default = 0)| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realized that this doc was misplaced and duplicated in the native-batch file. It is found below in the tuningConfig section where it should be. This is just clean up.
good point. just added to the tasks.md docs to help inform users |
@jihoonson should we have to do anything more than adding this to release notes in future release to notify clients of json payload change? I suppose it is possible that clients are relying on some static schema for this report JSON so the new field could break code that doesn't ignore new fields being added. |
LGTM, Thanks @capistrant. I don't particularly worry about it but we can include some warning about it in the release notes if you want. |
This change is causing build to fail. Seems like the change #11688 that went in a few days ago use the IngestionStatsAndErrorsTaskReportData constructor which is now changed in this PR. Travis green check in this PR was from 25 days ago. In the future, I recommend rerunning Travis before merging for checks that passed long time ago. |
agree 100%. I've been trying to stay on top of merging with master on a regular basis to make sure I keep CI fresh but let this one slip. Thanks for the quick fix for master |
Description
Follow on to #10676.
The main change is adding
segmentAvailabilityWaitTimeMs
toIngestionStatsAndErrorsTaskReportData
. This is the milliseconds that a task waited for its segments to be handed off to Historical nodes. If there is no wait for handoff, the value is simply 0. Otherwise the handoff value can be between 0 and the configured timeout.Secondly, I made a small refactor of
AbstractBatchIndexTask#waitForSegmentAvailability
. The method now sets the value forsegmentAvailabilityConfirmationCompleted
as well as returns the boolean value assigned to that variable. In the past the code only returned the boolean and relied on the caller to set the value, this seemed like poor practice.Key changed/added classes in this PR
IngestionStatsAndErrorsTaskReportData
AbstractBatchIndexTask
This PR has: