-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Web console: reflect the changes to interval requirement in the data loader flow #10647
Conversation
@@ -1546,17 +1557,41 @@ export function getPartitionRelatedTuningSpecFormFields( | |||
}, | |||
]; | |||
|
|||
if (oneOf(deepGet(spec, 'spec.tuningConfig.partitionsSpec.type'), 'hashed', 'single_dim')) { | |||
parallelFields.push({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to show time intervals in the partitioning step? Seems like we have a duplicate function in the "global filter" in the filtering step (probably here). I prefer the intervals filter in the filtering step because it's a filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct and I debated taking them out of the partition step completely but decided to leave them (redundantly) and have them folded under Show more
. My understanding it that setting intervals still has an effect on how many ingestion tasks are spawned - so it is still relevant (although unlikely) here. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So, this is more for ingestion performance tuning rather than filtering. Performance tuning guide seems nice to have. Ideally, it would be great if it could be done in a user-friendly way in the web console, which I have no idea on how. The best I can think of is adding a documentation for ingestion performance tuning and linking them in the web console. Maybe we can do it as a follow-up.
Going back to this change, it seems sort of reasonable to me in a sense that it gives some information for performance tuning. One comment I have now is that I was confused about how to set intervals with dynamic partitioning because the Show more
button doesn't show up. Maybe we can show a warning or a hint text directly (currently the description shows up when you click the button) that says you can make your ingestion job faster by setting intervals in the filtering step? I know this could be a bad UX but at least consistent across partitioning types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vogievetsky and I talked offline. The intervals
is a sort of filter but it's actually more about partitioning. The filtering can happen as a side effect of partitioning. The below is a quote from our doc.
A list of intervals describing what time chunks of segments should be created. If type is set to uniform, this list will be broken up and rounded-off based on the segmentGranularity. If type is set to arbitrary, this list will be used as-is.
So, we decided to move it back to the partitioning step.
@vogievetsky this is a nice change! Would you please take a look at my comment? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Thank you for the review |
…loader flow (apache#10647) * no need for intervals * don't set redundant fields * fix tests * better filter control * work with not * wrap callout with form group * update snapshot * add split hint * highlight issues with spec * fixes * fix default value * move intervals back to partition step * work with all sorts of chars * fix enabled view
The changes made in the #10592 PR make it so that intervals are no longer required for hashed / single_dim. This PR reflects that change in the data loader flow.
This PR also fixes:
This PR has:
(Remove this item if the PR doesn't have any relation to concurrency.)