-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix IncrementalIndex performance regression #12048
fix IncrementalIndex performance regression #12048
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (timeAndMetricsColumnCapabilities.containsKey(columnName)) { | ||
return timeAndMetricsColumnCapabilities.get(columnName); | ||
} | ||
if (dimensionDescs.containsKey(columnName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through the code in IncrementalIndex, sometimes dimensionDescs
is accessed with synchronization. Sometimes it is not. Assuming that it's okay to access it without synchronization, this looks good. Have we verified that we will already be in a synchronized block once we get to this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, will have a look. nevermind, it doesn't change after constructortimeAndMetricsColumnCapabilities
would also have this problem potentially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I didn't trust any of the users of dimensionDescs
that weren't obviously safe so adjusted a couple of other locations as well before I did the measurement in my previous comment
Description
This PR fixes a performance regression caused by #11853, where adding type information to the
RowBasedColumnSelectorFactory
ofIncrementalIndex
was done by using thegetColumnCapabilities
method the latter had, which built a hashmap of all the capabilities and then translated that into aRowSignature
. This turned out to be very dramatically expensive.To fix this,
IncrementalIndex
now just implementsColumnInspector
so that it can cut out the middle-man and serve as theColumnInspector
for theRowBasedColumnSelectorFactory
.Before:
After:
Zoomed into sink.add:
before:
after:
Task run times are shorter too:
before:
after:
This PR has: