-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-36092: [C++] Simplify concurrency in as-of-join node #36094
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me. Let me verify I understand:
Before, on InputReceived, we would update the memo for an input based on the new batch.
Now, the only thing that really happens on InputReceived, is the batch is placed into the appropriate input queue. The update to the memo table happens when teh processing thread runs, as it is processing batches for an input.
bool update = current_time_ < ts; | ||
if (update) current_time_ = ts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool update = current_time_ < ts; | |
if (update) current_time_ = ts; | |
current_time_ = std::min(current_time_, ts); |
Minor nit: Maybe simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method still needs to return a Boolean, so we can't remove the bool update
line, but we can replace the if-statement.
Yes exactly |
@rtpsw Looks like you added another atomic variable here: Can you fix that one too? |
Nvm looks like you fixed this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As @icexelloss said, this is correct. However, there is another difference to note. The pre-PR code invalidates the key hasher and updates the memo-store time given any batch, whereas the post-PR code does so given all but the first batch (after |
Conbench analyzed the 6 benchmark runs on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
What changes are included in this PR?
The key hasher invalidation and memo-store time updating are moved to the processing thread.
Are these changes tested?
Yes, by existing tests.
Are there any user-facing changes?
No.