fix: support UDAFs with different intermediate schema #3412
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
@vpapavas recently enhanced UDAFs to support a different intermediate type, e.g. the
AVG
UDAF might take anDOUBLE
input and output, but it's intermediate type isSTRUCT<COUNT BIGINT, SUM DOUBLE>
.This enhancement did not include the necessary parallel changes needed to support the same for static queries.
The rocksdb state stores store the intermediate type. Therefore, static queries, which read data from the rocksdb state stores, must now perform an additional step to convert the intermediate aggregator state to the output type of the aggregator.
How to review
In the future, static queries will also generate physical query plans. But for now they are hacked together in
StaticQueryExecutor
. So please refrain from commenting on the architecture of this PR. I'm well aware it is less than ideal. Once @rodesai has finished decoupling the KS implementation from the physical plan I can start to address this.KsqlMaterialization
and it's factoryKsqlMaterializationFactory
contain the main change, which is to add another step to convert aggregate internal state to final state.AggregatesInfo
was added to capture the data needed for this new stepAggregateNode
enhanced to capture it and pass it toMaterializationInfo
.Also fixed some bad import order issues along the way, (or rather my IDE did)
Testing done
materialized-aggregate-static-queries.json
has new tests to cover this.Reviewer checklist