-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Joining on ROWKEY #2735
Fix Joining on ROWKEY #2735
Conversation
…now have the correct keyfield - unnecessary repartition steps of streams now avoided in new queries, but maintained in existing queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice change @big-andy-coates. Took me a while to get my head wrapped around exactly what was changing despite it being, in essence, a pretty simple change... The data flow has such high complexity around this part of the code I wish there was a simpler way to model it!
ksql-engine/src/main/java/io/confluent/ksql/structured/SchemaKStream.java
Show resolved
Hide resolved
ksql-engine/src/main/java/io/confluent/ksql/planner/plan/JoinNode.java
Outdated
Show resolved
Hide resolved
ksql-engine/src/main/java/io/confluent/ksql/planner/plan/JoinNode.java
Outdated
Show resolved
Hide resolved
ksql-engine/src/main/java/io/confluent/ksql/structured/SchemaKStream.java
Show resolved
Hide resolved
ksql-engine/src/main/java/io/confluent/ksql/structured/SchemaKStream.java
Outdated
Show resolved
Hide resolved
ksql-engine/src/main/java/io/confluent/ksql/planner/plan/JoinNode.java
Outdated
Show resolved
Hide resolved
ksql-engine/src/test/java/io/confluent/ksql/physical/PhysicalPlanBuilderTest.java
Show resolved
Hide resolved
ksql-engine/src/test/java/io/confluent/ksql/physical/PhysicalPlanBuilderTest.java
Show resolved
Hide resolved
…rgs, as this is what they are. They are not KeyFields. Clean up some naming to make the code easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @big-andy-coates
LGTM.
@@ -104,12 +107,38 @@ | |||
private static final String FILTER_MAPVALUES_NODE = "KSTREAM-MAPVALUES-0000000004"; | |||
private static final String FOREACH_NODE = "KSTREAM-FOREACH-0000000005"; | |||
|
|||
private static final String createStream = "CREATE STREAM TEST1 (COL0 BIGINT, COL1 VARCHAR, COL2 DOUBLE) WITH ( " | |||
+ "KAFKA_TOPIC = 'test1', VALUE_FORMAT = 'JSON' );"; | |||
private static final String CREATE_STREAM_TEST1 = "CREATE STREAM TEST1 " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better readability you could include stream/table in the name. For instance, Create Stream Stream_test1 ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I've already done...
Constant CREATE_STREAM_TEST1
CREATEs a STEAM called TEST1...
Description
Fixes #2636
Fixes #2525
Fixes #2470
This continues the work to clean up key field handling in the existing code.
This PR addresses issues when joining by ROWKEY.
empty
on the source added to the metastore when joining on ROWKEY. This is also now resolved.Review notes:
joins.json
to see the tests that have been fixed and added.Analyzer
, which now builds theKeyField
s that theJoinNode
needs.JoinNode
for the wiring. Note:getKeyField
now returnsleftKeyField
, which may feel wrong, but this is refactored existing functionality. I'll be looking into this more soon, as there are issues around key fields where only the right join key is in the projection.SchemaKStream
, specificallyselectKey
as this is where the magic happens. This will, if using new key fields i.e. if a new query, early out and not repartition is selecting rowkey. It still repartitions if using legacy key fields. This method also builds a new key field, replacing ROWKEY in the new key, and uses this to create newSchemaKStream
, which fixes theempty
key issue.PhysicalPlanBuilderTest
as this adds functional tests around the different combinations on join keys.Testing done
Extensive JSON and Unit tests added.
Reviewer checklist