Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Union in decoupled mode #15870

Merged
merged 81 commits into from
Feb 21, 2024
Merged

Conversation

kgyrtkirk
Copy link
Member

@kgyrtkirk kgyrtkirk commented Feb 8, 2024

  • to support Union some further changes were needed - as the original DruidQueryGenerator didn’t supported Dag -s.
  • after making the changes for that it was relatively easy to add Union
  • added a test to ensure that unused NotYetSupported modes are removed
  • moved DruidQueryGenerator to a separate package (I think in the future related classes may come here - which could give better overview)

@kgyrtkirk kgyrtkirk marked this pull request as ready for review February 9, 2024 18:55
@@ -262,6 +262,7 @@ private Program buildBaseRuleSetProgram(PlannerContext plannerContext)
builder.addMatchLimit(CalciteRulesManager.HEP_DEFAULT_MATCH_LIMIT);
builder.addGroupBegin();
builder.addRuleCollection(baseRuleSet(plannerContext));
builder.addRuleInstance(CoreRules.UNION_MERGE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the name of the method is no longer accurate (it's going beyond baseRuleSet). It appears this is only used in decoupled mode-- is that correct? If so some comments about that, or reflecting it in the name, would be useful.

Haven't read the rest of the patch as yet, so this is just a single comment rather than a review.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to separate basic optimiazation rules from the conversion rules - it will be usefull to be able to enable/tweak rules without altering the other planning rules.

It kinda already made one things easier: I was able to just throw in UNION_MERGE - without having to worry about disturbing existing plan differences.

there will be some more deeper tweaking will be needed as AGGREGATE_REMOVE and AGGREGATE_CASE_TO_FILTER doesn't play nicely together - which induces some plan differences

I can rename it to buildDecoupledLogicalOptimizationProgram - does that sound better?

Copy link
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the changes @kgyrtkirk :)
I have left a few comments, please let me know your thoughts!

}
if (newInputs.size() == 1) {
Vertex inputVertex = newInputs.get(0);
Vertex newVertex = inputVertex.mergeNode(node, isRoot);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of mergeNode, does this seem more like a addParent/ buildParent?
Also, the method name can contain a maybe too since it can return null as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used Optional instead maybe - this made me remove some apidoc as well ; as now the returntype tells the story (instead of me in an apidoc)

Since we have a Vertex here the use of the keyword parent might be misleading.

I've renamed it to extendWith ; new method signature:

Optional<Vertex> extendWith(RelNode parentNode, boolean isRoot)

@Nullable
public PartialDruidQuery mergeNode1(RelNode node, boolean isRoot)
{
if (accepts(node, Stage.WHERE_FILTER, Filter.class)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future, should we move this code to respective DruidLogicalNode themselves? That interface can have an accept method and the input to this method could be a DruidLogicalNode instead of a RelNode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I think PartialDruidQuery is a way to build the Query - and I believe there might be other ways to build queries in the future - to keep that separation natural: I think the builder should hide away the conversion logic.

From another viewpoint: if there is another way to build Query (not thru PartialDruidQuery) ; then the nodes it uses why contain PartialDruidQuery related stuff?

Maybe the common denominator would be to have a target build related node classes like PDQFilter / PDQProject ... that might be another approach we could take later on....

Copy link
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Request to address the comment about test runtime before merge

@Test
public void ensureAllModesUsed()
{
Set<Method> methodsAnnotatedWith = new Reflections("org.apache", new MethodAnnotationsScanner())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much time does this test take? if it takes long enough, might be worth changing this prefix to org.apache.druid.sql

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took 1.595s on the CI

on my system:

  • 1.3s without changes
  • 0.45s with org.apache.druid.sql

experimented a but with changing scan configuration - but it just made it more complicated without much benefit... changed it to org.apache.druid.sql :D

@rohangarg rohangarg merged commit bcce080 into apache:master Feb 21, 2024
82 of 83 checks passed
@rohangarg
Copy link
Member

Merged since the test failure was unrelated (in Nested Column ITs)

@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants