Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-14112] [SQL] [WIP] Unique Constraints over a Set of AttributeReferences #11930

Closed
wants to merge 1 commit into from

Conversation

gatorsmile
Copy link
Member

What changes were proposed in this pull request?

This PR is to introduce unique constraints over a set of AttributeReferences. Below are just two of use cases:

We can infer the output uniqueness of the current operator through the uniqueness of the child node's output. The bottom-up propagation rule of unique constraints is

  • Distinct, Intersect, Except always return distinct values.
  • Aggregate has three cases:
    1. When its aggregate expressions is identical to its grouping expressions, it is equivalent to Distinct. It can always return distinct values.
    2. When the child's outputSet is subset of its own outputSet, it keeps the unique constraints of the child.
    3. Otherwise, it does not enforce unique constraints
  • UnaryNode Filter, BroadcastHint, Sort, Window, GlobalLimit, LocalLimit, Sample and SubqueryAlias still keep the unique constraints of the child.
  • BinaryNode Left-semi Join keeps the unique constraints of the left child.
  • Project keeps the unique constraints of the child if and only if the child's outputSet is subset of its own outputSet

How was this patch tested?

TODO: add a set of test cases for verifying the propagation rules

@SparkQA
Copy link

SparkQA commented Mar 24, 2016

Test build #54017 has finished for PR 11930 at commit aea6146.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 29, 2016

@gatorsmile can you close this? looks like a WIP that hasn't been updated in a long time. You have 16 PRs open at the moment, actually.

@gatorsmile
Copy link
Member Author

Sure, let me close it. I will slow it down to make sure my open PRs are less than 20. Thanks!

@gatorsmile gatorsmile closed this Jun 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants