-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-14796][SQL] Add spark.sql.optimizer.inSetConversionThreshold config option. #12562
Conversation
Can we add a unit test in the appropriate optimizer suite? We also need to come up with a better name. |
Thank you for review, @rxin . |
Test build #56492 has finished for PR 12562 at commit
|
Oh, sorry. There exists already |
Test build #56511 has finished for PR 12562 at commit
|
Test build #56514 has finished for PR 12562 at commit
|
Test build #56550 has finished for PR 12562 at commit
|
@@ -128,4 +131,21 @@ class OptimizeInSuite extends PlanTest { | |||
comparePlans(optimized, correctAnswer) | |||
} | |||
|
|||
test("OptimizedIn test: Use configuration.") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd give this a more descriptive name, and explicitly say setting the threshold for turning into InSet
maybe inSetConversionThreshold? |
@@ -17,11 +17,14 @@ | |||
|
|||
package org.apache.spark.sql.catalyst.optimizer | |||
|
|||
import scala.collection.immutable.HashSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you use this import?
Thank you so much, @rxin and @marmbrus !
By the way, @marmbrus d you mean the duplication of value |
Test build #56573 has finished for PR 12562 at commit
|
Merging in master. Thanks. |
What changes were proposed in this pull request?
Currently,
OptimizeIn
optimizer replacesIn
expression intoInSet
expression if the size of set is greater than a constant, 10.This issue aims to make a configuration
spark.sql.optimizer.inSetConversionThreshold
for that.After this PR,
OptimizerIn
is configurable.How was this patch tested?
Pass the Jenkins tests (with a new testcase)