-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23588][SQL] CatalystToExternalMap should support interpreted execution #20979
Conversation
private lazy val toScalaValue: Any => Any = { | ||
assert(inputData.dataType.isInstanceOf[MapType]) | ||
val mapType = inputData.dataType.asInstanceOf[MapType] | ||
CatalystTypeConverters.createToScalaConverter(mapType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, don't we need to support different types of resulting collection specified by collClass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, you're right. I'll update soon.
I checked the related code; it seems the input type is always the same with the output type specified by collClass
? ex.)
scala> Seq(Map(1 -> 1, 2 -> 2)).toDS.as[Map[Long, String]].map(y => y).explain(true)
== Physical Plan ==
*(1) SerializeFromObject [externalmaptocatalyst(ExternalMapToCatalyst_key34, false, LongType, lambdavariable(ExternalMapToCatalyst_key34, false, LongType, false), ExternalMapToCatalyst_value34, ExternalMapToCatalyst_value_isNull34, ObjectType(class java.lang.String), staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, lambdavariable(ExternalMapToCatalyst_value34, ExternalMapToCatalyst_value_isNull34, ObjectType(class java.lang.String), true), true, false), input[0, scala.collection.immutable.Map, true]) AS value#137]
+- *(1) MapElements <function1>, obj#136: scala.collection.immutable.Map
+- *(1) DeserializeToObject catalysttoexternalmap(CatalystToExternalMap_keyLoopValue33, lambdavariable(CatalystToExternalMap_keyLoopValue33, , LongType, false), CatalystToExternalMap_valueLoopValue33, CatalystToExternalMap_valueLoopIsNull33, lambdavariable(CatalystToExternalMap_valueLoopValue33, CatalystToExternalMap_valueLoopIsNull33, StringType, true).toString, cast(value#130 as map<bigint,string>), interface scala.collection.immutable.Map, MapType(LongType,StringType,true)), obj#135: scala.collection.immutable.Map
+- LocalTableScan [value#130]
deserializerFor
adds casts implicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I buy that argument. This expression can produce a map type other than scala.collection.immutable.Map
, so we should either implement this, or we should remove the ability to produce different map types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll consider again.
Test build #88925 has finished for PR 20979 at commit
|
retest this please |
Test build #88935 has finished for PR 20979 at commit
|
Test build #89201 has finished for PR 20979 at commit
|
Test build #89548 has finished for PR 20979 at commit
|
private lazy val valueConverter = | ||
CatalystTypeConverters.createToScalaConverter(inputMapType.valueType) | ||
|
||
private def newMapBuilder(): Builder[AnyRef, AnyRef] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make sure the builder only get's resolved once per expression instead of once per row. Can you address this in a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Disregard my earlier comment. Merging to master. Thanks! Can you address my last comment in a follow-up? |
Test build #89567 has finished for PR 20979 at commit
|
ok, thanks! |
…ion in CatalystToExternalMap ## What changes were proposed in this pull request? This pr is a follow-up pr of apache#20979 and fixes code to resolve a map builder method per execution instead of per row in `CatalystToExternalMap`. ## How was this patch tested? Existing tests. Author: Takeshi Yamamuro <[email protected]> Closes apache#21112 from maropu/SPARK-23588-FOLLOWUP.
What changes were proposed in this pull request?
This pr supported interpreted mode for
CatalystToExternalMap
.How was this patch tested?
Added tests in
ObjectExpressionsSuite
.