-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22972] Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc #20165
Conversation
…provider org.apache.spark.sql.hive.orc.
ok to test |
@@ -72,7 +72,7 @@ object HiveSerDe { | |||
def sourceToSerDe(source: String): Option[HiveSerDe] = { | |||
val key = source.toLowerCase(Locale.ROOT) match { | |||
case s if s.startsWith("org.apache.spark.sql.parquet") => "parquet" | |||
case s if s.startsWith("org.apache.spark.sql.orc") => "orc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to keep the original one. We do not want to introduce behavior breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}' | ||
|) | ||
""". | ||
stripMargin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combine the line 73 and 74
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
""". | ||
stripMargin) | ||
spark.sql( | ||
"desc formatted normal_orc_as_source_hive").show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change it to spark.sql("desc formatted normal_orc_as_source_hive").show(), is it ok?
How to get the warning and verify it in code?
"desc formatted normal_orc_as_source_hive").show() | ||
checkAnswer(sql("SELECT COUNT(*) FROM normal_orc_as_source_hive"), Row(10)) | ||
assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc") | ||
.equals(HiveSerDe.sourceToSerDe("orc"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add the checks:
assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.orc")
.equals(HiveSerDe.sourceToSerDe("orc")))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Test build #85724 has finished for PR 20165 at commit
|
| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}' | ||
|) | ||
""".stripMargin) | ||
spark.sql("desc formatted normal_orc_as_source_hive").show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace it by
val tableMetadata = spark.sessionState.catalog.getTableMetadata(
TableIdentifier("normal_orc_as_source_hive"))
assert(tableMetadata.storage.inputFormat ==
Option("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat"))
assert(tableMetadata.storage.outputFormat ==
Option("org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat"))
assert(tableMetadata.storage.serde ==
Option("org.apache.hadoop.hive.ql.io.orc.OrcSerde"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -62,6 +63,22 @@ class HiveOrcSourceSuite extends OrcSuite with TestHiveSingleton { | |||
""".stripMargin) | |||
} | |||
|
|||
test("SPARK-22972: hive orc source") { | |||
spark.sql( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithTable("normal_orc_as_source_hive")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -62,6 +63,22 @@ class HiveOrcSourceSuite extends OrcSuite with TestHiveSingleton { | |||
""".stripMargin) | |||
} | |||
|
|||
test("SPARK-22972: hive orc source") { | |||
spark.sql( | |||
s"""CREATE TABLE normal_orc_as_source_hive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put it to next line. You can refer the other test cases of CREATE TABLE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Test build #85723 has finished for PR 20165 at commit
|
Test build #85725 has finished for PR 20165 at commit
|
Test build #85778 has finished for PR 20165 at commit
|
Test build #85779 has finished for PR 20165 at commit
|
test("SPARK-22972: hive orc source") { | ||
val tableName = "normal_orc_as_source_hive" | ||
withTable(tableName) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|OPTIONS ( | ||
| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}' | ||
|) | ||
""".stripMargin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
s"""
|CREATE TABLE $tableName
|USING org.apache.spark.sql.hive.orc
|OPTIONS (
| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}'
|)
""".stripMargin)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful about the indents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
LGTM except two trivial comments. |
Test build #85796 has finished for PR 20165 at commit
|
…provider org.apache.spark.sql.hive.orc ## What changes were proposed in this pull request? Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc. ## How was this patch tested? test("SPARK-22972: hive orc source") assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc") .equals(HiveSerDe.sourceToSerDe("orc"))) Author: xubo245 <[email protected]> Closes #20165 from xubo245/HiveSerDe. (cherry picked from commit 68ce792) Signed-off-by: gatorsmile <[email protected]>
Thanks! Merged to master/2.3 Could you submit a separate PR for 2.2? |
Thank you too. |
…provider org.apache.spark.sql.hive.orc Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc. For branch-2.2, it cherry-pick from apache@8032cf8 test("SPARK-22972: hive orc source") assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc") .equals(HiveSerDe.sourceToSerDe("orc"))) Author: xubo245 <[email protected]> Closes apache#20165 from xubo245/HiveSerDe.
…provider org.apache.spark.sql.hive.orc ## What changes were proposed in this pull request? Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc. This PR is for branch-2.2 and cherry-pick from 8032cf8 The old PR is #20165 ## How was this patch tested? Please see test("SPARK-22972: hive orc source") Author: xubo245 <[email protected]> Closes #20195 from xubo245/HiveSerDeForBranch2.2.
…provider org.apache.spark.sql.hive.orc ## What changes were proposed in this pull request? Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc. This PR is for branch-2.2 and cherry-pick from apache@8032cf8 The old PR is apache#20165 ## How was this patch tested? Please see test("SPARK-22972: hive orc source") Author: xubo245 <[email protected]> Closes apache#20195 from xubo245/HiveSerDeForBranch2.2.
What changes were proposed in this pull request?
Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.
How was this patch tested?
test("SPARK-22972: hive orc source")
assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc")
.equals(HiveSerDe.sourceToSerDe("orc")))