Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22972] Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc #20165

Closed
wants to merge 6 commits into from

Conversation

xubo245
Copy link
Contributor

@xubo245 xubo245 commented Jan 5, 2018

What changes were proposed in this pull request?

Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.

How was this patch tested?

test("SPARK-22972: hive orc source")
assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc")
.equals(HiveSerDe.sourceToSerDe("orc")))

@gatorsmile
Copy link
Member

ok to test

@@ -72,7 +72,7 @@ object HiveSerDe {
def sourceToSerDe(source: String): Option[HiveSerDe] = {
val key = source.toLowerCase(Locale.ROOT) match {
case s if s.startsWith("org.apache.spark.sql.parquet") => "parquet"
case s if s.startsWith("org.apache.spark.sql.orc") => "orc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to keep the original one. We do not want to introduce behavior breaks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}'
|)
""".
stripMargin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine the line 73 and 74

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

""".
stripMargin)
spark.sql(
"desc formatted normal_orc_as_source_hive").show()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change it to spark.sql("desc formatted normal_orc_as_source_hive").show(), is it ok?

How to get the warning and verify it in code?

"desc formatted normal_orc_as_source_hive").show()
checkAnswer(sql("SELECT COUNT(*) FROM normal_orc_as_source_hive"), Row(10))
assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc")
.equals(HiveSerDe.sourceToSerDe("orc")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add the checks:

assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.orc")
  .equals(HiveSerDe.sourceToSerDe("orc")))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@SparkQA
Copy link

SparkQA commented Jan 5, 2018

Test build #85724 has finished for PR 20165 at commit cf7cbce.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}'
|)
""".stripMargin)
spark.sql("desc formatted normal_orc_as_source_hive").show()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it by

    val tableMetadata = spark.sessionState.catalog.getTableMetadata(
      TableIdentifier("normal_orc_as_source_hive"))
    assert(tableMetadata.storage.inputFormat ==
      Option("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat"))
    assert(tableMetadata.storage.outputFormat ==
      Option("org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat"))
    assert(tableMetadata.storage.serde ==
      Option("org.apache.hadoop.hive.ql.io.orc.OrcSerde"))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -62,6 +63,22 @@ class HiveOrcSourceSuite extends OrcSuite with TestHiveSingleton {
""".stripMargin)
}

test("SPARK-22972: hive orc source") {
spark.sql(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithTable("normal_orc_as_source_hive")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -62,6 +63,22 @@ class HiveOrcSourceSuite extends OrcSuite with TestHiveSingleton {
""".stripMargin)
}

test("SPARK-22972: hive orc source") {
spark.sql(
s"""CREATE TABLE normal_orc_as_source_hive
Copy link
Member

@gatorsmile gatorsmile Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put it to next line. You can refer the other test cases of CREATE TABLE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@SparkQA
Copy link

SparkQA commented Jan 5, 2018

Test build #85723 has finished for PR 20165 at commit fa902d6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 5, 2018

Test build #85725 has finished for PR 20165 at commit b64ce53.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 8, 2018

Test build #85778 has finished for PR 20165 at commit 1cfa72f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 8, 2018

Test build #85779 has finished for PR 20165 at commit cc4dd13.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

test("SPARK-22972: hive orc source") {
val tableName = "normal_orc_as_source_hive"
withTable(tableName) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

|OPTIONS (
| PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}'
|)
""".stripMargin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

s"""
  |CREATE TABLE $tableName
  |USING org.apache.spark.sql.hive.orc
  |OPTIONS (
  |  PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}'
  |)
""".stripMargin)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful about the indents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@gatorsmile
Copy link
Member

LGTM except two trivial comments.

@SparkQA
Copy link

SparkQA commented Jan 8, 2018

Test build #85796 has finished for PR 20165 at commit 3eafdfb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jan 9, 2018
…provider org.apache.spark.sql.hive.orc

## What changes were proposed in this pull request?
Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.

## How was this patch tested?
 test("SPARK-22972: hive orc source")
    assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc")
      .equals(HiveSerDe.sourceToSerDe("orc")))

Author: xubo245 <[email protected]>

Closes #20165 from xubo245/HiveSerDe.

(cherry picked from commit 68ce792)
Signed-off-by: gatorsmile <[email protected]>
@asfgit asfgit closed this in 68ce792 Jan 9, 2018
@gatorsmile
Copy link
Member

Thanks! Merged to master/2.3

Could you submit a separate PR for 2.2?

@xubo245
Copy link
Contributor Author

xubo245 commented Jan 9, 2018

Thank you too.
Ok, I will raise a PR for this later.

xubo245 added a commit to xubo245/spark that referenced this pull request Jan 9, 2018
…provider org.apache.spark.sql.hive.orc

Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.
For branch-2.2, it cherry-pick from apache@8032cf8

 test("SPARK-22972: hive orc source")
    assert(HiveSerDe.sourceToSerDe("org.apache.spark.sql.hive.orc")
      .equals(HiveSerDe.sourceToSerDe("orc")))

Author: xubo245 <[email protected]>

Closes apache#20165 from xubo245/HiveSerDe.
asfgit pushed a commit that referenced this pull request Jan 10, 2018
…provider org.apache.spark.sql.hive.orc

## What changes were proposed in this pull request?

Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.
This PR is for branch-2.2 and cherry-pick from 8032cf8

The old PR is #20165

## How was this patch tested?

 Please see test("SPARK-22972: hive orc source")

Author: xubo245 <[email protected]>

Closes #20195 from xubo245/HiveSerDeForBranch2.2.
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
…provider org.apache.spark.sql.hive.orc

## What changes were proposed in this pull request?

Fix the warning: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.
This PR is for branch-2.2 and cherry-pick from apache@8032cf8

The old PR is apache#20165

## How was this patch tested?

 Please see test("SPARK-22972: hive orc source")

Author: xubo245 <[email protected]>

Closes apache#20195 from xubo245/HiveSerDeForBranch2.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants