Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12820][SQL]Resolve db.table.column #10753

Closed
wants to merge 2 commits into from

Conversation

zhichao-li
Copy link
Contributor

Currently spark only support to specify col name like: table.col, or col in projection, but it's very common that user use db.table.col especially when join table across database.
Hive doesn't support this for now but it has been used in lot of other traditional db like mysql.

@zhichao-li zhichao-li changed the title [SPARK-12820]Resolve db.table.column [SPARK-12820][SQL]Resolve db.table.column Jan 14, 2016
@rxin
Copy link
Contributor

rxin commented Jan 14, 2016

When we have a database named "a", and two tables:

  1. a table also named "a", and a struct column named "b", and a field in the struct named "c".
  2. a table named "b", and a column in it named "c"

What happens if the user specifies "a.b.c"?

@zhichao-li
Copy link
Contributor Author

It would return attribute c in table b up on this patch. Any suggestion? throw ambiguous exception for such case? 

@rxin
Copy link
Contributor

rxin commented Jan 14, 2016

What do we do if the user specifies "a.b" right now? I'd say we should follow that, and make sure we have a test case for it.

@zhichao-li
Copy link
Contributor Author

Would return attribute b in table a. resolved within the logic of table.col

@SparkQA
Copy link

SparkQA commented Jan 14, 2016

Test build #49381 has finished for PR 10753 at commit 1cafed7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Subquery(alias: String, child: LogicalPlan, databaseName: Option[String] = None)

@chenghao-intel
Copy link
Contributor

I would say we'd better keep the same checking logic with mysql/hive for the ambiguous case, @zhichao-li can you please check that with mysql/hive?

@zhichao-li
Copy link
Contributor Author

@chenghao-intel , mysql(5.5) doesn't support nested data type for now, and hive(1.2.1) doesn't support the usage of "db.table.field" in projection list at the moment. I guess most of the user work around this by using alias.

@SparkQA
Copy link

SparkQA commented Feb 25, 2016

Test build #51930 has finished for PR 10753 at commit 4155ffe.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 26, 2016

Test build #52031 has finished for PR 10753 at commit a3c4f20.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. We can also continue the discussion on the JIRA ticket.

@asfgit asfgit closed this in 1a33f2e Jun 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants