[SPARK-12820][SQL]Resolve db.table.column #10753

zhichao-li · 2016-01-14T05:15:47Z

Currently spark only support to specify col name like: table.col, or col in projection, but it's very common that user use db.table.col especially when join table across database.
Hive doesn't support this for now but it has been used in lot of other traditional db like mysql.

rxin · 2016-01-14T05:36:46Z

When we have a database named "a", and two tables:

a table also named "a", and a struct column named "b", and a field in the struct named "c".
a table named "b", and a column in it named "c"

What happens if the user specifies "a.b.c"?

zhichao-li · 2016-01-14T06:17:45Z

It would return attribute c in table b up on this patch. Any suggestion? throw ambiguous exception for such case?　

rxin · 2016-01-14T06:18:27Z

What do we do if the user specifies "a.b" right now? I'd say we should follow that, and make sure we have a test case for it.

zhichao-li · 2016-01-14T06:23:08Z

Would return attribute b in table a. resolved within the logic of table.col

SparkQA · 2016-01-14T07:10:36Z

Test build #49381 has finished for PR 10753 at commit 1cafed7.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Subquery(alias: String, child: LogicalPlan, databaseName: Option[String] = None)

chenghao-intel · 2016-02-17T06:24:36Z

I would say we'd better keep the same checking logic with mysql/hive for the ambiguous case, @zhichao-li can you please check that with mysql/hive?

zhichao-li · 2016-02-25T03:12:28Z

@chenghao-intel , mysql(5.5) doesn't support nested data type for now, and hive(1.2.1) doesn't support the usage of "db.table.field" in projection list at the moment. I guess most of the user work around this by using alias.

SparkQA · 2016-02-25T06:34:02Z

Test build #51930 has finished for PR 10753 at commit 4155ffe.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2016-02-26T09:06:24Z

Test build #52031 has finished for PR 10753 at commit a3c4f20.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2016-06-15T22:27:38Z

Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. We can also continue the discussion on the JIRA ticket.

zhichao-li changed the title ~~[SPARK-12820]Resolve db.table.column~~ [SPARK-12820][SQL]Resolve db.table.column Jan 14, 2016

zhichao-li force-pushed the dbname branch from 1ca4369 to 4155ffe Compare February 25, 2016 04:07

zhichao-li added 2 commits February 26, 2016 15:24

resolve db.table.column

989a8ec

add unit test

a3c4f20

zhichao-li force-pushed the dbname branch from 4155ffe to a3c4f20 Compare February 26, 2016 07:34

asfgit closed this in 1a33f2e Jun 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-12820][SQL]Resolve db.table.column #10753

[SPARK-12820][SQL]Resolve db.table.column #10753

zhichao-li commented Jan 14, 2016

rxin commented Jan 14, 2016

zhichao-li commented Jan 14, 2016

rxin commented Jan 14, 2016

zhichao-li commented Jan 14, 2016

SparkQA commented Jan 14, 2016

chenghao-intel commented Feb 17, 2016

zhichao-li commented Feb 25, 2016

SparkQA commented Feb 25, 2016

SparkQA commented Feb 26, 2016

rxin commented Jun 15, 2016

[SPARK-12820][SQL]Resolve db.table.column #10753

[SPARK-12820][SQL]Resolve db.table.column #10753

Conversation

zhichao-li commented Jan 14, 2016

rxin commented Jan 14, 2016

zhichao-li commented Jan 14, 2016

rxin commented Jan 14, 2016

zhichao-li commented Jan 14, 2016

SparkQA commented Jan 14, 2016

chenghao-intel commented Feb 17, 2016

zhichao-li commented Feb 25, 2016

SparkQA commented Feb 25, 2016

SparkQA commented Feb 26, 2016

rxin commented Jun 15, 2016