Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12506][SPARK-12126][SQL]use CatalystScan for JDBCRelation #11005

Closed
wants to merge 1 commit into from

Conversation

huaxingao
Copy link
Contributor

As suggested here, I will change JDBCRelation to implement CatalystScan, and then directly access Catalyst expressions in JDBCRDD.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

cc @liancheng

=== "((NOT (col0 != 'abc' OR col0 IS NULL OR 'abc' IS NULL) "
+ "OR (col0 IS NULL AND 'abc' IS NULL))) AND (col1 = 'def')")
}
// test("compile filters") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks you forgot to uncomment those..

@maropu
Copy link
Member

maropu commented Feb 2, 2016

Great work though, I think that we should consider lots of things to refactor these kinds of datasource push-down codes because related tickets (SPARK-12449, 12686, ...) describes non-filter logical operators push downs. Also, this pr could break binary compatibility...

@HyukjinKwon
Copy link
Member

As @maropu said, for me I also agree with him but some may think differently as mentioned here.

@viirya
Copy link
Member

viirya commented Feb 3, 2016

Is PrunedFilteredScan and CatalystScan conflicting? If not (looks they are not), can we keep original JDBCRelation and let it implement CatalystScan too? In order not to duplicate current codes, I am wondering if SQLBuilder can help converting filtering expressions to SQL string.

@maropu
Copy link
Member

maropu commented Feb 3, 2016

I agree that CatalystScan is used to support arithmetic operations in datasources though, this current CatalystScan trait only processes filter expressions. If we make this kind of interface changes, it'd be better to consider other requirements discussed in jira; otherwise, we get stuck with the same issue every time we add new features in datasources.

Anyway, the idea to share codes in SQLBuilder and JDBCRDD is good to me.

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. We can also continue the discussion on the JIRA ticket.

@asfgit asfgit closed this in 1a33f2e Jun 15, 2016
@kyprifog
Copy link

@rxin Why was this pull request closed? Can you direct me to a new one that could have replaced it? I was trying to track down the modification of push downs for basic operations like "limit".

@HyukjinKwon
Copy link
Member

Technical reason: It's kind of risky to rely on CatalystScan and completely replace the interface. I think I already see some tests were disabled here. Also, there look potential better suggestions above.

Practical reason: there are too many pending PRs as you see. If the author is not responsive and the PR is inactive to review comments, we better leave them closed for now - seems it's already stuck in few technical reasons. The author is welcome to reopen and other contributors are welcome to take over.

@HyukjinKwon
Copy link
Member

BTW, datasource v2 is in progress too to allow more push downs (see SPARK-22386). You might want to take a look

@kyprifog
Copy link

@HyukjinKwon Thanks this is what I was looking for. Glad the work is still being continued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants