Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support "Fail on error" for parse_url #11193

Open
mythrocks opened this issue Jul 15, 2024 · 0 comments
Open

[FEA] Support "Fail on error" for parse_url #11193

mythrocks opened this issue Jul 15, 2024 · 0 comments
Labels
feature request New feature or request Spark 4.0+ Spark 4.0+ issues

Comments

@mythrocks
Copy link
Collaborator

mythrocks commented Jul 15, 2024

#6969 added support for parse_url. At the time, support was not added to enable fail on error when ANSI mode is enabled.

When fail on error is enabled (i.e. in ANSI mode) on Apache Spark, then the query should fail on parse errors, instead of returning null rows. When a parse_url is attempted on spark-rapids, we see that the query falls off the GPU:

  @Expression <Alias> parse_url(a#7, QUERY, a, true) AS parse_url(a, QUERY, a)#10 could run on GPU
    !Expression <ParseUrl> parse_url(a#7, QUERY, a, true) cannot run on GPU because Fail on error is not supported on GPU when parsing urls.

This behaviour can be seen when the url_test.py::test_parse_url_query_with_key integration test is run, in ANSI mode.

It would be good to have support for fail on error when ANSI is enabled on Spark RAPIDS.

(Note: This error is observed as part of the investigation into #11017. The failing condition will be disabled for now.)

@mythrocks mythrocks added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jul 15, 2024
mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Jul 15, 2024
Fixes NVIDIA#11017.

This commit fixes the tests in url_test.py, so that they don't fail
when ANSI mode is enabled.

All the errant tests fail because `parse_url()` does not currently
support "fail on error" in spark-rapids.
See NVIDIA#11193.

The tests have been modified to explicitly run with ANSI mode disabled.
These tests can be enabled to run in ANSI mode after NVIDIA#11193 has been
addressed.

Signed-off-by: MithunR <[email protected]>
@mattahrens mattahrens added Spark 4.0+ Spark 4.0+ issues and removed ? - Needs Triage Need team to review and classify labels Jul 16, 2024
@sameerz sameerz mentioned this issue Jul 18, 2024
49 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Spark 4.0+ Spark 4.0+ issues
Projects
None yet
Development

No branches or pull requests

2 participants