Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Python] Exception ignored in: 'pyarrow._substrait._create_named_table_provider' #37235

Open
danepitkin opened this issue Aug 17, 2023 · 1 comment

Comments

@danepitkin
Copy link
Member

danepitkin commented Aug 17, 2023

Describe the bug, including details regarding any error messages, version, and platform.

Refactor Substrait.run_query() in C++/PyArrow so that it does not suppress user-defined Exceptions.

In PyArrow, Substrait can take in a table_provider object, which can map a table name in the Substrait plan to a table object in memory. This table_provider is user-defined python code, which can throw an exception. The underlying C++ implementation currently suppresses user-defined expressions and will instead report a more generic error[1]:

        ARROW_ASSIGN_OR_RAISE(acero::Declaration source_decl,
                              named_table_provider(table_names, *base_schema));

        if (!source_decl.IsValid()) {
          return Status::Invalid("Invalid NamedTable Source");
        }

When upgrading to Cython 3, the user-defined exception is now raised by default because cdef functions that are not extern now safely propagate Python exceptions by default. To mask raising this exception, we have to explicitly add noexcept to this Cython function[2]:

cdef CDeclaration _create_named_table_provider(
    dict named_args, const std_vector[c_string]& names, const CSchema& schema
) noexcept:

This is the pytest output seen when suppressing the python user-defined Exception:

pyarrow/tests/test_substrait.py::test_named_table_invalid_table_name
  /opt/homebrew/Caskroom/mambaforge/base/envs/arrow-dev/lib/python3.11/site-packages/_pytest/unraisableexception.py:78: PytestUnraisableExceptionWarning: Exception ignored in: 'pyarrow._substrait._create_named_table_provider'

  Traceback (most recent call last):
    File "/Users/dane/code/arrow/python/pyarrow/tests/test_substrait.py", line 250, in table_provider
      raise Exception("Unrecognized table name")
  Exception: Unrecognized table name

    warnings.warn(pytest.PytestUnraisableExceptionWarning(msg))

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

[1] https://github.com/apache/arrow/blob/main/cpp/src/arrow/engine/substrait/relation_internal.cc#L422-L427
[2] https://github.com/apache/arrow/blob/main/python/pyarrow/_substrait.pyx#L29

Component(s)

Python

@pitrou
Copy link
Member

pitrou commented Aug 22, 2023

To mask raising this exception, we have to explicitly add noexcept to this Cython function[2]:

The proper solution would be for the Cython function to return CResult[CDeclaration]. This might need a bit of plumbing on the C++ side. Do you think you can give it a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants