Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logical Expr serde should be stack safe #5693

Open
thinkharderdev opened this issue Mar 22, 2023 · 3 comments
Open

Logical Expr serde should be stack safe #5693

thinkharderdev opened this issue Mar 22, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@thinkharderdev
Copy link
Contributor

Is your feature request related to a problem or challenge?

Noticed while working on #5691 I was getting stack overflows while running tests (well actually a segfault but after some investigation I think it was just a quirk of my ARM mac and really the underlying issue was a stack overflow). I was able to run the tests by compiling them in release mode but the current recursive expression parsing/translation is not stack safe and will likely eventually cause issues.

Otherwise deeply nested expressions can cause a system crash.

Describe the solution you'd like

Avoid recursion in datafusion_proto::logical_plan::from_proto::parse_expr

Describe alternatives you've considered

Leave it as is and document as a known limitation

Additional context

No response

@thinkharderdev thinkharderdev added the enhancement New feature or request label Mar 22, 2023
@comphead
Copy link
Contributor

thats cool @thinkharderdev, is it possible to get a valid reproduce scenario?

@jonahgao
Copy link
Member

jonahgao commented Jun 29, 2023

Hi, @thinkharderdev
I might have encountered the same problem.

Could you please try to increase the stack size here: https://github.com/apache/arrow-datafusion/blob/06e22a5386a65c0e6a29943cd03ab5e442732745/datafusion/proto/src/bytes/mod.rs#L458

I increased it to 12_000_000, and the problem disappeared.

@alamb
Copy link
Contributor

alamb commented Jun 30, 2023

I think it changing to use a worklist style algorithm (store state on the heap rather than the stack with recursive function calls) would be neat, though it may be a big change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants