Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COUNT(DISTINCT) on StringView panics: unreachable code: Utf8/Binary should use ArrowBytesSet #11767

Closed
Tracked by #11752
alamb opened this issue Aug 1, 2024 · 1 comment · Fixed by #11768
Closed
Tracked by #11752
Assignees
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Aug 1, 2024

Describe the bug
I found one of the clickbench extended queries panics when using StringView -- #11723

To Reproduce

> create table foo (id int, x varchar, y varchar) as values (1, 'foo', 'bar'), (2, 'foo', 'baz');
0 row(s) fetched.
Elapsed 0.004 seconds.

> create view foov as select id, arrow_cast(x, 'Utf8View') as x, arrow_cast(y, 'Utf8View') as y from foo;
0 row(s) fetched.
Elapsed 0.002 seconds.

> select count(distinct x), count(distinct y) from foov group by id;
thread 'tokio-runtime-worker' panicked at /Users/andrewlamb/Software/datafusion2/datafusion/physical-expr-common/src/binary_view_map.rs:220:18:
internal error: entered unreachable code: Utf8/Binary should use `ArrowBytesSet`
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
External error: Join Error
caused by
External error: task 65 panicked

Expected behavior
The query should run without panic

Additional context

@alamb alamb added the bug Something isn't working label Aug 1, 2024
@alamb alamb transferred this issue from apache/arrow-rs Aug 1, 2024
@XiangpengHao
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants