Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uuid and random need return different value in different row #10247

Closed
liukun4515 opened this issue Apr 26, 2024 · 3 comments · Fixed by #10248 or #10193
Closed

uuid and random need return different value in different row #10247

liukun4515 opened this issue Apr 26, 2024 · 3 comments · Fixed by #10248 or #10193
Labels
good first issue Good for newcomers physical-expr Physical Expressions

Comments

@liukun4515
Copy link
Contributor

          I think the idea here is that expectation is that `rand` is invoked *once per row* rather than *once per batch*. And the only way it knew how many rows to make is to get a null array in 🤔 

For example, when I run datafusion-cli from this PR to call random() the same value is returned for each row:

> create table foo as values (1), (2), (3), (4), (5);
0 row(s) fetched.
Elapsed 0.018 seconds.

> select column1, random() from foo;
+---------+--------------------+
| column1 | random()           |
+---------+--------------------+
| 1       | 0.9594375709000513 |
| 2       | 0.9594375709000513 |
| 3       | 0.9594375709000513 |
| 4       | 0.9594375709000513 |
| 5       | 0.9594375709000513 |
+---------+--------------------+
5 row(s) fetched.
Elapsed 0.012 seconds.

But I expect that each row has a different value for random()

However, since none of the tests failed, clearly we have a gap in test coverage 🤔

Originally posted by @alamb in #10193 (comment)

@jayzhan211
Copy link
Contributor

@liukun4515 This should be solved in #10193

@alamb
Copy link
Contributor

alamb commented Apr 26, 2024

To be clear, I think the correct thing happens on main already

The example from #10193 (comment) was with intermediate changes when I reran on #10193

here is what happens on main (the correct thing)

DataFusion CLI v37.1.0
> create table t as values (1), (2);
0 row(s) fetched.
Elapsed 0.032 seconds.

> select random() from t;
+---------------------+
| random()            |
+---------------------+
| 0.02024777131575939 |
| 0.9330727106990677  |
+---------------------+
2 row(s) fetched.
Elapsed 0.012 seconds.

> select uuid() from t;
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 630d1d50-1ed2-4d3c-bb04-89d338e3e59f |
| 594e03fb-b038-4a48-a6e6-e2f8f12746c1 |
+--------------------------------------+
2 row(s) fetched.
Elapsed 0.003 seconds.

@alamb
Copy link
Contributor

alamb commented Apr 26, 2024

Here is a PR that adds a test for this case : #10248

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers physical-expr Physical Expressions
Projects
None yet
3 participants