-
Notifications
You must be signed in to change notification settings - Fork 733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support outer Unnest/Explode in BigQuery dialect #2941
Comments
can you give an actual example of bigwuery input and output and why it’s wrong? we don’t do left join because presto doesn’t support left join unnest |
i tested this and it seems like the only case that isn't working as expected is explode_outer(array()) |
Thanks for the quick fixing. |
I am also trying to implement it by creating a custom |
yea, it's easy to do this as a left join unnest, but i chose not to do this because presto can't handle it and sqlglot handles many dialects not just one. the complexity of the sql is not a goal for sqlglot, simply accuracy. |
do you have a solution for this? |
I didn't find out a way to correct the SQL with CROSS JOIN. But I sent a draft PR to define a custom function for BQ. The draft PR fails on some unit test currently. If that works for you, I will fix the unit test tomorrow for your official review. |
SELECT
IF(pos = pos_2, col, NULL) AS col
FROM x
CROSS JOIN UNNEST(GENERATE_ARRAY(
0,
GREATEST(ARRAY_LENGTH(IF(ARRAY_LENGTH(COALESCE(a, [])) = 0, [a[SAFE_ORDINAL(0)]], a))) - 1
)) AS pos
CROSS JOIN UNNEST(IF(ARRAY_LENGTH(COALESCE(a, [])) = 0, [a[SAFE_ORDINAL(0)]], a)) AS col WITH OFFSET AS pos_2
WHERE
pos = pos_2
OR (
pos > (
ARRAY_LENGTH(IF(ARRAY_LENGTH(COALESCE(a, [])) = 0, [a[SAFE_ORDINAL(0)]], a)) - 1
)
AND pos_2 = (
ARRAY_LENGTH(IF(ARRAY_LENGTH(COALESCE(a, [])) = 0, [a[SAFE_ORDINAL(0)]], a)) - 1
)
)
|
Great to see that works! Thanks for quick fixing! |
Is your feature request related to a problem? Please describe.
Cannot preserve empty arrays: SQLGlot's current
explode_to_unnest
transformation for BigQuery cannot include null/empty arrays orEXPLODE_OUTER
semantics as expected, leading to potential data loss.Suboptimal Output: Even in successful EXPLODE translation, the generated BigQuery SQL is verbose and could be simplified.
Here is the sample API and outputs:
IIUC, for BigQuery, the above SQL equals to:
Flatting arrays with a
CROSS JOIN
excludes rows that have empty or NULL array. BigQuery supportsLEFT JOIN
to include these rows. However, sqlglot does not have equal behavior here:Describe the solution you'd like
Create a custom
explode_to_unnest
function dedicated to BigQuery handling. This allow for:EXPLODE_OUTER
support: Introduce logic to accurately translate EXPLODE_OUTER into BigQuery SQL using LEFT JOIN.Describe alternatives you've considered
Modifying the core
explode_to_unnest
might introduce incompatibilities with other database dialects that are currently handled correctly.The text was updated successfully, but these errors were encountered: