Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune projected nested columns through UNNEST #3925

Open
JamesRTaylor opened this issue Jun 4, 2020 · 11 comments
Open

Prune projected nested columns through UNNEST #3925

JamesRTaylor opened this issue Jun 4, 2020 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@JamesRTaylor
Copy link

When CROSS JOIN UNNEST is used to access some columns within an array of rows, the entire row is projected as opposed to only the subset of columns based on the references. For example:

create table nest_test(id int, a array(row(x varchar, y varchar))) WITH (format='PARQUET')"
select x from nest_test cross join unnest(a)

Only a.x should be projected, but currently all columns of a are projected. For very wide rows, this can be expensive.

@phd3
Copy link
Member

phd3 commented Jun 5, 2020

@JamesRTaylor I think there're multiple steps to this one:

  1. Prune unnest fields in UNNEST node. This is similar to pruning symbols for single-level nesting. For multi-level nesting, this is more like pushing down dereferences.
    for example, convert
Project ( p1 := f(A.B.C), p2 := m(K) )
	unnest(arr -> {A, E, K})
		S(arr)

to

	Project(p1 := f(A_B_C), p2 := m(K))
		unnest(arr' -> {A_B_C, K})
			project(arr' -> transform(arr, el -> row(el.A.B.C, el.K)))
                                 S(arr)

Here E is completely pruned, and A.B.C is extracted from A.

  1. Push this array transformation as far down as possible in the plan node so that they can reach TableScan. We might be able to represent it through a new subclass of LambdaExpression to simplify things.

  2. Add support for generic function calls in ConnectorExpression. (If this is too involved, may be we can start with creating a connector expression for the new subclass as mentioned above.) Then the ORC/Parquet reader would be able to prune the fields from the array rows.

I feel 3) won't be impactful enough without 1) and 2) for all query shapes. I've a WIP patch for 1) but needs some more work. But my hunch is, that patch, if checked-in alone, may degrade performance since UNNEST operator avoids data copy as much as possible already.

@findepi findepi added the enhancement New feature or request label Jun 5, 2020
@phd3
Copy link
Member

phd3 commented Jun 5, 2020

@martint what are your thoughts on this?

@JamesRTaylor
Copy link
Author

Thanks for the write-up, @phd3. I was thinking (1) and (2) plus using a ConnectorExpression that would contain pruned type definitions. I'd guess that the main improvement would come from the Parquet/ORC reader only reading A_B_C instead of all of A (very similar to the recent improvements for other nested data situations). Would it work to use the Variable class with a type that includes only the referenced sub fields?

I'd love to take a look at your patch. I could benchmark it on some production queries on our end if it's far enough along so we get an idea of potential impact.

@martint
Copy link
Member

martint commented Jun 6, 2020

The current plan to support this is the transformation you described in (1), plus the ability to push down functions (e.g., transform) and lambda expressions to the connector. Before we can do that, we need to land some of the refactorings we've been making to the way functions are represented in the plan and add some APIs to be able to communicate and obtain information about them between the connector and the engine.

@phd3
Copy link
Member

phd3 commented Jun 9, 2020

@JamesRTaylor sure. I need to understand the case of NULLs in unnest. i.e. make sure that the number of rows output by unnest(a) is exactly the same as after pruning a. I'll create a WIP PR by next week.

@gjhkael
Copy link

gjhkael commented Sep 13, 2021

any update?

@mathfool
Copy link

Any update on this one?

@mixermt
Copy link

mixermt commented Apr 2, 2024

2024 and most advanced query execution engines still can't read nested data efficiently 🤷‍♂️

@Desmeister
Copy link

Quick update. I had a stable build, but was continuously refactoring on top of Martin's recent AST/IR changes so I opted to wait on completion of those. I'm not expecting large changes after the latest major PR, so I'm fixing up the new classes and unit tests based on the new IR-friendly syntax.

@Desmeister Desmeister reopened this Apr 2, 2024
@rotem-ad
Copy link
Member

rotem-ad commented Apr 8, 2024

@martint @Desmeister This feature can be a serious game changer for our main use-case.
We have a very large Iceberg table (~40TB of compressed ingested data per day).
The table's schema includes many structured columns - including multi-level nesting (e.g: ARRAY(ROW(col_a,col_b,..,ARRAY(ROW(col_1,col_2,col_3,...)))) ).
Some of the ROW structs includes hundreds of nested columns, and in total there are a few thousands of nested columns in the table.

When running simple analytical queries with UNNEST operations against this table, which include only a tiny subset of the nested columns, Trino simply can't handle it - queries always fail with Query exceeded per-node memory limit.. exception.
Enabling fault-tolerant execution for those queries doesn't help as well and they fail after a while.

The only workaround we've come up with so far is 'tricking' Trino into thinking the table only has the subset of the columns required by the query. We do this by creating a Hive external table with the modified schema on top of the Iceberg table's data folder.
Running the same queries against the external table is super-fast! However, this workaround has too many caveats and obviously cannot be used in production.

Do you happen to know if this feature is planned to be merged any time soon?

@bentzimortaboola
Copy link

@Desmeister Can you provide ETA for PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

10 participants