-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap of being a "gateway" for other data sources #8386
Comments
If parse finds a function then it is surely a Trino function. Then if possible we could try to pushdown such function execution to remote data source. This is supported for connectors for aggregation functions only. |
Hi @kokosing thanks for the reply.
I know it is the current situation, but from #8140 (comment) I think there is future plan to support datasource-specific function? Then how about the idea of adding function namespace to make the parser easier?
I know it is the current situation, but from #7994 I think there is effort to push other functions down? |
Not exactly. That depends on connector if Trino function has an equivalent function in remote data source we can try to push down the function execution there, but it does not mean that data source specific function will be available in Trino SQL.
Function is just one of expression forms. We do not push down expressions, but operations like filter, join, projection, aggregation, limit, sort. Operation contains expressions but not every expression form is supported everywhere. Currently functions are supported in aggregation pushdown only. I am not aware of any other active work related to pushdown functions in projection or filter. @hashhar @wendigo ? |
cc @martint |
Yes, that's the goal. Namespaces are not strictly needed, but the work done leading to it is needed to be able to do the handshake between the engine and connectors when reasoning about functions. Overall, this is the approach we're shooting for:
|
Hi @martint we are interested in the "connectors can expose functions" work, can share with us more materials if have. Thanks! |
Firstly thanks so much for the hard work of the community. As we all know that Trino is "SQL on Everything", but for a general query engine or "gateway" for other data sources, I think some works still need to do. Since I could not find an overall plan for this, I create this issue as the roadmap to record the progress, and welcome all the ideas.
The important features in my opinions include:
trino-parser can understand if the function encountered belongs to Trino or belongs to data sources. The benefit of this is we can use the datasource-specific functions that not supported by Trino. I am not sure if Function and type namespaces #8 is about this, if not then I guess a namespace idea can be considered -- when we use
druid.func1(c1)
, Trino knows this function belongs to Druid data source, and can stop parsing and just keep the expression, then later just push this into Druid data source.Pushdown, I can see Allow connectors to participate in query optimization #18 is about general pushdown and ConnectorExpression pushdown #7994 is specific for function pushdown.
Dynamic filtering, with dynamic filtering enabled, many full scans can be avoided. I have a PR about JDBC dynamic filtering: Enable dynamic filtering in JDBC connector #8137 (hope the community can give more feedback).
Forgive me if I use the wrong term or I misunderstand something or miss some discussions and welcome pointing them out!
The text was updated successfully, but these errors were encountered: