-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enh]: Add support for database tables, views and queries #1289
Comments
hi @MarcSkovMadsen - thanks for the request, and good to hear from you!
What operations would we need to support? Just selecting columns by name (via SELECT)? If so, then this is definitely in-scope, and we can add it to the interchange-only level of support https://narwhals-dev.github.io/narwhals/extending/#interchange-only-support happy to chat more about this if you like |
It was great talking to you at PyData Paris too! I want to share some thoughts on why I think adding database query support as a general tabular data source would be a valuable direction for Narwhals. Narwhals as an Abstraction for Tabular DataI see Narwhals as an abstraction layer for tabular data, enabling Python developers to:
Many libraries implement similar abstractions, like data backends in HoloViews. These abstractions allow libraries to focus on data interfaces rather than data loading or transforming details, freeing users to work with data seamlessly regardless of format or source. In the data visualization and app ecosystem, a clear trend is emerging: frameworks are increasingly abstracting over media types. For example, Streamlit allows users to provide a simple Gradio exemplifies this approach by organizing components based on media type, allowing users to input for example JSON as a With Narwhals providing a unified backend for tabular data, frameworks could adopt it to improve and simplify their tabular data handling. Others could build similar backend abstractions for different media types ( Goals and VisionMy primary goal is to support larger communities and make data handling easier. I would like to do this for:
I understand that interchange support could help achieve this. This would be a solid starting point, but ideally, I'd like the applications I work on to be as performant and scalable as possible. For a database-backed tabular data source, this means offloading computations directly to the database. For example:
Adding database query support would help Narwhals offer a seamless, efficient experience across many platforms and use cases. To be more specific I would have to analyze the above. But interchange is a good starting point. |
@MarcSkovMadsen, do I understand correctly that you're proposing that Narwhals directly take on the Ibis use case? |
Its a good question. I hope not. I can't really see through all of this. What I believe I can see is that the current goals of Narwhals is not enough for someone trying to make it easy for for users to utilize tabular data in their projects. I believe we want to enable users to use DataFrames, SELECT queries in databases, local parquet files, remote parquet files and probably more. I.e. it means that library maintainers probably still need to combine narwhals with other libraries (ibis?) and also create support for other sources them selves. Maybe the problem is already solved. Maybe I just need better how to integrate narwhals and ibis? Maybe improved docs explaining to library developers that support for tabular data from databases goes via Ibis and that narwhals has a long term goal of making that as performant as possible. The downside of ibis is that then you suddenly have to include both narwhals and ibis as dependencies. |
Having studied narwhals, ibis and the problem I'm trying to solve a bit more (for example) I think what I'm asking for is probably very much aligned with the road map:
Documenting better how to use tabular data sources that are not DataFrames in origin via Ibis would help me a lot. Including making it clear that its a design decision not to implement your own deep support for databases Defining a lazy-only layer of support for Ibis would help push computations to the underlying backend Then I can hope and ask library builders like pygwalker to support Narwhals and Ibis instead of building their own custom Then there is a detail of me wanting to enable users to just give file paths and urls instead of dataframes as input. Because that is where I see data app frameworks going. But that is a detail. I think I will close this one. Thanks so much for the discussion. |
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
https://panel.holoviz.org/index.html
Please describe the purpose of the new feature or describe the problem to solve.
I'm a contributor to and user of Panel and HoloViz ecosystem. We plan to extend Panel dataframe support from Pandas Dataframe to Dataframe in general. Similarly what other data app frameworks are or will be doing.
To add support we will probably add a
TabularData
parameter toparam
: https://param.holoviz.org/user_guide/Parameter_Types.html.We want start experimenting with this by adding the support to the package https://github.com/panel-extensions/panel-graphic-walker. I hope to do it via panel-extensions/panel-graphic-walker#22. Looking at what the library we wrap support we can see that a database Connector is also a tabular dataset. See https://github.com/Kanaries/pygwalker/tree/main/pygwalker/data_parsers and https://github.com/Kanaries/pygwalker/blob/d0d9acda42d4c11e5741ebb46b639afdd96d00e3/pygwalker/data_parsers/database_parser.py#L44. For them it consist of a sqlalchemy connectionstring and a
SELECT
query. To work with all databases (including azure synapse) it should be a sqlalchemy engine instead of a connectionstring.Suggest a solution if possible.
Support a Database Query:
If you have tried alternatives, please describe them below.
Maybe Ibis support will provide this? But it is a wrapper with overhead. And many more people are familiar with SQLAlchemy than Ibis. Why not support SQLAlchemy queries directly?
Additional information that may help us understand your needs.
No response
The text was updated successfully, but these errors were encountered: