You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to start by acknowledging what I'm doing is convoluted, but in short I'm saving files as binary blobs in a SQLite database. I want to be able to do this with Parquet files produced by Ibis with table.to_parquet, but hit an error (Ibis writes to a file of the string of the buffer object, instead of the object itself). Using pyarrow.parquet to write the table directly works (and is my workaround until I stop doing this convoluted thing), but it'd be cool (?) if Ibis directly support writing to io.BytesIO for output methods.
What is the motivation behind your request?
as a simple example:
```{python}
import io
import ibis
import pyarrow as pa
import pyarrow.parquet as pq
ibis.options.interactive = True
```
```{python}
t = ibis.examples.penguins.fetch()
t
```
```{python}
b = io.BytesIO()
t.to_parquet(b)
b.getvalue()
# this outputs an empty buffer, and instead writes out to a file like <_io.BytesIO object at 0x11eb68a90>
```
```{python}
b = io.BytesIO()
pq.write_table(t.to_pyarrow(), b)
b.getvalue()
# this outputs the correct Parquet bytes
```
Describe the solution you'd like
it seems like at some point the buffer object is being turned into a string and passed down into the writer. I took a cursory look but wasn't exactly sure where
my ideal solution is I can pass io.BytesIO objects to Ibis table output methods
What version of ibis are you running?
9.5
What backend(s) are you using, if any?
sqlite, duckdb
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
It should work as you have it above for the backends that don't have their own parquet writer -- but for DuckDB it will definitely break, and I don't think BigQuery and Snowflake will be very happy about it either.
Basically, for all of the backends where we're generating SQL to handle parquet writing (vs. via pyarrow), it won't work (with the possible exception of polars and datafusion)
that makes a lot of sense, thanks for the explanation. I'll just close this out as not planned for now, my workaround is fine/as I said I don't think this is a super real thing I need to do in the medium term
Is your feature request related to a problem?
I want to start by acknowledging what I'm doing is convoluted, but in short I'm saving files as binary blobs in a SQLite database. I want to be able to do this with Parquet files produced by Ibis with
table.to_parquet
, but hit an error (Ibis writes to a file of the string of the buffer object, instead of the object itself). Usingpyarrow.parquet
to write the table directly works (and is my workaround until I stop doing this convoluted thing), but it'd be cool (?) if Ibis directly support writing toio.BytesIO
for output methods.What is the motivation behind your request?
as a simple example:
Describe the solution you'd like
it seems like at some point the buffer object is being turned into a string and passed down into the writer. I took a cursory look but wasn't exactly sure where
my ideal solution is I can pass
io.BytesIO
objects to Ibis table output methodsWhat version of ibis are you running?
9.5
What backend(s) are you using, if any?
sqlite, duckdb
Code of Conduct
The text was updated successfully, but these errors were encountered: