-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug(trino): cannot create table with large size data in trino #10178
Comments
did some checks, trino and impala insert data into database one by one, see trino https://github.com/ibis-project/ibis/blob/main/ibis/backends/trino/__init__.py#L592 It becomes slower and slower during insertion...so it takes almost forever to insert the I tested insert data by chunk, data = list(op.data.to_frame().itertuples(index=False))
insert_stmt = self._build_insert_template(name, schema=schema)
with self.begin() as cur:
cur.execute(create_stmt)
chunk_size = 100 # Define the chunk size
for i in range(0, len(data), chunk_size):
chunk = data[i:i + chunk_size]
cur.executemany(insert_stmt, chunk) not sure if we want to insert the data as a whole, it may be out of memory if data size is too large -----update-----
|
There's not much we can do here, both Impala and Trino have no way to efficiently insert data from Python without an unacceptable amount of client-side work. We've asked for this from the Trino folks, but nothing has materialized over the past few years and for whatever reason they're very reluctant or simply don't have the time to implement support for ingesting Arrow (or some other efficient in-memory format) from clients. |
got you. Thanks for your explanation. This is blocking #9908 and #9744. Should we skip trino and impala there in unit tests.
|
Closing this out as an upstream issue |
What happened?
code to reproduce the error:
It throws Exception because ofMEMORY_LIMIT_EXCEEDED
It is related to the_in_memory_table_exists
, I saw we have recently changed the implementation #10067smaller data runs OK
I guess this could be the reason of CI failures for #9908 #9744
What version of ibis are you using?
9.5.0
What backend(s) are you using, if any?
Trino
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: