Insert high throughput events using iceberg #23592
Unanswered
allanbatista
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am analyzing how to do a massive insert using trino (with iceberg). - about 1 million per minute (each event has about 1kb)
I tried to do a this inserts using SQL using python connector, but the throughput is so slow.
I tried to parallelize using multiple workers, but I get continuous error from iceberg metadata.
I have my current pipeline in AWS. Kafka + Firehose + S3 + Athena
Is possible to update a partition in trino like Athena (
ALTER TABLE ADD PARTITION
) to add a already exists file in a path structured partitioned?Filepath example:
account_id=account-id-1/service_name=service-name-1/year=2021/month=01/day=01/hour=00/1727426921807104378_N_acc88e31-eb5e-4eac-be1e-871703dedbda.parquet
Beta Was this translation helpful? Give feedback.
All reactions