iceberg time travel unavailable with nessie catalog #23631
Replies: 2 comments
-
We can time travel to non-latest snapshots with Nessie in Trino Iceberg connector as far as I tested locally. Could you share the entire steps to reproduce (CREATE TABLE, INSERT and SELECT statements on Trino)? Note that we don't support backtick or double-quotes for time travel. Please use |
Beta Was this translation helpful? Give feedback.
-
@ebyhr As most basic example, have a dbt project, connected to Trino on the nessie iceberg catalog: profiles.yaml
Create a dummy model in dbt: execution_time.sql
Run dbt with I confirm that the data is available with
(we use DBeaver) My guess is that the culprit is the way dbt updates data, by dropping old tables and recreating new data, with the help of temp helper tables, instead of updating/truncating tables. As when we go into the S3 where the iceberg data is stored, we notice for the 2 dbt executions, 2 different folders for the same table are present, both containing a single parquet file in the data folder. This also happens for 'incremental' materialization in dbt, not only 'table'. Let me know if you need any more info to reproduce this. |
Beta Was this translation helpful? Give feedback.
-
When using Nessie as iceberg catalog (we are using nessie with REST), we are unable to use Nessie's time travelling.
According to Trino documentation, we should use:
SELECT * FROM "nessie"."myschema"."mytable" FOR TIMESTAMP AS OF TIMESTAMP '2024-09-28 09:59:29.803 Europe/Vienna'
However, this will always result in
SQL Error [76]: Query failed (#20241001_081626_00426_nmc3s): No version history table "myschema"."mytable" at or before 2024-09-28T07:59:29.803Z
This only works when selecting a timestamp later than the latest commit to nessie, probably because of how you have to specify an
iceberg.rest-catalog.prefix
in the iceberg connector and the inner workings of nessie.When looking at the Nessie documentation on how to time travel, it states some other syntax:
SELECT * FROM nessie.myschema.`mytable#2024-07-01T00:00:00Z`;
Specifically, you need to use backticks, which Trino does not support:
SQL Error [1]: Query failed (#20241001_082410_00427_nmc3s): line 1:35: backquoted identifiers are not supported; use double quotes to quote identifiers
But when using double quotes, we get:
SQL Error [84148237]: Query failed (#20241001_082542_00430_nmc3s): Failed to load table: myschema.mytable#2024-07-01t00:00:00z
This seems to just be a syntax compatibility issue between Trino and Nessie, so any way to fix this?
Let me know if there is anything else I can try to make this work.
In theory you can still time travel with trino+nessie by updating the iceberg.rest-catalog.prefix to an earlier commit, but this is not the desired way to do this.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions