[apache#3595] Improvement(spark-connector): update spark-iceberg conn…

…ector document (apache#3596) ### What changes were proposed in this pull request? update spark-iceberg connector document ### Why are the changes needed? Fix: apache#3595
diqiu50 · Jun 13, 2024 · 6fdfc88 · 6fdfc88
1 parent 877254c
commit 6fdfc88
Showing 1 changed file with 59 additions and 16 deletions.
diff --git a/docs/spark-connector/spark-catalog-iceberg.md b/docs/spark-connector/spark-catalog-iceberg.md
@@ -12,31 +12,35 @@ The Gravitino Spark connector offers the capability to read and write Iceberg ta
 
 #### Support DML and DDL operations:
 
-- `CREATE TABLE` 
-
-Supports basic create table clause including table schema, properties, partition, does not support distribution and sort orders.
-
+- `CREATE TABLE`
+  - `Supports basic create table clause including table schema, properties, partition, does not support distribution and sort orders.`
 - `DROP TABLE`
 - `ALTER TABLE`
 - `INSERT INTO&OVERWRITE`
 - `SELECT`
 - `MERGE INOT`
 - `DELETE FROM`
 - `UPDATE`
+- `CALL`
+- `TIME TRAVEL QUERY`
+- `DESCRIBE TABLE`
 
 #### Not supported operations:
 
 - View operations.
-- Branching and tagging operations.
-- Spark procedures.
+- Metadata tables, like:
+  - `{iceberg_catalog}.{iceberg_database}.{iceberg_table}.snapshots`
 - Other Iceberg extension SQL, like:
   - `ALTER TABLE prod.db.sample ADD PARTITION FIELD xx`
   - `ALTER TABLE ... WRITE ORDERED BY`
+  - `ALTER TABLE prod.db.sample CREATE BRANCH branchName`
+  - `ALTER TABLE prod.db.sample CREATE TAG tagName`
+- AtomicCreateTableAsSelect&AtomicReplaceTableAsSelect
 
 ## SQL example
 
 ```sql
-// Suppose iceberg_a is the Iceberg catalog name managed by Gravitino
+-- Suppose iceberg_a is the Iceberg catalog name managed by Gravitino
 USE iceberg_a;
 
 CREATE DATABASE IF NOT EXISTS mydatabase;
@@ -57,7 +61,7 @@ VALUES
 (2, 'Bob', 'Marketing', TIMESTAMP '2021-02-01 10:30:00'),
 (3, 'Charlie', 'Sales', TIMESTAMP '2021-03-01 08:45:00');
 
-SELECT * FROM employee WHERE date(hire_date) = '2021-01-01'
+SELECT * FROM employee WHERE date(hire_date) = '2021-01-01';
 
 UPDATE employee SET department = 'Jenny' WHERE id = 1;
 
@@ -74,18 +78,57 @@ USING (SELECT 4 as id, 'David' as name, 'Engineering' as department, TIMESTAMP '
 ON employee.id = new_employee.id
 WHEN MATCHED THEN DELETE
 WHEN NOT MATCHED THEN INSERT *;
+
+-- Suppose that the first snapshotId of employee is 1L and the second snapshotId is 2L
+-- Rollback the snapshot for iceberg_a.mydatabase.employee to 1L
+CALL iceberg_a.system.rollback_to_snapshot('iceberg_a.mydatabase.employee', 1);
+-- Set the snapshot for iceberg_a.mydatabase.employee to 2L
+CALL iceberg_a.system.set_current_snapshot('iceberg_a.mydatabase.employee', 2);
+
+-- Suppose that the commit timestamp of the first snapshot is older than '2024-05-27 01:01:00'
+-- Time travel to '2024-05-27 01:01:00'
+SELECT * FROM employee TIMESTAMP AS OF '2024-05-27 01:01:00';
+SELECT * FROM employee FOR SYSTEM_TIME AS OF '2024-05-27 01:01:00';
+
+-- Show the details of employee, such as schema and reserved properties(like location, current-snapshot-id, provider, format, format-version, etc)
+DESC EXTENDED employee;
 ```
 
-## Catalog properties
+For more details about `CALL`, please refer to the [Spark Procedures description](https://iceberg.apache.org/docs/latest/spark-procedures/#spark-procedures) in Iceberg official document. \
+
+## Iceberg backend-catalog support
+- HiveCatalog
+- JdbcCatalog
+- RESTCatalog
+
+### Catalog properties
 
 Gravitino spark connector will transform below property names which are defined in catalog properties to Spark Iceberg connector configuration.
 
-| Gravitino catalog property name | Spark Iceberg connector configuration | Description               | Since Version |
-|---------------------------------|---------------------------------------|---------------------------|---------------|
-| `catalog-backend`               | `type`                                | Catalog backend type      | 0.5.0         |
-| `uri`                           | `uri`                                 | Catalog backend uri       | 0.5.0         |
-| `warehouse`                     | `warehouse`                           | Catalog backend warehouse | 0.5.0         |
-| `jdbc-user`                     | `jdbc.user`                           | JDBC user name            | 0.5.0         |
-| `jdbc-password`                 | `jdbc.password`                       | JDBC password             | 0.5.0         |
+#### HiveCatalog
+
+| Gravitino catalog property name | Spark Iceberg connector configuration | Default Value | Required | Description               | Since Version |
+|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
+| `catalog-backend`               | `type`                                | `hive`        | Yes      | Catalog backend type      | 0.5.0         |
+| `uri`                           | `uri`                                 | (none)        | Yes      | Catalog backend uri       | 0.5.0         |
+| `warehouse`                     | `warehouse`                           | (none)        | Yes      | Catalog backend warehouse | 0.5.0         |
+
+#### JdbcCatalog
+
+| Gravitino catalog property name | Spark Iceberg connector configuration | Default Value | Required | Description               | Since Version |
+|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
+| `catalog-backend`               | `type`                                | `jdbc`        | Yes      | Catalog backend type      | 0.5.0         |
+| `uri`                           | `uri`                                 | (none)        | Yes      | Catalog backend uri       | 0.5.0         |
+| `warehouse`                     | `warehouse`                           | (none)        | Yes      | Catalog backend warehouse | 0.5.0         |
+| `jdbc-user`                     | `jdbc.user`                           | (none)        | Yes      | JDBC user name            | 0.5.0         |
+| `jdbc-password`                 | `jdbc.password`                       | (none)        | Yes      | JDBC password             | 0.5.0         |
+
+#### RESTCatalog
+
+| Gravitino catalog property name | Spark Iceberg connector configuration | Default Value | Required | Description               | Since Version |
+|---------------------------------|---------------------------------------|---------------|----------|---------------------------|---------------|
+| `catalog-backend`               | `type`                                | `rest`        | Yes      | Catalog backend type      | 0.5.1         |
+| `uri`                           | `uri`                                 | (none)        | Yes      | Catalog backend uri       | 0.5.1         |
+| `warehouse`                     | `warehouse`                           | (none)        | No       | Catalog backend warehouse | 0.5.1         |
 
 Gravitino catalog property names with the prefix `spark.bypass.` are passed to Spark Iceberg connector. For example, using `spark.bypass.io-impl` to pass the `io-impl` to the Spark Iceberg connector.