Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose table alterations under alter namespace #1909

Open
ion-elgreco opened this issue Nov 25, 2023 · 7 comments
Open

Expose table alterations under alter namespace #1909

ion-elgreco opened this issue Nov 25, 2023 · 7 comments
Labels
binding/python Issues for the Python package enhancement New feature or request

Comments

@ion-elgreco
Copy link
Collaborator

Description

Use Case
Eventually, we will have multiple alterations possible on the table, such as setting/unsetting table properties, adding and removing columns and so forth. We can cluster these nicely together under a single namespace called alter. The API will look like this:

DeltaTable().alter.set_table_properties()
DeltaTable().alter.unset_table_properties()
DeltaTable().alter.add_columns()
DeltaTable().alter.change_columns()
DeltaTable().alter.replace_columns()
DeltaTable().alter.add_constraints()
DeltaTable().alter.drop_constraints()

Related Issue(s)
#1663

@ion-elgreco ion-elgreco added the enhancement New feature or request label Nov 25, 2023
@roeap
Copy link
Collaborator

roeap commented Nov 25, 2023

In principle I have no too strong feeling about bundling some commands under a common property, much like we do for optimize.

Given the name alter though, I would suggest restricting it to things things that can be done via the ALTER TABLE command, as we may want / need to implement that operation at some point.

The way things seem to be going with the Delta Protocol, it seems table features are front and center when it comes to configuring tables. Along with that some configuration is becoming more complex. As such we may consider exposing set_table_properties as a low level (discouraged) API only and instead model this around table features. Something along the lines of

def enable_feature(name: FeatureName, config: dict)...

The advantage may be that is is easier for us to validate the configuration as configuration for specific features may include multiple keys that need to be consistent and (as far as I understand) may even require setting domain metadata at some point.

@ion-elgreco
Copy link
Collaborator Author

ion-elgreco commented Nov 25, 2023

@roeap it's mostly inspired from the SQL alter operations: https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-alter-table.html

I am not entirely following you on why set_table_proprties should be a low-level API. Because not every table property belongs to a certain feature, right? : https://books.japila.pl/delta-lake-internals/DeltaConfigs/#appendOnly

@roeap
Copy link
Collaborator

roeap commented Nov 25, 2023

yes, there is config that is unrelated to features... mainly saying that the config that is related to features should maybe modeled as such ...

@ion-elgreco
Copy link
Collaborator Author

ion-elgreco commented Jan 3, 2024

@roeap I am going to start looking to this soon, just want to clarify one thing; For configs that are related to features, should we raise when someone tries to add or remove them in set table propeties way?

@dtheodor
Copy link

Does this issue cover adding support for DDL statements in general, such as CREATE TABLE and ALTER TABLE ...? Currently only possible with spark.

@ion-elgreco
Copy link
Collaborator Author

Does this issue cover adding support for DDL statements in general, such as CREATE TABLE and ALTER TABLE ...? Currently only possible with spark.

Create table is already covered with the create operation.

Some alter operations are already available, this is still a work in progress to add more, such as add columns operation

@dtheodor
Copy link

Complete support for alter operations would make this project useful for lightweight migrations, omitting the need for a spark cluster to perform them

@rtyler rtyler removed this from the python v0.20 milestone Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants