-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow restricting table features to auto-update protocol versions #1660
Closed
Closed
Changes from 7 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
e12e2c9
implementation
xupefei ab6f68e
address most comments
xupefei 8f0bcd7
address comments
xupefei d61ba90
address comments
xupefei 6b1532a
even better test
xupefei ee2bf6e
Merge branch 'master' of github.com:delta-io/delta into tf-auto-updat…
xupefei b156872
address comments
xupefei 9b500a2
address comments
xupefei fe7361f
address comments
xupefei 74d39a9
address comments
xupefei File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,18 +111,32 @@ sealed trait LegacyFeatureType | |
* A trait indicating this feature can be automatically enabled via a change in a table's | ||
* metadata, e.g., through setting particular values of certain feature-specific table properties. | ||
* | ||
* When the feature's metadata requirements are satisfied during table creation | ||
* ([[actions.Protocol.forNewTable]]) or commit ([[OptimisticTransaction.updateMetadata]]), the | ||
* When the feature's metadata requirements are satisfied for <b>new tables</b>, or for | ||
* <b>existing tables when [[automaticallyUpdateProtocolOfExistingTables]] set to `true`</b>, the | ||
Comment on lines
+114
to
+115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we allow html annotations in our doc comments? I thought it was just markdown? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe so... Someone could correct me if I'm wrong |
||
* client will silently add the feature to the protocol's `readerFeatures` and/or | ||
* `writerFeatures`. | ||
* `writerFeatures`. Otherwise, a proper protocol version bump must be present in the same | ||
* transaction. | ||
*/ | ||
sealed trait FeatureAutomaticallyEnabledByMetadata { | ||
sealed trait FeatureAutomaticallyEnabledByMetadata { this: TableFeature => | ||
|
||
/** | ||
* Whether the feature can automatically update the protocol of an existing table when the | ||
* metadata requirements are satisfied. As a rule of thumb, a table feature that requires | ||
* explicit operations (e.g., turning on a table property) should set this flag to `true`, while | ||
* features that are used implicitly (e.g., when using a new data type) should set this flag to | ||
* `false`. | ||
*/ | ||
def automaticallyUpdateProtocolOfExistingTables: Boolean = this.isLegacyFeature | ||
|
||
/** | ||
* Determine whether the feature must be supported and enabled because its metadata requirements | ||
* are satisfied. | ||
*/ | ||
def metadataRequiresFeatureToBeEnabled(metadata: Metadata, spark: SparkSession): Boolean | ||
|
||
require( | ||
!this.isLegacyFeature || automaticallyUpdateProtocolOfExistingTables, | ||
"Legacy feature must be auto-update capable.") | ||
} | ||
|
||
/** | ||
|
@@ -205,10 +219,12 @@ object TableFeature { | |
if (DeltaUtils.isTesting) { | ||
features ++= Set( | ||
TestLegacyWriterFeature, | ||
TestWriterFeature, | ||
TestLegacyReaderWriterFeature, | ||
TestWriterFeature, | ||
TestWriterMetadataNoAutoUpdateFeature, | ||
TestReaderWriterFeature, | ||
TestReaderWriterMetadataFeature) | ||
TestReaderWriterMetadataAutoUpdateFeature, | ||
TestReaderWriterMetadataNoAutoUpdateFeature) | ||
} | ||
val featureMap = features.map(f => f.name.toLowerCase(Locale.ROOT) -> f).toMap | ||
require(features.size == featureMap.size, "Lowercase feature names must not duplicate.") | ||
|
@@ -302,7 +318,6 @@ object IdentityColumnsTableFeature | |
|
||
object TimestampNTZTableFeature extends ReaderWriterFeature(name = "timestampNtz") | ||
with FeatureAutomaticallyEnabledByMetadata { | ||
|
||
override def metadataRequiresFeatureToBeEnabled( | ||
metadata: Metadata, spark: SparkSession): Boolean = { | ||
SchemaUtils.checkForTimestampNTZColumnsRecursively(metadata.schema) | ||
|
@@ -312,6 +327,8 @@ object TimestampNTZTableFeature extends ReaderWriterFeature(name = "timestampNtz | |
object DeletionVectorsTableFeature | ||
extends ReaderWriterFeature(name = "deletionVectors") | ||
with FeatureAutomaticallyEnabledByMetadata { | ||
override def automaticallyUpdateProtocolOfExistingTables: Boolean = true | ||
|
||
override def metadataRequiresFeatureToBeEnabled( | ||
metadata: Metadata, | ||
spark: SparkSession): Boolean = { | ||
|
@@ -330,6 +347,17 @@ object TestLegacyWriterFeature | |
|
||
object TestWriterFeature extends WriterFeature(name = "testWriter") | ||
|
||
object TestWriterMetadataNoAutoUpdateFeature | ||
extends WriterFeature(name = "testWriterMetadataNoAutoUpdate") | ||
with FeatureAutomaticallyEnabledByMetadata { | ||
val TABLE_PROP_KEY = "_123testWriterMetadataNoAutoUpdate321_" | ||
override def metadataRequiresFeatureToBeEnabled( | ||
metadata: Metadata, | ||
spark: SparkSession): Boolean = { | ||
metadata.configuration.get(TABLE_PROP_KEY).exists(_.toBoolean) | ||
} | ||
} | ||
|
||
object TestLegacyReaderWriterFeature | ||
extends LegacyReaderWriterFeature( | ||
name = "testLegacyReaderWriter", | ||
|
@@ -338,10 +366,24 @@ object TestLegacyReaderWriterFeature | |
|
||
object TestReaderWriterFeature extends ReaderWriterFeature(name = "testReaderWriter") | ||
|
||
object TestReaderWriterMetadataFeature | ||
extends ReaderWriterFeature(name = "testReaderWriterMetadata") | ||
object TestReaderWriterMetadataNoAutoUpdateFeature | ||
extends ReaderWriterFeature(name = "testReaderWriterMetadataNoAutoUpdate") | ||
with FeatureAutomaticallyEnabledByMetadata { | ||
val TABLE_PROP_KEY = "_123testReaderWriterMetadata321_" | ||
val TABLE_PROP_KEY = "_123testReaderWriterMetadataNoAutoUpdate321_" | ||
override def metadataRequiresFeatureToBeEnabled( | ||
metadata: Metadata, | ||
spark: SparkSession): Boolean = { | ||
metadata.configuration.get(TABLE_PROP_KEY).exists(_.toBoolean) | ||
} | ||
} | ||
|
||
object TestReaderWriterMetadataAutoUpdateFeature | ||
extends ReaderWriterFeature(name = "testReaderWriterMetadataAutoUpdate") | ||
with FeatureAutomaticallyEnabledByMetadata { | ||
val TABLE_PROP_KEY = "_123testReaderWriterMetadataAutoUpdate321_" | ||
|
||
override def automaticallyUpdateProtocolOfExistingTables: Boolean = true | ||
|
||
override def metadataRequiresFeatureToBeEnabled( | ||
metadata: Metadata, | ||
spark: SparkSession): Boolean = { | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Table name can be passed in to this error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems not trivial because we have only the Delta Log instance which is based on a path. The name is already lost at this stage.