-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: handle protocol compatibility #1807
Conversation
9815fc6
to
0c72b20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just a drive-by fix as builds were failing for benches.
pub static INSTANCE: Lazy<ProtocolChecker> = Lazy::new(|| { | ||
let reader_features = HashSet::new(); | ||
// reader_features.insert(ReaderFeatures::ColumnMapping); | ||
|
||
let mut writer_features = HashSet::new(); | ||
writer_features.insert(WriterFeatures::AppendOnly); | ||
writer_features.insert(WriterFeatures::Invariants); | ||
// writer_features.insert(WriterFeatures::CheckConstraints); | ||
// writer_features.insert(WriterFeatures::ChangeDataFeed); | ||
// writer_features.insert(WriterFeatures::GeneratedColumns); | ||
// writer_features.insert(WriterFeatures::ColumnMapping); | ||
// writer_features.insert(WriterFeatures::IdentityColumns); | ||
|
||
ProtocolChecker::new(reader_features, writer_features) | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like the API of ProtocolChecker
and I think it's a great addition to the library. The motivation for this global I am struggling with however. I thought it might be to avoid allocations, but as far as I can tell, the common paths through can_read
and can_write_to
will clone a number of the related HashSets anyways, which has me perplexed.
I think I'm just not understanding, would you mind putting some more commentary above INSTANCE
on the motivation for it so that I may be enlightened 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add some comments, just really quick here as well.
The cloning bothered me as well, and I'll see that I can fix that. The main motivation though is that with table features protocol support is essentially a negotiation between the client and the table. Since the table features we support lock in at compile time, there really is exactly one correct way the instance should look like, so i felt its less error prone if we just create it globally and track support for features in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some comments and avoided cloning the sets.
f8133b1
to
f914757
Compare
Ayyo @MrPowers |
use crate::kernel::{Action, ReaderFeatures, WriterFeatures}; | ||
use crate::table::state::DeltaTableState; | ||
|
||
lazy_static! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for? The table reader and writers have Into<usize>
impls so you should just be able to cast them to find their associated reader / writer version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main intent is to have a uniform way of handling compatibility for tables on 3/7 and the ones below.
I.e. to check compatibility with a version we need all features leading up to that version, unless on 3/7 where we just need to check all features the table requires. So casting a specific features to a min version only helps for certain checks. So just enumerating all features per version allowed to have one consistent way of handling capability checks, that "should" keep on behaving correctly at we add more and more features to the supported feature list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think I might not understand the check here. Are you saying these are for the chain of required features? Like check constraints for example requires append only and invariants? So maybe it's the name that throws me off if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as usual, it depends 😆 - tables features as a concept are only available at reader/writer version 3/7. AT that version, clients can validate that they implement all features the table requires as stated in the Protocol
action. The respective fields must be available at these versions. For any reader/writer version below that, clients must assume that they need to implement all features up to the respective version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh I see, so this only matters when we are not dealing with actual table features and meant to enforce the "all or nothing" style of protocol versioning prior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly.
@rtyler @wjones127 - any chance you could approve this so we can land it on main? There seems to be quite some activity around table features right now, so i think it would be valualble to establish a baseline... |
1d65865
to
08ee8be
Compare
~~based on #1807~~ # Description In the effort to advance protocol support and move our internal APIs closer to the kernel library, it is advantageous to leverage the expression handling logic from kernel specifically for filtering actions etc. This PR just add the expression definitions and evaluation logic. Integrating it with our current codebase and basing the existing partition handling logic on this is left for follow up PRs to keep thigs review-able. related: #1894, #1776
Description
In preparation for further improvements to protocol support, this PR introduces a
ProtocolChecker
which validates that we can read form / write to / commit to a specific table. So far everything is expressed in table features in the hopes, that this keeps on giving correct behaviours as we add more table features.The existing support for append only is integrated and extended to handle enablement logic according to the protocol.