-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement transaction identifiers - continued #2539
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, there is very little left that the DeltaTableState
is actually doing. Moist methods are just proxies to the methods on EagerSnapshot
, which makes sense, since that also represents the state 😆.
After this, and getting the first kernel version in (#2495) I hope that we can finally drop the DeltaTableState
...
/// Convert actions to their json representation | ||
pub fn log_entry_from_actions<'a>( | ||
actions: impl IntoIterator<Item = &'a Action>, | ||
) -> Result<String, TransactionError> { | ||
/// Obtain the byte representation of the commit. | ||
pub fn get_bytes(&self) -> Result<bytes::Bytes, TransactionError> { | ||
let mut jsons = Vec::<String>::new(); | ||
for action in actions { | ||
for action in &self.actions { | ||
let json = serde_json::to_string(action) | ||
.map_err(|e| TransactionError::SerializeLogJson { json_err: e })?; | ||
jsons.push(json); | ||
} | ||
Ok(jsons.join("\n")) | ||
} | ||
|
||
/// Obtain the byte representation of the commit. | ||
pub fn get_bytes(&self) -> Result<bytes::Bytes, TransactionError> { | ||
// Data MUST be read from the passed `CommitData`. Don't add data that is not sourced from there. | ||
let actions = &self.actions; | ||
Ok(bytes::Bytes::from(Self::log_entry_from_actions(actions)?)) | ||
Ok(bytes::Bytes::from(jsons.join("\n"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since i to grok the commit logic a bit, it seemed like this was just seprated for legacy reasons, and the log_entry_from_actions
was ever only supposed to be used with the data in CommitData
) -> Result<PreCommit<'a>, CommitBuilderError> { | ||
let data = CommitData::new(self.actions, operation, self.app_metadata)?; | ||
Ok(PreCommit { | ||
) -> PreCommit<'a> { | ||
let data = CommitData::new( | ||
self.actions, | ||
operation, | ||
self.app_metadata, | ||
self.app_transaction, | ||
); | ||
PreCommit { | ||
log_store, | ||
table_data, | ||
max_retries: self.max_retries, | ||
data, | ||
post_commit_hook: self.post_commit_hook, | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this did not need to be fallible - a lot of the touched files just updates from this change.
if self.app_transaction_version.contains_key(app_id) { | ||
continue; | ||
} | ||
self.app_transaction_version.insert( | ||
app_id.to_owned(), | ||
Transaction { | ||
app_id: app_id.into(), | ||
version: ex::read_primitive(version, idx)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a drive-by comment, but: Does this mean this will only keeps the first version? since we continue if the hash map has the key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, test_app_txn_visitor
answered this in the affirmative.
Is that the correct behavior, though? The Delta protocol says
Delta only ensures that the latest version for a given appId is available in the table snapshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replay visits commits from latest to oldest. By keeping the first one we encounter, we are in fact only showing the latest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see! Thanks!
/// actions. | ||
pub fn app_transaction_version(&self) -> &HashMap<String, i64> { | ||
&self.app_transaction_version | ||
/// HashMap containing the last transaction stored for every application. |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Description
This is based on @Blajda's work in #2327 and aims to revive transactions identifiers.
This PR elaborates a bit on the
ReplayVisitor
concept introduced by David. specifically we moved things "one level down" to be tracked on the eager snapshot. This was mainly required to make it work with the commit flow, to properly handle updating the state after commits, without piping the visitors through all the ways.The nice thing about this, that we can isolate the mechanics as to when or how we track additional actions in the
EagerSnapshot
and expose an interface the looks like what we might get for kernel - i.e. some opaque iterator over the respective actions.Related Issue(s)
closes #2130
Documentation