Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize remove action apply with early iteration exit #424 #431

Merged
merged 2 commits into from
Sep 18, 2021
Merged

Optimize remove action apply with early iteration exit #424 #431

merged 2 commits into from
Sep 18, 2021

Conversation

akshay26031996
Copy link
Contributor

Description

Right now, we use Iterator::retain to remove files from active file list when applying remove actions from the transaction log. This can be optimized by performing an early exit of the iteration when we reach the file that needs to be removed.

  1. Find the position of the file.
  2. Remove the file from the vector.

Related Issue(s)

state.files.retain(|a| *a.path != v.path);
let index = { state.files.iter().position(|a| *a.path == v.path) };
if let Some(index) = index {
state.files.remove(index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the order is not important (I believe) swap_remove could be used, which is O(1) instead of O(n)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that order does not matter (although I am new to this repo). I am pushing the suggested change and the required changes in test. In test, I am sorting the vectors before comparing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice optimization, good call @Dandandan :)

@houqp houqp merged commit a733b69 into delta-io:main Sep 18, 2021
@houqp
Copy link
Member

houqp commented Sep 18, 2021

Thanks @akshay26031996 for your contribution.

@@ -15,14 +15,15 @@ async fn read_simple_table() {
assert_eq!(table.get_min_writer_version(), 2);
assert_eq!(table.get_min_reader_version(), 1);
assert_eq!(
table.get_files(),
table.get_files().sort(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sort returns ()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, @akshay26031996 could you send a PR to correct the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, let me go through this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize remove action apply with early iteration exit
4 participants