-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: data-expire by partition info #3273
base: master
Are you sure you want to change the base?
feature: data-expire by partition info #3273
Conversation
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lintingbin Thanks for the contribution, I left some comments, please take a look when you're free thanks.
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Show resolved
Hide resolved
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
amoro-ams/src/test/java/org/apache/amoro/server/optimizing/maintainer/TestDataExpire.java
Outdated
Show resolved
Hide resolved
amoro-ams/src/test/java/org/apache/amoro/server/optimizing/maintainer/TestDataExpire.java
Outdated
Show resolved
Hide resolved
@klion26 I have already responded and made modifications. Please help review it again. |
@lintingbin thanks for rebasing the comments, the change LGTM. let's see if there are any more comments from the community. Do we need to modify the corresponding documents? |
There is no need to modify the documentation since there are no changes in the parameters. |
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
@XBaith I have refactored the code by removing some unnecessary parts and optimizing the naming of variables to enhance readability. I also added a check for cases where transform is Void. Please review the code again. |
I don't see any unit test for that case. Do you mean you test locally for |
...-ams/src/main/java/org/apache/amoro/server/optimizing/maintainer/IcebergTableMaintainer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote some unit tests for a case involving expiring partitions after dropping the partition field. Unfortunately, the current procedure cannot handle this scenario correctly.
Here’s the partition level case:
- I set the partition field as
op_time
and inserted some records. - Then, I removed the partition field before expiring the data.
Expected behavior: All records should be retained since the partition field has been removed.
Please fix this issue and add additional unit tests to cover similar scenarios. Thanks!
Sorry for the mistake; this bug existed prior to this PR. I will raise another PR to address it. Additionally, we should add more unit tests to cover vulnerabilities as much as possible. Thanks for your contribution! |
Why are the changes needed?
Close #3272.
Brief change log
When the expiration field is the partition field and the expiration level is partition, prioritize using the partition information of the datafile to expire the data files.
Modify the expected results of a test case, which is slightly different from the previous implementation. In partition expiration, date comparison is needed, and data files with the same date should not be expired.
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation