Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-2875] Added metric for orphan file action #3004

Merged
merged 5 commits into from
Aug 16, 2024

Conversation

huyuanfeng2018
Copy link
Contributor

Why are the changes needed?

Close #2875

Brief change log

  • added TableOrphanFilesCleaningMetrics used to collect metrics generated by cleaning orphan files

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added type:docs Improvements or additions to documentation module:ams-server Ams server module labels Jul 4, 2024
Copy link
Contributor

@XBaith XBaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can try to collect more detailed information, such as the size of deleted files

Comment on lines 84 to 85
| table_slated_orphan_content_file_cleaning_count | Counter | catalog, database, table | Slated Count of orphan content files cleaned in the table since ams started |
| table_slated_orphan_metadata_file_cleaning_count | Counter | catalog, database, table | Slated Count of orphan metadata files cleaned in the table since ams started |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does slated_orphan_content_file and slated_orpha_metadata_file means?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indicates the calculated number of files expected to be deleted. Failure may occur during the deletion process, so there will also be an metric of the actual number of deleted files.

@czy006 czy006 modified the milestone: Release 0.8.0 Jul 16, 2024
@huyuanfeng2018
Copy link
Contributor Author

https://iceberg.apache.org/docs/nightly/spark-procedures/#output_7
@XBaith
It seems that iceberg only lists the deleted files and does not calculate the size. I looked at the code in Ams and found that the statistical size changes code are a bit large. I think the current file number indicator is good enough. WDYT?

@zhoujinsong
Copy link
Contributor

Hi, @huyuanfeng2018
There seem to be some conflicts that should be fixed.

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some small suggestion about the metric name, PTAL.

}

public static final MetricDefine TABLE_ORPHAN_CONTENT_FILE_CLEANING_COUNT =
defineCounter("table_orphan_content_cleaning_count")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is table_orphan_content_file_cleaning_count proper here, being consistent with the table_orphan_metadata_file_cleanning_count ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there was a mistake here, it's been fixed

.build();

public static final MetricDefine TABLE_SLATED_ORPHAN_CONTENT_FILE_CLEANING_COUNT =
defineCounter("table_slated_orphan_content_file_cleaning_count")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the name table_expected_orphan_content_file_cleaning_count easier to understand here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally inherited the style of the old code and named it slated, but now it seems expected is easier to understand.

@zhoujinsong
Copy link
Contributor

@huyuanfeng2018 Some conflicts seem to need to be fixed.

# Conflicts:
#	amoro-ams/amoro-ams-server/src/main/java/org/apache/amoro/server/table/TableRuntime.java
@huyuanfeng2018
Copy link
Contributor Author

@huyuanfeng2018 Some conflicts seem to need to be fixed.

fixed & PTAL.

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhoujinsong zhoujinsong merged commit 923d81b into apache:master Aug 16, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-server Ams server module type:docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement]: Action clean-orphan-files should report metric
5 participants