feat(storage): support non_pk_prefix_watermark state cleaning #19889

Li0k · 2024-12-23T06:06:15Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

related to #18802

This PR supports non_pk_prefix_watermark state cleaning for Hummock.

Since non_pk_prefix_watermark relies on catalogs, this introduces additional overhead. Therefore, this PR does not guarantee read filtering for non_pk_prefix_watermark and only handles expired data.

The changes are as follows:

watermarks of type non_pk_prefix_watermark are not added to ReadWatermarkIndex.
state table support to write non_pk_prefix_watermark and serialize.
compaction catalog agent support to get watermark serde
skip watermark iterator supports filtering non_pk_prefix_watermark.

Checklist

I have written necessary rustdoc comments.
I have added necessary unit tests and integration tests.
I have added test labels as necessary.
I have added fuzzing tests or opened an issue to track them.
My PR contains breaking changes.
My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

My PR needs documentation updates.

Release note

…nto li0k/storage_non_pk_watermark_clean

src/storage/src/hummock/store/version.rs

src/meta/src/hummock/manager/compaction/mod.rs

hzxa21 · 2025-01-03T08:16:31Z

src/meta/src/hummock/manager/compaction/mod.rs

+                .table_watermarks
+                .iter()
+                .filter_map(|(table_id, table_watermarks)| {
+                    if table_id_with_pk_prefix_watermark.contains(table_id) {


We already have a WaterMarkType define in the version, why don't we just use that to filter out table with non pk prefix watermark?

Also, if we filter out non pk prefix watermark here, how can compactor retrieve the non pk prefix watermark? Based on the logic here, it seems that we rely on the fact that non pk prefix watermark is present in the compact task.

Good catch , we should filter the watermark by WaterMarkType directly.

And, the filtered results are only passed to the picker, while all relevant watermarks are passed to the compactor (pk or non-pk).

src/storage/hummock_sdk/src/compact_task.rs

src/storage/hummock_sdk/src/table_watermark.rs

src/storage/hummock_sdk/Cargo.toml

hzxa21 · 2025-01-03T08:32:49Z

src/storage/src/hummock/iterator/skip_watermark.rs

@@ -42,10 +47,14 @@ pub struct SkipWatermarkIterator<I> {
 }

 impl<I: HummockIterator<Direction = Forward>> SkipWatermarkIterator<I> {


nits: since SkipWatermarkIterator is only used by compactor, how about moving skip_watermark.rs into src/hummock/compactor?

Of course, I will propose a separate pr for it

hzxa21 · 2025-01-03T08:41:45Z

src/storage/src/hummock/iterator/skip_watermark.rs

+                                            });
+                                    let watermark_col_in_pk =
+                                        row.datum_at(*watermark_col_idx_in_pk);
+                                    cmp_datum(


IIUC, if cmp_datum returns Euqal | Greater, based on the logic in L360, the watermark will be advanced. I think this is incorrect for non pk prefix watermark because the non pk prefix watermark and the pk doesn't have the same ordering.

…nto li0k/storage_non_pk_watermark_clean

hzxa21 · 2025-01-08T13:45:21Z

src/meta/src/hummock/manager/compaction/mod.rs

+            let table_watermarks = version
+                .latest_version()
+                .table_watermarks
+                .iter()
+                .filter_map(|(table_id, table_watermarks)| {
+                    if matches!(
+                        table_watermarks.watermark_type,
+                        WatermarkSerdeType::PkPrefix,
+                    ) {
+                        Some((*table_id, table_watermarks.clone()))
+                    } else {
+                        None
+                    }
+                })
+                .collect();


Actually why don't we do the filtering inside the picker instead like in here if the watermark type is part of TableWatermarks:

risingwave/src/meta/src/hummock/compaction/selector/vnode_watermark_selector.rs

Line 53 in 5ed4920

let table_watermarks =

We can avoid cloning the table watermark, which can be large given that it stores bytes from user data, with no harm.

hzxa21 · 2025-01-08T13:56:15Z

src/storage/src/hummock/iterator/skip_watermark.rs

+                                }
+                                WatermarkSerdeType::Serde(_watermark) => {
+                                    // do not skip the non-pk prefix watermark when vnode is the same
+                                    return false;


I am afraid this is still incorrect based on the semantic of advance_watermark:

/// Return a flag indicating whether the current key will be filtered by the current watermark.

If we always return false when the table, vnode are the same here, that means none of the keys can be filtered by the watermark. Please clearfully walk through the logics of advance_watermark, should_delete and advance_key_and_watermark. I am still concerned that the implementation of SkipWatermarkState and SkipWatermarkIterator rely on the assumption that the key ordering and watermark ordering is the same and we may still miss some changes.

feat(storage): basic of non_pk_watermark state clean

605f235

github-actions bot added type/feature ci/run-e2e-single-node-tests ci/run-e2e-test-other-backends labels Dec 23, 2024

Li0k added 2 commits December 23, 2024 15:27

feat(storage): ignore non_pk_prefix_watermark compaction

501d374

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

3544c0e

…nto li0k/storage_non_pk_watermark_clean

Li0k changed the title ~~feat(storage): non_pk_watermark state clean~~ WIP: feat(storage): non_pk_watermark state clean Dec 23, 2024

Li0k marked this pull request as ready for review December 23, 2024 07:28

github-actions bot added the Invalid PR Title label Dec 23, 2024

fix ut

d1a39a8

graphite-app bot requested a review from a team December 23, 2024 08:18

Li0k added 4 commits December 23, 2024 17:01

fix panic

7c3f521

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

e3dbc73

…nto li0k/storage_non_pk_watermark_clean

refactor(storage): refactor watermark type

b71eff9

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

9e0af8e

…nto li0k/storage_non_pk_watermark_clean

Li0k requested a review from a team as a code owner December 25, 2024 12:20

Li0k requested a review from xxchan December 25, 2024 12:20

Li0k added 10 commits December 25, 2024 20:22

typo

74336d6

fix(storage): fix wateramrk_col_idx_in_pk

96de9ba

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

3127678

…nto li0k/storage_non_pk_watermark_clean

fix check

49a48ad

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

bb7a29b

…nto li0k/storage_non_pk_watermark_clean

refactor

6b0b295

typo

3c23aa3

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

3113463

…nto li0k/storage_non_pk_watermark_clean

fix panic

b2e158e

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

fd308de

…nto li0k/storage_non_pk_watermark_clean

Li0k changed the title ~~WIP: feat(storage): non_pk_watermark state clean~~ feat(storage): support non_pk_prefix_watermark state cleaning Dec 30, 2024

github-actions bot removed the Invalid PR Title label Dec 30, 2024

typo

bf28307

Li0k added 2 commits December 30, 2024 14:53

typo

369d718

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

3500061

…nto li0k/storage_non_pk_watermark_clean

Li0k requested review from hzxa21, st1page and chenzl25 December 30, 2024 07:32

hzxa21 reviewed Jan 3, 2025

View reviewed changes

Li0k added 2 commits January 8, 2025 16:46

address comments

ef4c752

Merge branch 'main' of https://github.com/risingwavelabs/risingwave i…

5ed4920

…nto li0k/storage_non_pk_watermark_clean

hzxa21 reviewed Jan 8, 2025

View reviewed changes

refactor

8130c61

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(storage): support non_pk_prefix_watermark state cleaning #19889

feat(storage): support non_pk_prefix_watermark state cleaning #19889

Li0k commented Dec 23, 2024 •

edited

Loading

hzxa21 Jan 3, 2025

hzxa21 Jan 3, 2025

Li0k Jan 8, 2025

hzxa21 Jan 3, 2025

Li0k Jan 8, 2025 •

edited

Loading

hzxa21 Jan 3, 2025

hzxa21 Jan 8, 2025

hzxa21 Jan 8, 2025

		@@ -42,10 +47,14 @@ pub struct SkipWatermarkIterator<I> {
		}

		impl<I: HummockIterator<Direction = Forward>> SkipWatermarkIterator<I> {

feat(storage): support non_pk_prefix_watermark state cleaning #19889

Are you sure you want to change the base?

feat(storage): support non_pk_prefix_watermark state cleaning #19889

Conversation

Li0k commented Dec 23, 2024 • edited Loading

What's changed and what's your intention?

Checklist

Documentation

hzxa21 Jan 3, 2025

Choose a reason for hiding this comment

hzxa21 Jan 3, 2025

Choose a reason for hiding this comment

Li0k Jan 8, 2025

Choose a reason for hiding this comment

hzxa21 Jan 3, 2025

Choose a reason for hiding this comment

Li0k Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

hzxa21 Jan 3, 2025

Choose a reason for hiding this comment

hzxa21 Jan 8, 2025

Choose a reason for hiding this comment

hzxa21 Jan 8, 2025

Choose a reason for hiding this comment

Li0k commented Dec 23, 2024 •

edited

Loading

Li0k Jan 8, 2025 •

edited

Loading