Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to customize key_properties for table clustering #104

Open
xiangshiyin opened this issue Oct 17, 2024 · 1 comment
Open

How to customize key_properties for table clustering #104

xiangshiyin opened this issue Oct 17, 2024 · 1 comment

Comments

@xiangshiyin
Copy link

The doc about the config cluster_on_key_properties says

Determines whether to cluster on the key properties from the tap. Defaults to false. When false, clustering will be based on _sdc_batched_at instead.

The code confirms that key_properties is used to define clustering key set and used as the primary key in merge operation

Anyone knows how to determine the default key_properties value and how to customize (is it even possible)? Thanks!

@xiangshiyin
Copy link
Author

With some further digging, we believe the key_properties here should be the primary keys defined in the stream. With the current clustering key configuration in the sink, the BQ table clustering is directly influenced by the order of columns in the primary key combo defined in the incoming stream. It'll be good if we could have a configurable parameter under the sink so we have more flexibility.

cc. @epapineau

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant