Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDCSDK] "Snapshot too old" error when dynamic table addition is disabled #24781

Open
yugabyte-ci opened this issue Nov 5, 2024 · 0 comments
Assignees
Labels
area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/highest Highest priority issue

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Nov 5, 2024

Jira Link: DB-13877

It is linked with the time at which we try to read the publication details i.e. last_pub_refresh_time. In the YugabyteDB cluster, if the flag for dynamic table support cdcsdk_enable_dynamic_table_support is disabled and hence last_pub_refresh_time value will not be changed and this will stay at the initial value assigned to it at the time of slot creation i.e. the consistent snapshot time.

Upon every task restart, we try to read the sys catalog as of the last_pub_refresh_time to get publication details and in the above case, we will try to read it as some timestamp equivalent to consistent snapshot time (set initially at the time of publication creation) which raised the “Snapshot too old” error because the history was cleared.

Workaround:
The current slot is not useful now, it needs to be dropped. Then the identified and tested workaround for the above is to enable dynamic table addition preview flag and increasing system catalog history retention. To enable the workaround, these steps need to be followed:

  1. Since we are guaranteeing that the system can recover within x hours by the use of other retention flags, we should also ensure that the system catalog keeps the history for the same period of time.
    a. Set the tserver GFlag timestamp_syscatalog_history_retention_interval_sec to match x hours
  2. Drop the existing slot.
  3. Add the tserver GFlag cdcsdk_enable_dynamic_table_support to allowed_preview_flags_csv
    a. Set the flag true - this is required to keep updating the last_pub_refresh_time
@yugabyte-ci yugabyte-ci added area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/low Low priority labels Nov 5, 2024
@yugabyte-ci yugabyte-ci added priority/highest Highest priority issue and removed priority/low Low priority labels Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/highest Highest priority issue
Projects
None yet
Development

No branches or pull requests

2 participants