-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support a checkpointing mechanism for kafkasql storage #3194
Comments
Possibly use H2 online backup/restore to implement checkpointing: http://www.h2database.com/html/tutorial.html#upgrade_backup_restore We would need to backup the DB to a persistent volume and also store the location in the kafka topic (journal) that corresponds to the checkpoint. Then on pod restart, we can restore the DB from the saved backup and also skip ahead in the topic to the correct message and pickup consuming messages from there. |
Note: this should hopefully be easier to do since we sequence writes to the H2 DB. We should probably use the existing consumer thread to perform this checkpointing task. Perhaps by simply adding a message to the topic when the checkpoint should occur. |
Suggested solution using Kafka to store the checkpoint dataTwo new topics The first topic will contain the checkpoint data itself (analogous to the export functionality we have currently, but dumping the data into a topic). The second topic will ensure that only a single Registry instance will perform checkpointing at the same time, and provides metadata for loading checkpoints. This will be done by an instance announcing the intention of doing the checkpoint, so the other pods will not do it as well. This requires some additional recovery logic, in case the instance fails or is restarted during the process. Creating a checkpoint
Each coordination message contains a unique instance ID that is regenerated after restart. Reading a checkpoint
Failures and recovery
|
Very interesting proposal. I have some general comments that might inspire you to simplify/modify your ideas. Or maybe not. :)
|
Feature or Problem Description
Support a checkpointing mechanism for kafkasql storage. This will help with startup times.
Proposed Solution
TBD
Additional Context
Related Apicurio/apicurio-registry-operator#197
The text was updated successfully, but these errors were encountered: