Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: add horizon db reingest fill-gaps command #4010

Closed
tamirms opened this issue Oct 18, 2021 · 0 comments · Fixed by #4060
Closed

Feature Request: add horizon db reingest fill-gaps command #4010

tamirms opened this issue Oct 18, 2021 · 0 comments · Fixed by #4060
Assignees

Comments

@tamirms
Copy link
Contributor

tamirms commented Oct 18, 2021

What problem does your feature solve?

When running the horizon db reingest range command with parallel workers the entire range to be ingested is divided among several threads. If the command crashes (via the user aborting or through an operational error) this will result in the range being partially ingested. However, because the range was being ingested concurrently, there is no guarantee that horizon's ingestion history will be left in a contiguous state.

For example, let's say we are reingesting from ledger 1 to ledger 1,000 with 10 workers. After some time, the command is aborted. It is possible that all the workers finished their ingestion task except the worker which was ingesting the sub range from ledger 500 to 600.

Thus, there is a gap in history from ledger 500 to 600.

What would you like to see?

We should implement a horizon db reingest fill-gaps command.

horizon db reingest fill-gaps can be called with no parameters in which case it queries for any existing gaps in the horizon db and then proceeds to ingest history to fill the gaps.

The command can also be called with a start and end ledger parameter. In which case, the command will only query for gaps within the provided range.

horizon db reingest fill-gaps 1 1000 will fill any gaps occurring within the range 1-1000

We should also be able to configure the same parallel worker command line parameters that are used by horizon db reingest range so that fill-gaps can be also take advantage of concurrency.

What alternatives are there?

There is a horizon db detect-gaps command which will examine the horizon db to find any gaps in history. So, horizon operators could manually call horizon db detect-gaps and then fill the gaps manually with several horizon db reingest commands. This workflow is doable for operators who maintain a single horizon instance. But it becomes cumbersome and error prone when you are managing many horizon instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant