Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handoff between real-time and historical processing #216

Closed
1 task done
Tracked by #418
Ishatt opened this issue Sep 26, 2023 · 2 comments
Closed
1 task done
Tracked by #418

Handoff between real-time and historical processing #216

Ishatt opened this issue Sep 26, 2023 · 2 comments
Assignees

Comments

@Ishatt
Copy link
Contributor

Ishatt commented Sep 26, 2023

Description

Problem:

Currently we do not coordinate between real-time and historical processing, indexers can run on blocks out of order.

See Confluence pages

See Real-time to Historical communication for architectural concerns.

Proposed Solutions:

Append real-time blocks to be processed to the processing queue of the historical thread.

After the move to Redis we could store a redis flag for each function for historical_backfill_in_progress . When this is set at the start of historical backfill, real-time processing can skip processing until both the flag is cleared and the historical queue is empty. Once the historical queue has been filled the flag can be cleared.

Linked Issues

Preview Give feedback
  1. 9 of 9
    Epic
    morgsmccauley

Relates to

Preview Give feedback
No tasks being tracked yet.
@roshaans
Copy link
Contributor

roshaans commented Oct 9, 2023

@ailisp would like to create a "sql join query" in query-api, but is unable to do so without Hasura foriegn-key relationships which he is unable to do as QueryAPI currently processes historical backfill in parallel with real-time blocks and in order to create forign-key relationships, you have to impose an order of insertions

object X references object Y, but X can be inserted before Y that result in foreign key constraint violation.

@morgsmccauley
Copy link
Collaborator

morgsmccauley commented Oct 24, 2023

One way to solve this is for each indexer to have their own 'real time' block process, rather than the shared one we have now. This would make the handover much simpler, as it could just 'start' at the correct block height after historical has completed.

morgsmccauley added a commit that referenced this issue Nov 7, 2023
This PR updates Coordinator such that **only one** Historical Backfill
process can exist for a given Indexer Function. Currently, these
processes do not have awareness of each other, meaning a developer can
trigger multiple simultaneous backfills, with them all writing to the
same Redis Stream. The major changes are detailed below.

## Streamer/Historical backfill tasks
In this PR, I've introduced the `historical_block_processing::Streamer`
struct. This is used to encapsulate the async task which is spawned for
handling the Historical Backfill. I went with "Streamer", as I envision
this becoming the single process which handles both historical/real-time
messages for a given Indexer, as described in
#216 (comment).

`Streamer` is responsible for starting the `async` task, and provides a
basic API for interacting with it. Right now, that only includes
cancelling it. Cancelling a `Streamer` will drop the existing future,
preventing any new messages from being added, and then delete the Redis
Stream removing all existing messages.

## Indexer Registry Mutex/Async
`indexer_registry` is the in-memory representation of the given Indexer
Registry. Currently, we pass around the `MutexGuard` to mutate this
data, resulting in the lock being held for much longer than is needed,
blocking all other consumers from taking the lock. There isn't much
contention for this lock, so it isn't a problem right now, but could end
up being one in future.

Adding in the `Streamer` `Mutex` made mitigating this problem trivial,
so I went ahead and made some changes. I've updated the code so that we
pass around the `Mutex` itself, rather than the guard, allowing us to
the hold the lock for the least amount of time possible. This required
updating most methods within `indexer_registry` to `async`, so that we
could take the `async` `lock`.

With the `Mutex` being used rather than the `MutexGuard`, it seemed
sensible to add `indexer_registry` to `QueryAPIContext`, rather than
passing it round individually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants