Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(relayer): add leader streamer #2351

Closed
wants to merge 2 commits into from
Closed

Conversation

corverroos
Copy link
Collaborator

@corverroos corverroos commented Nov 1, 2024

Implements a leader based attest stream cache that aims to avoid overlapping streams thereby deduping the query load on the archive node.

issue: #2321

@corverroos corverroos changed the title feat(relayer): add leader streamer WIP feat(relayer): add leader streamer Nov 3, 2024
@corverroos corverroos marked this pull request as ready for review November 3, 2024 12:02
}

// IncRange increases the range to the provided height plus one
// since the leader will move on the that height next.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// since the leader will move on the that height next.
// since the leader will move on that height next.

Comment on lines +163 to +167
for s := range b.leaders {
if s.contains(offset) {
return nil, false
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you planned to use containsStreamerUnsafe here.

Comment on lines +104 to +108
if attestOffset == 1 {
// Block other streams
<-ctx.Done()
return ctx.Err()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the case attestOffset == 1. Maybe worth a comment?

type leader struct {
mu sync.RWMutex
from uint64
latest uint64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like latest is not really needed because it's only used to guess if the range was covered by somebody else or not (this does not directly prove that the range is consistenty filled). So in case of some invariant violation (for whatever reason) we can get a leader with inconsistent range coverage that will prevent others from starting on its range.

Instead we could not rely on ranges at all and simply start a new leader on any height that is a cache miss — it will ultimately bump into somebody's range (or timeout).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, this is actually critical, since, latest is increased as soon as previous is fetched, this ensure that this leader remains the leader and others don't also start streaming from next as soon as it isn't in the cache.

@corverroos
Copy link
Collaborator Author

I do realise an issue with design though: “blast radius”:

  • if leader is slow in processing X, others are slowed down artificially.
  • this can happen when mempoolLimit is reached
  • Risk of artificially slowing down xchain latency isn’t worth the optimisation.

Two options:

  • 1️⃣ Follow a more “orchestration” based pattern where workers request stuff and the orchestrator does the actual streaming. It populates the cache while workers only read from the cache. The orchestrator is fast, since it only fetches. Slow workers blast radius is contained.
  • 2️⃣ Since this is the second design that isn’t good enough, I suggest we first do Cache historical height when streaming attestations #1860 which should solve the problem. We can come back to this if we see that caching historical height isn’t sufficient and we need more optimisations.

@corverroos corverroos closed this Nov 4, 2024
@corverroos corverroos deleted the corver/leaderstream branch November 22, 2024 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants