Unify subscribeRepos datastore queries? #30

snarfed · 2024-08-03T03:55:40Z

Right now, in subscribeRepos, we query the datastore separately for each connected subscriber (client). This is fine for historical blocks, but it's duplicative for new blocks. For ongoing subscribers, ideally we should only do a given datastore query once, and then fan out the results to all subscribers.

Current design:

Events to emit are stored in a thread-safe ring buffer
- Live iterator that returns new events as they happen, blocking, thread-safe
- Rollback iterator that starts reading events from a given seq, then switches to live
New singleton thread that runs current code in xrpc_sync.subscribe_repos, reads blocks by seq from the datastore, assembles them into events, and stores those events in the ring buffer
- load entire rollback window eagerly, on startup? or lazily, on demand?
- phase two: start and stop this thread and its datastore query on demand
new minimal subscribeRepos handler that reads from the ring buffer

This would take some rearchitecting. Right now, we do all of this inside the request handler, per client:

arroba/arroba/xrpc_sync.py

Lines 199 to 206 in 351d43f

    
           # serve new events as they happen 
        
           logger.info(f'serving new events') 
        
           while True: 
        
               with new_events: 
        
                   new_events.wait(NEW_EVENTS_TIMEOUT.total_seconds()) 
        
               for commit_data in server.storage.read_events_by_seq(start=last_seq + 1): 
        
                   yield handle(commit_data)

We'd need to start a separate, shared thread for the realtime datastore queries, collect the resulting blocks into events in memory, and have each client's request handler read and emit from there.

The text was updated successfully, but these errors were encountered:

snarfed · 2024-11-21T22:44:33Z

Added a draft design to the top description.

snarfed · 2024-12-19T19:32:06Z

This may be getting acute, Bridgy Fed's atproto-hub is capped out on CPU serving 8 subscribeRepos clients, and it's falling behind processing Bluesky's own firehose. 😕

roughly 12-18h. for snarfed/arroba#30, snarfed/arroba#39

snarfed mentioned this issue Aug 3, 2024

Optimize costs snarfed/bridgy-fed#1149

Open

snarfed changed the title ~~Unify subscribeRepos datastore queries~~ Unify subscribeRepos datastore queries? Nov 6, 2024

snarfed mentioned this issue Nov 21, 2024

memory leak in subscribeRepos rollback window #39

Open

snarfed added the now label Dec 19, 2024

snarfed added a commit to snarfed/bridgy-fed that referenced this issue Dec 19, 2024

atproto-hub: drop subscribeRepos rollback window back down to 50k seqs

ca6e37b

roughly 12-18h. for snarfed/arroba#30, snarfed/arroba#39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify subscribeRepos datastore queries? #30

Unify subscribeRepos datastore queries? #30

snarfed commented Aug 3, 2024 •

edited

Loading

snarfed commented Nov 21, 2024

snarfed commented Dec 19, 2024

Unify subscribeRepos datastore queries? #30

Unify subscribeRepos datastore queries? #30

Comments

snarfed commented Aug 3, 2024 • edited Loading

snarfed commented Nov 21, 2024

snarfed commented Dec 19, 2024

snarfed commented Aug 3, 2024 •

edited

Loading