Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soroban-rpc: Add in-memory events storage #355

Closed
wants to merge 4 commits into from
Closed

Conversation

tamirms
Copy link
Contributor

@tamirms tamirms commented Jan 24, 2023

Add in-memory events db with a retention window specified in number of ledgers. For example, the event store can be configured to have a retention window of 17280 ledgers, which corresponds to approximately 24 hours assuming an average ledger close time of 5 seconds.

The event store is implemented using a circular buffer.

Close stellar/go#4718


CREATE INDEX ledger_entries_key ON ledger_entries(key);

CREATE TABLE ledger_close_meta (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why both the ledger entries and the events needs to live in the same database; they have quite a distinct an unrelated "characteristics" and not ever accessed together.
I'm concerned that placing both in the same database would result in a database-level locking between the two that isn't needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are not accessed together but they are updated together using the same stream of ledgers coming from captive core. it's better if the two tables are in the same database because we can ensure both are updated within the same transaction. otherwise, if one table is behind the other it will complicate the ingestion code to ensure both are synced to the same ledger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the fact that we can commit the entire ledger changes in a single commit. In this particular case, I don't believe that it really give us much advantage. The events are only being "added" and not really requiring any synchronization with the rest of the data. You have already noticed that the current ledger data tend to become very large. That's because we have lots of write/delete. In order to avoid that, we need to perform periodic full vacuuming.
Unfortunately, performing the full vacuuming on a large database would take a long time - which is why it's advisable to break out the tables that have a different insert/delete profile into a separate database files.

Copy link
Contributor Author

@tamirms tamirms Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right that the ledgers (from which we derive the events) don't need to be synchronized with the ledger entries for any of our sql queries. My point was that it would require more code to manage the ingestion.

If we cannot assume that the ledger entries and ledgers tables are not synchronized to the same ledger sequence then, we need to catchup which ever table is behind and then ingest into both tables ledger by ledger. I was thinking that at this stage of soroban-rpc we could avoid that complexity. would you be ok if I created an issue for this concern and we can decide to implement it at a later point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like the right thing to do. It would allow us to release this as a working prototype while we improve on the implementation.


reader.Rewind()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems wrong that we need to rewind and re-process. But maybe I'm not understanding the codeflow here correctly.

Copy link
Contributor

@tsachiherman tsachiherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made several comments. I'd be happy to review them together if any questions comes up !

cmd/soroban-rpc/internal/events/events.go Outdated Show resolved Hide resolved
continue
}
for eventIndex, opEvent := range op.Events {
events = append(events, event{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worthwhile to offer an external config to be used here as event topic filters? as this is crunching events for the whole network, maybe rpc hosting use cases will only be interested in events for their contract or known topics, etc?

Comment on lines +68 to +79
events *events.MemoryStore
retentionWindow uint32
Copy link
Contributor

@2opremio 2opremio Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding events processing to ledgerentry_storage.go doesn't feel right, both because of the naming and because it complicates the logic.

If possible, I would separate the event ingestion from ledgerentry_storage.go. We can have a common daemon consuming CloseMeta entries, which are passed to ledgerentry_storage/ledgerentry_storage.go and, say events_storage/events_storage.go.

Something similar (but probably more lightweight) than the Horizon processors would do.

TrimLedgers(retentionWindow uint32) error
InsertLedger(ledger xdr.LedgerCloseMeta) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this into a separate interface since this is for Ledger entries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a similar request : can we avoid calling these "Ledgers" ?
i.e.

ApplyLedgerEventsRetentionWindow(retentionWindow uint32) error
InsertLedgerEvents(ledger xdr.LedgerCloseMeta) error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsachiherman we are actually are storing the entire xdr.LedgerCloseMeta into the db. We don't store events in the db, we only keep events in memory. The reason for that is because I remember some discussion of eventually exposing an HTTP endpoint to serve ledgers in soroban-rpc and that this endpoint could be used by other services / clients instead of captive core.

@2opremio
Copy link
Contributor

2opremio commented Jan 25, 2023

I think that mixing up the Ledger Entry and Events logic into the same files makes the code complicated. I would change the file hierarchy to decouple them and do the processing of each separately.

  1. I would have a common module for the db stuff (like what you are suggesting in the but making sure we get the naming right, e.g. renaming LedgerEntry to something else when it's not used exclusively for ledger entries). It probably makes sense to separate this further into two files, one for the ledgerentry stuff and another for the meta stuff.
  2. I would have two other modules for the events and ledger_entry stuff which can work as the Horizon processors (that requires some orchestrator for captive-core which passes the ledger information to each processor).

}

// GetLedger fetches a single ledger from the db.
func (s *sqlDB) GetLedger(sequence uint32) (xdr.LedgerCloseMeta, bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before sql gets more verbose, is it worth considering a type-safe compile time binding like go-jet.

@paulbellamy
Copy link
Contributor

Is this closed in favor of #361, or still relevant?

@tsachiherman
Copy link
Contributor

Is this closed in favor of #361, or still relevant?

yes; @tamirms created #361 in order to break this one into a smaller set of reviewable components.
So, he would need to rebase this one and keep working on completing this one.

@tamirms
Copy link
Contributor Author

tamirms commented Jan 30, 2023

@paulbellamy @tsachiherman I will close this PR to minimize confusion and spin up new PRs for the remaining work

@tamirms tamirms closed this Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

soroban-rpc: Implement event storage
5 participants