-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Events source of truth: db or receipts #11830
Comments
I just discovered a case where this is important: I made a boo boo on my mainnet node and decided to start again from a snapshot. The snapshot will reset the splitstore but not the database, which should be fine, except for the fact that when my node got messed up it may not have been on the canonical chain. So even though a snapshot resync may give me new events (tbh I'm not convinced that it's doing this properly, that's another story though) and not duplicate existing events because of the duplicate checking on inserts, it won't give me any reverts that I should have. I've been thinking about this in the context of this: #11770 (comment) - if we make the APIs "give me events since this tipset", and walk from that tipset to the current one, we should be able to see where we go backward and call those reverts regardless of what the database says. It doesn't help with the case of "give me all events from height X", however, because we're currently just going to query the database and give them whatever shows up, including possibly some events from the same height but different tipset because they haven't been marked as reverts. In that case, we may want to do the walk of the tipsets ourselves and only collect events per-tipset and collate them for the user rather than collecting them as a whole batch with a single query. Then we just have to ask whether it'd be nearly as efficient to just read from the AMT instead of the database (probably not, but maybe it's close). |
Oh... IMO, that's closer to #11640. I.e., when we restart, we need to process all applies/reverts between the last tipset we processed and the current tipset. Handling snapshots may be a bit tricky... |
Chiming in with a birds' eye view, hoping to provide some perspective from recent battles on the field. I think it's helpful to break down the access patterns and query operations to move this convo forward.
Clients expect consistency across methods in terms of data availability. For example, taking a transaction hash returned by I agree with @Stebalien we do have some rough edges that stem from an unclear/blurry technical design. Here's what I propose concretely. Bear with me if some of these points are already implemented. I have been somewhat disconnected of Lotus development, and I'm trying to reason from first principles.
The main problems I'm interested in solving are:
|
I'm not seeing the need for all this complexity. What we need to ensure is:
Once we have that, we should be good to go. We can get the consistency by:
Once we have that, we can fix our current API endpoints such that, instead of making event handling "best effort", we always return events if we should be indexing/storing them and never return events otherwise. That should cover everything, IMO. |
To be clear, there are other good ideas in here (querying node for the supported state-range, etc.). But this issue is trying to address the specific issue where different commands have different "views" of the state. |
As I use the events for builtin actors I've been bumping into the consistency problem in odd ways, so this is something I'd really like to address properly. I've managed to get my node into inconsistent state a couple of times, mainly because of "advanced" tinkering, but it's the kind of thing that I can easily imagine others replicating. We also have the problem of events db ballooning growth (although we have some WAL-related tuning to do, which I've been playing with but it's being stubborn). Here's my current proposal for how we might re-engineer some of it to at least make it nicer from where I stand:
|
I'd like resolve a quick design question with respect to event indexing. Should we care about ordering in the database? Or should we be using the actual events as the source of truth?
I'm asking because the original idea was to index keys and values flagged for indexing, not all keys and values. We ended up using the database as the source of truth, but this also means that we ended up inserting a bunch of fields into the database that technically aren't supposed to be indexed and technically maybe should not even be queryable.
The alternative would be to find events via the index, but then actually look up the real events from the receipts tree and return those to the user. Unfortunately, that's almost certainly going to have a performance impact and will increase complexity.
The text was updated successfully, but these errors were encountered: