Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new ChainIndexer that subsumes that existing MsgIndex, EventIndex and TransactionIndex #12453

Closed
9 of 12 tasks
Tracked by #12344
aarshkshah1992 opened this issue Sep 12, 2024 · 16 comments
Closed
9 of 12 tasks
Tracked by #12344
Assignees
Labels

Comments

@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented Sep 12, 2024

Summary

This issue is for the implementation of a new ChainIndexer in Lotus that will replace and subsume the existing MsgIndex, EventsIndex, and EthTxHashIndex, which are currently fragmented across multiple databases and have several known issues documented in filecoin-project/lotus#12293.

Key Features

The ChainIndexer offers the following key features:

  • Indexes all necessary state in a single database.
  • We now index a "tipset" and the relevant state changes caused by a tipset which makes the Indexed state easy to reason about and more aligned with how we persist state in the Chainstore.
  • Implements snapshot hydration.
  • Implements automated config driven Index garbage collection (GC).
  • Provides automated backfilling.
  • Offers simplified configuration.
  • Wraps the asynchronous indexing in a synchronous READ API for RPC endpoints to consume to avoid off by N errors caused by async Indexing.

Note: while the ChainIndexer is primarily focused on events and ETH RPC usecases, it also benefits pre-FEVM as well. For example, StateSearchMsg and its various dependents will now have a shortcut to find the message.

Implementation Items

Tasks

Preview Give feedback

Switch RPC APIs to use the Chain Index

  • The Filecoin and ETH RPC APIs will switch to using the ChainIndexer instead of the MsgIndex, EthTxHashIndex and EventsIndex.
  • The EventFilterManager will read events from the ChainIndexer and prefill all registered filters rather than depending on the Indexer to do the pre-filling of filters.
  • The ChainIndexer will listen to Mpool message addition updates to index the corresponding ETH Tx Hash. The EthTxHashManager will no longer be used for this.

Read APIs Should Account for the Async Nature of Indexing

  • All APIs that read from the index will wait for the current head in the chainstore to be indexed if their first read attempt fails and then retry the read before returning a response to the user.
    • Note that we're only waiting for an already produced tipset in our chainstore to be indexed here; we're not waiting for a new tipset to be produced. This should work well in practise but if it times out, it points to another underlying problem in our chainstore <-> indexing pathway that should be investigated.
  • This is a workaround to handle asynchronous indexing in Filecoin.
  • For events, it should be noted that indexing the current head T only indexes events in T-1 because of deferred execution.

ETH RPC APIs Should Only Expose Executed Tipsets and Messages

  • As part of this work, we should review all ETH RPC APIs to ensure they only expose and respond to requests for executed tipsets and transactions. This is because Filecoin uses a deferred execution model, unlike Ethereum:
  • In Filecoin, messages included in tipset T are executed in tipset T + 1.
  • In Ethereum, messages included in tipset T are also executed in tipset T.
  • Exposing messages and tipsets that have not been executed yet via ETH RPC APIs in Lotus causes errors when users ask for the corresponding execution state, receipts, or events for those tipsets/messages because they do not yet exist.
  • There have been proposals for an even more conservative implementation where ETH RPC APIs only expose "finalized" tipsets and messages post-F3. However, it remains to be determined how well this would work in practice, given that clients might end up waiting for up to 3 epochs (90 seconds) for already included messages in the worst case.
  • For now, we should go ahead with exposing all(even non-finalised) executed tipsets/messages and revisit the implementation after F3 ships.

Removing Re-orged Tipsets That Are No Longer Part of the Canonical Chain

  • The ChainIndexer will periodically prune all permanently re-orged/reverted tipsets from the index. It can do this by simply pruning all tipsets at a height less than (current head - finality policy - some buffer).
  • The use of foreign key-based cascading deletes in the DDL will greatly simplify this implementation. By simply deleting a tipset from the index, all associated indexed state will be deleted from the DB. See SQLite Foreign Keys for more information.

Garbage Collection

  • Garbage collection (GC) will be configuration-driven. Users can specify how much history they want to retain, and the ChainIndexer can perform periodic GC based on this configuration.
  • GC should be straightforward in the ChainIndexer because of the use of FOREIGN KEY ON CASCADE DELETES, as described in SQLite Foreign Keys.
  • When a tipset is deleted, all associated indexed state will be automatically deleted from the DB due to the cascading delete behavior.

Snapshot Hydration

  • When a node is synced from a snapshot, the index should be completely deleted, and a new index should be hydrated from the snapshot.
  • It's important to note that snapshots don't contain events. In order to hydrate events in the index, messages in the tipset will have to be re-executed.

Automated Backfilling

  • When a Lotus node starts up, it performs the following steps:
    • Looks up the latest non-reverted tipset in the ChainIndex for which the corresponding state exists in the statestore.
    • Instantiates the Observer with that tipset as the current head.
    • Starts the Observer.
  • This process ensures that the ChainIndexer will observe the (Apply, Revert) path between its last non-reverted indexed tipset and the current heaviest tipset in the chainstore before processing real-time updates, effectively performing automated backfilling.
  • One challenge arises for a niche use case when an RPC provider toggles the Indexing flag to ON after keeping it OFF for an extended period or for the first time. In such cases, the backfilling backlog could interfere with indexing real-time tipset changes, potentially impacting RPC queries that primarily target state at or near the head.
  • To address this issue, a configuration option can be exposed that allows such users to disable automated backfilling if their primary focus is serving RPC queries for new tipsets after enabling indexing.

Simplify Indexing Config

  • The current indexing configuration in Lotus is extremely complex, with partial indexing options that make it difficult for node operators to understand the state they are indexing or should be indexing.
  • To improve the user experience and simplify the implementation, the current config will be replaced with a simple "Indexing ON/Indexing OFF" switch. Users either index everything that Lotus needs to provide fast RPC responses or index nothing.
  • The niche use case of "Index X but do not Index Y" will no longer be supported.

Migration from Old Indices to the New ChainIndex

  • Develop a lotus-shed utility that allows users to migrate existing indices to the new ChainIndexer database. This command should only be executed when the Lotus node is offline to ensure data consistency and avoid potential conflicts.
  • When a Lotus node starts up, it should bypass any migration or backfilling processes and directly begin indexing new tipsets in the ChainIndexer. This approach offers several benefits:
  1. Users can migrate the historical index at their own pace without incurring a performance penalty during node startup.
  2. The node can quickly respond to queries for new tipsets since indexing for these tipsets commences as soon as the node is operational.
  • By decoupling the migration process from the node startup, users gain flexibility in managing the transition to the new indexing system while maintaining optimal performance for real-time tipset indexing.
@aarshkshah1992 aarshkshah1992 added the kind/feature Kind: Feature label Sep 12, 2024
@github-project-automation github-project-automation bot moved this to 📌 Triage in FilOz Sep 12, 2024
@aarshkshah1992 aarshkshah1992 self-assigned this Sep 12, 2024
@BigLep
Copy link
Member

BigLep commented Oct 9, 2024

🧵 From Slack conversations: "It will take 9-10 days to backfill the ChainIdexer all the way back to FEVM, but it is a one time cost, and you can copy the index over to other nodes, so you only need to run the backfill operation on one node."

A few questions/thoughts on this:

  1. What users do we need to proactively talk about this with? I know Glif is aware. Who else should we bring into this conversation?
  2. Do these users have multiple nodes so they can do a rolling upgrade?
  3. Related to number 2, is this a showstopper for any of these archival users?
  4. If this is a showstopper, what are our options?
  • Bootstrap chainindex.db from the 3 existing sqlite dbs (and then quickly identify areas that are missing data and backfill them)?
  • Have someone in the community generate chainindex.db and share with others (and include the accompanying verify commands)?
  • ???

@BigLep
Copy link
Member

BigLep commented Oct 10, 2024

Some notes from 2024-10-09 Lotus standup focused on the "~9 days to backfill a FEVM-archival node" topic:

  • This whole conversation is starting from the perspective that asking an infra provider to dedicate days to backfilling is a big ask.
  • @jennijuju is going to get a list from @eshon during 2024-10-10 of the infra operators we should reach out to discuss how much of a show-stopper this days of backfilling it.
    • Once she has that, she'll create a table in filecoindevs Notion for tracking user communication and their needs.
  • Ideally @aarshkshah1992 will document more as to why there is no migration from the old dbs to the new one. For example, is it technically not possible? Will the wall-clock time to migrate from 3 to 1 be the same as just backfilling? Or even if it's not the same amount of time, is it still on the order of days? (Reducing upgrade time from 9 days to 2 days is fine, but in practice that isn't very helpful since operators still need to plan for multiple days which is still a problem.) Is the engineering time involved to do this too expensive?
    • This is effectively what @rvagg is asking about here: feat: migration("re-indexing"), backfilling and diasgnostics tooling for the ChainIndexer #12450 (comment)
    • On the topic of the legacy 3 database missing data, one could make the argument that "yeah, that's true, but it's what I've been using. Is it possible to use that incomplete data now while a background backfill process runs to fill it in over the next X days? I'm no worse off than I was before and don't have to have a node with production downtime for days on end. I understand that until the backfill process runs I won't necessarily be serving 100% correct info, but it's no worse than my previous situation."
  • A lot of good/valid thought has gone into the decisions not to improve the existing system we had and to make the ChainIndexer be a hard cutover rather than keep both systems running in parallel. We have communicate the problems with the old system (e.g., "all of the above indices suffer from some or all of the following problems that need to be fixed" section of Meta Issue: Fixing high impact correctness and performance problems in ETH RPC API for snapshot synced nodes #12293), but given the big upgrade ask for archival nodes, it's probably worth being clear why we had to make the decisions we did that is now causing this large upgrade time.
    • Avoiding missing data holes is obviously the killer reason as is a simpler system for maintainers so bugs can be debugged and fixed quicker.
    • An area I don't think we have called out is the space savings we get from normalizing the data across tables in one chainindex db rather than the duplication of data that happened across 3 separate dbs before.
      • Having a good clean data model with a good abstraction sets us up to rely on indices more for further performance gains and to leverage other database solutions.
  • There are good documentation items in chain/index/chain-indexing-overview-for-rpc-providers.MD that need to be done per the comments. Unless someone gets to it sooner, @BigLep will plan on doing a first pass on 2024-10-10.

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Oct 10, 2024

@BigLep

🧵 From Slack conversations: "It will take 9-10 days to backfill the ChainIdexer all the way back to FEVM, but it is a one time cost, and you can copy the index over to other nodes, so you only need to run the backfill operation on one node."

A few questions/thoughts on this:

  1. What users do we need to proactively talk about this with? I know Glif is aware. Who else should we bring into this conversation?
  2. Do these users have multiple nodes so they can do a rolling upgrade?
  3. Related to number 2, is this a showstopper for any of these archival users?
  4. If this is a showstopper, what are our options?
  • Bootstrap chainindex.db from the 3 existing sqlite dbs (and then quickly identify areas that are missing data and backfill them)?
  • Have someone in the community generate chainindex.db and share with others (and include the accompanying verify commands)?
  • ???
  1. The long backfilling time is primarily a concern for archival nodes, not snapshot synced nodes. Protofire(Glif), Vulcanise and Blockscout are the three archival node operators I am aware of. Would love @eshon and @jennijuju to chime in if there are more. We've already proactively initiated conversations with Protofire/Glif about what's coming up. Ideally, we would deploy the ChainIndexer on their archival node first to serve a portion of their traffic and once we get a green-light from them -> onboard other RPC providers.

  2. Do these users have multiple nodes so they can do a rolling upgrade? I know Protofire and Vulcanise do. I am unsure about the others.

  3. If a user is only running one archival node , here are the options:

    • Either they spin up an additional archival node to be able to do a rolling upgrade OR
    • Get a copy of the new Index from other RPC providers/community where there is a trusted relationship.

I would like to strongly push back against the idea of using the old Index (which suffers from multiple problems which prompted this workstream in the first place) to build the new one.
There is a non-trivial amount of engineering effort involved in doing this right and I am fairly confident that if we go down this path -> we will end up having to spend engineering cycles down the road on debugging correctness problems with the ChainIndexer which are really happening because of the missing/inconsistent data in the old Index.

Also replied at:
#12450 (comment).

@eshon
Copy link
Contributor

eshon commented Oct 10, 2024

Another archival node provider is Zondax, let me share details later today with Jenni.

@eshon
Copy link
Contributor

eshon commented Oct 10, 2024

When you say "backfilling" do you specifically mean backfilling the FEVM indexes only would take 9 days?

Does this assume the node has already loaded all FEVM archival data since FEVM launch and is fully synced?

@aarshkshah1992
Copy link
Contributor Author

@eshon Yes this assumes that the node has already loaded all FEVM archival data since FEVM launch and is fully synced.
"Backfilling" here refers to reading the chain state and indexing data that we need for faster RPC responses in the Index Database.

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Oct 10, 2024

Results from testing on a dedicated Protofire FEVM Archival node. This node is doing nothing other than syncing the chain.

1) Backfilling 1 month of epochs backwards from the current chain head. Takes ~12 hours.

2024-10-08 18:06:43.525 starting chainindex validation; from epoch: 4336809; to epoch: 4250409; backfill: true; log-good: false
2024-10-08 18:15:49.508 -------- Chain index validation progress: 3.33%; Time elapsed: 9m5.98274048s
2024-10-08 18:27:27.114 -------- Chain index validation progress: 6.67%; Time elapsed: 20m43.58922645s
2024-10-08 18:42:42.489 -------- Chain index validation progress: 10.00%; Time elapsed: 35m58.963728548s
2024-10-08 19:01:34.272 -------- Chain index validation progress: 13.33%; Time elapsed: 54m50.747261985s
2024-10-08 19:27:53.144 -------- Chain index validation progress: 16.67%; Time elapsed: 1h21m9.618754411s
2024-10-08 20:06:49.629 -------- Chain index validation progress: 20.00%; Time elapsed: 2h0m6.103717312s
2024-10-08 21:10:58.370 -------- Chain index validation progress: 23.33%; Time elapsed: 3h4m14.844417783s
2024-10-08 22:17:20.862 -------- Chain index validation progress: 26.67%; Time elapsed: 4h10m37.337324591s
2024-10-08 23:26:31.600 -------- Chain index validation progress: 30.00%; Time elapsed: 5h19m48.07516203s
2024-10-09 00:31:51.979 -------- Chain index validation progress: 33.33%; Time elapsed: 6h25m8.453541436s
2024-10-09 01:58:04.654 -------- Chain index validation progress: 36.67%; Time elapsed: 7h51m21.128442883s
2024-10-09 03:06:59.404 -------- Chain index validation progress: 40.00%; Time elapsed: 9h0m15.878883989s
2024-10-09 03:19:06.227 -------- Chain index validation progress: 43.33%; Time elapsed: 9h12m22.702241843s
2024-10-09 03:29:00.946 -------- Chain index validation progress: 46.67%; Time elapsed: 9h22m17.420597166s
2024-10-09 03:38:47.714 -------- Chain index validation progress: 50.00%; Time elapsed: 9h32m4.189265746s
2024-10-09 03:48:33.692 -------- Chain index validation progress: 53.33%; Time elapsed: 9h41m50.167261601s
2024-10-09 03:58:44.708 -------- Chain index validation progress: 56.67%; Time elapsed: 9h52m1.183098448s
2024-10-09 04:09:45.871 -------- Chain index validation progress: 60.00%; Time elapsed: 10h3m2.346345951s
2024-10-09 04:21:08.180 -------- Chain index validation progress: 63.33%; Time elapsed: 10h14m24.654708182s
2024-10-09 04:32:44.268 -------- Chain index validation progress: 66.67%; Time elapsed: 10h26m0.742834532s
2024-10-09 04:43:09.888 -------- Chain index validation progress: 70.00%; Time elapsed: 10h36m26.36274386s
2024-10-09 04:51:30.369 -------- Chain index validation progress: 73.33%; Time elapsed: 10h44m46.843732873s
2024-10-09 05:02:44.664 -------- Chain index validation progress: 76.67%; Time elapsed: 10h56m1.138670683s
2024-10-09 05:14:33.169 -------- Chain index validation progress: 80.00%; Time elapsed: 11h7m49.644118179s
2024-10-09 05:26:52.491 -------- Chain index validation progress: 83.33%; Time elapsed: 11h20m8.965545335s
2024-10-09 05:39:28.663 -------- Chain index validation progress: 86.67%; Time elapsed: 11h32m45.138303295s
2024-10-09 05:51:50.451 -------- Chain index validation progress: 90.00%; Time elapsed: 11h45m6.925924816s
2024-10-09 06:03:02.344 -------- Chain index validation progress: 93.33%; Time elapsed: 11h56m18.819100394s
2024-10-09 06:15:01.300 -------- Chain index validation progress: 96.67%; Time elapsed: 12h8m17.774766296s
2024-10-09 06:26:34.288 -------- Chain index validation progress: 100.00%; Time elapsed: 12h19m50.762641635s
2024-10-09 06:26:34.305 -------- Chain index validation progress: 100.00%; Time elapsed: 12h19m50.779804039s

2) Backfilling 1 month of epochs post FEVM launch . Takes ~10 hours.

2024-10-09 06:34:36.198 starting chainindex validation; from epoch: 2769848; to epoch: 2683448; backfill: true; log-good: false
2024-10-09 06:54:32.847 -------- Chain index validation progress: 3.33%; Time elapsed: 19m56.648777171s
2024-10-09 07:13:29.590 -------- Chain index validation progress: 6.67%; Time elapsed: 38m53.391578991s
2024-10-09 07:31:37.937 -------- Chain index validation progress: 10.00%; Time elapsed: 57m1.738433863s
2024-10-09 07:53:53.763 -------- Chain index validation progress: 13.33%; Time elapsed: 1h19m17.564622641s
2024-10-09 08:17:20.598 -------- Chain index validation progress: 16.67%; Time elapsed: 1h42m44.400170981s
2024-10-09 08:38:23.602 -------- Chain index validation progress: 20.00%; Time elapsed: 2h3m47.403992297s
2024-10-09 08:59:40.515 -------- Chain index validation progress: 23.33%; Time elapsed: 2h25m4.31638391s
2024-10-09 09:22:41.837 -------- Chain index validation progress: 26.67%; Time elapsed: 2h48m5.638957169s
2024-10-09 09:46:41.586 -------- Chain index validation progress: 30.00%; Time elapsed: 3h12m5.387221278s
2024-10-09 10:09:15.496 -------- Chain index validation progress: 33.33%; Time elapsed: 3h34m39.29731905s
2024-10-09 10:30:27.827 -------- Chain index validation progress: 36.67%; Time elapsed: 3h55m51.628606445s
2024-10-09 10:51:02.016 -------- Chain index validation progress: 40.00%; Time elapsed: 4h16m25.817962431s
2024-10-09 11:13:19.400 -------- Chain index validation progress: 43.33%; Time elapsed: 4h38m43.201847276s
2024-10-09 11:35:17.255 -------- Chain index validation progress: 46.67%; Time elapsed: 5h0m41.0564808s
2024-10-09 11:58:17.438 -------- Chain index validation progress: 50.00%; Time elapsed: 5h23m41.240064043s
2024-10-09 12:19:09.401 -------- Chain index validation progress: 53.33%; Time elapsed: 5h44m33.202230962s
2024-10-09 12:39:43.318 -------- Chain index validation progress: 56.67%; Time elapsed: 6h5m7.120162996s
2024-10-09 13:00:36.205 -------- Chain index validation progress: 60.00%; Time elapsed: 6h26m0.007156519s
2024-10-09 13:22:07.533 -------- Chain index validation progress: 63.33%; Time elapsed: 6h47m31.334230385s
2024-10-09 13:42:22.805 -------- Chain index validation progress: 66.67%; Time elapsed: 7h7m46.606813157s
2024-10-09 14:02:50.702 -------- Chain index validation progress: 70.00%; Time elapsed: 7h28m14.503955704s
2024-10-09 14:23:17.452 -------- Chain index validation progress: 73.33%; Time elapsed: 7h48m41.253678763s
2024-10-09 14:42:55.491 -------- Chain index validation progress: 76.67%; Time elapsed: 8h8m19.292820409s
2024-10-09 15:05:11.490 -------- Chain index validation progress: 80.00%; Time elapsed: 8h30m35.292191527s
2024-10-09 15:27:14.396 -------- Chain index validation progress: 83.33%; Time elapsed: 8h52m38.197724796s
2024-10-09 15:49:58.772 -------- Chain index validation progress: 86.67%; Time elapsed: 9h15m22.573845885s
2024-10-09 16:12:19.897 -------- Chain index validation progress: 90.00%; Time elapsed: 9h37m43.698457415s
2024-10-09 16:33:45.127 -------- Chain index validation progress: 93.33%; Time elapsed: 9h59m8.929105029s
2024-10-09 16:56:38.008 -------- Chain index validation progress: 96.67%; Time elapsed: 10h22m1.809325232s
2024-10-09 17:19:30.228 -------- Chain index validation progress: 100.00%; Time elapsed: 10h44m54.030102146s
2024-10-09 17:19:30.354 -------- Chain index validation progress: 100.00%; Time elapsed: 10h44m54.155308084s

3) Backfilling 1 month of epochs mid-way between FEVM launch and the current chain head. Takes ~13 hours

2024-10-09 18:06:50.812 starting chainindex validation; from epoch: 3511567; to epoch: 3425167; backfill: true; log-good: false
2024-10-09 18:22:00.482 -------- Chain index validation progress: 3.33%; Time elapsed: 15m9.670482824s
2024-10-09 18:35:25.365 -------- Chain index validation progress: 6.67%; Time elapsed: 28m34.553606048s
2024-10-09 18:48:19.165 -------- Chain index validation progress: 10.00%; Time elapsed: 41m28.353796507s
2024-10-09 19:01:29.618 -------- Chain index validation progress: 13.33%; Time elapsed: 54m38.806024773s
2024-10-09 19:15:12.071 -------- Chain index validation progress: 16.67%; Time elapsed: 1h8m21.259877238s
2024-10-09 19:30:44.968 -------- Chain index validation progress: 20.00%; Time elapsed: 1h23m54.15652168s
2024-10-09 19:50:59.944 -------- Chain index validation progress: 23.33%; Time elapsed: 1h44m9.132300745s
2024-10-09 20:19:22.942 -------- Chain index validation progress: 26.67%; Time elapsed: 2h12m32.130043369s
2024-10-09 20:52:27.399 -------- Chain index validation progress: 30.00%; Time elapsed: 2h45m36.587897912s
2024-10-09 21:20:40.064 -------- Chain index validation progress: 33.33%; Time elapsed: 3h13m49.25204028s
2024-10-09 21:50:49.975 -------- Chain index validation progress: 36.67%; Time elapsed: 3h43m59.162984189s
2024-10-09 22:18:22.220 -------- Chain index validation progress: 40.00%; Time elapsed: 4h11m31.408482377s
2024-10-09 22:45:28.032 -------- Chain index validation progress: 43.33%; Time elapsed: 4h38m37.22010544s
2024-10-09 23:12:16.162 -------- Chain index validation progress: 46.67%; Time elapsed: 5h5m25.350077042s
2024-10-09 23:39:37.234 -------- Chain index validation progress: 50.00%; Time elapsed: 5h32m46.422173688s
2024-10-10 00:10:51.416 -------- Chain index validation progress: 53.33%; Time elapsed: 6h4m0.604601922s
2024-10-10 00:46:44.348 -------- Chain index validation progress: 56.67%; Time elapsed: 6h39m53.536003528s
2024-10-10 01:31:14.595 -------- Chain index validation progress: 60.00%; Time elapsed: 7h24m23.783330796s
2024-10-10 04:05:18.792 -------- Chain index validation progress: 63.33%; Time elapsed: 9h58m27.980538058s
2024-10-10 04:25:13.568 -------- Chain index validation progress: 66.67%; Time elapsed: 10h18m22.756382023s
2024-10-10 04:45:33.326 -------- Chain index validation progress: 70.00%; Time elapsed: 10h38m42.514054977s
2024-10-10 05:05:31.425 -------- Chain index validation progress: 73.33%; Time elapsed: 10h58m40.613381271s
2024-10-10 05:25:52.663 -------- Chain index validation progress: 76.67%; Time elapsed: 11h19m1.850925885s
2024-10-10 05:45:20.150 -------- Chain index validation progress: 80.00%; Time elapsed: 11h38m29.338635039s
2024-10-10 06:05:00.198 -------- Chain index validation progress: 83.33%; Time elapsed: 11h58m9.386637783s
2024-10-10 06:24:40.726 -------- Chain index validation progress: 86.67%; Time elapsed: 12h17m49.91455223s
2024-10-10 06:43:07.535 -------- Chain index validation progress: 90.00%; Time elapsed: 12h36m16.723633668s
2024-10-10 07:00:46.843 -------- Chain index validation progress: 93.33%; Time elapsed: 12h53m56.031377256s
2024-10-10 07:20:33.779 -------- Chain index validation progress: 96.67%; Time elapsed: 13h13m42.967073078s
2024-10-10 07:38:29.397 -------- Chain index validation progress: 100.00%; Time elapsed: 13h31m38.585839486s
2024-10-10 07:38:29.963 -------- Chain index validation progress: 100.00%; Time elapsed: 13h31m39.151568536s

I am now running the index "doctor"/validation on these to sanity check that the backfilled data is in line with the chain state.

@BigLep
Copy link
Member

BigLep commented Oct 15, 2024

@aarshkshah1992 : can we get final numbers on chainindex.db size for the full archival node? I know there were some numbers here, but I'm not sure how many tipsets that is and I'd also like to get a larger time range. I want to be able to make a statement like "As of 202410, ChainIndexer will accumulate approximately XMiB per day of data, or XGiB per month" in #12600

@BigLep
Copy link
Member

BigLep commented Oct 16, 2024

@aarshkshah1992 : can we get final numbers on chainindex.db size for the full archival node? I know there were some numbers here, but I'm not sure how many tipsets that is and I'd also like to get a larger time range. I want to be able to make a statement like "As of 202410, ChainIndexer will accumulate approximately XMiB per day of data, or XGiB per month" in #12600

I'm seeing our docs already had a statement that "The ChainIndex will consume ~10GB of storage per month of tipsets (e.g., ~86400 epochs)". I guess that's all I need but it would be good to have an official record of it in here like you have with backfill times in #12453 (comment)

@jennijuju
Copy link
Member

Would love @eshon and @jennijuju to chime in if there are more.

Talked with Eva and the summary (in notion) is shared with the team

@aarshkshah1992
Copy link
Contributor Author

@BigLep We have yet to index the entire history all the way upto FEVM launch. We were waiting on the reviews to land/get addressed so we can be sure that we're using the same indexing code as users.

Looks like the PR will be ready tomorrow (all reviews will have been addressed) -> will then kick-off an indexing of the entire state and also get all the numbers you need here.

@aarshkshah1992
Copy link
Contributor Author

@BigLep

The ChainIndex will consume ~10GB of storage per month of tipsets (e.g., ~86400 epochs)

That does not sound correct. Where did you get it from ? Please can we wait on the next round of archival node testing to get the final numbers ? I'll make sure to document them here once we have them.

@BigLep
Copy link
Member

BigLep commented Oct 16, 2024

@aarshkshah1992

The ChainIndex will consume ~10GB of storage per month of tipsets (e.g., ~86400 epochs)

That does not sound correct. Where did you get it from ? Please can we wait on the next round of archival node testing to get the final numbers ? I'll make sure to document them here once we have them.

Ack, good to know. I can't recall / find where I got these numbers from. I was surprised to see them, so maybe I put them in as fillers. I don't remember. Anyways, I will put X placeholders for now and we'll update once official results have been published here.

@aarshkshah1992
Copy link
Contributor Author

@BigLep

Please see https://filecoinproject.slack.com/archives/CP50PPW2X/p1729413621133599.

~10G growth in the Index DB size per month is actually correct.

@aarshkshah1992
Copy link
Contributor Author

The ChainIndexer PR is now merged. Keeping this issue open till RPC providers upgrade and finish backfilling the Index.

@rjan90
Copy link
Contributor

rjan90 commented Jan 6, 2025

The ChainIndexer PR is now merged. Keeping this issue open till RPC providers upgrade and finish backfilling the Index.

Will close this isse now that we have confirmation from Protofire upgrading to the new ChainIndexer, running the backfilling process without encountering any issues, as well as have it running stable over the holidays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🎉 Done
Development

No branches or pull requests

5 participants