-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp/services/ledgerexporter: Research Spike - Alternative storage methods #4473
Comments
Some initial notes I made regarding using BitTorrent: There are a few benefits:
There are some downsides:
|
Thinking about it more, there is one more downside. S3 (or any other cloud file store with an optional CDN) can give everyone very quick (thinking about 100 milliseconds or less) access to any ledger meta. It probably won't be possible with BitTorrent. It will, however, allow easier replication for orgs/people who would like to host fast access archives. |
That's very true @bartekn: there's an initial startup time to connecting to the swarm before you can download. If you want more than a handful of ledgers, though, that startup time should be amortized and hardly impact the overall time. |
This research spike entails exploring alternative ways to store and distribute unpacked ledger metadata (txmeta).
Currently, we've uploaded unpacked txmeta for pubnet through July 2022 to S3. Here are some stats:
With that in mind, there are two avenues to this spike:
Is there a better way to structure these files? For example, we could have a folder for each checkpoint; we could combine all ledgers in a checkpoint into a single file; etc. The task is to come up with some strategies and analyse the pros/cons of each (things like storage/bandwidth costs, etc.)
Is there a better way to distribute these files? A back-of-the-envelope calculation tells us that the majority of the cost of distributing these files comes from egress bandwidth. We also want to let people build & store these files themselves, yet minimize the risk of people using rogue/corrupt/malformed txmeta. The task is to come up with a alternative transport layers and analyse their tradeoffs (for example, BitTorrent gives us bandwidth decentralization and integrity, but it's harder to do incremental updates. We can batch torrents by some ledger range, but is that too hard? What happens when we upgrade the meta format? What about IPFS? Others?? etc.)
The text was updated successfully, but these errors were encountered: