Design backend for serving TxMeta #4910

tamirms · 2023-06-15T08:57:30Z

The results from https://docs.google.com/document/d/1YETNALx5EzqZDNSVWzTfaK5Ogw84PsBlrt64nOr-Njg/edit?usp=sharing show that precomputing TxMeta is very effective in speeding up ingestion. We need to design a solution for distributing TxMeta so Horizon operators (along with Hubble) can ingest precomputed TxMeta instead of relying on captive core.

Using a blobstore like S3 or GCS (google cloud storage) is appealing because then we don't need to build any infrastructure to serve requests for TxMeta. However, there are concerns about download latency being too slow. Perhaps these concerns can be mitigated by using cloudflare for caching and batching several ledgers together in one file.

To complete this issue we need a design document which proposes a solution for distributing TxMeta and analyzes the cost and performance of the solution.

sreuland · 2023-06-22T18:02:25Z

I'd throw kafka broker in for consideration as 'blobstore' candidate and alternative to proprietary cloud options. Available on most cloud infra providers as 'managed' deployment or can deploy internal(but, requires substantial ops support). Kafka has been used in terabyte situations and has a client-broker transport for throughput, h/a, message delivery. ideas on model would be like message_id: <ledger_id> and message_payload: <base64_ledger_txmeta> additional kafka message headers could be added for letting clients do pro-active filtering/routing based on other attributes of the ledger.

the notion of consumer offest in the protocol could be interesting way to enable random access to ledgers by sequence number, and thereby allowing clients to consume historical ledgers in a custom replay ranged use case, synonymous with reingest range <from> <to>.

using the kafka message offset for random access to ledgers would entail using a single partition topic strategy and initially publishing messages to the topic starting with genesis ledger as offset=0, that way the insertion order of messages is preserved and the offset mirrors the ledger sequence number such as ledgerN will be at offsetN-1.

sreuland · 2023-10-16T18:00:56Z

may want to include as part of HLD(high level design) if/how the existing mono-repo ingestion sdk will enable access to new TxMeta source, new ledgerbackend? identify which programming languages are highly desired for ingestion sdk with this new TxMeta capability but not present yet, is the mono-go the only one currently, are any other languages minimally required as part of new TxMeta solution?

also, have the ingestion sdk provide outbound path from the ledgerbackend interface, so have the ability for apps to build publishers to remote tx meta sources as well, same sdk can be pub and sub sides.

mollykarcher · 2023-10-19T18:34:38Z

may want to include as part of HLD(high level design) if/how the existing mono-repo ingestion sdk will enable access to new TxMeta source, new ledgerbackend? identify which programming languages are highly desired for ingestion sdk with this new TxMeta capability but not present yet, is the mono-go the only one currently, are any other languages minimally required as part of new TxMeta solution?

also, have the ingestion sdk provide outbound path from the ledgerbackend interface, so have the ability for apps to build publishers to remote tx meta sources as well, same sdk can be pub and sub sides.

+1 to this. I think it makes sense to productize/productionalize both the publish and consumption side. That is, there is an SDK/package that produces this new TxMeta ledger backend, and there is also one that consumes from it. I'm not particularly opinionated on whether those should be the same package or different packages, but I think keeping them both in the ingest package could definitely make sense.

sreuland · 2023-11-01T20:43:18Z

@chowbao , just wanted to reference this design ticket earmarked for remote tx meta storage, as you're design proposal is overlapping, we'll want to incorporate your summary here, thanks!

tamirms added horizon snapshots labels Jun 15, 2023

tamirms added this to Platform Scrum Jun 15, 2023

github-project-automation bot moved this to Backlog in Platform Scrum Jun 15, 2023

tamirms moved this from Backlog to Next Sprint Proposal in Platform Scrum Jun 15, 2023

tamirms mentioned this issue Jun 15, 2023

Add support for ingesting via precomputed TxMeta in Horizon #4911

Closed

5 tasks

mollykarcher added performance issues aimed at improving performance and removed snapshots labels Jun 15, 2023

mollykarcher moved this from Next Sprint Proposal to Current Sprint in Platform Scrum Jun 20, 2023

Shaptic moved this from Current Sprint to Next Sprint Proposal in Platform Scrum Jun 20, 2023

mollykarcher moved this from Next Sprint Proposal to Current Sprint in Platform Scrum Jul 19, 2023

mollykarcher moved this from Current Sprint to Next Sprint Proposal in Platform Scrum Aug 29, 2023

mollykarcher moved this from Next Sprint Proposal to Current Sprint in Platform Scrum Aug 29, 2023

mollykarcher closed this as completed Nov 21, 2023

github-project-automation bot moved this from Current Sprint to Done in Platform Scrum Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design backend for serving TxMeta #4910

Design backend for serving TxMeta #4910

tamirms commented Jun 15, 2023

sreuland commented Jun 22, 2023 •

edited

Loading

sreuland commented Oct 16, 2023 •

edited

Loading

mollykarcher commented Oct 19, 2023 •

edited

Loading

sreuland commented Nov 1, 2023

Design backend for serving TxMeta #4910

Design backend for serving TxMeta #4910

Comments

tamirms commented Jun 15, 2023

sreuland commented Jun 22, 2023 • edited Loading

sreuland commented Oct 16, 2023 • edited Loading

mollykarcher commented Oct 19, 2023 • edited Loading

sreuland commented Nov 1, 2023

sreuland commented Jun 22, 2023 •

edited

Loading

sreuland commented Oct 16, 2023 •

edited

Loading

mollykarcher commented Oct 19, 2023 •

edited

Loading