Make a crate of grpc_stream #397

rtso · 2024-06-10T21:43:17Z

Description

Clean up transaction_stream.rs' (renamed from grpc_stream.rs`) so we can make a crate of it, which will be used in the SDK.

TransactionStream

transaction_stream.rs defines a struct TransactionStream that has these methods:

init_stream establishes a connection with the grpc stream
get_next_transaction_batch fetches the next transaction response from grpc
is_end_of_stream signals if you've reached the end of the stream
reconnect_to_grpc will attempt to reconnect with grpc if there's an error with the stream
reconnect_to_grpc_with_retries is a wrapper around reconnect_to_grpc that adds retry logic

How processors use TransactionStream

In worker.rs, fetch_transactions_from_transaction_stream will initialize and manage the TransactionStream, filtering transactions, sending the transactions to channel, handle reconnection if necessary.

Testing

cargo run --release -- -c config.yaml

banool

Apologies if I left comments for things that are just copy pasted. If this isn't the right place to address them, please open issues / a tracker item for those rather than fixing them here.

banool · 2024-06-13T09:20:14Z

rust/aptos-indexer-transaction-stream/src/utils/counters.rs

+    }
+}
+
+/// These metrics are temporary (suffixed with _temp to avoid conflict with metrics in processor crate)


Does it matter if they clash, don't they still mean the same thing?

Prometheus will try to register the metric with the same name and will error. I can't import the metrics from processor crate because of cyclic dep.

I suggest renaming them and changing them- ideally each crate is responsible for emitting its own metrics; if anything, we can allow for a "metrics_prefix" and then allow dynamically creating them (instead of a single global lazy static) per instance of the streamer.

We can then create nice and specific metrics- maybe an "items received", and then a "filtered" either book or separate series, etc

Ie there should not be any overlap between crates and the SDK because the SDK will theoretically never have the same context on what's happening inside a given step as the step does (especially for complicated things like this)

Ie we should move away from thinking "this is the processor GRPC client" and think of it as "this is the way anyone in any application, even python, should use our GRPC APIs, and will put it on crates.io and it'll go viral and the world will love us"- I find that generalized approach helps create nice, hermetic, batteries-included crates, vs crates that rely on specific behavior from other implementing crates for our own purposes (within reason ofc)

Also duplicate metrics- ie the GRPC emitting channel size and also processor with the instrumented channel- it's fine :-)

Got it, didn't realize you'd be using this crate alongside the old code and the registration itself would clash.

These clients support setting a prefix. If we don't do that today, I agree we should. Example from API Gateway: https://github.com/aptos-labs/api-gateway/blob/fa335677d1c80de9dc103396d563d4e47006a661/src/metrics/labels.rs#L37.

banool · 2024-06-13T09:24:55Z

rust/aptos-indexer-transaction-stream/src/utils/counters.rs

+    ReceivedTxnsFromGrpc,
+    // Received transactions from GRPC. Sending transactions to channel.
+    ProcessedBatch,
+    // Processor finished processing one batch of transaction
+    ProcessedMultipleBatches, // Processor finished processing multiple batches of transactions


Can we do this instead:

pub enum ProcessorStep { /// Received transactions from GRPC. Sending transactions to channel. ReceivedTxnsFromGrpc, /// Processor finished processing one batch of transaction ProcessedBatch, /// Processor finished processing multiple batches of transactions ProcessedMultipleBatches, }

Also what do we mean by "Sending transactions to channel.". Did that happen as part of this step or is that what happens next?

Sending transactions to channel is what happens next. This is copy/pasted from the processors crate. We'll redo all these metrics as part of SDK, so this is just throwaway for backwards compatibility for our dashboards.

Can we kill this enum and giant, single overloaded logging method as a pattern? It's quite awful and confusing hahaha

giant, single overloaded logging method as a pattern

We can kill it in this crate. For more context, "giant, single overloaded logging method as a pattern" got introduced in the indexer grpc stack to reduce duplicated code and make sure we are logging all the same metrics in every relevant place. I'm open to other suggestions on how to do this :)

I actually don't mind this enum ¯_(ツ)_/¯. I like it for the reason Renee mentions, it's easier to do things consistently. If anything I'd go further and wrap this in a struct that also contains all the relevant labels, and then some methods on that struct bump the relevant counts (so the counters become private to this file). You can see the API Gateway code for some inspiration on that: https://github.com/aptos-labs/api-gateway/blob/fa335677d1c80de9dc103396d563d4e47006a661/src/metrics/labels.rs#L134-L135. In this case we only bump the relevant metrics when the struct is dropped, but you could also have explicit functions for it (which makes more sense in this case).

rust/processor/src/worker.rs

rust/aptos-indexer-transaction-stream/src/transaction_stream.rs

banool · 2024-06-13T09:41:17Z

rust/aptos-indexer-transaction-stream/src/transaction_stream.rs

+    );
+
+    transaction_stream.init_stream().await;
+    let mut send_ma = MovingAverage::new(3000);


Should this 3k value be configurable? (with a default).

rtso added 2 commits June 10, 2024 17:42

Make a crate

89de746

Clean up const

cfe52cd

rtso requested review from CapCap and a team June 10, 2024 21:49

rtso added 4 commits June 10, 2024 17:59

Fix lint

6751097

Actually move it to its own crate

4e5e1cb

Move txn_sender out of TransactionStream struct

fe52037

make methods pub

d15c17d

banool requested changes Jun 13, 2024

View reviewed changes

rtso added 5 commits June 13, 2024 16:37

TransactionStreamConfig

93294ff

Move chunking & filtering to processor crate

63fbfb7

Result everywhere

6185d59

Remove processor stuff in txn stream crate

2c04656

remove rest of panics

682703f

rtso requested a review from banool June 16, 2024 21:17

rtso added 2 commits June 16, 2024 17:31

Remove metrics

2699eee

Rename logs

3c8896b

rtso mentioned this pull request Jul 16, 2024

Create txn stream crate in this repo aptos-labs/aptos-indexer-processor-sdk#21

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make a crate of grpc_stream #397

Make a crate of grpc_stream #397

rtso commented Jun 10, 2024 •

edited

Loading

banool left a comment

banool Jun 13, 2024

rtso Jun 13, 2024 •

edited

Loading

CapCap Jun 13, 2024

CapCap Jun 13, 2024 •

edited

Loading

CapCap Jun 13, 2024 •

edited

Loading

banool Jun 13, 2024

banool Jun 13, 2024

banool Jun 13, 2024

rtso Jun 13, 2024

CapCap Jun 13, 2024

rtso Jun 13, 2024 •

edited

Loading

banool Jun 13, 2024

banool Jun 13, 2024

Make a crate of grpc_stream #397

Are you sure you want to change the base?

Make a crate of grpc_stream #397

Conversation

rtso commented Jun 10, 2024 • edited Loading

Description

TransactionStream

How processors use TransactionStream

Testing

banool left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rtso Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CapCap Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

CapCap Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rtso Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rtso commented Jun 10, 2024 •

edited

Loading

rtso Jun 13, 2024 •

edited

Loading

CapCap Jun 13, 2024 •

edited

Loading

CapCap Jun 13, 2024 •

edited

Loading

rtso Jun 13, 2024 •

edited

Loading