Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use trace_replayBlockTransactions API for faster tracing #1543

Merged
merged 6 commits into from
Mar 14, 2019

Conversation

goodsoft
Copy link
Contributor

@goodsoft goodsoft commented Mar 9, 2019

Motivation

Currently indexer is using trace_replayTransaction JSONRPC method for fetching internal transaction.
This caused major slowdown of indexing and forced implementation of ad-hoc elimination rules (e.g. dropping simple token transfers).

Parity supports another API method: trace_replayBlockTransactions, which fetches internal transactions for the entire block.
Some utterly unscientific tests showed up to 20% speed boost in comparison with current approach, with the added bonus of fetching all transactions, including previously eliminated ones.

Also, previously realtime block fetcher needed to check, whether any particular transaction was a contract, which involved sending multiple eth_getCode API requests, which further slowed down new block arrival.
Some more unscientific benchmarks showed that employing trace_replayBlockTransactions method the average time of a new block arrival into database (inserted_at - timestamp) went down from ~2m15s to ~0m50s.

Changelog

  • Add internal_transactions_indexed_at field to blocks and update InternalTransactions import runner to handle it.
  • Replace Chain.stream_transactions_with_unfetched_internal_transactions with Chain.stream_blocks_with_unfetched_internal_transactions.
  • Change EthereumJSONRPC.fetch_internal_transactions to accept list of block numbers and use trace_replayBlockTransactions API method
  • Replace this method with a stub for Geth, as it has a corresponding method, but can't recover association between internal transaction and normal transaction, and has 128 block limit anyway.
  • Increase timeout for JSONRPC API requests to 10 minutes, as some blocks from Ethereum DDoS times caused requests to take 1m30s and more.
  • Update realtime, catchup and internal transactions fetchers to use the aforementioned methods, remove unneeded checks for simple token transfers.
  • Update tests for these changes, and also fix some inconsistencies in test data, which old code was agnostic of.

Upgrading

Migration is added for adding internal_transactions_indexed_at to blocks table.
Theoretically, it is possible to update this field with some temporary value for blocks, which only have simple token transfer transactions and already fetched transactions to speed up indexing of interesting transactions, and then reset it back to nil for a second pass to index all remaining simple transactions.

@ghost ghost assigned goodsoft Mar 9, 2019
@ghost ghost added the in progress label Mar 9, 2019
@coveralls
Copy link

coveralls commented Mar 9, 2019

Pull Request Test Coverage Report for Build fd3cbba8-187f-4b4d-97a1-cb68011e2249

  • 37 of 47 (78.72%) changed or added relevant lines in 7 files are covered.
  • 26 unchanged lines in 5 files lost coverage.
  • Overall coverage remained the same at 82.44%

Changes Missing Coverage Covered Lines Changed/Added Lines %
apps/ethereum_jsonrpc/lib/ethereum_jsonrpc/parity.ex 10 11 90.91%
apps/explorer/lib/explorer/chain/import/runner/internal_transactions.ex 9 10 90.0%
apps/indexer/lib/indexer/block/realtime/fetcher.ex 3 4 75.0%
apps/indexer/lib/indexer/block/catchup/fetcher.ex 3 6 50.0%
apps/indexer/lib/indexer/internal_transaction/fetcher.ex 9 13 69.23%
Files with Coverage Reduction New Missed Lines %
apps/ethereum_jsonrpc/lib/ethereum_jsonrpc/http.ex 1 73.47%
apps/ethereum_jsonrpc/lib/ethereum_jsonrpc.ex 1 87.5%
apps/ethereum_jsonrpc/lib/ethereum_jsonrpc/parity.ex 2 83.33%
apps/indexer/lib/indexer/block/realtime/fetcher.ex 5 30.93%
apps/indexer/lib/indexer/internal_transaction/fetcher.ex 17 57.14%
Totals Coverage Status
Change from base Build 8b2c4541-5b86-4250-856b-723ea36d43fb: 0.0%
Covered Lines: 4216
Relevant Lines: 5114

💛 - Coveralls

@goodsoft goodsoft force-pushed the gs-block-internal-transactions branch from baef3f2 to 5184994 Compare March 10, 2019 18:18
@goodsoft goodsoft added ready for review This PR is ready for reviews. and removed in progress labels Mar 10, 2019
@ghost ghost assigned vbaranov Mar 11, 2019
@ghost ghost added the in progress label Mar 11, 2019
@ayrat555
Copy link
Contributor

@goodsoft some chains are exclusively using only geth (for example, SafeChain) because they leverage features that only geth has. SafeChain are using geth because it has clique consensus algorithm. is there any way that we can fall back to trace_replayTransaction if geth is used?

@goodsoft
Copy link
Contributor Author

goodsoft commented Mar 11, 2019

We can do it like this:

  • Split EthereumJSONRPC.fetch_transactions into two methods: the old one accepting list of transactions and the new one accepting list of blocks.
  • Implement only the old one for geth and only new one for parity
  • Bring back Explorer.Chain.stream_transactions_with_unfetched_internal_transactions for old API
  • Switch between those combinations of APIs in InternalTransactions.Fetcher and Block.Realtime.Fetcher

Though, it'll probably be better to split InternalTransactions.Fetcher into two different modules as well, as it will need to handle these APIs differently.

Does this sound adequate to you?

@ayrat555
Copy link
Contributor

@goodsoft sounds reasonable

@vbaranov vbaranov removed the ready for review This PR is ready for reviews. label Mar 13, 2019
@goodsoft goodsoft force-pushed the gs-block-internal-transactions branch 8 times, most recently from b9ebdf7 to 1ce1c7d Compare March 13, 2019 21:47
Internal transaction fetcher is split into two APIs:
* `fetch_block_internal_transactions` is used for Parity
* `fetch_internal_transactions` is used for the rest of variants

`Indexer.InternalTransaction.Fetcher` is updated to take care of that.
We refetch all internal transactions anyway.
@goodsoft goodsoft force-pushed the gs-block-internal-transactions branch 3 times, most recently from 32b8472 to 7f43ad5 Compare March 14, 2019 00:29
@goodsoft
Copy link
Contributor Author

The decrease in coverage is mostly caused by the internal transaction fetching tests being disabled for Geth. While the same code was being used for both Parity and Geth, it was properly exercised, but now Parity has a separate logic.

I'll leave it as-is, as figuring out what's up with internal transactions in Geth gets out of the scope of this issue.

@goodsoft goodsoft added ready for review This PR is ready for reviews. and removed in progress labels Mar 14, 2019
],
http_options: [recv_timeout: :timer.minutes(1), timeout: :timer.minutes(1), hackney: [pool: :ethereum_jsonrpc]]
http_options: [recv_timeout: :timer.minutes(10), timeout: :timer.minutes(10), hackney: [pool: :ethereum_jsonrpc]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goodsoft could you change it also for prod?

@ghost ghost added the in progress label Mar 14, 2019
@vbaranov vbaranov self-requested a review March 14, 2019 12:25
json_rpc_named_arguments
|> Keyword.fetch!(:variant)
|> case do
EthereumJSONRPC.Parity ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching on the variant here seems a little fragile. It might be a good idea to make this an application configuration, so that we could say if Application.get_env(...) then it becomes a bit more clear. Then the configuration could be placed in the adapter specific configurations like config/parity.exs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense if both variants supported both APIs, so we could choose one in configuration.
Otherwise, there's exactly one valid value for this setting for each variant and it doesn't make sense to make it configurable at this point.

@ayrat555
Copy link
Contributor

@goodsoft looks good. but what about my previous comment?

@goodsoft
Copy link
Contributor Author

@ayrat555 if you mean the one about having fallback for Geth, I have implemented it in the second commit.

@vbaranov vbaranov merged commit 807f7c3 into master Mar 14, 2019
@ghost ghost removed the in progress label Mar 14, 2019
@vbaranov vbaranov deleted the gs-block-internal-transactions branch March 14, 2019 17:31
@sieniven sieniven mentioned this pull request Mar 2, 2024
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review This PR is ready for reviews.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants