Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test ethrelay catchup and test challenge invalid block #254

Merged
merged 24 commits into from
Aug 13, 2020

Conversation

ailisp
Copy link
Contributor

@ailisp ailisp commented Aug 4, 2020

Implement #238

@ailisp ailisp linked an issue Aug 4, 2020 that may be closed by this pull request
@ailisp ailisp changed the title Test ethrelay catchup Test ethrelay catchup and test challenge invalid block Aug 6, 2020
@ailisp ailisp linked an issue Aug 6, 2020 that may be closed by this pull request
Copy link
Contributor

@MaksymZavershynskyi MaksymZavershynskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you!

environment/lib/near2eth-relay.js Outdated Show resolved Hide resolved
ci/test_challenge.sh Show resolved Hide resolved
environment/index.js Outdated Show resolved Hide resolved
@ailisp ailisp mentioned this pull request Aug 7, 2020
6 tasks
@ailisp
Copy link
Contributor Author

ailisp commented Aug 7, 2020

@nearmax got this error in e2e test chanllenge, this's also always happens locally (Seems there's either a problem with watchdog, or contract)

[2020-08-07T22:36:14Z] (node:5160) UnhandledPromiseRejectionWarning: Error: [-32000] Server error: 5GkUH4hzeZs17nrL8wSrsNLt7ZHJ8ApgXxrtRk6pD79v does not exist
...
[2020-08-07T22:36:14Z]     at async Command.execute (/var/lib/buildkite-agent/builds/buildkite-i-01dd6e880ebe52ccf-1/nearprotocol/rainbow-bridge/environment/commands/transfer-eth-erc20-from-near.js:148:24)

Underlying code is https://github.com/near/rainbow-bridge/blob/master/environment/commands/transfer-eth-erc20-from-near.js#L148

Do you have idea why a light_client_proof respond "does not exist" on requested receiptId? The receiptId is responded by rpc of the successful burn txn so it should exist. Also @bowenwang1996

@MaksymZavershynskyi
Copy link
Contributor

@nearmax got this error in e2e test chanllenge, this's also always happens locally (Seems there's either a problem with watchdog, or contract)

[2020-08-07T22:36:14Z] (node:5160) UnhandledPromiseRejectionWarning: Error: [-32000] Server error: 5GkUH4hzeZs17nrL8wSrsNLt7ZHJ8ApgXxrtRk6pD79v does not exist
...
[2020-08-07T22:36:14Z]     at async Command.execute (/var/lib/buildkite-agent/builds/buildkite-i-01dd6e880ebe52ccf-1/nearprotocol/rainbow-bridge/environment/commands/transfer-eth-erc20-from-near.js:148:24)

Underlying code is https://github.com/near/rainbow-bridge/blob/master/environment/commands/transfer-eth-erc20-from-near.js#L148

Do you have idea why a light_client_proof respond "does not exist" on requested receiptId? The receiptId is responded by rpc of the successful burn txn so it should exist. Also @bowenwang1996

Is it 100% reproducible? Or does it happen randomly? There might be a race condition issue when ViewClient hasn't learnt yet that this receipt exists. I've seen similar issue with light client header -- RPC would tell me that this header exists, but if I try to immediately grab it with light client endpoint it would tell me it is not know, but then if I wait it becomes known.

@bowenwang1996
Copy link
Contributor

Do you have idea why a light_client_proof respond "does not exist" on requested receiptId? The receiptId is responded by rpc of the successful burn txn so it should exist. Also @bowenwang1996

You need to wait for the next block for the outcome root.

@ailisp
Copy link
Contributor Author

ailisp commented Aug 11, 2020

You need to wait for the next block for the outcome root.

The next block you mean "near block" right? I waited for a few minutes and retry the rpc manually and it still "does not exist". No near node configuration was changed, and it pass without wait in the e2e test without challenge step

@ailisp
Copy link
Contributor Author

ailisp commented Aug 11, 2020

Is it 100% reproducible? Or does it happen randomly? There might be a race condition issue when ViewClient hasn't learnt yet that this receipt exists. I've seen similar issue with light client header -- RPC would tell me that this header exists, but if I try to immediately grab it with light client endpoint it would tell me it is not know, but then if I wait it becomes known.

Seems always reproducible locally, and wait doesn't help (tried print the params and retry rpc a few minutes later and just now (10+ hours later), still not exist.

@bowenwang1996
Copy link
Contributor

What does the set up look like? How many near nodes are there and what is the epoch length?

@ailisp
Copy link
Contributor Author

ailisp commented Aug 11, 2020

What does the set up look like? How many near nodes are there and what is the epoch length?

It's default near init setup, one node, epoch length is 60

@bowenwang1996
Copy link
Contributor

I wonder whether it is because block production is too fast and garbage collection kicks in very quickly. Does it help if you change epoch length to 600 or make the node archival?

@ailisp
Copy link
Contributor Author

ailisp commented Aug 12, 2020

I wonder whether it is because block production is too fast and garbage collection kicks in very quickly. Does it help if you change epoch length to 600 or make the node archival?

Good point!
Update: it's probably the reason! passed first time, the restart with new config.json couldn't automated yet, but it passes. At the meantime, i've log and manually call rpc that verified every hash calculation is correct in NearBridge.sol. So we're good now. Thanks for help!

@MaksymZavershynskyi
Copy link
Contributor

I wonder whether it is because block production is too fast and garbage collection kicks in very quickly. Does it help if you change epoch length to 600 or make the node archival?

Good point!
Update: it's probably the reason! passed first time, the restart with new config.json couldn't automated yet, but it passes. At the meantime, i've log and manually call rpc that verified every hash calculation is correct in NearBridge.sol. So we're good now. Thanks for help!

I've seen this issue before with bridge e2e tests, too bad it did not occur to me that this is the case. Thank you @bowenwang1996 !

@ailisp ailisp merged commit 37f1d12 into master Aug 13, 2020
@ailisp ailisp deleted the test-ethrelay-catchup branch August 13, 2020 01:01
karim-en pushed a commit that referenced this pull request Dec 20, 2021
* e2e to test ethrelay catchup

* refactor and run on ci

* workaround buildkite

* manually test challenge works

* fix nonce by a robust callcontract wrapper

* improve nonce handling, reduce lock time to 60s to speed up on ci

* fix collect log on ci

* robustness fix: near query retry and retry eth lock when fail of nonce

* fix event parse of sendTransaction by steal from web3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Watchdog challenge e2e test e2e test for eth-relay catchup
3 participants