Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resume tee-worker #1050

Merged
merged 3 commits into from
Dec 21, 2022
Merged

Resume tee-worker #1050

merged 3 commits into from
Dec 21, 2022

Conversation

ziming-zung
Copy link
Contributor

@ziming-zung ziming-zung commented Dec 1, 2022

resolves: #999
resolves: #997

This PR:

  • After stopping, the tee-worker can resumes syncing
  • tee-worker attempt to reconnect the parachain once disconnected

For #999

  1. We should change the read_or_init_parachain_validator function in core/parentchain/light-client/src/io.rs.
    In this function, the state is cleared every time, so we should rewrite the state after init.
    https://github.com/litentry/litentry-parachain/blob/09784f1d74f3882f22062579178ea3041e8daeed/tee-worker/core/parentchain/light-client/src/io.rs#L79-L84

  2. Enclave will be registered each time when tee-worker run. So we shoud check if enclave is already registered each time.
    https://github.com/litentry/litentry-parachain/blob/09784f1d74f3882f22062579178ea3041e8daeed/tee-worker/service/src/main.rs#L444-L446

Strongly recommended to manually execute the command to confirm

For #997
substrate-api-client uses library ws, which is difficult to implement reconnection mechanism. So I decided to use tungstenite instead of ws.
To check implementation, see at: https://github.com/litentry/substrate-api-client/tree/new-tungstenite-polkadot-v0.9.29

@ziming-zung ziming-zung requested a review from a team December 1, 2022 14:26
@BillyWooo BillyWooo linked an issue Dec 1, 2022 that may be closed by this pull request
@Kailai-Wang Kailai-Wang linked an issue Dec 1, 2022 that may be closed by this pull request
Copy link
Collaborator

@Kailai-Wang Kailai-Wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job locating the problem.

While there might still be some small problems with reconnection, I'm thinking if we can add tests for it? These tests are actually very important.

We'd need to test:

  • parentchain disconnected and reconnected, the workers (both primary and non-primary validateers can still continue
  • primary validateers can stop and resume
  • non-primary validateers can stop and resume

After we have the tests we can submit a PR to upstream

@Kailai-Wang
Copy link
Collaborator

Looks like still some CI issues?

None,
) {
Ok(Some(value)) => {
if value.mr_enclave.to_vec() == mrenclave && value.url == trusted_url {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's almost guaranteed that we'll fall into the !found branch if we are worker-2, as the trusted_url is different. I don't understand why we still need to register the enclave in this case, but it's the same logic as the original code

@Kailai-Wang
Copy link
Collaborator

Update: we might have some problems in the tungstenite-client version of substrate-api-client -- let's try to update it and re-test it

@ziming-zung
Copy link
Contributor Author

Update: we might have some problems in the tungstenite-client version of substrate-api-client -- let's try to update it and re-test it

I had updated the tungstenite-client

@Kailai-Wang
Copy link
Collaborator

CI looks good!

I think we can merge this one and submit a PR to substrate-api-client

@Kailai-Wang
Copy link
Collaborator

Maybe let's merge this PR and write tests in another PR? @zhizming-zhong

@ziming-zung
Copy link
Contributor Author

Maybe let's merge this PR and write tests in another PR? @zhizming-zhong

Of course, after conflicts are resolved and the CI is passed, this PR will be merged into dev branch

@ziming-zung ziming-zung changed the base branch from tee-dev to dev December 21, 2022 06:49
@ziming-zung
Copy link
Contributor Author

ziming-zung commented Dec 21, 2022

notice !!!
I reset the branch, and this PR only resolves #999.
may be need to wait for the PR Make api no_std compatible to be completed before solving issue #997

@codecov-commenter
Copy link

codecov-commenter commented Dec 21, 2022

Codecov Report

Merging #1050 (dbc959f) into dev (c628c9a) will decrease coverage by 0.06%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##              dev    #1050      +/-   ##
==========================================
- Coverage   20.69%   20.62%   -0.07%     
==========================================
  Files         217      217              
  Lines        9031     9060      +29     
==========================================
  Hits         1869     1869              
- Misses       7162     7191      +29     
Impacted Files Coverage Δ
tee-worker/core/parentchain/light-client/src/io.rs 0.00% <0.00%> (ø)
tee-worker/service/src/main.rs 0.00% <0.00%> (ø)
tee-worker/service/src/sidechain_setup.rs 0.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Collaborator

@Kailai-Wang Kailai-Wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can not resume worker node Can't reconnect to the parentchain once disconnected
3 participants