Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go node: unite compute, merge, and transaction scheduler roles #2514

Merged
merged 10 commits into from
Jan 17, 2020

Conversation

pro-wh
Copy link
Contributor

@pro-wh pro-wh commented Jan 3, 2020

We're removing the separation among registering nodes for the compute, merge, and transaction scheduler roles. You now have to register for and enable all or none of these roles, under a new, broadened, and confusing--you're welcome--term "compute."

non-changelog data:

  • the .enabled flag is moved to its own package, computeenable. this is because there are dependencies leading toward the merge worker package, but that would be a weird place to put the compute enable flag.
  • a backwards incompatible change is made to the on-chain representation of registered nodes, and no migration is provided. you're welcome.

closes #2107

@pro-wh pro-wh added c:runtime/compute Category: runtime compute worker c:registry Category: entity/node/runtime registry service c:breaking/consensus Category: breaking consensus changes c:worker-merge c:breaking/cfg Category: breaks configuration labels Jan 3, 2020
@pro-wh
Copy link
Contributor Author

pro-wh commented Jan 3, 2020

aw this is going to break the byzantine tests isn't it

go/consensus/tendermint/apps/scheduler/scheduler.go Outdated Show resolved Hide resolved
.changelog/2107.breaking.md Show resolved Hide resolved
go/worker/computeenable/init.go Show resolved Hide resolved
@pro-wh pro-wh force-pushed the pro-wh/feature/onerole branch 2 times, most recently from 815eac4 to e50b619 Compare January 7, 2020 22:01
@pro-wh
Copy link
Contributor Author

pro-wh commented Jan 8, 2020

remaining weirdness:

  • registry node runtimes are shared across multiple roles
  • some tests might not be up to date on per-runtime merge and storage change

@pro-wh pro-wh force-pushed the pro-wh/feature/onerole branch 2 times, most recently from a9702e8 to dd81398 Compare January 8, 2020 23:03
@pro-wh pro-wh requested a review from kostko January 9, 2020 00:30
@codecov
Copy link

codecov bot commented Jan 9, 2020

Codecov Report

Merging #2514 into master will decrease coverage by 3.97%.
The diff coverage is 51.26%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2514      +/-   ##
==========================================
- Coverage   67.37%   63.39%   -3.98%     
==========================================
  Files         330      347      +17     
  Lines       30566    32324    +1758     
==========================================
- Hits        20594    20492     -102     
- Misses       7461     9247    +1786     
- Partials     2511     2585      +74
Impacted Files Coverage Δ
go/worker/merge/worker.go 87.17% <ø> (+1.99%) ⬆️
go/oasis-node/cmd/registry/node/node.go 51.21% <ø> (+0.98%) ⬆️
go/oasis-test-runner/oasis/compute.go 0% <ø> (ø)
go/common/node/node.go 63.52% <ø> (-1.64%) ⬇️
go/oasis-test-runner/oasis/args.go 0% <0%> (ø)
go/oasis-node/cmd/debug/byzantine/merge.go 77.21% <0%> (+2.53%) ⬆️
go/registry/api/api.go 40.15% <0%> (+0.25%) ⬆️
go/worker/txnscheduler/init.go 100% <100%> (ø) ⬆️
go/consensus/tendermint/apps/beacon/beacon.go 73.07% <100%> (ø) ⬆️
go/worker/computeenable/init.go 100% <100%> (ø)
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1deea6...db985d3. Read the comment docs.

Copy link
Member

@kostko kostko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the huge coverage difference?

One minor comment, otherwise, this looks good now (except for the coverage thing).

I'll need to tweak all the seeds again in #2504 as the Byzantine node must now wait until epoch 2 due to a change where compute nodes now wait for the key manager runtime to be available first before registering.

go/consensus/tendermint/apps/beacon/beacon.go Outdated Show resolved Hide resolved
@pro-wh pro-wh force-pushed the pro-wh/feature/onerole branch from dd81398 to 48f1d9e Compare January 10, 2020 22:00
@pro-wh
Copy link
Contributor Author

pro-wh commented Jan 13, 2020

alright, I'd better take inventory of what went down in coverage

  1. StateProcessingBatch.cancel
  2. Namespace.UnmarshalBinary ErrMalformedNamespace
  3. Namespace.UnmarshalText error from base64 DecodeString
  4. Namespace.UnmarshalHex error from hex.DecodeString
  5. Namespace.Equal cmp == nil
  6. cmd debug storage check-roots Args error from nrFn
  7. ibid, error from ValidateRuntimeIDStr
  8. cmd debug storage force-finalize Args
  9. cmd debug storage ValidateRuntimeIDStr error from hex.DecodeString
  10. cmd debug storage checkDiff error from storageClient.GetDiff
  11. ibid, error from write log iterator Next
  12. ibid, error from write log iterator Value
  13. cmd debug storage doCheckRoots error from storageClient.New
  14. ibid, error from client.GetBlock
  15. ibid, error from storageWorkerClient.GetLastSyncedRound
  16. ibid, retry when storage worker is not synced
  17. ibid, error from client.GetBlock
  18. cmd debug storage doForceFinalize failed branch

I'm stopping here. Observations:

  • covered -> not covered transitions are almost all error handling branches, which I suspect are ones that we don't explicitly exercise in tests
  • item 18 https://codecov.io/gh/oasislabs/oasis-core/pull/2514/changes#L282 ought to be impossible, because you would have to go through line 247 (never covered) before you could get to line 282 (previously covered, now not covered)
  • what is force-finalize? do we exercise it?

@pro-wh pro-wh requested a review from kostko January 13, 2020 22:38
@pro-wh pro-wh force-pushed the pro-wh/feature/onerole branch 2 times, most recently from 6ca9723 to e33b02a Compare January 15, 2020 23:03
@pro-wh pro-wh force-pushed the pro-wh/feature/onerole branch from e33b02a to d4de0e4 Compare January 16, 2020 22:28
@pro-wh
Copy link
Contributor Author

pro-wh commented Jan 16, 2020

now the runtime-dynamic test doesn't work, because you wouldn't be able to register per-runtime storage nodes until the runtimes are registered

This is so that we have more usable testing schedules with a compute-only
role and a merge-only role.
@pro-wh pro-wh force-pushed the pro-wh/feature/onerole branch from d4de0e4 to db985d3 Compare January 16, 2020 23:58
@pro-wh
Copy link
Contributor Author

pro-wh commented Jan 17, 2020

alright, adjusted as the comment there described

@pro-wh pro-wh merged commit 194461b into master Jan 17, 2020
@pro-wh pro-wh deleted the pro-wh/feature/onerole branch January 17, 2020 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c:breaking/cfg Category: breaks configuration c:breaking/consensus Category: breaking consensus changes c:registry Category: entity/node/runtime registry service c:runtime/compute Category: runtime compute worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove granular role registration
2 participants