random beacon cache changes #1278

kevjue · 2020-12-18T21:09:48Z

Description

The changes in this PR will make all validators have they randomness cache set before starting mining. It does this by two ways:

Whenever a block is synced or finalized via the consensus engine, the node will check to see if it authored that block, and will cache the block's parent hash (which is needed to recover the block's randomness in the future).
Before a node starts syncing, it will check to see if its last randomness commitment (saved on the random smart contract) has a cached entry. If not, then it will do a reverse search of the blockchain to find the last randomness's block and then cache the block's parent hash.

Other changes

Refactored some of randomness logic. Specifically

Moved generation of randomness logic from package contract_comm/random to consensus/istanbul.
Moved specification of randomness cache leveldb location from package contract_comm/random to consensus/istanbul.
Moved saving and retrieving to/from randomness cache logic from contract_comm/random to core/rawdb/accessors_chain.go.
Moved the randomness seed logic from miner/worker.go to consensus/istanbul package.

Tested

e2e tests (note that the when a validator's data directory is deleted test case in the e2e sync test will verify that restoring the randomness cache works.)
unit tests

Related issues

Fixes #[issue number here]

Backwards compatibility

Yes

… blockchain

trianglesphere

I think that this fixes the backwards iteration issue because we bail out in commit new work, but this kinda looks like a validator will not propose a block until it rolls back it's head block and re-syncs or it's swapped out for a replica that was able to do that rollback and save the last randomness.

My understanding is that the validator needs to sync the randomness with the block state, but is there a way to do a backwards iteration and save on startup? Would that break something with atomic state?

miner/worker.go

core/blockchain.go

miner/worker.go

trianglesphere · 2020-12-21T16:53:27Z

Separately this breaks all of the e2e tests with the following as the last output (seems to time out):

   Deploying 'Random'
   ------------------
   > transaction hash:    0xc62a41b1569a16224df46bf749d66ccc0794f47c0239797e96f8528f7865a716
- Blocks: 0            Seconds: 0
   > Blocks: 0            Seconds: 0
   > contract address:    0xdDe36eF5b705E05fF72851E93Ca5Cc6f593A8461
   > block number:        300
   > block timestamp:     1608568122
   > account:             0x47e172F6CfB6c7D01C1574fa3E2Be7CC73269D95
   > balance:             3010022.399200419
   > gas used:            2729453
   > gas price:           100 gwei
   > value sent:          0 ETH
   > total cost:          0.2729453 ETH

  Setting initial Random implementation on proxy

- Saving migration to chain.

Error: Transaction was not mined within750 seconds, please make sure your transaction was properly sent. Be aware that it might still be mined!    at PromiEvent (/home/circleci/repos/celo-monorepo/node_modules/truffle/build/webpack:/packages/contract/lib/promievent.js:9:1)
    at TruffleContract.setCompleted (/home/circleci/repos/celo-monorepo/node_modules/truffle/build/webpack:/packages/contract/lib/execute.js:169:1)
    at Migration._deploy (/home/circleci/repos/celo-monorepo/node_modules/truffle/build/webpack:/packages/migrate/migration.js:93:1)
    at process._tickCallback (internal/process/next_tick.js:68:7)

kevjue · 2020-12-21T17:21:56Z

I think that this fixes the backwards iteration issue because we bail out in commit new work, but this kinda looks like a validator will not propose a block until it rolls back it's head block and re-syncs or it's swapped out for a replica that was able to do that rollback and save the last randomness.

My understanding is that the validator needs to sync the randomness with the block state, but is there a way to do a backwards iteration and save on startup? Would that break something with atomic state?

Totally agree with you points. I've been adding to this PR where the it will do the backwards iteration (if the randomness cache is not already saved) right before starting mining.

mcortesi · 2021-01-04T19:09:20Z

miner/miner.go

+			for {
+				blockHeader := bc.GetHeaderByHash(blockHashIter)
+
+				// We got to the genisis block, so this goroutine didn't find the latest


typo "genesis"

mcortesi · 2021-01-04T19:20:11Z

miner/miner.go

+		return true
+	}
+
+	// goroutine communication channels and waitgroup


this second part of the function looks weird...

it reads like:

firstBlock = race( getMostRecentFromCoinbaseInHistory() getFirstBlockFronCoinbaeInFuture() ) generateAndSaveCommitement(firstBlock)

now, because of the race, this doesn't look to be deterministic, hence weird.

How would you suggest this is changed?

Both getters of the race is necessary.

nvm, i see some more of you comments below.

Do we need to wait for new blocks at this point? During sync start, mining gets stopped with Mining aborted due to sync, but as it reads in the new blocks that it doesn't have, it will check if it should save the randomness. Once it's done syncing or fails to sync, it runs this function prior to starting mining again.

I have the same doubt as @trianglesphere . I understood that when this function is called we are "already synced" since that's the reason of the update() function. (but i remember it's a tricky one, since it only triggers once)

And it would seems that even if that's not the case, the code in blockchain would still handle it, since it reacts on new blocks

This handles the case of when the the partition's (made up of the node and it's peers when the sync downloader.DoneEvent is emitted) aggregate head block is behind a block that the node authored (let's call this the "future authored block"). Note that this probably is a very rare scenario.

When this scenario happens, two things can happen.

The node's reverse iteration is completed before the "future authored block" is synced. The node will then start istanbul. Even if this partition's aggregate head block is behind other validators, this is already handled via the istanbul engine.

The "future authored block" is synced before the reverse iteration is completed. In this case, the syncing of that block will stop the reverse iteration since it has become irrelevant, and then istanbul will be started.

So waiting for new blocks is basically to handle case 2. Having said all that, that case is probably going to happen very infrequently, and the added benefit (not having to wait for the reverse iteration to complete) seems marginal. So seems like it's not worth the added complexity. WDYT?

So thinking about what we store, it's a lookup of commitment -> block hash where the block hash is mangled to create the commitment. I think that we have multiple of these commitments in the eth db. When looking up the last commitment made, we check a smart contract, so I think that order does not matter here, only that we do have the last commitment saved in the level db.

It's true that we want to bail out of reverse iteration as an optimization if we find a "future authored block", but shouldn't the future block record its commitment when we process it.

@trianglesphere - Ya, you're correct. It's unnecessarily recalculating the cache entry when it gets the "future authored block".

WDYT of the proposal of removing that handling of new block logic altogether? It does add alot of complex logic (which means higher chance for bugs) for minor benefit.

That sounds good to me. Seem like it's rare enough and doesn't make it incorrect.

mcortesi · 2021-01-04T19:23:43Z

miner/miner.go

+		blockHashIter := currentHeader.Hash()
+
+		wg.Add(1)
+		go func() {


i would extract this two function into their own functions: getMostRecentBlockFromAuthor(errgroup) and waitForNextBlockFromAuthor(errgroup)

and use an errgroup to handle coordination (https://godoc.org/golang.org/x/sync/errgroup)

Also, try move evertyhing to the random.go file; so to confine all code related to randomness to a single place. We can even not make it a part of miner object, and just pass everything as parameter to a static function

I actually was considering putting some of this logic within istanbul, but decided against it. The reason is that I believe Istanbul should only be responsible for defining the randomness beacon protocol (e.g. when/how is the randomness generated and verified).

As for the randomness cache, the only users of it is the miner package, as the cache is not an essential part of the randomness protocol, but more of an optimization that the miner can use. So with that in mind, restoration of the cache seems to fit within the miner package. I'm happy to create a new random.go file within miner package, though.

mcortesi · 2021-01-04T19:25:16Z

miner/miner.go

+
+// commitmentCacheSaved will check to see if the last commitment cache entry is saved.
+// If not, it will search for it and save it.
+func (miner *Miner) commitmentCacheSaved(istEngine consensus.Istanbul) bool {


function name suggest is just a query, no side effect or write. I would change the name to make that more apparent.

mcortesi · 2021-01-04T19:30:27Z

core/blockchain.go

@@ -1376,6 +1392,11 @@ func (bc *BlockChain) writeBlockWithState(block *types.Block, receipts []*types.
 	rawdb.WriteBlock(blockBatch, block)
 	rawdb.WriteReceipts(blockBatch, block.Hash(), block.NumberU64(), receipts)
 	rawdb.WritePreimages(blockBatch, state.Preimages())
+	if (randomCommitment != common.Hash{}) {


i'm hesitatnt to have this logic inside blockchain, would it make sense to have a blockchain listener that computes/writes the commitment for every block?

In fact, we could have a more cohesive type with all random related functions:

Listen to new Blocks to generate/write commitment

Reconstruct commitment cache

Get current / genearte current

What's your hesitancy of putting this in blockchain?

The changes are following much of the same design that blockchain already implements. E.g. the totalDifficulty is also a "cache", in that it's possible to reconstruct from the past chain, but it's computed at the time of block insert and saved in it's own leveldb location.

+1 for keeping this in blockchain. I think keeping this code block here is much easier to understand than having a dedicated listener.

My hesitation is about coupling... or using other maxims: Separation of Concerns / Single Responsibility Principle

Now blockchain "knows" about randomness cache, (it's coupled to it). Same with miner.

And then, the "responsibility" of keeping the cache sane is split in a few places. There's the logic of the cache (read/write), there's the logic of check & recovery on miner, and there's the logic of generation with each new block on blockchain.

Now, in any future where a maintainer makes a change to those modules, it need to have the same logical context and awareness about this. And that's tricky, since it's just one of the many responsibilities these modules have.

That's why my suggestion is to have "something" that represents the randomness cache, and it's responsible of keeping it sane, recovering it and providing access to it. That "something" can even be an interface, thus it's easy to mock and test things in isolation.

@trianglesphere @kevjue what do you think?

WDYT of keeping the randomness cache within the blockchain object (basically it would be a "primitive" of the block) and move the restoration logic from miner into blockchain (However, miner will still invoke the restoration logic if needed).

As for the actual insertion of the cache (at least for the normal case, as opposed to the restoration case), I still think that it should be done within the block insertion function (basically how it's implemented now), since it's inserted as part of a leveldb transaction along with all of the other parts of the block, since inserting all block related data atomically is one high level goal that we wanted to attain.

I'm fine with bundling lines approx 1360 to 1370 into some manager object/primitive, but I agree with Kevin that rawdb.WriteRandomCommitmentCache(...) should stay here.

mcortesi · 2021-01-04T19:36:43Z

miner/miner.go

+		return true
+	}
+
+	// goroutine communication channels and waitgroup


i think here i would have a new function, something like "regenerateCache"... and a good log message stating taht we are doing such... (and that it might take a while)

trianglesphere · 2021-01-04T20:05:54Z

I deployed 64494cc9ce6a1717de45b1a61d618d1f17de394c to baklava, and the fix got the validator to start signing again.

mcortesi · 2021-01-07T16:43:53Z

core/blockchain.go

@@ -2336,3 +2337,45 @@ func (bc *BlockChain) SubscribeLogsEvent(ch chan<- []*types.Log) event.Subscript
 func (bc *BlockChain) SubscribeBlockProcessingEvent(ch chan<- bool) event.Subscription {
 	return bc.scope.Track(bc.blockProcFeed.Subscribe(ch))
 }
+
+func (bc *BlockChain) RecoverRandomnessCache(commitment common.Hash, commitmentBlockHash common.Hash) error {
+	if istEngine, isIstanbul := bc.engine.(consensus.Istanbul); isIstanbul {


very personal nit :P

I prefer exiting fast, and avoiding nesting here:

istEngine, isIstanbul := bc.engine.(consensus.Istanbul) if !isIstanbul { return } ...

mcortesi · 2021-01-07T16:44:28Z

core/blockchain.go

+		for {
+			blockHeader := bc.GetHeaderByHash(blockHashIter)
+
+			// We got to the genisis block, so this goroutine didn't find the latest


typo: genesis

also, not a goroutine anymore

mcortesi · 2021-01-07T16:46:14Z

core/blockchain.go

@@ -2336,3 +2337,45 @@ func (bc *BlockChain) SubscribeLogsEvent(ch chan<- []*types.Log) event.Subscript
 func (bc *BlockChain) SubscribeBlockProcessingEvent(ch chan<- bool) event.Subscription {
 	return bc.scope.Track(bc.blockProcFeed.Subscribe(ch))
 }
+
+func (bc *BlockChain) RecoverRandomnessCache(commitment common.Hash, commitmentBlockHash common.Hash) error {


mcortesi · 2021-01-07T16:47:17Z

core/blockchain.go

+func (bc *BlockChain) RecoverRandomnessCache(commitment common.Hash, commitmentBlockHash common.Hash) error {
+	if istEngine, isIstanbul := bc.engine.(consensus.Istanbul); isIstanbul {
+
+		blockHashIter := commitmentBlockHash


i would add a log message here, it's quite a rare thing, and can take a decent amount of time, so an INFO msg would be nice, to inform the user what's happening

* update the randomness commitment cache when inserting blocks into the blockchain * added comments * reverse iteration search for the commitment cache entry * unsubscribe from the newblock subscription * fixed bug and added comments * handled randomness beacon case when smart contracts are being migrated * fixed bug * fixed a bug * fixed a bug and added comments to non intuitive code * addressed from PR comments * addressed PR comments Co-authored-by: Kevin Jue <[email protected]>

update the randomness commitment cache when inserting blocks into the…

63a3aee

… blockchain

kevjue requested review from mcortesi, tkporter and trianglesphere as code owners December 18, 2020 21:09

added comments

f70eb74

trianglesphere reviewed Dec 21, 2020

View reviewed changes

miner/worker.go Outdated Show resolved Hide resolved

core/blockchain.go Show resolved Hide resolved

miner/worker.go Show resolved Hide resolved

kevjue marked this pull request as draft December 22, 2020 00:06

Kevin Jue added 6 commits December 22, 2020 11:48

reverse iteration search for the commitment cache entry

d86395f

unsubscribe from the newblock subscription

7f1e4f9

fixed bug and added comments

b934012

handled randomness beacon case when smart contracts are being migrated

3b4304f

fixed bug

7300470

fixed a bug

b77c713

kevjue marked this pull request as ready for review December 24, 2020 01:44

fixed a bug and added comments to non intuitive code

64494cc

mcortesi reviewed Jan 4, 2021

View reviewed changes

Kevin Jue added 2 commits January 5, 2021 10:12

Merge branch 'master' into kevjue/random_beacon_fix

7cb03a1

addressed from PR comments

70d87f6

mcortesi approved these changes Jan 7, 2021

View reviewed changes

addressed PR comments

fc8c430

kevjue merged commit 8ec69a8 into master Jan 7, 2021

kevjue deleted the kevjue/random_beacon_fix branch January 7, 2021 19:46

trianglesphere mentioned this pull request Jan 8, 2021

Release 1.2.2-beta.1 #1287

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

random beacon cache changes #1278

random beacon cache changes #1278

kevjue commented Dec 18, 2020 •

edited

Loading

trianglesphere left a comment

trianglesphere commented Dec 21, 2020

kevjue commented Dec 21, 2020

mcortesi Jan 4, 2021

mcortesi Jan 4, 2021

kevjue Jan 4, 2021

kevjue Jan 4, 2021

trianglesphere Jan 4, 2021

mcortesi Jan 4, 2021

kevjue Jan 4, 2021 •

edited

Loading

trianglesphere Jan 4, 2021

kevjue Jan 4, 2021 •

edited

Loading

trianglesphere Jan 4, 2021

mcortesi Jan 5, 2021

mcortesi Jan 4, 2021

kevjue Jan 4, 2021

mcortesi Jan 4, 2021

mcortesi Jan 4, 2021

kevjue Jan 4, 2021

trianglesphere Jan 4, 2021

mcortesi Jan 4, 2021

kevjue Jan 4, 2021

trianglesphere Jan 4, 2021

mcortesi Jan 4, 2021

mcortesi Jan 4, 2021

trianglesphere commented Jan 4, 2021

mcortesi Jan 7, 2021

mcortesi Jan 7, 2021

mcortesi Jan 7, 2021

mcortesi Jan 7, 2021

mcortesi Jan 7, 2021

random beacon cache changes #1278

random beacon cache changes #1278

Conversation

kevjue commented Dec 18, 2020 • edited Loading

Description

Other changes

Tested

Related issues

Backwards compatibility

trianglesphere left a comment

Choose a reason for hiding this comment

trianglesphere commented Dec 21, 2020

kevjue commented Dec 21, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevjue Jan 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevjue Jan 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trianglesphere commented Jan 4, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevjue commented Dec 18, 2020 •

edited

Loading

kevjue Jan 4, 2021 •

edited

Loading

kevjue Jan 4, 2021 •

edited

Loading