Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Update leader slot in poh recorder if we skipped it #3451

Merged
merged 2 commits into from
Mar 23, 2019

Conversation

pgarg66
Copy link
Contributor

@pgarg66 pgarg66 commented Mar 22, 2019

Problem

In a single node setup with existing ledger, the leader skip its leader slot without voting. The Poh recorder doesn't know when the node should become the leader again.

Summary of Changes

Reset the poh recorder if the node is skipping its leader slot.

@pgarg66 pgarg66 requested review from carllin and mvines March 22, 2019 22:23
@codecov
Copy link

codecov bot commented Mar 22, 2019

Codecov Report

Merging #3451 into v0.12 will decrease coverage by <.1%.
The diff coverage is 9%.

@@           Coverage Diff           @@
##           v0.12   #3451     +/-   ##
=======================================
- Coverage   81.2%   81.1%   -0.1%     
=======================================
  Files        132     132             
  Lines      20737   20748     +11     
=======================================
+ Hits       16843   16844      +1     
- Misses      3894    3904     +10

@codecov
Copy link

codecov bot commented Mar 22, 2019

Codecov Report

Merging #3451 into v0.12 will decrease coverage by <.1%.
The diff coverage is 8.3%.

@@           Coverage Diff           @@
##           v0.12   #3451     +/-   ##
=======================================
- Coverage   81.2%   81.1%   -0.1%     
=======================================
  Files        132     132             
  Lines      20737   20749     +12     
=======================================
+ Hits       16843   16844      +1     
- Misses      3894    3905     +11

Copy link
Contributor

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep this prevents the bug I was seeing earlier. Some kind of test would be super nice, please cherrypick to v0.12

@mvines
Copy link
Contributor

mvines commented Mar 22, 2019

Oh when I apply this to v0.12, I'm hitting this assert:

assert!(frozen.contains_key(&parent_slot));

STR:

  1. ./multinode-demo/setup.sh
  2. ./multinode-demo/drone.sh
  3. ./multinode-demo/bootstrap-leader.sh
  4. Wait for the staking account airdrop/transactions to complete
  5. ^C
  6. ./multinode-demo/bootstrap-leader.sh
  7. 💥

@pgarg66
Copy link
Contributor Author

pgarg66 commented Mar 22, 2019

@mvines , so it doesn't assert on master? It has a similar check there.

@mvines
Copy link
Contributor

mvines commented Mar 23, 2019

On master without this patch I got

thread 'solana-replay-stage' panicked at 'epoch_stakes cannot move backwards', core/src/locktower.rs:148:13

when I tried the STR at #3451 (comment) a couple times, wasn't 100% reproducible but I also didn't try too hard to reproduce. (But be sure to pick up #3456 or the STR will fail much earlier due to staking account setup failures.)

On master with this patch I got the assert in the STR right away:

thread 'solana-replay-stage' panicked at 'assertion failed: frozen.contains_key(&parent_slot)', core/src/replay_stage.rs:217:13

@pgarg66
Copy link
Contributor Author

pgarg66 commented Mar 23, 2019

@mvines I just pushed an updated patch. It's getting a bit hacky now. Maybe if it works for v0.12, we can use as a short term solution.

@pgarg66
Copy link
Contributor Author

pgarg66 commented Mar 23, 2019

When I run it locally, I get this error the second time:

2019-03-23T00:04:36.713097274Z INFO  solana::rpc] get_signature_status rpc request status: SignatureNotFound
[2019-03-23T00:04:36.713397374Z INFO  solana_runtime::system_program] CreateAccount: invalid argument; account 32NF3VFd4xpgoxpWVxUVLgwiTXYCvFwjWd5SkzbhSwhp already in use
[2019-03-23T00:04:36.713518874Z INFO  solana_runtime::bank] tx error: Err(ProgramError(0, CustomError([0, 0, 0, 0]))) Transaction { signatures: [26YTxr2fLuyVT8PwbPevxrmG7bPEEq3ihYk9kM9jmWRTw9EGe5iWMxfidUPmK6ZZsUtBjVAQWrbRk8HMJvuhAWNC], account_keys: [8ewKzaMiA6XLfsq8TKbUkUX3X7hFU6A9uwNmQK4dViMJ, 32NF3VFd4xpgoxpWVxUVLgwiTXYCvFwjWd5SkzbhSwhp], recent_blockhash: P6BW4BNrkZMrg5tXzPXtvf8XAnWNoGycHGUviUAEH8V, fee: 0, program_ids: [11111111111111111111111111111111, 9tGpLNn8yNMEvJsc9c9wBKPBAHakCQSg5ViDSjxN9CGX], instructions: [Instruction { program_ids_index: 0, accounts: [0, 1], data: [0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0, 0, 205, 1, 0, 0, 0, 0, 0, 0, 132, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] }, Instruction { program_ids_index: 1, accounts: [1], data: [0, 0, 0, 0] }] }
[2019-03-23T00:04:36.714050674Z INFO  solana_runtime::bank] 1 errors of 1 txs
[2019-03-23T00:04:36.714364274Z INFO  solana::banking_stage] program error 0, CustomError([0, 0, 0, 0])
[2019-03-23T00:04:36.714936974Z INFO  solana::banking_stage] @1553299476714 done processing transaction batches: 1 time: 5ms reqs: 1 reqs/s: 168.69383
[2019-03-23T00:04:37.115533498Z INFO  solana::rpc] get_signature_status rpc request received: "26YTxr2fLuyVT8PwbPevxrmG7bPEEq3ihYk9kM9jmWRTw9EGe5iWMxfidUPmK6ZZsUtBjVAQWrbRk8HMJvuhAWNC"
[2019-03-23T00:04:37.115777098Z INFO  solana::rpc] get_signature_status rpc request status: ProgramRuntimeError
Error: RpcRequestError("Transaction \"26YTxr2fLuyVT8PwbPevxrmG7bPEEq3ihYk9kM9jmWRTw9EGe5iWMxfidUPmK6ZZsUtBjVAQWrbRk8HMJvuhAWNC\" failed: ProgramRuntimeError")
./multinode-demo/bootstrap-leader.sh: line 92: kill: (30953) - No such process

@mvines
Copy link
Contributor

mvines commented Mar 23, 2019

#3456 fixed that error, rebase :)

@pgarg66
Copy link
Contributor Author

pgarg66 commented Mar 23, 2019

oh cool

@pgarg66
Copy link
Contributor Author

pgarg66 commented Mar 23, 2019

It seems to work for me

@mvines
Copy link
Contributor

mvines commented Mar 23, 2019

Ah, your latest patch works for me on master too.

I still get thread 'solana-replay-stage' panicked at 'epoch_stakes cannot move backwards', core/src/locktower.rs:148:13 when restart ./multinode-demo/bootstrap-leader.sh a couple times but that's a separate issue (that doesn't occur on v0.12)

@pgarg66
Copy link
Contributor Author

pgarg66 commented Mar 23, 2019

cool. so we merge it for v12?

@mvines
Copy link
Contributor

mvines commented Mar 23, 2019

cool. so we merge it for v12?

yes please

@pgarg66 pgarg66 merged commit f479021 into solana-labs:v0.12 Mar 23, 2019
@pgarg66 pgarg66 deleted the v0.12 branch March 23, 2019 00:35
@carllin carllin mentioned this pull request Mar 23, 2019
behzadnouri pushed a commit to behzadnouri/solana that referenced this pull request Nov 4, 2024
…s#3451)

The use of fifo compaction in rocksdb is deprecated (as of v2.0). The
current behavior emits a warning if fifo is set. This change removes
"fifo" as an option, leaving "level" as the only valid value for
--rocksdb-shred-compaction
alessandrod pushed a commit to alessandrod/solana that referenced this pull request Nov 5, 2024
…port of solana-labs#3451) (solana-labs#3464)

Disallow --rocksdb-shred-compaction fifo in the validator (solana-labs#3451)

The use of fifo compaction in rocksdb is deprecated (as of v2.0). The
current behavior emits a warning if fifo is set. This change removes
"fifo" as an option, leaving "level" as the only valid value for
--rocksdb-shred-compaction

(cherry picked from commit fddc554)

Co-authored-by: steviez <[email protected]>
Lichtso pushed a commit to Lichtso/solana that referenced this pull request Nov 7, 2024
…s#3451)

The use of fifo compaction in rocksdb is deprecated (as of v2.0). The
current behavior emits a warning if fifo is set. This change removes
"fifo" as an option, leaving "level" as the only valid value for
--rocksdb-shred-compaction
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants