Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override the timeout_commit and timeout_propose when switching to v3 #3859

Closed
evan-forbes opened this issue Sep 11, 2024 · 8 comments · Fixed by #3882
Closed

Override the timeout_commit and timeout_propose when switching to v3 #3859

evan-forbes opened this issue Sep 11, 2024 · 8 comments · Fixed by #3882
Assignees
Labels
required issue is required to be closed before workstream can be closed warn:blocked item is not currently being worked on but is still blocked WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc WS: V3 3️⃣ item is directly relevant to the v3 hardfork
Milestone

Comments

@evan-forbes
Copy link
Member

evan-forbes commented Sep 11, 2024

We should override the timeouts upon the switch to app version v3

@evan-forbes evan-forbes added WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc WS: V3 3️⃣ item is directly relevant to the v3 hardfork labels Sep 11, 2024
@evan-forbes evan-forbes added this to the v3 milestone Sep 11, 2024
@rootulp
Copy link
Collaborator

rootulp commented Sep 11, 2024

Context

Validator's locally configure these timeouts in config.toml:

# How long we wait for a proposal block before prevoting nil
timeout_propose = "10s"
# How much timeout_propose increases with each round
timeout_propose_delta = "500ms"
# How long we wait after receiving +2/3 prevotes for “anything” (ie. not a single block or nil)
timeout_prevote = "1s"
# How much the timeout_prevote increases with each round
timeout_prevote_delta = "500ms"
# How long we wait after receiving +2/3 precommits for “anything” (ie. not a single block or nil)
timeout_precommit = "1s"
# How much the timeout_precommit increases with each round
timeout_precommit_delta = "500ms"
# How long we wait after committing a block, before starting on the new
# height (this gives us a chance to receive some more precommits, even
# though we already have +2/3).
timeout_commit = "11s"

Problem

We'd like to decrease the block time from ~11.75 seconds to ~6 seconds at the same time that celestia-app v3 gets deployed to networks.

We don't want to rely on validator's manually configuring timeouts in their config.toml because we can't guarantee that they will be modified at the same time. Therefore, we'd like to add ADR-115's next_block_delay as a field in EndBlockResponse. If it exists, use that instead of timeout commit.

Proposal

Backport ADR-115 to celestia-core

Related

@evan-forbes evan-forbes added the required issue is required to be closed before workstream can be closed label Sep 12, 2024
@rootulp
Copy link
Collaborator

rootulp commented Sep 13, 2024

Notes from discussion w/ @evan-forbes :

  • This issue diverges from ADR-115 because we want to override two params not just one (as proposed in ADR-115).
  • This is blocked on experiments for determining what the values are.
    • @evan-forbes hypothesizes that TimeoutCommit = ~4.7 seconds and TimeoutProposal = ~3.5 seconds.
    • @evan-forbes teach @rootulp how to run experiment to determine the correct timeouts. Blocked on @evan-forbes syncing with @sysrex because he's currently using his personal accounts for Congest.

@rootulp rootulp added the warn:blocked item is not currently being worked on but is still blocked label Sep 13, 2024
@rootulp rootulp changed the title Override the timeout commit and proposal timeouts when switching to v3 Override the timeout_commit and timeout_propose when switching to v3 Sep 13, 2024
@rootulp rootulp assigned staheri14 and unassigned rootulp Sep 16, 2024
@staheri14
Copy link
Contributor

Question:
At height X-1 (before upgrade height X), should we suggest timeouts for the current app version (v2) or the upcoming version (v3)?

Option 1: Use Timeouts for Next Version (v3)

  • Pros:
    • Fast block production starts immediately at upgrade height (X).
  • Cons:
    • Inconsistency: Suggested timeouts at height X-1 don’t match the app version in the current block header (v2 vs. v3).

Option 2: Use Timeouts for Current Version (v2)

  • Pros:
    • Consistency: Timeouts always align with the app version in the current block header.
  • Cons:
    • Delayed effect: Timeouts based on v3 will only apply at height X+1 (one height after the upgrade).

@evan-forbes
Copy link
Member Author

good question, my understanding of this is that we should only change the timeouts when the app version is changed, so we should do option 2

@evan-forbes
Copy link
Member Author

to add more reasoning to the above, we don't know if the upgrade is successful until we produce a block that has that version, therefore that is the soonest we possibly could change.

@staheri14
Copy link
Contributor

Thanks, @evan-forbes! That makes sense—I'll go ahead with the second option.

@staheri14
Copy link
Contributor

Design Update/Decision:

For the genesis state, the start time of block height 1 is calculated using cs.CommitTime + timeoutCommit (also see here), where timeoutCommit (in the new desing), depends on the timeouts of the state after applying the previous block. Since there is no block before the genesis block (hence no state), two options can be considered:

  1. Calculating the cs.StartTime of the next height i.e., H=1, by using timeoutCommit of zero.
  2. Using the timeoutCommit from the genesis state itself. This will be somewhat an exception where the cs.StartTime calculated at height 0 uses timeouts meant for height 1.

Currently, I went with the second option in the implementation [link], but please let me know if you think the first option makes more sense.

@evan-forbes
Copy link
Member Author

either or should be okay if I'm understanding correctly, especially for our existing testnets

we could also just use a default timeouts for the first block, as it shouldn't be super important for us atm

evan-forbes added a commit to celestiaorg/celestia-core that referenced this issue Oct 15, 2024
…ight according to the app version (#1494)

The celestia-core portion of
[#3859](celestiaorg/celestia-app#3859)

---------

Co-authored-by: evan-forbes <[email protected]>
evan-forbes added a commit that referenced this issue Oct 16, 2024
Closes #3859

manually tested in
#3882 (comment)

also tested in knuu

---------

Co-authored-by: evan-forbes <[email protected]>
Co-authored-by: Evan Forbes <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Rootul P <[email protected]>
rach-id pushed a commit to celestiaorg/celestia-core that referenced this issue Nov 18, 2024
…ight according to the app version (#1494)

The celestia-core portion of
[#3859](celestiaorg/celestia-app#3859)

---------

Co-authored-by: evan-forbes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
required issue is required to be closed before workstream can be closed warn:blocked item is not currently being worked on but is still blocked WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc WS: V3 3️⃣ item is directly relevant to the v3 hardfork
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants