Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable BN service request timeouts #2371

Conversation

hansieodendaal
Copy link
Contributor

@hansieodendaal hansieodendaal commented Oct 22, 2020

Description

  • This change adds configurable base node service request timeouts, notably differentiating between general service requests (default at 180s), requests to fetch blocks for block sync (default at 30s) and requests for a complete current UTXO set included in the blockchain (default at 600s).
  • Removed a mempool service request timeout setting in the sample config files that was not used.

Motivation and Context

Different aspects of the base node operation require different timeouts to work optimally.

How Has This Been Tested?

Tested in a base node on Windows

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Feature refactor (No new feature or functional changes, but performance or technical debt improvements)
  • New Tests
  • Documentation

Checklist:

  • I'm merging against the development branch.
  • I ran cargo-fmt --all before pushing.
  • I ran cargo test successfully before submitting my PR.
  • I have squashed my commits into a single commit.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.

Copy link
Collaborator

@stringhandler stringhandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I think let's remove the constants entirely and just use the defaults in the config file (or the actual values in the config file)

@@ -23,6 +23,12 @@
use std::time::Duration;

/// The allocated waiting time for a request waiting for service responses from remote base nodes.
pub const BASE_NODE_SERVICE_REQUEST_TIMEOUT: Duration = Duration::from_secs(180);
pub const GENERAL_SERVICE_REQUEST_TIMEOUT: Duration = Duration::from_secs(180);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be removed since they are defaulted in the config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constants removed in favour of moving the values directly into the default struct method.

/// The allocated waiting time for a fetch UTXOs request waiting for service responses from remote base nodes.
pub const FETCH_UTXOS_SERVICE_REQUEST_TIMEOUT: Duration = Duration::from_secs(600);
/// The minimum allocated waiting time for any request waiting for service responses from remote base nodes.
pub const SERVICE_REQUEST_MINIMUM_TIMEOUT: Duration = Duration::from_secs(10);
/// The fraction of responses that need to be received for a corresponding service request to be finalize.
pub const BASE_NODE_SERVICE_DESIRED_RESPONSE_FRACTION: f32 = 0.6;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also be removed and added to the config file with a default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constants removed in favour of moving the values directly into the default struct method, however, desired_response_fraction bit added to the config file, as we do not have a requirement for that to be user-configurable at the moment.

@hansieodendaal hansieodendaal force-pushed the ho_configurable_bn_service_request_timeouts branch 3 times, most recently from 7d920b9 to 0dadf07 Compare October 22, 2020 19:41
@hansieodendaal hansieodendaal force-pushed the ho_configurable_bn_service_request_timeouts branch from 0dadf07 to 6aa341f Compare October 26, 2020 04:55
Copy link
Collaborator

@stringhandler stringhandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with it in general, but I think the removal of unrelated constants (e.g. MAX_METADATA_REQUEST_RETRY_ATTEMPTS) without adding config values for them is a step backwards. If you revert those or add config values for them, I'm happy to merge

@hansieodendaal hansieodendaal force-pushed the ho_configurable_bn_service_request_timeouts branch 2 times, most recently from 15cf763 to 1eabd46 Compare October 27, 2020 15:39
@hansieodendaal
Copy link
Contributor Author

hansieodendaal commented Oct 28, 2020

@mikethetike, I have tested out the 30s default timeout to fetch blocks; this is not adequate for our current block sync strategy. Three out of eight nodes started banning sync peers because blocks were not returned in 30s, and could not sync to the tip anymore. I propose we change this default setting to 60s or 120s, do you agree?

2020-10-28 09:55:22.522676500 [c::bn::base_node_service::service] WARN  Request (request key 11880102790285126520) timed out after 30007ms
2020-10-28 09:55:22.522715000 [c::bn::base_node_service::service] ERROR Failed to send outbound request (request key: 11880102790285126520): RequestTimedOut
2020-10-28 09:55:22.522747100 [c::bn::state_machine_service::states::block_sync] DEBUG Failed to fetch blocks from peer: RequestTimedOut. Retrying.
2020-10-28 09:55:22.522755300 [c::bn::state_machine_service::states::block_sync] INFO  Banning peer Node ID: 1cc148eeaa18107edfe665a06b

Edit:
It can take 30s for one block. So should be 5*30.
Changed to 150

base_layer/core/src/base_node/service/service.rs Outdated Show resolved Hide resolved
common/config/presets/windows.toml Outdated Show resolved Hide resolved
common/config/presets/windows.toml Outdated Show resolved Hide resolved
common/src/configuration/utils.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If agreed, please commit suggestions before merge. Thanks
Will rebase

This change adds configurable base node service request timeouts, notably
differentiating between general service requests, requests to fetch blocks
for block sync and requests for a complete current UTXO set included in the
blockchain.
@hansieodendaal hansieodendaal force-pushed the ho_configurable_bn_service_request_timeouts branch from 1eabd46 to 94e67a1 Compare October 28, 2020 12:58
@stringhandler stringhandler merged commit 5a36b76 into tari-project:development Oct 28, 2020
@hansieodendaal hansieodendaal deleted the ho_configurable_bn_service_request_timeouts branch November 2, 2020 12:46
stringhandler added a commit that referenced this pull request Nov 24, 2020
Major changes since v0.6.0
---
- [#2431](#2431) Database refactor to remove checkpoints and more efficient calculation of MMR roots
- [#2352](#2352) Add DNS seed support in base node

Minor changes since v0.6.0
---
- [#2448](#2448) Add OpenSSL to Windows install, runtime
- [#2434](#2434) Add supervisord setup notes
- [#2439](#2439) Provide initial sync status to merge mining proxy
- [#2377](#2377) Message malleability detect and ban
- [#2447](#2447) Fix for xmrig powershell script
- [#2440](#2440) Stagenet Setup Guide Corrections
- [#2444](#2444) DHT connectivity waits for comms connectivity before starting
- [#2427](#2427) Update merge mining runtime, README, Win install
- [#2420](#2420) Show base node chain tip and sync status in the console wallet
- [#2419](#2419) Add optional fee_per_gram field to console wallet Send Tx form
- [#2421](#2421) Fix QR code rendering in the console wallet
- [#2423](#2423) Plumb in the balance in the console wallet
- [#2415](#2415) Prevent loop in peer sync by storing all peers attempted
- [#2383](#2383) Implement daemon-mode in tari_base_node
- [#2407](#2407) Simplify automated stress test
- [#2397](#2397) Fix preset config files
- [#2400](#2400) Implement wallet base node service
- [#2403](#2403) Add exclusive file locks to Wallet, Chain and Peer db’s
- [#2408](#2408) Fix wallet conversion error for a valid tx status
- [#2356](#2356) Combine validation code to use same function in pruned and archive mode.
- [#2371](#2371) Add configurable BN service request timeouts
- [#2430](#2430) Implement entry and persistence of custom base node in console wallet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants