This repository has been archived by the owner on Nov 15, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
client/cli/src/config: Warn on low file descriptor limit #6956
Merged
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should prevent starting the node at all when the file descriptors are below a certain limit. Maybe below 5k?
The warning will otherwise not seen by anyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1024 descriptors should really be enough for all our purposes under default settings for most users. Blocking startup on this does not seem reasonable. The number of file descriptors required does really depends on the settings. If the user wishes to serve thousands of peers and RPC clients at the same time they might need even higher number.
I'd put 1024 for the warning limit. If substrate or polkadot currently requires more with default settings this is a bug, that should be fixed not by a warning, but with reducing the number of open files and sockets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current plan for the Polkadot networking, all validators would be permanently connected to all other validators. So if we have 1000 validators on the network, each validator would maintain 999 TCP connections all the time. 1024 doesn't seem nearly enough.
How is it a bug to have more than 1024 descriptors? It's akin to saying "if we use more than 2 GiB of memory, that is a bug". These limits are arbitrary. The Linux kernel is capable of serving waaaay more than 1024 sockets, and I don't see what limiting ourselves to 1024 brings other than making our life more difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I was saying, under default settings we should not be using that many. The default peer limit is 50. Database requires a constant number of handles of about a 100. RPC connections are limited to 500 IIRC. If we are still using more than 1024 handles I'd like to know where and for which purposes. Any unaccounted leaking of file sockets I'd consider to be a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was clearly saying "under default settings for most user". Validators are not most users and even for them we won't be maintaining 999 connections. This is simply not scalable. We would need to select a subset and invent some kind of routing when we get to such numbers.
As a simple user, who only wishes to send a transaction, I don't expect the client to open 1000 connections and run out of file handles. If it does that, I'd consider it to be a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (by "we", I mean me but mostly the Web3 foundation) have known about that problem for probably two years now, but nobody ever came up with a concrete idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arkpar @tomaka I am having difficulties understanding why 1_000 connections per validator in itself is problematic. Can you expand on what you are basing your assumption on that 1_000 TCP connections per validator is not scalable? My question refers to the TCP connections themselves. I am not suggesting to send every message to everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to the validator messaging protocol. Which is currently sending every message to every validator. If we switch that to routing at some point there won't be any need to keep connections that are not used.