Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

additional solana-sys-tuner limits #22566

Closed
wants to merge 5 commits into from

Conversation

jbiseda
Copy link
Contributor

@jbiseda jbiseda commented Jan 18, 2022

Problem

Add additional OS network tuning for burst traffic

Summary of Changes

Add solana-sys-tuner updated limits for:
net.core.optmem_max=4194304
net.core.netdev_max_backlog=250000

per: https://community.mellanox.com/s/article/linux-sysctl-tuning

Fixes #

@mvines
Copy link
Member

mvines commented Jan 18, 2022

If these values stick, please add them to the manual tuning section in the docs as well: https://docs.solana.com/running-validator/validator-start#manual

@codecov
Copy link

codecov bot commented Jan 18, 2022

Codecov Report

Merging #22566 (7494d71) into master (6edeed8) will increase coverage by 0.0%.
The diff coverage is 93.8%.

❗ Current head 7494d71 differs from pull request most recent head 0ed81c8. Consider uploading reports for the commit 0ed81c8 to get more accurate results

@@           Coverage Diff           @@
##           master   #22566   +/-   ##
=======================================
  Coverage    81.1%    81.1%           
=======================================
  Files         560      561    +1     
  Lines      151206   151286   +80     
=======================================
+ Hits       122633   122719   +86     
+ Misses      28573    28567    -6     

@@ -93,6 +93,10 @@ fn tune_kernel_udp_buffers_and_vmmap() {

// increase mmap counts for many append_vecs
sysctl_write("vm.max_map_count", "1000000");

// Reference: https://community.mellanox.com/s/article/linux-sysctl-tuning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is referencing TCP tuning, so I poked around for the same vars WRT UDP and found https://indico.cern.ch/event/212228/contributions/1507212/attachments/333941/466017/10GE_network_tests_with_UDP.pdf. Admittedly it's tuning 10Gbit and using larger MTU, but at least the right protocol. The suggestions are quite a bit different and also tough a couple other knobs that we haven't investigated.

Additionally, it got me wondering whether we might be pinning PoH to the NIC's IRQ affinity core, which would probably be bad.

WDYT about rolling new values out to our nodes for a week and monitoring before we change the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's try on our nodes first.

@jbiseda jbiseda marked this pull request as draft January 20, 2022 05:07
@stale
Copy link

stale bot commented Mar 2, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Mar 2, 2022
@stale
Copy link

stale bot commented Apr 16, 2022

This stale pull request has been automatically closed. Thank you for your contributions.

@stale stale bot closed this Apr 16, 2022
@McSim85
Copy link

McSim85 commented May 17, 2023

Hey @jbiseda @t-nelson
How was your testing going?

I am asking if we should change the zero value to something worthable.
In practice, setting net.core.optmem_max to infinite value could potentially cause problems, since it would allow a single socket to allocate a huge amount of memory for options.

@jbiseda
Copy link
Contributor Author

jbiseda commented May 17, 2023

Hey @jbiseda @t-nelson How was your testing going?

sys-tuner has been removed, see: #31682

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants