Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 2.0.15 keeps crushing with "Failed to map the data file (size: 192): Cannot allocate memory (os error 12)" #3559

Open
davaymne opened this issue Nov 9, 2024 · 2 comments

Comments

@davaymne
Copy link

davaymne commented Nov 9, 2024

version = 2.0.15

[2024-11-09T12:14:04.028489030Z ERROR solana_gossip::cluster_info] Insert self failed: InsertFailed
[2024-11-09T12:14:14.528584026Z ERROR solana_gossip::cluster_info] Insert self failed: InsertFailed
[2024-11-09T12:14:14.528601962Z ERROR solana_gossip::cluster_info] Insert self failed: InsertFailed
[2024-11-09T12:14:22.028709734Z ERROR solana_gossip::cluster_info] Insert self failed: InsertFailed
[2024-11-09T12:14:22.028715212Z ERROR solana_gossip::cluster_info] Insert self failed: InsertFailed
[2024-11-09T12:17:41.633100625Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 314400): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633327424Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 1088): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633328125Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 1056): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633326924Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 800): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633328125Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 352): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633328496Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 308256): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633400704Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 309216): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633426953Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 310464): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633648224Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 311040): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.633998458Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 314400): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.634060020Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 308544): Cannot allocate memory (os error 12).
[2024-11-09T12:17:41.634099149Z ERROR solana_accounts_db::accounts_hash] Failed to map the data file (size: 311200): Cannot allocate memory (os error 12).

Memory itself is fine.

Screenshot 2024-11-09 at 12 27 29

/etc/sysctl.d/21-solana-validator.conf

net.core.rmem_default = 134217728
net.core.rmem_max = 134217728
net.core.wmem_default = 134217728
net.core.wmem_max = 134217728
vm.max_map_count = 1000000
fs.nr_open = 2048000
vm.swappiness = 1
@davaymne
Copy link
Author

The culprit is, old files are not being deleted from accounts/run and accounts/snapshot folders:

/accounts/snapshot/301584906/ | wc -l
1412921

accounts/snapshot/301584906/ | head
total 413226660
-rw-r--r-- 2 solana solana       272 Jan  1  1970 296940793.992700
-rw-r--r-- 2 solana solana       136 Jan  1  1970 296940835.683178
-rw-r--r-- 2 solana solana       408 Jan  1  1970 296940837.815933
-rw-r--r-- 2 solana solana       136 Jan  1  1970 296940890.731088
-rw-r--r-- 2 solana solana       408 Jan  1  1970 296940975.84647
-rw-r--r-- 2 solana solana       136 Jan  1  1970 296941149.153122
-rw-r--r-- 2 solana solana       136 Jan  1  1970 296941257.419268
-rw-r--r-- 2 solana solana       408 Jan  1  1970 296941260.348094
-rw-r--r-- 2 solana solana       544 Jan  1  1970 296941276.289218

@antonya86
Copy link

antonya86 commented Nov 21, 2024

We face the same problem. agave process crushes once reaches vm.max_map_count limit (1mil). Extending this limit up to 2mil doesn't help. It looks it is a bug in memory allocation in agave client

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants