You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Raft layer does not have memory budgeting -- in particular, it needs a global memory budget shared across all Raft groups. This makes it vulnerable to OOMs if many Raft groups are seeing concurrent memory-intensive operations, typically large messages like SST ingestions. The work to resolve this is tracked in:
88346: roachtest: use n1-standard for 16-core GCE machines r=srosenberg a=erikgrinaker
Roachtest used `n1-highcpu` machines at 16 cores and beyond. However, this causes a memory cliff, because a `n1-standard-8` machine has ~30 GB memory (3.75 GB per core), but a `n1-highcpu-16` machine only has 14 GB memory (0.9 GB per core).
This patch makes 16-core machines use `n1-standard` as well, with 60 GB memory, and only switches to `n1-highcpu` at 32 cores (with 29 GB memory).
Touches #87809.
Release note: None
Co-authored-by: Erik Grinaker <[email protected]>
The Raft layer does not have memory budgeting -- in particular, it needs a global memory budget shared across all Raft groups. This makes it vulnerable to OOMs if many Raft groups are seeing concurrent memory-intensive operations, typically large messages like SST ingestions. The work to resolve this is tracked in:
Until this work gets prioritized, we should increase the memory of the failing roachtests to avoid test flakes. This includes, but is not limited to:
Jira issue: CRDB-19547
The text was updated successfully, but these errors were encountered: