Skip to content

Commit

Permalink
Refine size-your-shards wording (#89081) (#89159)
Browse files Browse the repository at this point in the history
Clarify that the limits in the docs are absolute maxima that will avoid
things just breaking but won't necessarily give great performance.
  • Loading branch information
DaveCTurner authored Aug 8, 2022
1 parent e9ea4cb commit 03db466
Showing 1 changed file with 22 additions and 7 deletions.
29 changes: 22 additions & 7 deletions docs/reference/how-to/size-your-shards.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -175,17 +175,25 @@ index prirep shard store

[discrete]
[[shard-count-recommendation]]
==== Aim for 3000 indices or fewer per GB of heap memory on each master node
==== Master-eligible nodes should have at least 1GB of heap per 3000 indices

The number of indices a master node can manage is proportional to its heap
size. The exact amount of heap memory needed for each index depends on various
factors such as the size of the mapping and the number of shards per index.

As a general rule of thumb, you should aim for 3000 indices or fewer per GB of
heap on master nodes. For example, if your cluster contains 12000 indices then
each dedicated master node should have at least 4GB of heap. For non-dedicated
master nodes, the same rule holds and should be added to the heap requirements
of the other roles of each node.
As a general rule of thumb, you should have fewer than 3000 indices per GB of
heap on master nodes. For example, if your cluster has dedicated master nodes
with 4GB of heap each then you should have fewer than 12000 indices. If your
master nodes are not dedicated master nodes then the same sizing guidance
applies: you should reserve at least 1GB of heap on each master-eligible node
for every 3000 indices in your cluster.

Note that this rule defines the absolute maximum number of indices that a
master node can manage, but does not guarantee the performance of searches or
indexing involving this many indices. You must also ensure that your data nodes
have adequate resources for your workload and that your overall sharding
strategy meets all your performance requirements. See also
<<single-thread-per-shard>> and <<each-shard-has-overhead>>.

To check the configured size of each node's heap, use the <<cat-nodes,cat nodes
API>>.
Expand All @@ -207,7 +215,7 @@ GET _cat/shards?v=true

[discrete]
[[field-count-recommendation]]
==== Allow 1kB of heap per field per index on data nodes, plus overheads
==== Data nodes should have at least 1kB of heap per field per index, plus overheads

The exact resource usage of each mapped field depends on its type, but a rule
of thumb is to allow for approximately 1kB of heap overhead per mapped field
Expand All @@ -222,6 +230,13 @@ For example, if a data node holds shards from 1000 indices, each containing
of heap for the fields and another 0.5GB of heap for its workload and other
overheads, and therefore this node will need a heap size of at least 4.5GB.

Note that this rule defines the absolute maximum number of indices that a data
node can manage, but does not guarantee the performance of searches or indexing
involving this many indices. You must also ensure that your data nodes have
adequate resources for your workload and that your overall sharding strategy
meets all your performance requirements. See also <<single-thread-per-shard>>
and <<each-shard-has-overhead>>.

[discrete]
[[avoid-node-hotspots]]
==== Avoid node hotspots
Expand Down

0 comments on commit 03db466

Please sign in to comment.