Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to size the direct memory under expected event size #476

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,31 @@ Setting direct memory too low decreases the performance of ingestion.

NOTE: Be sure that heap and direct memory combined does not exceed the total memory available on the server to avoid an OutOfDirectMemoryError

[id="plugins-{type}s-{plugin}-memory-sizing"]
===== How to size the direct memory used

To correctly size the direct memory to sustain the flow of incoming Beats connections, the medium size of the transmitted
log lines and the batch size used by Beats (default to 2048), has to be known. For each connected client, a batch of events
is read and due to the way the decompressing and decoding part works, it keeps two copies of the batch in memory.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer:
Beats decoding part keeps 2 copies of the buffer it's processing in memory:

The expression used to calculate the maximum direct memory is:
["source","text"]
-----
event size * batch size * 2 * beat clients
-----

Supposing a 1Kb event size, there is a small overhead of ~500 bytes of metadata transferred, considering 1000 connected clients,
the maximum memory needed could be estimated as:
["source","text"]
-----
1,5 KB * 2048 * 2 * 1000
-----
This totalling to about ~6GB. So if you have some data about the medium size of the events to process you can size
the memory accordingly without risking to go in Out-Of-Memory error on the direct memory space in production environment.

NOTE: This calculation is the worst case scenario, where all Beats clients have sent almost full batches, but neither has
yet completed. In normal circumstances this is unlikely to happen, because different Beats send data with different rates,
when one client is sending, some other is idle. However, this situation could happen after a Logstash crash, on restart
all clients will bomb the Logstash process.

//Content for Beats
ifeval::["{plugin}"=="beats"]
Expand Down