-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV on startup with large value of --max-stats if total stats size is greater than 2^32 #3463
Comments
Various bits of code in BlockMemoryHashSet use uint32_t for the size, so this is the result of an undetected overflow. |
cc @jmarantz |
I assume the action-item is just to check that and print a reasonable error...or do you think we should be using larger ints for the offsets in the structure? |
I think checking/printing is good enough for now. |
can you assign to me? |
Actually the system can't handle 1G cells and it should be able to;, even though the number of bytes may be >4G, so there are I think two parts to this bug:
|
In my workspace, I've changed some uint32 to size_t that were artificially limiting the # of stats to <4G, but I think that doesn't matter in practice, at least on my machine, because I run out of memory before I run out of addressable stats. I can handle about 200M stats, but this bug is about 10G stats, which requires 1.6T shared memory. My workstation doesn't have that much, but creating the shared-memory and mmaping it succeed. It only fails when you access enough pages, and the failure-indication (for me) is SIGBUS, and that failure happens when I ask for 250M stats, which can be indexed in 32 bits easily. I think we should just declare a max # of stats at most 100M or maybe some smaller reasonable number, independent of available memory. What should that number be? |
Fun fact: if you are debugging some boundary conditions with an egregious amount of shared memory, make sure you get a clean run with a sane size, otherwise your workstation will be left with zero bytes of shared memory available, and a lot of things will not work, in and out of Envoy... |
In my case, I have much larger-than-default names. I think I'm setting max-obj-name-len to 384, and the crashing value for max-stats was 10M. These are much smaller than the numbers you're talking about. Was your testing after converting uint32-->size_t? If so, that may be right. I think the original bug was just an overflow of a uint32. |
Yeah, there's a real limitation imposed by the impl, which I have fixed, that some of the byte-size values were represented by uint32_t. Without my change I segv on 100M stats (the limit I just set), and with it I don't. There probably also should be a limit on max-stats * max-obj-name-len to be something that's likely to fit into shared memory. |
With existing system, 25M stats works:
and 30M stats fails:
because it's 160 bytes per entry, and 30M * 160 = 4.8e9, whereas 25*160=4e9, which fits in 32 bits. Fixing the offsets resolves this, which I have pending. |
Repro steps:
envoy --max-stats 10000000000
Call Stack:
The text was updated successfully, but these errors were encountered: