-
Notifications
You must be signed in to change notification settings - Fork 479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about per-cpu cache interaction with hot added vCPUs to VMs. #267
Comments
I think it'd be important to try to reproduce with a build at or after 5823a86 (from July 2023). e33c7bc is from October 2022. Prior to that, we used That patch switches us to reading |
Thanks for the information, @ckennelly.
and
|
It might be easier to start with the latest version at head. A lot has changed about the per-CPU cache in the last 12 months. Is the crash consistently in the deallocation parts of the per-CPU cache, though? If you can compile with file+line debugging information, that might give some indication as to what's being accessed. |
I could not reproduce this with the version at a more recent Thanks for the pointers @ckennelly. While I still don't know what caused this issue or how it exactly got fixed, it does look like the issue does not exist anymore. |
Hello,
This is more of a question but could be a bug report if it stands.
Background:
We recently observed crashes while running Envoyproxy (
v1.25.9
+ custom filters, tcmalloc:59400332b9cff9920b6a1da203ac1575272a9f44
) in our environment and the crash was determined to coincide with addition of new vCPUs to the running VM. The process was running natively on the VM, not in a docker container.This was 100% reproducible, but the stack traces differed each time. However, most times the common denominator in the stack traces included some tcmalloc functions.
Compiling with gperftools made the crash go away so we had it narrowed down to tcmalloc.
We tried writing a small producer/consumer program where the producer allocates heap memory, multiple consumers try to write to this memory and then free it and surely enough we see similar crashes with this small test program as well when vCPUs are hot plugged.
Admittedly, we've not been able to test this with a newer version of tcmalloc beyond
e33c7bc60415127c104006d3301c96902f98d42a
which is the latest version that Envoyproxy depends on.Question:
How does tcmalloc handle hot plugging vCPUs to VMs with per-cpu caches? I'm very interested in knowing how this is or could be done.
Is this something that is a bug or unsupported or something that is fixed since
e33c7bc60415127c104006d3301c96902f98d42a
?Any pointers are much appreciated.
-Aditya
Example traces
The text was updated successfully, but these errors were encountered: