-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault in v8::internal::compiler::(anonymous namespace)::MayAlias (10.15.2 from Debian Buster) #31484
Comments
Is there any chance you could share a reproduction? Can you try other Node.js versions and compare the result (including, ideally, verifying that the 10.15.2 binary from https://nodejs.org/en/ or e.g. |
No. As I said it's a fairly complex application, containing a bunch of internal business logic. I started trying to rip out JavaScript code that I assume not to be executed within the staging system to reduce the footprint of the application, but so far without any real success. The biggest issue that I am not yet able to reliably trigger the segmentation fault. With an unmodified application it takes roughly 30 minutes to 1 hour to crash with the current traffic on that system. I can offer to poke around within the core file. I might be able to provide the core file to a nodejs core developer in private. I would need to check with my manager and to clean up the configuration to not leak any private information, such as TLS certificates, though.
Yes, I can try that. I just downloaded |
That was quick. I started a traffic generator in parallel and the application crashed at 23:59:14 UTC. Find below the stack trace with the node binary downloaded from nodejs.org:
Edit: After saving the core file and restarting the traffic generator the application crashed at 00:03:13 UTC. |
Can you open the core dump with gdb and post the output of |
Sure. Disassembly cut at some location after the current instruction that still fit on my terminal. I can give more disassembly if it is helpful. From the Debian binary
from the nodejs.org binary
Not sure why the disassembly shows mangled function names. |
Thanks. So in both cases it crashes because the second function argument is a nullptr. What happens when you start node with Is checking with the latest v12.x or v13.x an option for you? v10.x is still at V8 6.8 and that's positively ancient by now. The bug may have been fixed in a newer release. v8/v8@b28637b seems like a good candidate. |
Started the unmodified application at 13:40:45 UTC using
I can check within the staging system with arbitrary node versions (I already downloaded some random precompiled binaries from the Internet upon request from addaleax). For the production system I would have to check with my manager, not using the Debian Packages comes with quite a bit of operational complexity I'd like to avoid. |
It's 21:44:50 UTC now. The process is now running since roughly 8 hours without the crash by specifying the While researching the command line flag I also stumbled upon |
As a quick sanity check, let's see if the regression test from v8/v8@b28637b triggers a crash. |
It does so I think we've found our culprit. I've opened #31507 to back-port the fix to v10.x. |
FWIW: The CI appears to be behind OAuth I did not attempt to authorize. Not sure whether it was meant for me or just for tracking it yourself. In any case I trust your judgement here. Thanks for the quick turn around time. If you'd like me to test some special build to verify I'm happy to do so if you give me instructions. |
For us both, really. :-) If you could build from source and test with the patch, that'd of course be great. Second-best is if you can try with the next v10.x release because I don't know how often Debian updates their node package. |
I compiled my last node.js from source back in the 0.10.x days. I'll have a look.
Not at all. Debian Buster never got an updated nodejs since 2019-04-17 (https://tracker.debian.org/pkg/nodejs). Not even for backported security fixes which is odd. I plan to file a bug with them to get the patch backported to Debian stable once it is verified fixed, though. |
I downloaded the 10.15.2 source code, compiled it without modifications to verify that the bug is present with the self compiled version. It crashed in ~16 minutes with a bit of traffic generation. I'm currently recompiling with the patch applied and will report back. |
@bnoordhuis Unfortunately I have to report that 10.15.2 from source with the patch applied crashed as well (after 43 minutes). Meanwhile I've adjusted production to run with I compiled with (g++ from Debian Buster):
FWIW: I ran the regression test from the pull request with Debian's node, self compiled node w/o patch and self compiled w/ patch. None of those crashed. |
Hm, that's sad news. If you're up for it, a debug build (
Yes, that's expected. Node does a lot of JS bootstrapping that d8, the V8 shell, doesn't - and perturbs the environment the test runs in. Many of V8's regression tests rely on a pristine environment.
|
After adding 3G of swap file to that 1G of memory VM it compiled without being killed due to an out of memory condition and spat out 4.7G of artifacts, filling up the disk to ~100%. That said: The application is now running on a debug binary … I hope. |
Okay, it took 34 minutes for the Debug binary to crash. Compiled as
|
Find attached the output of the Debug node binary running with |
Thanks, I can see the general shape of the bug now. The fourth stack frame is a If you're up for it, I can write a patch for you to try out. |
Sure, I'm happy to test the patches you throw at me. It's not too much effort for me to test them, most of the time is spent waiting to find whether it crashes or not. |
https://github.com/bnoordhuis/io.js/commit/473868ecc2.patch Passes |
I've applied that patch and recompiled node.js. The process was started with the new binary at 14:31:34 UTC. Let's wait and see. |
It's 23:00:00 UTC now. The process is running fine since 8.5 hours. I guess it's safe to say that the patch indeed fixes my issue. Thanks. |
This commit back-ports the implementations of IsRename() and MayAlias() from the upstream 8.0 branch wholesale. Fixes several bugs where V8's load elimination pass considered values to be alive when they weren't. Fixes: nodejs#31484
Thanks for testing. I've opened #31613. |
This commit back-ports the implementations of IsRename() and MayAlias() from the upstream 8.0 branch wholesale. Fixes several bugs where V8's load elimination pass considered values to be alive when they weren't. Fixes: #31484 PR-URL: #31613 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Beth Griggs <[email protected]>
Debian Buster on amd64.
I'm seeing a more or less regular crash of a non-trivial application within libnode.so.64. It's running on a low-traffic staging system. Find below an excerpt from the syslog of the affected machine. Times are in UTC.
After becoming aware of the issue I made sure that the process could dump core and I also installed the relevant debug symbols. My understanding is that node crashes within v8's JIT (?) compiler:
I have the core dump on record and can provide additional information from the core file on request.
The application is launched using systemd. Find below the unit file for your reference:
Unit File (click to expand)
Find below a list of open files from the process after it was restarted automatically by systemd:
lsof -p $pid (click to expand)
The text was updated successfully, but these errors were encountered: