-
-
Notifications
You must be signed in to change notification settings - Fork 627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node process does not end and but CPU utilization at 100% #2090
Comments
Sorry, I just noticed #2080 and I believe they could be related, my reproduction seems to work correctly with 2.3.0 (which is the one that @rogerho1224 was using and worked for them). |
thanks @nachogiljaldo I can reproduce the issue |
No worries @sidorares if it helps, it might be intimately related to #2080 since I can confirm that using 2.3.0 fixed the issue for me. |
What is strange is that process hangs with 100% cpu just before exit. I suspect some combination of GC and large number of objects with large number of fields. Profiler does not pick that and shows a very small number of ticks in the main code and in the GC, the ticks during the "frozen" part are in "unaccounted" section The difference with 2.3.0 is highlighted in #2055, I'll try to see if changing that part only helps. |
@nachogiljaldo what is your os / architecture? Not sure if that matters, I'm seeing the issue on m1 mac |
I'm not sure if it's GC related, (not a node internals expert,but) I ran my reproduction with --trace-gc and the gc logs stop right with the "Done" message. |
Mine is Ubuntu x86_64. But a colleague of mine has the same issue with similar effects with M1. |
typically GC related performance problems are visible in the profiler log, maybe we see a bug in GC |
@nachogiljaldo I reduced script to a version with no external dependencies, could you try to run first file from https://gist.github.com/sidorares/128160e6b3dea1da3ad45cd672651d2d ? Still not sure what's going on, but switching to setting fields in the constructor hides the issue ( repro2.js ) cc @testn |
Actually it does finish, eventually. First script from gist finishes in about 5 minutes time |
I think I'll try to file this to the core node |
Actually, I think the better solution would be to cap the number of fields in dynamically compiled parser, if the number is over the limit use "interpreting" parser instead ( e.g in each row iterate over every column and read the code based on the column type ). We used to have "interpreting" parser a while ago - node-mysql2/lib/commands/query.js Lines 59 to 69 in 04f07f8
|
Not sure if this related. We thought that the problem came from our side since few issues were open on the repository and one of the purpose of this lib was a performance gain. But we finally reverted the change of the driver and the problem is gone. Indeed maybe it's when we manage many objects with many relations. We use RDS mysql with ECS on fargate, v16.20 node version, [email protected] |
@sebdec do you know roughly the number of columns and rows you are reading in an average query? |
@sebdec @nachogiljaldo @rogerho1224 could you try with |
Just tested with --noopt and it finished successfully almost immediately. |
Also @sidorares I tested the first file as you suggested (repro1) and it reproduces. |
we are currently using this module and our session inits into apps spent a lot of cpu, sometimes >1 core, but after auth is everything fine. We already done cpu profiling and seeing that most of ticks it's /usr/local/bin/node and mysql2 library How to deal with that broken but important module? Any hotfix? Any patches? Any workarounds? |
@stefkkkk some possible options: 1) What is your node version? The root cause is in node optimiser so hopefully its fixed eventually on a node side |
|
Mmm... in my environment (node20, linux) this was fixed on 3.9.4 (i.e. I was able to upgrade from the previous version I was running 2.7.0). I tried to upgrade to 3.9.7 today and the problem started to happen again, traced it to 3.9.5 so there's something in that PR that broke the issue that had been inadvertently fixed? EDIT: tried the noopt and max-opts alternatives but that's not viable as it makes our build absurdly slow (and some tests even time out). |
@nachogiljaldo I'm pretty sure inheriting result from null fixes/hides the problem, unfortunately this comes with it's own issues ( #2585 ) I'm still keen to review the way resulting object is constructed - see #2055 |
Also a possibility could be a configurable / pluggable result construction, e.i some users might prefer |
That seems potentially as a good idea, but not sure how feasible it would be in practice for cases when one doesn't connect directly to mysql but rather use an ORM. |
I have been investigating a problem that is causing trouble on our test suite and I managed to trace it to
mysql2
. This script allows reproducing it:Reproduction instructions
It happens with multiple node versions (v16.14.0, v18.16.0 are some of those I tested). For node 20.3.1 it happens, but you need to raise the 157 iterations.
In our tests we run thousands of DB queries which seems to be reaching this threshold and causing the process to "hung" (it eventually exits, but it takes a few minutes). As you can imagine this is a huge pain on CI because the "exit" takes as much as the test suite itself.
2 other interesting things to note:
Netstat shows the DB connection is closed while the process is waiting:
shows no output.
The text was updated successfully, but these errors were encountered: