-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node 4.2 large memory spikes and timeouts on Heroku #3370
Comments
@rvagg @trevnorris @Fishrock123 ... ideas? |
What happens when you start with (for example) |
Okay just tried running with
|
Yes. |
@bnoordhuis ... looking at this, we likely definitely want to make sure this is documented better |
Yes very eye-opening for me at least. Fingers crossed, but I think you guys just saved us from having to revert everything again. Really appreciate the quick response. Going to leave this open through end of day while we do a few more tests on our end. |
@jasnell I don't disagree but where would you put it? Maybe it's time we add a FAQ. |
FWIW I just added a FAQ to the wiki here and it is linked to on the main wiki page. Perhaps this could also be linked to somewhere on nodejs.org and other places? |
/cc @hunterloftis |
perhaps dig through a few old issues that shared the same type of characteristics so google/seo/etc can drive people in the right direction? |
@friism I recommend This is probably documentation we should provide as well in a 'Memory-management with Node on Heroku' article. |
@hunterloftis From what we've seen, the second we hit a memory limit everything pretty much starts timing out. I'd actually rather a hard kill at that point so we can get a fresh node up that can respond to requests again. You mention that if we specify |
@AndrewBarba yeah it looks like your app is particularly hard-hit by lazy collection, and in that case the lesser evil is shutting down a process. I'd be interested in hearing from @bnoordhuis on this but, from what I've seen, if your app requires more space than Also keep in mind that 'old space' is just one part of the application's memory footprint, so you can expect the whole app to take > |
Might try playing with flags like |
For a kitchen sink example with arbitrary values:
|
@hunterloftis Yes, |
Just weighing in to share my experience: I had this happen to me with 0.12.x and 4.0.x - before I migrated everything to 4.0, Heroku Support recommended --max_old_space_size and it worked like a charm. I still get some instances of the error every now and then, but the affected dyno shuts down and no great harm is done, just a couple dozen timed out requests once a week or so (out of thousands per minute). |
@fansworld-claudio That's great to hear. What size dyno are you running and what did you end up setting |
@AndrewBarba At first I had the Standard-2X (1GB) dynos and max_old at 960, then after a couple weeks of stability I scaled it down to Standard-1X (512MB) and max_old at 480 - has been stable for a couple months. YMMV though, this is a REST API that does mostly i/o with redis and mongo. |
It's kinda a pain to set this for all your heroku apps (and remember to vary this based on dyno size), so I made heroku-node wraps node and sets the |
@ApeChimp I just gave that a go and this showed in the console immediately:
Looks like you are dividing
51.2MB of ram is not enough to even start up. I think you can just simplify to |
@AndrewBarba, mea culpa That's fixed as of [email protected]. |
Couldn't the garbage collector also being executed when getting and out of memory exception? This would helps in a more generic way on systems with really low memory constrains, for example when executing npm on NodeOS it starts killing processes on QEmu with the default memory settings (128mb), and the same would happen on Raspberry Pi without swap... |
@piranna What exact exception do you mean? If you are having allocation failures in mind, catching all of those in some generic way is a terrible idea — many things could go wrong from that. If you are speaking about the system oom killer, intercepting it requires specific settings on the system level. |
The error I get is an Out of Memory exception, thrown at process level: |
@pirania That screenshot you have posted shows the OOM killer triggered. It's not interceptable by default. Also, it's not necessary caused by |
What a shame, I though it would be a good feature :-(
Yeah, I know it's killing random processes after it. In fact in that screenshot the error was caused by slap but nsh stopped working too :-/ |
For what it's worth, here's the solution we're trying now (I sure wish I had found this conversation before!!):
I'm getting the values for those flags from the V8 defaults, and it seems to be working great so far (Heroku / 512mb). |
Could it be possible to lower it when the sum of phisical memory + swap is smaller than this? It doesn't makes sense to have such a big limit if it's impossible to achieve it... This would help on memory-constrained systems. |
@piranna see #3370 (comment) and #3370 (comment) |
Or even a single flag could be a nice option: |
I also added custom settings for modulus.io's 396mb servos:
|
Is not |
So because setting |
Follow up: this is pure win. I'm running serveral 2x dynos in production for a very leaky app with You do not want to just set |
Thank you for the detailed analysis. Just spent all day debugging a critical infrastructure performance issues, and finally found that one of the services was move to v4.2.8. After rolling back to v0.12 everything returned to normal. |
@TylerBrock you mentioned that:
I've been experimenting and noticed if If anyone's interested, I've written more about the experiment here. If you have any ideas on how I can improve the experiment & understanding, I'd really appreciate it. |
I am testing |
Hello, the info in this issue was very useful. I made a module based on @joanniclaborde suggestion. The good thing is that the node process never reaches its max memory available, so it never crashes in Heroku. We have been using it in a real project with thousand of requests per minute and had no crash whatsoever. I encourage you to try it out and report any problems you find. |
Nice!! We've been using those same settings for a few months now, and it's running fine. Good idea to turn that code into a module @damianmr ! |
🔥 I had a terrible memory leak somewhere: it turned out to be another node module that my system depends on. I took it out and my memory usage went from an average of 8GB down to 4GB. That still meant that I had to pay for the $500 a month plan to keep my app from exceeding the memory limit of the lower plans. I looked everywhere and haven't been able to find why my app is still consuming so much memory. 💯 🙌 Then I came across this today and after using @damianmr's Strangely enough, my app's memory usage is showing as ~constant now rather than the exponential rise from before. |
Below is a screenshot of the past 12 hours which has been a total disaster for us to say the least. After updating to node 4.2 (from 0.10) in production we immediately exceeded all memory quotas and experienced a high volume of timeouts (even with no load and memory under the 1GB limit).
First, I apologize if this is not the place for this. I am happy to move discussion somewhere else and we will help diagnose whatever you guys need. We did this same parade with Node 0.12 and had to downgrade to 0.10.
Second, and I guess the real question here, is Heroku's 512MB of ram simply not enough to run Node 4.x? If that is the case, cool, but memory constraints definitely need to be made more clear.
Timeline:
In general you can see the memory is all over the place, maybe that is expected with newer versions of V8...
Although I don't have a screen, you can see in the first part of the graph running Node 0.10 that it stays almost perfectly flat at 256MB of ram. Under any load, that was consistent.
For reference, here is a load test we did in a dev environment running Node 4.2.1, cluster forked to 4 processes, and about 5k rpms. Also immediately hit the higher 1GB memory limit. We then dropped this down to 2 forked processes with the same result.
The text was updated successfully, but these errors were encountered: