Significantly incorrect memory usage reported #6

travisp · 2014-08-15T14:25:11Z

Was having issues with frequent cycling on a heroku project and noticed numbers like this in the logs (shortly after I restarted all of the web dynos)

2014-08-15T14:21:17.222622+00:00 app[web.3]: [2] PumaWorkerKiller: Consuming 429.48046875 mb with master and 2 workers
2014-08-15T14:21:18.443592+00:00 heroku[web.3]: source=web.3 dyno=heroku.15698018.ca54adb5-3f63-4006-ae9f-c0f235c53288 sample#load_avg_1m=0.00 sample#load_avg_5m=0.00
2014-08-15T14:21:18.443937+00:00 heroku[web.3]: source=web.3 dyno=heroku.15698018.ca54adb5-3f63-4006-ae9f-c0f235c53288 sample#memory_total=313.75MB sample#memory_rss=313.75MB sample#memory_cache=0.00MB sa

Basically, it seems to be vastly overestimating the amount of memory actually used. This is using the latest code from master, including get_process_mem 0.2.0

schneems · 2014-08-15T14:51:45Z

this is why zombocom/get_process_mem#7

samnang · 2015-02-02T16:29:41Z

@schneems because right now Puma is the recommended web server on heroku, so is it reliable to use it now? or better not and wait until this issue solve?

schneems · 2015-02-02T16:31:09Z

You can use this now, but it's at your own risk. Unicorn Worker Killer has the exact same bug and people have been using it on Heroku for years. I'll remove the warning from the readme when this gets resolved in a sane way. Feel free to experiment until then.

samnang · 2015-02-03T14:07:52Z

Thank @schneems, I will try it out. I'm a bit confused here between puma_worker_killer vs https://github.com/schneems/puma_auto_tune, we shouldn't use both at the same time? which one is your recommendation?

schneems · 2015-02-03T15:44:24Z

Puma Auto Tune does everything that PWK does + some. It will also cause your app to swap memory if you use it on Heroku guaranteed. Don't use Puma Auto Tune on Heroku right now. PWK is fine, but understand that it's not perfect, i.e. if you set your app to 512mb of RAM, it will start killing workers at about ~350mb of RAM. PWK outputs how much ram it thinks your system is using in a librato compatible format to the logs, you can manually compare that against actual RAM usage from Heroku's log runtime metrics and adjust.

samnang · 2015-02-03T15:58:47Z

if you set your app to 512mb of RAM, it will start killing workers at about ~350mb of RAM.

That's what I got here, and it keeps killing worker and restarting it. Does it make sense to set percent_usage greater than 100% here because of incorrect memory reporting of PWK?

schneems · 2015-02-03T16:05:41Z

That's one angle. Maybe shoot for 120%. Don't try to get too close. If you go over, then PWK won't kill workers when you need it to. Also realize that PWK is a bandaid for larger memory problems, it doesn't solve them, just covers them up.

samnang · 2015-02-05T04:25:07Z

Pretty unstable. Right now heroku memory go exceeded memory over 1 GB, but PWK reports only about ~600MB. And it didn't kill the workers as well.

PumaWorkerKiller.config do |config|
  config.ram           = 512  # mb
  config.frequency     = 5    # seconds
  config.percent_usage = 1.20
end

PumaWorkerKiller.start

» 10:18:59.685  2015-02-05 03:18:59.625773+00:00 heroku web.1  - - Process running mem=1767M(345.2%)
» 10:18:59.761  2015-02-05 03:18:59.625820+00:00 heroku web.1  - - Error R14 (Memory quota exceeded) Critical
» 10:18:59.869  2015-02-05 03:18:59.625257+00:00 heroku web.1  - - source=web.1 dyno=heroku.21274089.e2b6196c-6736-47c5-bcfd-8cd6393289ae sample#load_avg_1m=0.00 sample#load_avg_5m=0.02 sample#load_avg_15m=0.04
» 10:18:59.945  2015-02-05 03:18:59.625357+00:00 heroku web.1  - - source=web.1 dyno=heroku.21274089.e2b6196c-6736-47c5-bcfd-8cd6393289ae sample#memory_total=1767.64MB sample#memory_rss=501.53MB sample#memory_cache=0.00MB sample#memory_swap=1266.11MB sample#memory_pgpgin=1217595pages sample#memory_pgpgout=1089204pages
» 10:19:02.802  2015-02-05 03:19:02.516087+00:00 app web.1     - - [3] PumaWorkerKiller: Consuming 594.34765625 mb with master and 2 workers

schneems · 2015-02-05T04:30:37Z

Make sure you're using version 0.0.3 or master.

Consuming 594.34765625 mb with master and 2 workers

This should have triggerd a kill cycle.

if (total = get_total_memory) > @max_ram
  @cluster.master.log "PumaWorkerKiller: Out of memory. #{@cluster.workers.count} workers consuming total: #{total} mb out of max: #{@max_ram} mb. Sending TERM to #{@cluster.largest_worker.inspect} consuming #{@cluster.largest_worker_memory} mb."
  @cluster.term_largest_worker
else
  @cluster.master.log "PumaWorkerKiller: Consuming #{total} mb with master and #{@cluster.workers.count} workers"
end

where @max_ram = ram * percent_usage which should be 614 mb.

samnang · 2015-02-05T04:42:55Z

Yep, I was using 0.0.3. You are right 120% is about 614 mb, but it seems heroku memory goes 1767M already, but PWK still reports 594.34765625 mb, that's why it didn't kill the workers yet.

schneems · 2015-02-05T04:44:11Z

check your version of get_process_mem should be 0.2.0

samnang · 2015-02-05T04:53:10Z

Yes, it is.

$ bundle show puma
.gems/gems/puma-2.11.0

$ bundle show puma_worker_killer
.gems/gems/puma_worker_killer-0.0.3

$ bundle show get_process_mem
.gems/gems/get_process_mem-0.2.0

schneems · 2015-02-05T05:02:53Z

Weird. This is how we get the memory usage

def get_total(workers = set_workers)
  master_memory = GetProcessMem.new(Process.pid).mb
  worker_memory = workers.map {|_, mem| mem }.inject(&:+) || 0
  worker_memory + master_memory
end

My best bet is that you have something else running, maybe a separate binary or program, that is using up memory in a different process and PWK can't see it. If you're shelling out a bunch using backticks or Process.spawn PWK won't see it. Again, this is just yet another reason why it's "use at your own risk"-ware for now. Thanks for giving it a shot. Unfortunately the introspection tools on containers are just so limited.

samnang · 2015-02-05T10:43:14Z

In my code, I don't use anything to start any sub processes, but I'm not sure about other third parties that I'm using. Some of them are pubnub, newrelic, sidekiq. As far I can tell from the response time, I see puma is faster. I haven't done any benchmark myself yet.

Another thing, when I was using unicorn I see the memory doesn't keep growing like that much. I'm not sure because of this puma/puma#342

I think heroku recommends puma is the good choice and definitely the direction to go. Thanks, and hope you guys will find a way for this memory soon 😄

chetan-wwindia · 2016-09-08T14:10:52Z

What ever the value i set for config.ram the gem is taking it as 512 and restricting it to use 335 mb ram only. I check the value in rails console
PumaWorkerKiller.ram
=> 4096
still the cut out is working on 512.
I hvae done the default configuration which is working just that not taking the new config in the config/puma.rb or config/initializers/puma_worker_killer.rb

schneems · 2016-09-08T15:42:13Z

@chetan-wwindia are you on Heroku? Make sure that ram is set before the worker killer is "started". If this reproduces locally can you give me an example app that shows the problem?

chetan-wwindia · 2016-09-09T03:34:37Z

@schneems I using this on aws ubuntu server

schneems · 2016-09-13T17:22:22Z

Can you give me the code you're using to set the values? Does it work locally?

chetan-wwindia · 2016-09-14T04:28:17Z

On server its nginx + puma with 4 puma workers
It keeps piling on ram up to 6 gb. I m also dealing with memory lead
this is my temp solution till I find solution to memory leak .

Doesn't work on local
PumaWorkerKiller.config do |config|
config.ram = 4096 # mb
config.frequency = 10 # seconds
config.percent_usage = 0.80
config.rolling_restart_frequency = 3 * 3600 # 12 hours in seconds
end
PumaWorkerKiller.start

kevinelliott · 2017-04-05T20:44:53Z

Any update to report here? I'm curious of PWK can report correct memory consumption on Heroku dynos yet, or if there is some mild success using the memory definitions?

schneems · 2017-04-05T21:38:32Z

Check the readme. Will not work on Heroku until LXC exposes memory use inside of the container. So likely never. Use rolling restarts or performance dynos.

kevinelliott · 2017-04-05T22:37:48Z

Thanks, yeah I went with rolling restarts. So then it sounds like this issue should be re-closed.

jrimmer-healthiq · 2019-09-17T17:54:44Z

This isn't a problem on Performance Dynos, then? How about Shield Dynos?

schneems · 2019-09-17T22:48:58Z

perf, private, and shield dynos are all run on their own VPC so numbers should be correct.

schneems mentioned this issue Jan 12, 2015

Killing workers with plenty of memory left #11

Closed

schneems closed this as completed Nov 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significantly incorrect memory usage reported #6

Significantly incorrect memory usage reported #6

travisp commented Aug 15, 2014

schneems commented Aug 15, 2014

samnang commented Feb 2, 2015

schneems commented Feb 2, 2015

samnang commented Feb 3, 2015

schneems commented Feb 3, 2015

samnang commented Feb 3, 2015

schneems commented Feb 3, 2015

samnang commented Feb 5, 2015

schneems commented Feb 5, 2015

samnang commented Feb 5, 2015

schneems commented Feb 5, 2015

samnang commented Feb 5, 2015

schneems commented Feb 5, 2015

samnang commented Feb 5, 2015

chetan-wwindia commented Sep 8, 2016

schneems commented Sep 8, 2016

chetan-wwindia commented Sep 9, 2016

schneems commented Sep 13, 2016

chetan-wwindia commented Sep 14, 2016

kevinelliott commented Apr 5, 2017

schneems commented Apr 5, 2017

kevinelliott commented Apr 5, 2017

jrimmer-healthiq commented Sep 17, 2019

schneems commented Sep 17, 2019

Significantly incorrect memory usage reported #6

Significantly incorrect memory usage reported #6

Comments

travisp commented Aug 15, 2014

schneems commented Aug 15, 2014

samnang commented Feb 2, 2015

schneems commented Feb 2, 2015

samnang commented Feb 3, 2015

schneems commented Feb 3, 2015

samnang commented Feb 3, 2015

schneems commented Feb 3, 2015

samnang commented Feb 5, 2015

schneems commented Feb 5, 2015

samnang commented Feb 5, 2015

schneems commented Feb 5, 2015

samnang commented Feb 5, 2015

schneems commented Feb 5, 2015

samnang commented Feb 5, 2015

chetan-wwindia commented Sep 8, 2016

schneems commented Sep 8, 2016

chetan-wwindia commented Sep 9, 2016

schneems commented Sep 13, 2016

chetan-wwindia commented Sep 14, 2016

kevinelliott commented Apr 5, 2017

schneems commented Apr 5, 2017

kevinelliott commented Apr 5, 2017

jrimmer-healthiq commented Sep 17, 2019

schneems commented Sep 17, 2019