-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory usage not accurate on Heroku #7
Comments
I attempted to include swap from smaps and it doesn't look like it does anything:
Here's my the commit on my branch to add in swap: 17e6b9a |
As an experiment I tried putting in a task that looped through all
Here's the file: https://gist.github.com/schneems/10025798 |
I guess there is no way to get this information directly from heroku, other than using log drains - right? I took a look at their API and don't see any memory reporting. I wish heroku offered their "runtime-metrics" via their API for each dyno. |
Have you tried reaching out to heroku support to see if there is a command you could run to get the same value they are using? |
Have you looked at how oink is doing it? Do any of these approaches match heroku? https://github.com/noahd1/oink/blob/master/lib/oink/instrumentation/memory_snapshot.rb |
I work for Heroku 😀 So this is actually a linux concern http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ basically linux is not container aware when it comes to containers like LXC (which Heroku uses). While you can get the memory from outside the container easily, getting it from within, accurately, is really really hard (perhaps impossible?). As you mentioned we do drain the information via log-runtime-metrics, unfortunately they are getting this info from outside of the container. Another suggestion someone gave me at Railsconf was to take a look at how Passenger measures memory as they do something similar to puma_auto_tune (so i've been told). |
Ah - I see. Well what I think would be nice is if there was a way to use the heroku gem to query for the latest memory check. For example, if the values that log-runtime-metrics is displaying in the log could be stored somewhere and then retrieved via the API (i.e. the heroku gem) Make sense? Obviously this isn't a tiny change, but would really help with things like puma_auto_tune |
I briefly checked the Passenger code, and while they seem to have several memory measurements, this looks interesting: https://github.com/phusion/passenger/blob/master/ext/common/Utils/ProcessMetricsCollector.h#L495 They're using PSS + private dirty RSS + swap. I've been quite busy lately, but will try to get some time to test and see how that would work with puma_auto_tune/puma_worker_killer. |
Thanks for the tip. I'm running on a mac, if you want an example snap check http://stackoverflow.com/questions/17594183/what-does-private-dirty-memory-mean-in-smaps On Wed, Apr 30, 2014 at 11:58 AM, Karl Söderström
|
I tested with PSS + dirty + swap, but the results are far from accurate.
I've also tested a bit with different calculations for master and child processes. I believe that RSS + dirty + swap for the puma master process, and dirty + swap for the workers should give a pretty accurate result, but so far no luck... |
I played with this for a bit. It looks like they're using When i try the same thing https://gist.github.com/schneems/4ac2983f0abf3e200ec9
Neither |
But only using |
I did some more testing, and found something interesting while checking the memory of an instance only running bash:
RSS should be My guess would be that this instance would use about 2mb of memory. Sounds reasonable since the RSS for bash is 2 mb, right?
Turns out that If memory can be shared between containers and/or host, it will be even harder to figure out the memory use for a specific process... |
@schneems Would I hit this nasty bug on something else than Heroku? Let see a basic digital ocean VPS running Ubuntu 12.04/14.04 ? |
Basically measuring effective memory on any linux system seems to be impossible :( If you know you have the whole machine you could watch swap usage and minimize that, but it's not ideal. |
Any update on this lately? I switched back to puma_worker_killer from puma_auto_tune (not that that is all that relevant) and am seeing PWK report memory usage of 819mb when heroku reports 307mb. |
I changed the calculation back to use RSS which is what Unicorn Worker Killer uses. RSS is a better stop gap measure as it reports that more memory is used as opposed to less. It's still not perfect. We're working on rolling out some tools internally to better expose memory. It's a surprisingly hard problem. It will likely take awhile. On Wed, Jun 25, 2014 at 4:40 PM, Brian McManus [email protected]
|
Somewhat strange that it's so tough when Heroku reports on this in the logs (via log_runtime_metrics) and obviously knows when to issue R12 warnings as well. As a potentially terrible solution could you just periodically parse it out of the logs haha? Another idea would be to have whatever is monitoring memory write the current memory usage to an ENV var whenever it outputs to the log? |
The way memory is exposed outside the container is very different than how it's exposed inside. More info http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ |
@schneems Would this gem still be useful for relative memory usage? Had memory jump by 369MB on one dyno today. I'm finding it surprisingly difficult to determine which action was the culprit. A lot can happen in the 20-seconds worth of requests between log-runtime-metrics output. Thinking of adding memory_total to our lograge logs. |
Could you folks clarify whether this issue is irrelevant if hosting without containers (simple VPS like DigitalOcean/Linode)? |
Would also be interested in knowing if this affects PX dynos on heroku since they are not multitenant?~ Brian On Thu, Jul 17, 2014 at 6:26 PM, Maxim Chernyak aka hakunin
|
@maxim YMMV. RSS or PSS are both approximate measures. Honestly it doesn't matter where you run they're only going to be indicators of if you're using more or less memory, but not ever a byte-by-byte measure of exact memory. @bdmac a PX dyno is a container that takes up the ENTIRE instance, so it will be better...but not perfect. This buys you the ability to use existing OS tools like The issue comes really from the nebulous nature of memory rather than if it's run in a container or not. What number does DigitalOcean/Linode use to start throttling your app, i.e. how do you know when your app has started swapping to disk because no more physical memory is available? Hard as it is to believe, memory measurements when it comes to an OS is almost always an approximation at best. Trying to match our approximation to the host's approximation is the hard part. I've never run this anywhere in production other than Heroku, would be interested in experiences. |
@igorbernstein btw was looking into oink, and found this, so probably not going to work well for puma. @schneems I believe newrelic also uses rss for memory measurements, and their graphs always correlate perfectly with me getting "cannot allocate memory" errors, so perhaps my mileage will be pretty spot on? It's not like the function of "restart puma riiight about nnnow" can ever be precise, false positives are not really an issue as long as all the true positives are prevented. So considering the above, is this usable in production? |
@schneems Heh, I forgot that this isn't the puma_auto_tune repo, but hopefully it was clear that that's what I was referring to. |
Has this been fixed yet? |
It can't really be "fixed" from inside of a container until containers expose that information. |
And about XEN? Any issues? |
@schneems The first line on your readme for puma auto-tune, says don't use until this is fixed. Should we just ignore this and use it? |
Use at your own risk. I recommend puma worker killer rolling restarts instead. |
Just as an update, I was playing around with this some and it appears as though memory usage is accurate until you start swapping. After that, it fails to account for the swap and constantly shows up as being below the max memory limit because everything over that is getting swapped out. As such, if you have a very regular memory leak that happens over the course of several minutes, it should be possible to catch it and restart with a fairly decent level of accuracy so long as you leave enough headroom in the percentage (e.g., 0.9 rather than 0.98) and a short enough frequency. It's also worth noting that up until Oct 28, 2015, the file system cache was being included in the memory total reported by the platform (see here) which could have thrown off the calculation before as well. |
What about an approach which has an app process its own logs using a middleware? simplest approach: set up a log drain to send the app's logs back to itself. the middleware picks it up, looks at the memory info, discards everything. drawbacks:
more complicated approach to deal with problem 2: make standalone app which accepts log drain and nothing else. it filters out the memory info and sends it back to the production app. now the production app just needs to deal with a couple POSTs/minute. even more complicated approach to deal with problems 2 and 3: same standalone app above, but instead of sending logs back to production app, the production app can query the standalone app on a schedule. this might offer other interesting opportunities for coordination. although it's getting into territory that is probably anathema to 12-factor. |
Check out https://github.com/noahd1/oink it sounds similar to that proposal |
@schneems any reason why PSS was totally deleted? Before the gem had an option which type to choose There are a lot libraries which uses PSS as well: https://github.com/propella/multimem (ruby, it does the thing but slower then this one), https://www.selenic.com/smem/ (python), https://github.com/ColinIanKing/smemstat (Python). There is no problem with PSS. It is not an inaccurate. It's just PSS. For some processes it's better to use RSS, but for others - PSS. I really don't understand why PSS was deleted. |
@vfreely this would be a better question for a new issue, it’s not related to the core issue here other than I was using pss to measure. The short answer is that I don’t remember. I think maybe it was because it wasn’t supported in all OSes but I’m not 100% positive. |
@schneems got it, thanks for answer |
@schneems I was wondering whether the recent addition (Docker 1.13, January 2017) of the statistics |
Heroku runs on LXC containers do you know if that cgroup interface was implemented there as well? |
@schneems No, sorry, I don't know, as we don't use it. However, it would be useful as it seems to become a defacto standard. In general, this problem applies also to Docker and all other cgroup based containers, doesn't it? |
The somewhat canonical issue I've been watching is moby/moby#8427 (comment). If that is indeed a defacto standard then I would post there to see what other people think of its use. |
Kinda, but get_process_mem's job is to find the memory of a specific process while that file tells us the memory of the whole system. We could maybe add helper methods like |
@schneems How does Heroku determine the space used by a container? |
It's reported externally to the host OS. We don't document the exact mechanism anywhere. I want to close this issue in the interest of cleaning up issues. It's a much bigger problem than can be solved in just this thread. |
From Heroku:
Note that
puma.resource_ram_mb=488.5341796875
whilememory_total=516.06
. I can't find docs on the behavior of Pss and whether swap memory gets reported.For PumaAutoTune. I'm thinking we'll need to add a fudge factor similar to that in PumaWorkerKiller, though I would like to get this library as accurate as possible.
This may help a bit: https://www.kernel.org/doc/Documentation/filesystems/proc.txt
Related: schneems/puma_auto_tune#9
The text was updated successfully, but these errors were encountered: