-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Production thumbnails memory leak #3353
Comments
Update on thumbnails memory usage: It looks like there is a leak, but that it's very slow. Memory jumps in cliffs a little less than once a day (like 1.2 days, ish?), but is more or less stable the rest of the time. If it spikes, it comes back down to the percentage it last did a big jump to. The last three big jumps in the maximum usage are all almost exactly 2% (~20 MB). Outside the three 2% jumps, there are also some smaller 1% jumps, but not more than one a day in the last four days since things really stabilised after the last deployment. At this rate, if we deploy the thumbnails once a week, it would be safe to lower it to half it's current memory allocation. Another interesting observation is that in the last 18 hours the maximum memory of the thumbnails is actually spiking down then back up within 0.5% of the highest it's reached. That's in contrast to the stable periods before that where the maximum was pretty much pinned at one value with a few outlier spikes up and then back down to the stable usage. Maybe that's a sign the service has reached it's actual stable point? Or maybe there's another explanation. It is due for another 2% jump within the next 8 hours or so, maybe it'll happen before I log off for the day after the team meeting in my evening. Curious to see if the pattern continues. For now though, nothing bad is happening, so it's fine to let it keep rolling on to see how it goes until we have changes to deploy again. The regular API, which handles all other requests, looks stable as well, but with some interesting spikes: I think that is fine though. It is also not causing any issues. |
I'm setting this to blocked because there's not much we can or want to do about this right now. The memory does increase steadily but so far it's never caused an issue and doesn't get much beyond 40% maximum usage, which is a perfectly comfortable place to be (we won't suddenly run out). I think actually it would be better to close this, and to open a new issue if we identify it again after we merge the two API services back together. |
Description
We are seeing a memory leak pattern in production thumbnails since deploying the ASGI worker to thumbnails
After we reduced task resources, which caused a redeployment, we've seen memory usage climb to 29% maximum usage and appears to continue to climb.
The text was updated successfully, but these errors were encountered: