-
-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to palliate: Memory leak following rendering #611
Comments
FYI: The same PDF with WeasyPrint==0.36 our memory was increased ~330MB with WeasyPrint==0.42.2 our memory was increased ~415MB |
Thank you for this issue, and for taking the time to read #220 and #70!
Putting rendering steps into separate tasks looks like a good idea. More than memory consumption, it may be really helpful for parallelization. It's the way to go, but I'm scared 😄. Your problem is a good pretext to start working on that, but I'll feel ready to merge when we have a solid solution that can be used elsewhere on WeasyPrint and works perfectly on various operating systems. As we now target Python 3 only, we have many useful tools to achieve this in a clean way, but we have to choose them well. About your memory problem only: some well-known memory leaks have been fixed in the current master branch (mainly caches that are not global to the process anymore), and there's no known memory leak left. If you could find some time to test it, that would be really helpful! You can try to play with the garbage collector as well. Forking is just an efficient solution to a memory problem that shouldn't exist 😉.
That's bad news. Is it possible to get samples that show this problem? Memory decreased by at least ~10% between 0.36 and 0.42.x with my informal testing suite. One last thing: if you can, use Python 3.6. |
This pdf has sensitive information. But I think I can change it for a "lorem ipsum" text.
Currently, we use Python 2.7. But Python 3.X (surely 3.6) is coming soon. I am very busy. But in 1-2 months I think I will be able to get several hours to create an HTML example. And I will test it with Python 3.6 and differents WeasyPrint versions Please let me time, and I will be back ;-) |
Hi all! I am a Pablo partner and i have a bit more time to get the HTML and test the pdf creation with Python 2.7 and Python 3.6. I have made tests with this html and different Python and WeasyPrint versions getting the followings results:
To reproduce this data, you can use the following test html: Regards! |
@rubgombar1 Thank you for this table. I'm trying hard to keep both memory consumption and execution as low as possible. I only launch my memory/speed tests with the latest Python version that's why I missed the growing memory between 0.36 and 0.42.3 (we have the ~-10% I was talking about in my previous comment, but only for Python 3.6). A lot of work has been done to use pure dicts wherever it was possible, to get the best of the huge benefits introduced in Python 3.5 and 3.6 dicts.
You now know how much Python 3.6 is good for you! The memory gap between Python 3.5 and Python 3.6 is a big one. |
:-) Using the fork solution (first comment) it is a new memory grafic. It is similar that previous graphic. But in the previous grafic I did a manually restart, and in this grafic the memory went down by itself. |
Small update (results are different from previous table as we have different computers and systems).
I'd like to know if there's really a memory leak (and change the issue title accordingly). @goinnn @rubgombar1 Do you have news about your problem? |
@liZe I'm sorry we use Python 2.7 still :-( We will be able to say something in several months |
A dedicated website has now been created to follow performance and memory use. It makes me feel that I can close this issue for now 😄. Version 49 is dedicated to performance and bug fixes, the current Feel free to open a new ticket if you find a memory leak that needs to be fixed! |
Hi @liZe we are migrating from Python 2.7 to Python 3.7 finally. This month or next month we lanunch it in production enviroment. Then we will be able to comment something about it. This is our leak memory during billing proccess (1th of every month) We generate (approximately, depends on the month) 1800 pdfs + 2400 excel files How you see, our memory graph is uglyer than two years ago because currently we generate a lot of more bills (pdfs & excel files) and because two years ago we did fork process when we were generated PDFs with a lot of pages. But we think this was risky and now we don't create fork process. We use a celery task for this process, and we have configured it with this configuration:
So, celery restarts worker when memory is up to 2GB. @liZe thanks another time for this project. We will chat another time in several weeks |
Hi @liZe It is our new metric. Currently, with Python 3.6.10 (almost) does not increase. But these data are not conclusive, our billing this month is smaller than before month because COVID-19. We will chat another time in several weeks when COVID-19 is out of our lifes. Take care! Thanks!! |
That’s good news.
Of course.
😄 Take care… |
Hi,
We generate bills (PDFs) from a celery task and this month we generated a PDF with breakdown with 50 pages, until now every PDF had only one page, and we didn't have any problem.
You can see this problem in the next image, this Sunday I had done a manually restart and memory returned to a good score.
I read the next issues: #220 and #70. And we think, we have a nice solution to palliate this problem.
With this code, we will create a fork of main proccess, we have a peak of memory but when the child proccess is finished the memory returns to regular value.
What do you think about it?
PS: Thanks for this great project!
The text was updated successfully, but these errors were encountered: