-
-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WeasyPrint consuming a lot of memory when rendering tables with 5000 rows #1104
Comments
In this, the Data is of the large table, and you will see:
Note: I queried 10,000 rows out of this data of 50,000 rows. There is another Boys0604_2212.pdf file, which displays that before adding Thanks. |
#70 is probably interesting to read and could give expected levels of memory needed to render long tables. I’ll check your example as soon as possible. |
Yes, I think I checked the issue out, you mentioned about StyleDict, and deduplication of some rules. I might not be aware of those, but probably they are not in my code. Can you point out what I might need to improve in my code on this reference ? Also an Update: I think CSS is applied at some extent. But if you will notice in the Boys.pdf and the Test.pdf borders are there, but Cell highlighting is maybe what's missing. Once I applied
Have no effect. Thanks for looking into this, really appreciate it. |
Any updates on the issue ? |
I’m back, sorry for the delay…
They’re 5000, the first one is 0 😉.
There’s no reason why it shouldn’t work. Maybe there’s a problem in the CSS you generate? Could you please provide the generated HTML file?
That’s not normal to have such a difference. If the variable holding the first document is deleted (by using |
Hey, lize thanks for getting back, seems like the row thing was my own fault, I get it now 👍 , and same is for styling. About performance, I will get back to you. Really embarrassed for making typo which messed my styling. When I visited the code once again, to give you samples, after long time, I realised my mistake thanks to you :) |
The code I am using is:
So I am modifying this code for export features in apache/incubator-superset project, under the file viz.py When, I downloaded a chart with 6000 rows in pdf, I got the response, but initially it consumed 1.6 Gig of ram, then when I launched second request once the first got over the number jumped to 2.3 gigs, later on I launched two multiple requests and number further jumped to 3.9 gigs, not sure why is this happening, and it's of-course not good for multiple people using the web app and printing the chart. I will be posting the csv data and pdf which gets printed. So seems like styling is working, I am getting all the rows, at the end performance is huge bottle neck. Thanks for taking a look, I will be happy to assist you with providing a modified superset branch if you want to test this out yourself on apache/superset. |
With recent versions of WeasyPrint, there’s a difference of less than 20% between rendering long tables or the same amount of divs. WeasyPrint still uses too much memory, but tables are now not that much worse than other boxes. Rendering times have been improved with 50456df too. |
Hi, Weasyprint made my life a lot easier, but recently I noticed that it's consuming a lot of memory and on top of that on every print call the previous memory adds up, I am running the latest version 51 of Weasyprint.
Python-> Python 3.7
Distro-> Fedora Workstation 31
Note: I actually commented last lines of code for testing purposes.
I am trying to print a pandas Dataframe , everything works good except memory and CSS.
Version 1.
I simply printed the page without mentioning the size, and the styling on tables worked.
Version 2.
I introduced
size_css
because my content was large, and I needed A3 paper, and post that the styling on tables is not working, which I am not sure why ?I noticed performance issues as well when I ran this on 1000+ rows, it eats up a lot of memory, not sure why .. I read issue #220 about this, and tried the @font-face but it's not helping.
I ran this once it ate up 1.4 Gig of Ram, then on second time just after the previous one it added up and ate 2.1 Gig of memory.
I thought I might need to manually do gc.collect() but it has no effect.
Hence it's commented in the code.
Also, I thought that maybe the HTML string is getting a lot big, so I tested without rendering any PDF, but turns out it's less than 10Mb.
And, when I limit the dataset size to 50-100 rows something small, it behaves quite well, and on subsequent prints the memory do not add up like it happens with large ones.
I will attach the table's CSV, for your testing and also attach the Rendered PDF where you will be able to notice the table styling difference which I mentioned about.
Thanks!
The text was updated successfully, but these errors were encountered: