Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Huge memory consumption when writing images to PDF #542

Open
zenyui opened this issue Dec 19, 2023 · 8 comments
Open

[BUG] Huge memory consumption when writing images to PDF #542

zenyui opened this issue Dec 19, 2023 · 8 comments

Comments

@zenyui
Copy link

zenyui commented Dec 19, 2023

Description

I am trying to create a PDF from an array of golang image.Image objects. The images are about ~30MB together, and when I write them to the PDF, I observe the docker container spike to 1.4GB memory usage.

In production, this is causing my container to OOM and exit.

See implementation below.

Expected Behavior

I would expect the memory usage to be close to (or 2x, 3x) the size of the images, not 1.4GB! I also don't see a way to incrementally build/finalize the PDF, so I don't see a way to decrease the memory usage.

Actual Behavior

Memory usage is 1.4GB, and I don't see an avenue to accomplish what I'm hoping to do.

Attachments

// pdfFromGoImages creates a pdf from an array of images, each on a separate page
func pdfFromGoImages(ctx context.Context, images ...image.Image) (io.ReadSeeker, error) {
	c := creator.New()

	margins := float64(10)

	for ix, img := range images {
		pImg, err := c.NewImageFromGoImage(img)
		if err != nil {
			return nil, err
		}
		_ = c.NewPage()

		// scale to page width
		pImg.ScaleToWidth(c.Width() - margins*2)
		pImg.SetPos(margins, margins)
		if pImg.Height() >= c.Height() {
			pImg.ScaleToHeight(c.Height() - margins*2)
			pImg.SetPos(margins, margins)
		}
		b := creator.NewBlock(1, 1)
		if err := b.Draw(pImg); err != nil {
			return nil, err
		}
		if err := c.Draw(b); err != nil {
			return nil, err
		}

	}

	var outBytes bytes.Buffer
	writer := bufio.NewWriter(&outBytes)
	if err := c.Write(writer); err != nil {
		return nil, err
	}

	return bytes.NewReader(outBytes.Bytes()), nil
}
Copy link

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/

@zenyui
Copy link
Author

zenyui commented Dec 19, 2023

FYI, I am a licensed enterprise customer

@sampila
Copy link
Collaborator

sampila commented Dec 19, 2023

Hi @zenyui,

Could you share the images that you load into golang image.Image object? so we can reproduce the issue in our ends

@zenyui
Copy link
Author

zenyui commented Dec 19, 2023

Here is a google drive folder with a few pprof dumps and the source PDF.

The larger algorithm is:

  1. extract the images from the source pdf
  2. convert to golang image.Image and compress it to 75% quality (attempt to make it smaller)
  3. pass into above function to write images to a new PDF

@sampila
Copy link
Collaborator

sampila commented Dec 19, 2023

Here is a google drive folder with a few pprof dumps and the source PDF.

The larger algorithm is:

  1. extract the images from the source pdf
  2. convert to golang image.Image and compress it to 75% quality (attempt to make it smaller)
  3. pass into above function to write images to a new PDF

Thanks for the information, we will investigate this issue.

@zenyui
Copy link
Author

zenyui commented Feb 22, 2024

Still waiting on a solution.

@ipod4g
Copy link

ipod4g commented Aug 7, 2024

@zenyui
We have already improved partly PDF creation from images and introduced lazy mode allowing us to reduce memory consumption.
you can check it here:
https://github.com/unidoc/unipdf-examples/blob/master/image/pdf_images_to_pdf_lazy.go

As for image extraction, we are actively working on that and and we will keep you updated on our progress.

@sampila
Copy link
Collaborator

sampila commented Nov 12, 2024

Hi @zenyui,

We created a guide to solve the OOM situation by using the GOMEMLIMIT, which require go1.19, here the article https://unidoc.io/post/unipdf-with-go-memlimit-on-memory-intensive-application, hope that can help you to solve the OOM situation and would be great if we could got feedback on that.

Best regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants