Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circular references on Page Tree causes PDF::Reader to crash with SystemStackError #530

Open
tomascco opened this issue Dec 30, 2023 · 1 comment

Comments

@tomascco
Copy link

tomascco commented Dec 30, 2023

Pages-tree-refs.pdf (source)
Running the following script with the attached PDF renders the following error:

require "bundler/inline"

gemfile do
  gem "pdf-reader"
end

PDF::Reader.new("Pages-tree-refs.pdf").pages
# /usr/local/bundle/gems/pdf-reader-2.12.0/lib/pdf/reader/reference.rb:65:in `hash': stack level too deep (SystemStackError)

This is caused by a circular reference with Page Tree objects:

% ...
1 0 obj
  << /Type /Catalog
     /Pages 2 0 R
  >>
endobj

2 0 obj
  << /Type /Pages
     /Kids [6 0 R 3 0 R]
     /Count 2
     /MediaBox [0 0 595 842]
  >>
endobj

3 0 obj
  << /Type /Pages
     /Kids [4 0 R]
     /Count 1
     /MediaBox [0 0 595 842]
  >>
endobj

4 0 obj
  << /Type /Pages
     /Kids [5 0 R]
     /Count 1
     /MediaBox [0 0 595 842]
  >>
endobj

5 0 obj
  << /Type /Pages
     /Kids [3 0 R]
     /Count 1
     /MediaBox [0 0 595 842]
  >>
endobj
% ...

Here we can observe that 2 0 R is the root, that has two children: 6 0 R and the problematic 3 0 R:

3 0 R --> 4 0 R --> 5 0 R --> 3 0 R <-- the cycle restarts here.

I would like to give an shot to solve this, may I do it?

Context: I've been using PDF::Reader as a dependency of a gem created for my undergraduate thesis (https://github.com/tomascco/rubrik). As part of my research, I've tested PDF::Reader against some of the PDFs on the pdf.js repository (https://github.com/mozilla/pdf.js/tree/master/test/pdfs) and found some cases like this one.

I'd also like give some feedbacks as someone that used PDF::Reader as a dependency for a higher level PDF interface.

Would these patches and suggestions be welcome? @yob

@yob
Copy link
Owner

yob commented Dec 30, 2023

Would these patches and suggestions be welcome?

absolutely!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants