Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF::Reader::MalformedPDFError - after update to v2.10.0 #492

Open
kserhiyus opened this issue May 31, 2022 · 1 comment
Open

PDF::Reader::MalformedPDFError - after update to v2.10.0 #492

kserhiyus opened this issue May 31, 2022 · 1 comment

Comments

@kserhiyus
Copy link
Contributor

kserhiyus commented May 31, 2022

Hello,
The gem is great and I'm quite a power user of it.

However after update v2.9.2 to v2.10.0 some of my PDFs fail to be processed.
I did check those failing PDFs in several online validators and there were no issues found.

The error i get comes from here: https://github.com/yob/pdf-reader/blob/main/lib/pdf/reader/cid_widths.rb#L55

Error full trace:

/opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/cid_widths.rb:55:in `parse_second_form': CidWidths: 3 must be less than 3 (PDF::Reader::MalformedPDFError)
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/cid_widths.rb:37:in `parse_array'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/cid_widths.rb:22:in `initialize'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/width_calculator/composite.rb:17:in `new'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/width_calculator/composite.rb:17:in `initialize'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:146:in `new'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:146:in `build_width_calculator'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:49:in `initialize'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:214:in `new'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:214:in `block in extract_descendants'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:213:in `map'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:213:in `extract_descendants'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:48:in `initialize'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:393:in `new'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:393:in `block in build_fonts'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:392:in `each'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:392:in `map'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:392:in `build_fonts'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:30:in `initialize'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/item_receiver.rb:21:in `new'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/item_receiver.rb:21:in `page='
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/validating_receiver.rb:258:in `call_wrapped'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/validating_receiver.rb:24:in `page='
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:268:in `block in callback'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:267:in `each'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:267:in `callback'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:158:in `walk'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:37:in `block in extract_analyze_merge'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:34:in `collect'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:34:in `extract_analyze_merge'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:27:in `block in <class:Processor>'
	from (eval):34:in `instance_exec'
	from (eval):34:in `__dry_initializer_initialize__'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/dry-initializer-3.0.4/lib/dry/initializer/mixin/root.rb:7:in `initialize'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/bin/pdfcb:81:in `new'
	from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/bin/pdfcb:81:in `<top (required)>'
	from /opt/bitnami/ruby/bin/pdfcb:25:in `load'
	from /opt/bitnami/ruby/bin/pdfcb:25:in `<main>'

In fact in order to keep up with an update i tweaked this line

raise MalformedPDFError, "CidWidths: #{first} must be less than #{final}" unless first < final

to be

raise MalformedPDFError, "CidWidths: #{first} must be less than #{final}" unless first <= final

and for me all works as before the update.

Here are the failing PDFs:
https://assets.publishing.service.gov.uk/media/5c640a8ded915d04148c31b0/Mr_J_Szymaniak_v_Jason_Hunt_and_Mardi_Hunt_trading_as_Crazy_Bear_Farm_and_Farm_Shop_-_3304471-2018.pdf
https://assets.publishing.service.gov.uk/media/5de917a2e5274a06d71f0413/Mr_J_Szymaniak_v_Jason_Hunt___Mardi_Hunt_TA_Crazy_Bear_Farm_and_Farm_Shop_-_3304471-2018_Judgment.pdf

Regards,
Serhii

@yob
Copy link
Owner

yob commented May 31, 2022

Thanks for the clear bug report.

That particular raise was added between v2.9.2 and v2.10.0, so this sounds like a bug and I suspect your fix is what we need. Are you up for opening a PR and I'll get it merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants