Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a proper MessageHandler for PartialEvaluator.getTextContent to avoid errors for fonts relying on built-in CMap files (PR 8064 follow-up) #8194

Merged
merged 2 commits into from
Mar 25, 2017

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Mar 24, 2017

My apologies for inadvertently breaking this in PR #8064; apparently we don't have any tests that cover this use-case :(

Without this patch getTextContent will fail if called before getOperatorList, since loading of fonts during text-extraction may require fetching of built-in CMap files.

Please note: The text test added here, which uses an already existing PDF file, fails without this patch.

Given that this patch fixes a bad regression, I'm flagging a bunch of people for review (for whoever has time to look at it first).


Fixes #8193, courtesy of the second commit.

Three cheers for all the idiosyncrasies in IE that made the second patch necessary :-P

…o avoid errors for fonts relying on built-in CMap files (PR 8064 follow-up)

*My apologies for inadvertently breaking this in PR 8064; apparently we don't have any tests that cover this use-case :(*

Without this patch `getTextContent` will fail if called before `getOperatorList`, since loading of fonts during text-extraction may require fetching of built-in CMap files.

*Please note:* The `text` test added here, which uses an already existing PDF file, fails without this patch.
@takahiroyoshi
Copy link

This could potentially be a solution for issue #8193, but please note that I've not checked yet.
Provided that the way how I test this patch is right, this patch does not solve the issue #8193.

… the `responseType` in the `DOMCMapReaderFactory`, since IE fails otherwise (issue 8193)

I really cannot understand why this change is necessary, since modern browsers such as Firefox and Chrome work just fine with the old code.
Hence this is patch is yet another "hack" that's needed just because IE apparently cannot just work like you'd expect.

For consistency, the Node factory used in the CMap unit-tests is changed as well.

Fixes 8193.
@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/767a7afa6a98489/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/5c640bf6c539948/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/5c640bf6c539948/output.txt

Total script time: 22.80 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/767a7afa6a98489/output.txt

Total script time: 28.89 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@yurydelendik yurydelendik merged commit b7ba44b into mozilla:master Mar 25, 2017
@yurydelendik
Copy link
Contributor

Thank you for the patch.

@Snuffleupagus Snuffleupagus deleted the getTextContent-use-proper-handler branch March 25, 2017 22:15
movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018
…-proper-handler

Use a proper `MessageHandler` for `PartialEvaluator.getTextContent` to avoid errors for fonts relying on built-in CMap files (PR 8064 follow-up)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internet Explorer : "Unhandled rejection: DataCloneError"
4 participants