Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for viewing PDFs from file://-URLs in the Chrome extension #6233

Merged
merged 5 commits into from
Aug 15, 2015

Conversation

Rob--W
Copy link
Member

@Rob--W Rob--W commented Jul 18, 2015

Up to now, if a user did not check the "Allow access to file URLs" checkbox at the extension settings page, then they could not view local PDFs in Chromium.

With this patch, such navigations are intercepted, and the extension will ask the user to enable the setting when it detects that it has no access to file:-URLs.

When the user selects "Allow access to file URLs", the extension reloads and Chrome closes all existing extension tabs (https://crbug.com/511670). Since this is disruptive, I've added a work-around (save tab state before page unload, restore tab state when extension loads).

To test:

  1. node make chromium
  2. chromium --load-extension=build/chromium
  3. Visit chrome://extensions and uncheck the 'Allow access to file URLs" at the PDF.js extension (this was automatically checked because you've loaded the extension in unpacked mode).
  4. Open a local PDF in the browser.
  5. Verify that the PDF.js extension prompts for file access permissions.
  6. Follow the instructions:
    • Click on the chrome://extensions link.
    • Click on the "Allow access to file URLs" checkbox at the extensions page.
  7. Navigate back
  8. Verify that the local PDF is displayed by the extension.

  9. Go to chrome://extensions in a new tab (so the PDF viewer is still open).
  10. Click on "Allow access to file URLs" (to untick the checkbox).
  11. Observe that Chrome closes the PDF viewer tab, but verify that the PDF Viewer extension re-opens the tab (and prompts to re-enable file access).

Screenshot:

Chrome extension asking for access to local files

The displayed string adapts itself to the browser. Start Chrome/Chromium with the LANG=nl_NL.UTF8 environment variable, and you will see a Dutch browser interface and the Dutch translation of "Allow access to file URLs":

Chrome extension asking for access to local files (LANG=nl_NL.UTF8)

Ordinarily, local files cannot be embedded in a non-local website. Until
this commit, the extension allowed websites to embed local PDF files on
non-local (e.g. http(s)) websites. This unintended feature is now
disabled, to align better with Chrome's existing security policies
(=local file:-URLs cannot be loaded in a tab unless expicitly allowed).
@timvandermeij
Copy link
Contributor

I like the idea here, but I think we should use the current localization facilities to localize the entire message. It seems confusing to get an English message in a Dutch interface with a Dutch part in it. It would be better to just have the entire string in the current localization files such that it will be translated to all other languages upstream. If the "Allow access to file URLs" is a concern (i.e., if the translation should be exactly the same as in Chrome), we should reword the sentence to make it more general, i.e. something like "Allow access to file URLs at chrome://extensions to view this file.". That way we do not mention the exact name of the option, but rather guide the user towards the option, which allows upstream to translate this message more easily and also for languages that Chrome does not even support. What do you think?

@Snuffleupagus
Copy link
Collaborator

In this case, I think that it makes a certain amount of sense to handle the l10n separately.
My main motivation is that it seems slightly strange to essentially require that the Firefox l10n teams translate a Chrome-specific string (since it doesn't appear to be necessary for other builds).

@timvandermeij
Copy link
Contributor

Okay, that seems reasonable to me. What about the string not being translated entirely, but only partially? Would you agree that that is confusing or do you think it is fine like that?

@Rob--W
Copy link
Member Author

Rob--W commented Jul 19, 2015

it would be very nice if the whole string can be translated, but I did not know that it's possible to use Firefox's localization teams to translate the string.

When constructing that sentence, I had the following requirements:

  • It should be short and understandable, even to non-technical users.
  • It should guide the user to perform the desired action (i.e. enabling file access). This is achieved by inserting a link to chrome://extensions and by showing the extension icon + exact label of checkbox. PDF.js's language is not necessarily equal to Chrome's UI language, so I did not use webL10n to get the correct translation, but some Chrome-specific i18n API to get the UI language.
  • It should give context to the action, so that the user knows why he should perform that action: That's why the link to the PDF file is inserted.

That resulted in the final string:

Click on "[label]" at [link] line break
to view [path to PDF file.]

The line break is inserted to make sure that the top line is always at the same location, regardless of the file path.

Here's an example for Dutch:

Druk op "[label]" bij [link] om line break
[path to PDF file.] te weergeven.

Is it really possible to ask Firefox's i18n team to provide such translations?

@Rob--W Rob--W force-pushed the crx-local-files branch 2 times, most recently from 9104e5a to bf588f0 Compare July 20, 2015 11:52
@Rob--W
Copy link
Member Author

Rob--W commented Jul 20, 2015

I have updated the PR with two more commits, to minimize the disruption caused by reloading the extension, and updated my PR message with 3 more steps-to-verify. Without these two PRs, existing PDF tabs would disappear when the user follows the instructions to enable file access.

@Rob--W
Copy link
Member Author

Rob--W commented Aug 8, 2015

@timvandermeij / @Snuffleupagus What prevents this PR from being merged?

@timvandermeij
Copy link
Contributor

I was wondering if there was a way to translate the English parts of the string, but then I saw that https://github.com/mozilla/pdf.js/blob/master/web/viewer-snippet-mozPrintCallback-polyfill.html#L70 is untranslated as well. Let's add such a TODO to this code too to make it easier to find if we find a way to localize that later on. Personally I'm fine with keeping it this way for the current PR and following up on that later (as it is a TODO just like the other one, someone will find and address it eventually, but it is not a high priority).

Other than that, this just has to be reviewed properly by someone. I'm a bit busy with other PRs and on vacation soon, so could @Snuffleupagus or @yurydelendik review this perhaps?

@Rob--W
Copy link
Member Author

Rob--W commented Aug 8, 2015

@timvandermeij I've opened a new ticket (#6333) for the translation issue. Have fun on your well-deserved holiday trip!

@@ -1347,6 +1347,7 @@ html[dir='rtl'] .attachmentsItem > button {

.dialog > .row {
display: table-row;
word-break: break-all;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unconditionally adding this rule leads, in my opinion, to decreased readability of the text in the DocumentProperties dialog in some cases; see examples below.
Can you please add this rule only for the chromeFileAccessOverlay?


  • intelisa.pdf from the test-suite

intelisa

biblatex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move the declaration.

( The main reason for adding word-break: break-all is to prevent a long sequence of characters (such as a file path) from overflowing. This would be useful for the main dialogs as well, so I put it there. But after testing, I notice that the text doesn't overflow in the document properties dialog. This is because the p (containing the text) is not inline (but table-cell), has word-wrap break-word AND an explicit (maximum) width. )

@Snuffleupagus
Copy link
Collaborator

I've tested this and it appears to work exactly as described!
Since this PR is specific to the Chrome extension, I'm treating it somewhat like external code, hence I've not reviewed it line by line.
Once the comment I added is addressed, I think this can be merged.

The JSON file is generated as follows.

1. Go to the src/chrome/app/resources directory of Chromium's source.

2. Find the translation ID of the "Allow access to file URLs" string:

grep 'Allow access to file URLs' generated_resources_en-GB.xtb

3. With the ID that you've found, locate the other translations.

grep 3341703758641437857 generated_resources_*.xtb

4. If the result looks OK, serialize the result as JSON and save it.

> path/to/pdf.js/web/chrome-i18n-allow-access-to-file-urls.json \
python -c "import json;print(json.dumps({ \
$(grep 3341703758641437857 generated_resources_*.xtb | \
 sed "s@generated_resources_\([^.]\+\)\.xtb:<translation[^>]\+>\(.\+\)</translation>@'\1':'''\2''',@" \
)}, sort_keys=True, indent=2))"

(Strings are taken from Chromium 45.0.2448.0 (ccrev.com/337313).
Snuffleupagus added a commit that referenced this pull request Aug 15, 2015
Improve support for viewing PDFs from file://-URLs in the Chrome extension
@Snuffleupagus Snuffleupagus merged commit 421289c into mozilla:master Aug 15, 2015
@Snuffleupagus
Copy link
Collaborator

Thank you for the patch!

Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Sep 2, 2024
restoretab.js was added in mozilla#6233
with the purpose of restoring lost tabs when Chrome closes all extension
tabs when it reloads the extension. This forced reload can happen when
the user toggles the "Allow access to file URLs" option.

This logic does not work any more, and since the use of localStorage is
a blocker in migrating to MV3, this patch just drops the logic.
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Sep 8, 2024
restoretab.js was added in mozilla#6233
with the purpose of restoring lost tabs when Chrome closes all extension
tabs when it reloads the extension. This forced reload can happen when
the user toggles the "Allow access to file URLs" option.

This logic does not work any more, and since the use of localStorage is
a blocker in migrating to MV3, this patch just drops the logic.
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Sep 8, 2024
restoretab.js was added in mozilla#6233
with the purpose of restoring lost tabs when Chrome closes all extension
tabs when it reloads the extension. This forced reload can happen when
the user toggles the "Allow access to file URLs" option.

This logic does not work any more, and since the use of localStorage is
a blocker in migrating to MV3, this patch just drops the logic.
JohnAdamsy added a commit to Buymore/pdf.js that referenced this pull request Oct 1, 2024
* [Editor] Change the background color of the image preview in the new alt text dialog

* [Editor] Remove event listeners with `AbortSignal.any()`

There's a fair number of event listeners in the editor-code that we're currently removing "manually", by keeping references to their event handler functions.
This was necessary since we have a "global" `AbortController` that applies to all event listeners used in the editor-code, however it's now possible to combine multiple `AbortSignal`s; please see https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/any_static

Since this functionality is [fairly new](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/any_static#browser_compatibility) the viewer will check that `AbortSignal.any()` is available before enabling the editing-functionality.
(It should hopefully be fairly straightforward, famous last words, for users to implement a polyfill to allow editing in older browsers.)

Finally, this patch also adds checks and test-only asserts to ensure that we don't add duplicate event listeners in various editor-code.

* Check that `AbortSignal.any()` is supported in `PDFViewer` too (PR 18586 follow-up)

Without this patch the viewer may break on load, since the check added in PR 18586 only applies to the toolbar.

* Revert "[Editor] Pass a buffer instead of a typed array when passing image data to the model"

* [Firefox] Remove the "loadaiengineprogress" listener with `AbortSignal.any()`

* Use a few local variables in `PDFSidebar.#addEventListeners`

This, ever so slightly, shortens the code for a couple of repeatedly accessed class fields.

* Stop sidebar resizing on "blur" events

Because of an old oversight (by me) we don't stop sidebar resizing when the browser window loses focus, which seems generally wrong and can also lead to duplicate mouse-related event listeners being registered.

* Remove the sidebar resizing event listeners with an `AbortController`

* [Editor] Add a first test to test the new alt text flow

* Remove the `enableHighlightEditor` preference

This was enabled by default in Firefox 126, see [bug 1867513](https://bugzilla.mozilla.org/show_bug.cgi?id=1867513), so hopefully we should be able to remove the option/preference now.

* [Editor] Add the telemetry for the new alt text (bug 1912500)

* Group and scope the secondary toolbar button container/icon rules using CSS nesting

The secondary toolbar CSS rules predate the general availability of CSS
nesting, which makes them more difficult to understand and change
safely. The primary issues are that the rules are spread over the
`viewer.css` file, they share blocks with other elements and the scope
of the rules is sometimes bigger than necessary.

This refactoring groups all CSS rules for the secondary toolbar button
container/icons together, scoped to the top-level `#secondaryToolbar`
element, for improved overview and isolation. Note that this patch only
intends to move the existing rules around and not change any behavior.
Moreover, this patch does not move the rules for the secondary toolbar
itself and the secondary toolbar buttons; those will be part of a
follow-up patch and will be easier once this is in place first.

Co-authored-by: Calixte Denizet <[email protected]>

* Remove the `secondaryToolbarButton` CSS class

Secondary toolbar buttons are toolbar buttons with some extra rules,
mainly to make them wider and have visible labels. However, this
similarity is currently not clearly reflected in the implementation
because the secondary toolbar buttons use a different CSS class,
`secondaryToolbarButton`, compared to the other toolbar buttons that
use the `toolbarButton` CSS class. This also causes some duplication
in the rules and requires extra care to keep the common bits for the
`secondaryToolbarButton` class in sync with the `toolbarButton` class.

Fortunately, now that we have a dedicated CSS scope for the secondary
toolbar, we can simplify this by giving all secondary toolbar buttons
the `toolbarButton` class and explicitly listing the required overrides
in the `#secondaryToolbar` scope instead. Doing so removes most of the
special-casing for secondary toolbar buttons while explicitly listing
the differences in a single place for a better overview. It also lays
the foundation for making all toolbar buttons respect the
`browser.uidensity` Firefox preference later by reducing differences.

Co-authored-by: Calixte Denizet <[email protected]>

* Group and scope the secondary toolbar rules using CSS nesting

The secondary toolbar CSS rules predate the general availability of CSS
nesting, which makes them more difficult to understand and change
safely. The primary issues are that the rules are spread over the
`viewer.css` file, they share blocks with other elements and the scope
of the rules is sometimes bigger than necessary.

This refactoring groups all remaining secondary toolbar rules together,
scoped to the top-level `#secondaryToolbar` element, for improved
overview and isolation. Note that this patch only intends to move the
existing rules around and not change any behavior.

Co-authored-by: Calixte Denizet <[email protected]>

* Limit base-class initialization checks to development and TESTING modes

We have a number of base-classes that are only intended to be extended, but never to be used directly. To help enforce this during development these base-class constructors will check for direct usage, however that code is obviously not needed in the actual builds.

*Note:* This patch reduces the size of the `gulp mozcentral` output by `~2.7` kilo-bytes, which isn't a lot but still cannot hurt.

* Merge the duplicate `.editorParamsToolbar` CSS blocks

Now that we have dedicated CSS scopes for the findbar and the secondary
toolbar we have ended up with two CSS blocks with identical selectors,
so now we can combine them for simplicity and to remove some rules in
the first block that were always overridden by the second block.

Co-authored-by: Calixte Denizet <[email protected]>

* Improve grouping of the secondary toolbar button CSS rules

Using the `:is(a)` selector we can move the previously separate rules
into the main `.toolbarButton` override rules.

Co-authored-by: Calixte Denizet <[email protected]>

* Generalize the CSS rules for labeled toolbar buttons

This commit fixes a regression from mozilla#18596 where the "Add image" button
styles broke because it's a labeled toolbar button but the labeled
toolbar button CSS rules were only scoped to the secondary toolbar.
However, nowadays labeled toolbar buttons are also used in e.g. the
editor parameters toolbar, and this highlighted that we didn't clearly
encode the concept of labeled toolbar buttons in the CSS so far.

This patch extracts the CSS rules for labeled toolbar buttons that were
previously limited to the secondary toolbar into a dedicated generic
class that can be applied on top of any unlabeled toolbar button to
convert it to a labeled toolbar button, regardless of its position in
the DOM. This also makes the distinction clearer in the HTML, and the
new CSS scope for the toolbar buttons lays the foundation for combining
the other toolbar buttons rules in there later on.

Co-authored-by: Calixte Denizet <[email protected]>

* Allow specifying custom match logic in PDFFindController

This patch allows embedders of PDF.js to provide custom match
logic for seaching in PDFs. This is done by subclassing the
PDFFindController class and overriding the `match` method.

`match` is called once per PDF page, receives as parameters the
search query, the page contents, and the page index, and returns
an array of { index, length } objects representing the search
results.

* Bump library version to `4.6`

* Use the local `eventBus` in the `AnnotationEditorUIManager` constructor

This shortens the code ever so slightly, which cannot hurt.

* Handle the "switchannotationeditorparams" event in the editor-code (issue 18196)

The problem seems to be caused by the browser trying to "restore" editing input-elements, in the various toolbars, to their previous values when the tab is re-opened.

Hence the simplest solution appears to be to move the event handling into the editor-code, which is also less code overall, since the listener thus won't be registered early enough for the problem to appear.

* Link to official releases and the demo viewer in the bug report template

Hopefully this might lead to *more* users actually testing the latest version before reporting a bug.

* Fix the telemetry for the new alt-text flow

* Shorten the `PDFViewerApplication._parseHashParams` method

The way that the debugging hash-parameter parsing is implemented leads to a lot of boilerplate code in this method, since *most* of the cases are very similar.[1]
With just a few exceptions most of the options can be handled automatically, by defining an appropriate checker for each option.

---

[1] With the recent introduction of TESTING-only options the size of this method increased a fair bit.

* Enable disabled integration tests for Firefox

* Update dependencies to the most recent versions

* Fix vulnerability in the `axios` dependency

This patch is generated automatically using `npm audit fix` and fixes
CVE-2024-39338 (see GHSA-8hc4-vh64-cxmj),
bringing the vulnerability count back to zero.

* Fix the repository URLs in the `importl10n` script

We introduced quite a few new strings recently, but during the last few
rounds of updates we didn't see new translations updates becoming
available. I only just now recalled the announcement that Mozilla is
moving from Mercurial to Git, and indeed the Mercurial page at
https://hg.mozilla.org/l10n-central hasn't been updated since July
anymore and that's were we used to pull our translations from.

This commit fixes the issue by changing the URLs to the Mozilla Git
repositories hosted on GitHub instead.

* Update translations to the most recent versions

* Set the event handlers in the integration tests before any event is triggered

The function evaluateOnNewDocument in Puppeteer allow us to execute some js before the pdf.js one
is loaded.
It allows us to stub some setters before there are used and then set some event handlers very soon.

* [Editor] Move setting `window.uiManager` back to the test code

In PR mozilla#18574 setting `window.uiManager` was moved into the `src` folder
to avoid intermittent integration test failures because at the time we
lacked a way to register event listeners early (before PDF.js loads).
However, in PR mozilla#18617 this functionality got introduced, so we can now
use the new way of setting up the event bus in the tests to move this
back to the `test` folder again and to reduce the amount of test-only
code in the main codebase as discussed in PR mozilla#18574.

Partially reverts e037c57.

* Fix the "must check that a value is correctly updated on a field and its siblings" scripting integration test

This integration test fails intermittently because we cache the initial
total value to be able to compare it to the new total value at the end
of the test to check that it's different before doing the assertions.
However, this doesn't work as expected because the second `clearInput`
call triggers an intermediate total value calculation because it clicks
on another input field and that triggers a sandbox event.

This results in the `waitForFunction` calls always resolving immediately
and since we don't use other means of waiting until the calculation is
done (using e.g. `waitForSandboxTrip`) we basically rely on the time
between the final click and the assertions to be enough for the sandbox
to do its work. If it's is not done in that time, we do the assertions
with older values and that makes the test fail.

This commit fixes the issue by simply waiting for the total value to be
what we expect it to be. This requires less code, is more consistent
with the other integration tests and removes the possibility of doing
assertions against older values.

* Use standard glyph mapping for non-embedded and non-composite Calibri fonts (issue 18208)

Given that we handle non-embedded Calibri fonts as "mapped to standard font", we really ought to be able to use the same glyph mapping as for an actual standard font.
Note that this actually improves consistency in the code, given how we already handle such fonts if they happen to be of the `CIDFontType2` type; see https://github.com/mozilla/pdf.js/blob/b47c7eca83c35b8f9ea170aa3742fc70359726c2/src/core/fonts.js#L1186-L1190

* Don't show the print dialog when printing in some integration tests

* Send fetch requests for all page dict lookups in parallel
- When adding page dict candidates to the lookup tree, also initiate fetching them from xref, so if they are not yet loaded at all, the XHR will be sent
 - Only at the top level - assume that if there is a /Pages tree, it is sensibly structured and the number of requests won't be too bad
- We can then await on the cached Promise without making the requests pipeline
- This has a significant performance improvement for load-on-demand (i.e. with auto-fetch turned off) when a PDF has a large number of pages in the top level /Pages collection, and those pages are spread through a file, so every candidate needs to be fetched separately
 - PDFs with many pages where each page is a big image and all the pages are at the top level are quite a common output for digitisation programmes
- I would have liked to do something like "if it's the top level collection and page count = number of kids, then just fetch that page without traversing the tree" but unfortunately I agree with comments on mozilla#8088 that there is no good general solution to allow for /Pages nodes with empty /Kids arrays

* Introduce a helper method for fetching l10n-data in `PDFDocumentProperties`

Given the length of the l10n-strings we can slightly reduce verbosity, and thus overall code-size, by introducing a helper method for fetching l10n-data.

While testing this I stumbled upon an issue in the `L10n`-class, where an optional chaining operator was placed incorrectly since the underlying method always return an Array; see https://github.com/projectfluent/fluent.js/blob/48e2a62ed45ff2a62a231b2e83cfd8b332d27acb/fluent-dom/src/localization.js#L38-L77

* Introduce a `L10n`-method to translate an element once, and use that in `PDFLayerViewer`

Currently we *manually* fetch the "pdfjs-additional-layers" string and update the DOM-element, which was needed since we want to avoid triggering a bunch of otherwise unnecessary translation when appending the entire layer-tree to the DOM.
By introducing a new helper method in the `L10n`-class we can avoid this, and instead use a "data-l10n-id" attribute on the element (as most other viewer code does nowadays).

* Support an odd number of digits in hexadecimal strings (issue 18645)

See https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1840792

* Use `HTMLCanvasElement.toBlob()` unconditionally in `PDFPrintService`

The fallback is very old code, and according to the MDN compatibility data `HTMLCanvasElement.toBlob()` should be available in all browsers that we support now: https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/toBlob#browser_compatibility

* Revoke the blob-URLs used during printing in `PDFPrintService`

* Update dependencies to the most recent versions

* Fix vulnerability in the `micromatch` dependency

This patch is generated automatically using `npm audit fix` and fixes
CVE-2024-4067 (see GHSA-952p-6rrq-rcjv),
bringing the vulnerability count back to zero.

* Update translations to the most recent versions

* Utilize Fluent to format numbers and dates in `PDFDocumentProperties`

The `PDFDocumentProperties` dialog may not display correctly formatted data, especially in the GENERIC viewer, since it's using native methods[1] that depend on the *browser* locale instead of the viewer locale as intended.
At the time when this dialog was introduced that was probably all we could easily do, but with Fluent we're able to improve things since it's got built-in support for formatting numbers and dates. Not only does this simplify the JavaScript code, but it also gives the localizer more fine-grained control of the desired output.

Please find additional information here:
 - https://projectfluent.org/fluent/guide/builtins.html
 - https://projectfluent.org/fluent/guide/functions.html

---

[1] `toLocaleString`, `toLocaleDateString`, and `toLocaleTimeString`.

* Utilize Fluent to format dates in the `AnnotationLayer`

The `AnnotationLayer` may not display correctly formatted data in PopupAnnotations, especially in the GENERIC viewer, since it's using native methods[1] that depend on the *browser* locale instead of the viewer locale as intended.
With Fluent we're able to improve things since it's got built-in support for formatting dates. Not only does this simplify the JavaScript code slightly, but it also gives the localizer more fine-grained control of the desired output.

Please find additional information here:
 - https://projectfluent.org/fluent/guide/builtins.html
 - https://projectfluent.org/fluent/guide/functions.html

---

[1] `toLocaleDateString`, and `toLocaleTimeString`.

* Upgrade Puppeteer to version 23.1.1

This major version contains three breaking changes that impact us:

- The `product` option has been renamed to the more suitable `browser`.
- The `page.screenshot()` API returns a `Uint8Array` instead of a
  `Buffer`, but since `pngjs` requires a `Buffer` object we need to do
  the conversion using `Buffer.from()` before passing data to `pngjs`.
- The browser configuration should be set using a configuration file
  instead of environment variables. Note that as a bonus this allows us
  to remove the `cross-env` dependency since that was only used to set
  the Puppeteer environment variable equally for all operating systems.

For more information about the changes between the old and new Puppeteer
versions refer to https://github.com/puppeteer/puppeteer/releases.

* [Editor] Add a missing parameter in the telemetry for the new alt text flow (bug 1914480)

* [CRX] Remove obsolete manifest features

In preparation for migrating the Chrome extension to Manifest Version 3,
this patch removes parts of the manifest that are obsolete and/or
unsupported in MV3.

Remove ftp mentions: ftp was dropped from 6 years ago, in Chrome 59.

Remove file_browser_handlers: does not work on Chrome OS according to
mozilla#14161 . MV3 replacement needs
a different manifest key and logic, which will be added later.

Remove content_security_policy: MV3 does not support unsafe-eval CSP,
and PDF.js does not require it. This may result in a mild performance
degradation in PDFs that contain PostScript.

Remove page_action logic: When this logic was originally introduced,
Chrome showed page action buttons in the address bar, which enabled
the extension to display the button on specific PDF viewer tabs only.
In Chrome 49 (2016), pageActions were dropped from the address bar and
put in an UI area that is hidden by default. The user can pin the button
to be unconditionally visible on the toolbar, which is distracting.
Because the UX is no longer serving the original needs, this patch
removes page_action from the manifest.

* [CRX] Remove obsolete extension API calls

These work arounds are no longer relevant to the latest Chrome versions.

* [Editor] Utilize Fluent "better" when localizing the resizer DOM-elements

Currently we manually localize and update the DOM-elements of the editor-resizers, and it seems nicer to utilize Fluent for that task.
This can be achieved by updating the l10n-strings to directly target the `aria-label` and then just setting the `data-l10n-id` on the DOM-elements.

* [Editor] Define the "pdfjs-editor-new-alt-text-generated-alt-text-with-disclaimer" string once

This l10n-string is being re-defined once for every editor, i.e. currently four times, which seems unnecessary.
To avoid having to check if this l10n-string exists first, we can utilize rest parameters to move it into the `AnnotationEditor._l10nPromise` Map-definition instead.

* Use the URL global instead of the deprecated url.parse

The Node.js url.parse API (https://nodejs.org/api/url.html#urlparseurlstring-parsequerystring-slashesdenotehost)
is deprecated because it's prone to security issues (to the point that Node.js doesn't even publish CVEs for it anymore).

The official reccomendation is to instead use the global URL constructor, available both in Node.js and in browsers.
Node.js filesystem APIs accept URL objects as parameter, so this also avoids a few URL->filepath conversions.

* [Editor] Fix few telemetry issues with the new alt text flow (bug 1915434)

* Simplify the `PDFDocumentProperties.#updateUI` method

We can remove the `reset`-parameter, since it's redundant, given that it's only used after `PDFDocumentProperties.#reset` has been invoked which means that `this.#fieldData === null` which is equivalent to resetting.
Also, we don't need to have two separate loops in order to update the UI in this method.

Finally, inline the `DEFAULT_FIELD_CONTENT` constant now that it's only used once.

* Use "full" localization ids in the `PDFDocumentProperties` class

It was recently brought to my attention that using partial or generated localization ids is bad for maintainability, which means that PR 18636 wasn't the correct thing to do.
However, just reverting that one doesn't really fix the problems which is why this patch updates *every* l10n-id in the `PDFDocumentProperties` class (but doesn't touch any `viewer.ftl`-files). Obviously this leads to more verbose code, but that cannot really be helped.

* Move the metric-locale check into `PDFDocumentProperties.#parsePageSize`

With the introduction of Fluent the `getLanguage`-method became synchronous, hence it no longer seems necessary to do the metric-locale check eagerly in the constructor and it can instead be "delayed" until actually needed.

* Add a helper function for http/https requests in `src/display/node_stream.js`

Currently we repeat virtually the same http/https request code in two different classes in this file, which seems unnecessary and it leads to more code.

* [Editor] Update the loading icon when wait for ML to take into account prefered-reduced-motion setting

 * The icon has been updated in https://bugzilla.mozilla.org/show_bug.cgi?id=1908920;
 * Add a linter to check that a svg element doesn't have fill="context-fill" attribute.

* Update l10n files

Given the recent l10n-id changes, let's do one more update before the next release to avoid "broken" translations.

* Shorten the code that inits `AnnotationEditorLayerBuilder` in the `web/pdf_page_view.js` file

This code can now utilize logical OR assignment, which is ever so slightly shorter.

* Use "full" localization ids throughout the code-base

It was recently brought to my attention that using partial or generated localization ids is bad for maintainability, hence this patch goes through the code-base and replaces any such occurrences.

* Add an option (i.e. --noFirefox) to only use Chrome when running tests

* Bump the stable version in `pdfjs.config`

* Update dependencies to the most recent versions

* Update translations to the most recent versions

* Use `Headers` consistently in the different `IPDFStream` implementations

The `Headers` functionality is now available in all browsers/environments that we support, which allows us to consolidate and simplify how the `httpHeaders` API-option is handled; see https://developer.mozilla.org/en-US/docs/Web/API/Headers#browser_compatibility

Also, simplifies the old `NetworkManager`-constructor a little bit.

* [Editor] Make highlight annotations editable (bug 1883884)

The goal of this patch is to be able to edit existing highlight annotations.

* [Editor] Make the focused stamp annotation more clear from a screen reader point of view (bug 1911994)

It has been tested with Voice Over (mac) and with NVDA (windows).

When an added stamp annotation is focused, the screen reader will announce
that it's a figure containing a graphic with the added alt-text.

* [Editor] Make the stamp annotations alt text readable by either VO or NVDA (bug 1912001)

* [Editor] Remove the disclaimer when the user is editing the alt-text in the new alt-text modal (bug 1911764)

* Improve the `StructTreeLayerBuilder.render` method

In hindsight it occurred to me that there's a couple of smaller issues with this method after it's made asynchronous (in PR 18658).

 - If the `render`-method is invoked back-to-back the existing caching doesn't guarantee that re-parsing won't occur, which we can address by introducing a new (private) promise.

 - If there's any errors fetching and/or parsing the structTree-data, we'd attempt to parse it again on re-rendering despite that being pointless.

* In the autoprint integration test, resolve the promise on 'afterprint' event

* [Editor] Avoid to throw when an highlight annotation is resetted

* Make tagged images visible for screen readers (bug 1708040)

The idea is to insert a span in the text layer with an aria-role set to img
and use the bounding box provided by the attribute field in the tag dict in
order to have non-null dimensions for the image to make it "visible".

* Use response-`Headers` in the different `IPDFStream` implementations

Given that the `Headers` functionality is now available in all browsers/environments that we support, [see MDN](https://developer.mozilla.org/en-US/docs/Web/API/Headers#browser_compatibility), we can utilize "proper" `Headers` in the helper functions that are used to parse the response.

* Use the `_headersCapability` name in `PDFNetworkStreamFullRequestReader`

This improves consistency in the code-base since the implementations with the Fetch API respectively Node.js uses that name.

* Use "full" localization ids in the `ColorPicker` class (PR 18674 follow-up)

Apparently I missed these in PR 18674.

* Use "full" localization ids in the `AltText` class (PR 18674 follow-up)

Apparently I missed these in PR 18674.

* Avoid to have a white line around the canvas

The canvas must have the same dims as the page in order to avoid to see the page
background.

* Prevent `.visibleMediumView` from overriding already hidden elements (issue 18704, PR 18596 follow-up)

*Please note:* As a general rule we probably don't need to fix things affecting *custom* implementations of the default viewer, but in this case a "targeted" fix seem possible that shouldn't (famous last words) break anything else.

Ensure that the `.visibleMediumView` class, which is used when the viewer becomes narrow, cannot override the `display`-value for DOM elements that are being *explicitly* hidden either through the associated attribute or class.

* [CRX] Use DNR instead of webRequest in preserve-referer

webRequestBlocking is unavailable in MV3. Non-blocking webRequest can
still be used to detect the Referer, but we have to use
declarativeNetRequest to change the Referer header as needed.

* [CRX] Drop chrome_style from manifest.json

MV3 does not support chrome_style in options_ui. Remove it and replace
it with the minimal amount of styles that still has some spacing around
the individual settings for readability.

* [CRX] Replace deprecated extension.getURL with runtime.getURL

* [CRX] Remove restoretab.js logic

restoretab.js was added in mozilla#6233
with the purpose of restoring lost tabs when Chrome closes all extension
tabs when it reloads the extension. This forced reload can happen when
the user toggles the "Allow access to file URLs" option.

This logic does not work any more, and since the use of localStorage is
a blocker in migrating to MV3, this patch just drops the logic.

* [CRX] Replace localStorage in telemetry logic

In MV3, the background script is a service worker. localStorage is not
supported in service workers. Switch to storage.local instead.

Data migration is not implemented because it is not needed due to the
privacy-friendly design of the telemetry: In practice the data is reset
about every 4 weeks, when the major version of Chrome is updated.

* [CRX] Set manifest_version to 3

- Replace DOM-based pdfHandler.html (background page) with background.js
  (extension service worker).

- Adjust logic of background scripts to account for the fact that the
  scripts can execute repeatedly during a browser session. Primarily,
  register relevant extension event handlers at the top level and use
  in-memory storage.session API to keep track of initialization state.

- Extension URL router: replace blocking webRequest with the service
  worker-specific "fetch" event.

- PDF detection: replace blocking webRequest with declarativeNetRequest.
  This requires Chrome 128+. The next commit will add a fallback for
  earlier Chrome versions.

* [CRX] Add fallback PDF detection for Chrome 127-

* [CRX] Add work-around for Chrome crash

* [CRX] Bump minimum_chrome_version from 88 to 103

The minimum required version is Chrome 103 because wildcard support for
web_accessible_resources[].extension_id was introduced in 103, in
https://chromium.googlesource.com/chromium/src/+/c9caeb1a080f165f48d2a90559aa35d22965b440

A way to broaden compatibility is to drop that key. By doing so, the
minimum required Chrome version is then 96, because of the
chrome.storage.session API (and declarativeNetRequestWithHostAccess).

Since gulpfile.js already defines "Chrome >= 98" and there is no real
point in expanding support to older versions, I'm just setting the
minimum version to 103.

* [CRX] Detect availability of DNR responseHeaders before use

Critical fix for old but recent Chrome versions; all requests are
otherwise redirected to the viewer.

* Update dependencies to the most recent versions

* Update translations to the most recent versions

* [JS] Let AFSpecial_KeystrokeEx match a format without 'decoration' (bug 1916714)

It'll let the user enter 1234567 instead of 123-4567 for example.
It works like this in other pdf viewers.

* [Editor] Avoid to have a stamp editor resizing itself

Fixes mozilla#18715.

* [Editor] Avoid to have the ML disclaimer when the ML engine isn't ready (bug 1917543)

* Remove ununsed static `HighlightEditor._l10nPromise` field

* [CRX] Fix feature detect of DNR responseHeaders option

Fix regression from mozilla#18711. `urlFilter` is a key of `condition`, not of
the `rule`. Consequently, the feature detection method failed to detect
the availability of the feature in Chrome 128+.

* Consume any pending path before drawing an annotation

Fixes mozilla#18058.

* Consider foo-\nBar as a compound word

Fixes mozilla#18693.

* Ensure that textLayers can be rendered in parallel, without interfering with each other

Note that the textContent is returned in "chunks" from the API, through the use of `ReadableStream`s, and on the main-thread we're (normally) using just one temporary canvas in order to measure the size of the textLayer `span`s; see the [`#layout`](https://github.com/mozilla/pdf.js/blob/5b4c2fe1a845169ac2b4f8f6335337c434077637/src/display/text_layer.js#L396-L428) method.

*Order of events, for parallel textLayer rendering:*
 1. Call [`render`](https://github.com/mozilla/pdf.js/blob/5b4c2fe1a845169ac2b4f8f6335337c434077637/src/display/text_layer.js#L155-L177) of the textLayer for page A.
 2. Immediately call `render` of the textLayer for page B.
 3. The first text-chunk for pageA arrives, and it's parsed/layout which means updating the cached [fontSize/fontFamily](https://github.com/mozilla/pdf.js/blob/5b4c2fe1a845169ac2b4f8f6335337c434077637/src/display/text_layer.js#L409-L413) for the textLayer of page A.
 4. The first text-chunk for pageB arrives, which means updating the cached fontSize/fontFamily *for the textLayer of page B* since this data is unique to each `TextLayer`-instance.
 5. The second text-chunk for pageA arrives, and we don't update the canvas-font since the cached fontSize/fontFamily still apply from step 3 above.

Where this potentially breaks down is between the last steps, since we're using just one temporary canvas for all measurements but have *individual* fontSize/fontFamily caches for each textLayer.
Hence it's possible that the canvas-font has actually changed, despite the cached values suggesting otherwise, and to address this we instead cache the fontSize/fontFamily globally through a new (static) helper method.

*Note:* Includes a basic unit-test, using dummy text-content, which fails on `master` and passes with this patch.

Finally, pun intended, ensure that temporary textLayer-data is cleared *before* the `render`-promise resolves to avoid any intermittent problems in the unit-tests.

* [JS] Correctly format floating numbers when they're close to an integer (bug 1918115)

* Bump dset from 3.1.3 to 3.1.4

Bumps [dset](https://github.com/lukeed/dset) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/lukeed/dset/releases)
- [Commits](lukeed/dset@v3.1.3...v3.1.4)

---
updated-dependencies:
- dependency-name: dset
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* [Editor] Avoid an extra new line when serializing a FreeText annotation (bug 1897909)

The extra new line is added because of using shift+enter to add a new line
in the text editor.

* Fluent: use explicit NUMBER() in plural variants

* Use the "pageColorsBackground" option as background-color for non-loaded pages (issue 18680)

This should prevent non-loaded pages from flashing white in e.g. high contrast mode (HCM).

* [CRX] Enable WebAssembly in Chrome extension

After the removal of 'unsafe-eval' CSP in mozilla#18651, WebAssembly fails to
load, resulting in issues such as seen in mozilla#18457.

Manifest Version 3 does not allow 'unsafe-eval', does accept the more
specific 'wasm-unsafe-eval' as of Chrome 103. Note that manifest.json
already sets minimum_chrome_version to 103.

This patch also adds `object-src 'self'` because it was required until
Chrome 110. As of Chrome 111, the default is `object-src 'self'` and
`object-src` is no longer required. We could drop `object-src` in the
future, but for now we need to include it to support Chrome 103 - 110.

* [Editor] Take into account the device pixel ratio when drawing an added image

Fixes mozilla#18626.

* Simplify the code that picks the appropriate NetworkStream-implementation

This code is quite old and has been moved/re-factored a few times over the years, however we can simplify this even further since we don't actually need a function to determine what NetworkStream-implementation to use.

* Rename the toolbar buttons in order to free their current names

which can then be used for their future parent container.
This patch aims to simplify a bit the patch in mozilla#18385.

* Read a signed integer when using PUSHW in sanitizing a font (bug 1919513)

* Fix the rendering of the different separators we've in the UI

Currently, the css for a separator is something like { height: 1px; background-color: ... }.
But its rendering depends on its position on the screen.
So instead of setting the height to 1px, we just set something like { border-top: 1px solid ...; },
this way the final rendering is exactly the same for all the separators.

* Link to the new issue templates from the README (PR 18308 follow-up)

* Ignore non-existing /Shading resources during parsing (issue 18765)

* Use `fs/promises` in the Node.js unit-tests (PR 17714 follow-up)

This is available in all Node.js versions that we currently support, and using it allows us to remove callback-functions; please see https://nodejs.org/docs/latest-v18.x/api/fs.html#promises-api

* Add more optional chaining in the `test/` directory

* Update dependencies to the most recent versions

* Update translations to the most recent versions

* [api-minor] Pass `CanvasFactory`/`FilterFactory`, rather than instances, to `getDocument`

This unifies the various factory-options, since it's consistent with `CMapReaderFactory`/`StandardFontDataFactory`, and ensures that any needed parameters will always be consistently provided when creating `CanvasFactory`/`FilterFactory`-instances.

As shown in the modified example this may simplify some custom implementations, since we now provide the ability to access the `CanvasFactory`-instance used with a particular `getDocument`-invocation.

* Bump library version to `4.7`

* [Editor] Don't show the ml toggle button when the ml is disabled (bug 1920515)

* Update `typescript` to version 5.6.2

This is unblocked because in commit bb302dd the default value for the
constructor got removed, which apparently confused TypeScript before.

Fixes mozilla#18770.

* Refactor the toolbar html & css to improve its overall accessibility (bug 1171799, bug 1855695)

The first goal of this patch was to remove the tabindex because it helps
to improve overall a11y. That led to move some html elements associated
with the buttons which helped to position these elements relatively to their
buttons.
Consequently it was easy to change the toolbar height (configurable in Firefox
with the pref browser.uidensity): it's the second goal of this patch.
For a11y reasons we want to be able to change the height of the toolbar to make
the buttons larger.

* Remove useless css variable --editor-toolbar-base-offset

* Remove duplicated --toolbar-height definition in the css

* Correctly compute the font size when printing a text field with an auto font size (bug 1917734)

* Increase the size of the toolbar depending on the uidensity (bug 1171799)

* Add Calixte to the list of authors

* Remove the unused `splitToolbarButton` CSS class (PR 18385 follow-up)

* Unify separate `#toolbarContainer`-blocks in the CSS (PR 18385 follow-up)

* Slightly re-factor the `transportFactory` initialization in `getDocument`

Given that the `WorkerTransport`-constructor will access all possible factory-instances, let's ensure that the `transportFactory`-object always has a consistent shape regardless of other options.

Also, since `useWorkerFetch` is always true in MOZCENTRAL builds we never need to create `CMapReaderFactory`/`StandardFontDataFactory`-instances there.

* Fix the rendering of tiling pattern when the steps are lower than the tile dimensions (bug 1837738)

It fixes mozilla#16038.

The idea is to create a pattern having the steps for dimensions and then draw
the base tile and the different overlapping parts on it.

* Remove `trackTransform` arguments from `CachedCanvases.getCanvas`-calls (PR 15281 follow-up)

This became unused in PR 15281, however that patch clearly missed some occurrences; sorry about that!

* Add basic support for non-embedded GillSansMT fonts (issue 18801)

Given the following excerpt from the [Wikipedia article](https://en.wikipedia.org/wiki/Gill_Sans), mapping this to Helvetica should hopefully be fine:

> It has been described as "the British Helvetica" because of its lasting popularity in British design.

* [Editor] Avoid to have a selected stamp annotation on top of the secondary toolbar (bug 1911980)

* [Editor] When deleting an annotation with popup, then delete the popup too

* Ensure that the CursorTools-buttons are disabled e.g. during editing (PR 15522 follow-up)

We disable any non-default CursorTool in PresentationMode and during Editing, since they don't make sense there and to prevent problems such as e.g. [bug 1792422](https://bugzilla.mozilla.org/show_bug.cgi?id=1792422).
Hence it seems like a good idea to *also* disable the relevant SecondaryToolbar-buttons, to avoid the user being surprised that the CursorTools-buttons do nothing if clicked.

* [api-minor] Update the minimum supported Google Chrome version to 103

This patch updates the minimum supported browsers as follows:
 - Google Chrome 103[1], which was released on 2022-06-21; see https://chromereleases.googleblog.com/2022/06/stable-channel-update-for-desktop_21.html

Note that nowadays we usually try, where feasible and possible, to support browsers that are about two years old. By limiting support to only "recent" browsers we reduce the risk of holding back improvements of the *built-in* Firefox PDF Viewer, and also (significantly) reduce the maintenance/support burden for the PDF.js contributors.

*Please note:* As always, the minimum supported browser version assumes that a `legacy`-build of the PDF.js library is being used; see https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#faq-support

---

[1] This is consistent with the minimum supported version in the recently updated Google Chrome addon.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Tim van der Meij <[email protected]>
Co-authored-by: Calixte Denizet <[email protected]>
Co-authored-by: Jonas Jenwald <[email protected]>
Co-authored-by: calixteman <[email protected]>
Co-authored-by: Nicolò Ribaudo <[email protected]>
Co-authored-by: Richard Smith (smir) <[email protected]>
Co-authored-by: Rob Wu <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Francesco Lodolo <[email protected]>
Co-authored-by: Sylvestre Ledru <[email protected]>
Co-authored-by: Marco Castelluccio <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants