This improves performance of `issue19319.pdf` even more, and locally the rendering time of the second page goes from ~300 ms to ~250 ms, since we avoid sorting a bunch of duplicate entries.
Render `/Subtype /RichMedia` annotations so embedded video and audio can
be played in the viewer.
The core layer parses the `RichMediaContent` dictionary to locate the
primary playable asset and its MIME type. The display layer overlays a
play button on the annotation's poster; clicking it swaps in a
`<video>`/`<audio>` element backed by a `blob:` URL. Presentation mode
lets events reach the media controls instead of advancing the page.
It fixes#2787.
Some producers wrongly set the crypt filter CFM to AESV2 for V=5 documents;
per the spec these must be decrypted with AES256 using the file encryption key directly.
- Use `FileSpec.pickPlatformItem` when getting the fileStream, to ensure that /EF-entries are handled in a consistent way across the code-base.
- Combine a couple of the data-validation steps, to reduce a tiny bit of duplication. Also, use the `isDict` helper a little more.
- Finally, avoid using a temporary variable when returning data in the `Page.prototype.getStructTree` method.
**Steps to reproduce:**
1. Open the viewer.
2. Show the sidebar, and switch to the "Pages" view if necessary.
3. Click on the "Add file" button.
4. Choose a password-protected PDF, e.g. the `issue6010_1.pdf` file, via the "File Upload" dialog opened by the browser.
5. Enter the password, i.e. `abc`, and press the <kbd>Enter</kbd> key.
**Expected result:**
That the new PDF document is merged into the existing one, without UI side-effects.
**Actual result:**
Merging works, *however* the "File Upload" dialog is re-opened.
---
It seems that when the passwordPrompt dialog closes, the <kbd>Enter</kbd> key press (from the input) is forwarded to the previously focused element which naturally is the "Add file" button.
*Note:* This doesn't seem (easily) possible to test, since the integration-tests directly populate the `viewsManagerAddFilePicker` and doesn't actually "click" on the `viewsManagerAddFileButton` first.
Unless the entire document has been loaded, the dictionary lookups in `parseMarkedContentProps` may throw `MissingDataException`s and in that case we need to re-parse the current Annotation rather than ignoring the optionalContent.
According to the coverage data this method is unused, see e20c810dd4/blob/src/display/editor/tools.js (L2552), and searching through the entire code-base reveals no call-site invoking an `isSelected` method.
This branch isn't covered by any tests, and looking at the two existing call-sites we only ever pass in a `CanvasRenderingContext2D` interface to this function.
Based on the git history this branch was added in PR 3312, however as far as I can tell it doesn't actually appear to have been necessary even back then!?
Annotation-local attachments (those not in the catalog `/Names` tree) were
resolved through a dictionary cache that `Catalog.cleanup` clears, so their
content became unreachable once the idle cleanup had run.
Encode the reference of the embedded content in the attachment id and re-fetch
it from the xref on demand instead of caching the dictionary, so the content
stays reachable without anything having to survive cleanup.
It fixes a regression introduced by #21351.
Introduces a 'supportsDownloading' browser option (defaulting to false)
that lets embedders disable the save/download paths entirely. When
disabled:
- the toolbar and secondary-toolbar download buttons are hidden;
- PDFViewerApplication.{download,save,downloadOrSave} and the
"beforeunload" save prompt bail out early;
- the BaseDownloadManager helpers (download, downloadData,
openOrDownloadData) and the Firefox/generic _triggerDownload
implementations no-op.
Currently we only check LinkAnnotations with URLs, but completely ignore e.g. internal destinations, named actions, attachments, SetOCGState actions, JS actions, and ResetForm actions when testing if inferred links overlap any existing annotation.
This seems conceptually wrong, since it may easily break intended functionality by overlaying the *correct* DOM element with an inferred link (as was the case in issue 21458).
Firefox 154 no longer walks the prototype chain in the `Error.stack`
getter, so `BaseException`-derived instances return an empty string
rather than the prototype `Error`'s stack (see bug 1946559).
The lower BlueScale clamp from #21343 guarded foundry fonts via
`blueScale < DEFAULT_BLUE_SCALE`, but that lets a near-default value
(e.g. 0.037) with small zones get raised to `0.5 / maxZoneHeight`. On
macOS' Core Text rasterizer this collapses the overshooting glyphs, so
most text disappears (not reproducible on Linux/Windows).
A non-isolated transparency group must blend with its backdrop, but a group
containing a blend mode was forced onto a transparent intermediate canvas;
e.g. a /Multiply highlight then painted opaquely over the text behind it,
making that text invisible.
This makes setting a few Dictionary entries a little bit shorter, which shouldn't hurt.
(Part of this code isn't fully covered by tests, so it improves overall code coverage as well.)
For checkbox and radio button fields, the export value can differ from the
appearance-state name: the field's inheritable `Opt` array holds the real
export values (used for non-Latin text, or values shared between buttons).
We previously exposed the appearance-state name as the export value.
Given that the Promise returned by the `PDFDocument.prototype.getPage` method *always* resolves with a `Page` instance, checking that the page is defined isn't necessary; note 3a09329113/src/core/document.js (L1702-L1723)
Furthermore the `Page.prototype.collectAnnotationsByType` method is asynchronous, and thus it always returns a Promise, hence it's "pointless" to fallback to return an empty Array.
This is a major version bump, but the changelog at
https://github.com/sindresorhus/eslint-plugin-unicorn/releases/tag/v66.0.0
doesn't indicate any breaking changes that should impact us.
However, improved rules do require a small number of changes here:
- The `prefer-array-some` rule no longer reports a false positive after
https://github.com/sindresorhus/eslint-plugin-unicorn/issues/3198 got
fixed, so the ignore line that was added in commit 68a5ec1 is removed.
- The `prefer-ternary` rule triggers on more cases now, in particular
`let` declarations with `if` reassignments, so a number of changes are
made to make it pass again.
- The `prefer-at` rule triggers on more cases now, in particular
`substring` calls that just extract a single character, so one change
is made to make it pass again.
We currently download Chrome/Firefox immediately on `npm install`
invocations because Puppeteer's postinstall script does that by default.
However, this is wasteful if the user/workflow doesn't actually need to
run Puppeteer or its browsers, for example in GitHub Actions workflows
that do linting, static analysis or other tasks like updating locales or
publishing artifacts.
This commit therefore makes sure no browser binaries get pulled in by
default anymore, and defers doing that until it's actually necessary,
which is when we want to start the browsers in the `startBrowsers`
function of `test.mjs`.
Locally this brings the `npm install` runtime down from 8.998 to 1.800
seconds, and as a bonus it results in better log output too because it
now shows which browser versions were used in the run (whereas
previously with `npm install` this information was not sent to stdout).
We currently use the pinned version of Chrome as hardcoded in the
Puppeteer release, which is based on the version of Chrome that was
deemed stable at the time of the Puppeteer release.
However, this is not ideal because it means that Chrome updates are
strongly tied to Puppeteer releases, so if Puppeteer releases are slow
we could be missing out on e.g. (security) patches being applied on the
stable channel. It's also not consistent with Firefox where we don't
use a hardcoded pinned version either.
This commit therefore configures Puppeteer to use (resolve) the most
recent stable version of Chrome at the time of the installation so that
determining the browser version to use is fully decoupled from the
Puppeteer release we're running.
The effect of this change can be seen in the output of running
`npx puppeteer browsers list`:
Before:
`chrome@149.0.7827.22 (linux) <path>`
After (note the slightly newer version):
`chrome@149.0.7827.115 (linux)` <path>`
Nowadays Chrome has a built-in (new) headless mode in the regular
binary, but before that time there was an old headless mode that was
essentially a separate binary [1]. We don't use the latter, but it turns
out that Puppeteer downloads it automatically if it's not explicitly
skipped, which is wasteful because it costs extra time and resources for
each `npm install` invocation.
This commit therefore skips downloading Chrome headless shell explictly,
which results in the local runtime of `npm install` going from 10.125
seconds to 8.998 seconds (which can't hurt in e.g. GitHub Actions).
[1] https://developer.chrome.com/blog/chrome-headless-shell.
The "Merge PDF" integration-tests will (indirectly) invoke `PDFViewerApplication.open` as part of loading the new PDF document, which will end up creating a new `PDFWorker` instance.
Currently worker coverage is only collected at the end of each integration-test, which means that in these cases we miss the coverage data from any "previous" workers.
Currently when opening a PDF document the following code is used, where `checkFirstPage`/`checkLastPage` helps detect XRef corruption; note 86a18bd5fe/src/core/worker.js (L167-L176)
However when merging a PDF into an existing document the parsing is only "partial"; note 86a18bd5fe/src/core/worker.js (L632-L634)
It seems a little strange to not support corrupt PDFs in a consistent manner in the code-base, hence this patch adds a new `BasePdfManager` helper that handles all the relevant parsing/checking and re-uses that when merging PDFs.
We use the generic `page.mouse.move(x, y, { steps }` API, but that purely
performs the mouse move steps without having knowledge about if/how the
application handles any events caused by it, so it doesn't wait for the
sidebar to render before moving on. This causes intermittent failures if
the sidebar didn't get enough time to render before the next mouse move
is initiated (which can happen in slower environments).
This commit fixes the issue by doing the mouse move steps ourselves and
by waiting for a browser trip between each of them to make sure that the
sidebar got a chance to render.
Fixes#21447.
Relates to #21044 / #21045 / 24e5377.
This is a major version bump, but the changelog at
https://github.com/sindresorhus/eslint-plugin-unicorn/releases/tag/v65.0.0
doesn't indicate any breaking changes that should impact us.
However, there is one false positive, possibly introduced by patch
https://github.com/sindresorhus/eslint-plugin-unicorn/pull/3028:
```
src/core/xfa/factory.js
104:54 error Prefer `.some(…)` over `.find(…)` unicorn/prefer-array-some
```
This is incorrect because on this line we're not dealing with an array
but with a `FontFinder` instance instead (and that doesn't have a
`.some()` method), so we ignore the rule for this line.
This removes a little bit of code duplication, which only exist since the `src/display/canvas.js` code pre-dates the helper function by many years.
Note: Given that `OffscreenCanvas` is enabled by default there's currently not a lot of test coverage for this code-path, hence the added browser-test.
This functionality was added specifically for the standalone image-decoders, and by utilizing the pre-processor we can reduce the amount of "unnecessary" code in the regular builds.
Also, shorten a few loop variables a little bit since less code is always good.
The idea is to generate two operator lists for the Yes/Off states and render them on a separate canvas.
These canvases are then attached the annotation and we modify their display depending on the input state.
It fixes#18021.
This basically extends PR 9549 to the fallback `getAllPageDicts` method, which didn't exist at the time, in order to support more cases of corrupt PDF documents.
Currently the same code, for requesting the password from the main-thread, is now duplicated three times.
Let's avoid that by moving the new `getPassword` helper function, added in the previous commit, and re-use that everywhere instead.
These unit-tests used a PDF that prompted for password on document load, which meant that the on-demand password handling wasn't actually being tested as intended.
Updating the unit-tests also caused the "re-prompts for encrypted attachments after incorrect passwords" test to fail, since the `INCORRECT_PASSWORD` password reason was being accidentally "swallowed" in the worker-thread.
This removes a little bit of code duplication, which only exist since the `src/display/canvas.js` code pre-dates the helper function by many years.
*Note:* Given that `OffscreenCanvas` is enabled by default there's currently not a lot of test coverage for this code-path, hence the added browser-test.
This method has been completely unused for many years, possibly as far back as PR 8030, hence we can avoid shipping a little bit of dead code.
*Note:* If there's any third-party code depending on it, updating it ought to be as simple as changing
```javascript
const r = viewport.convertToViewportRectangle(rect);
```
into
```javascript
const r = [
...viewport.convertToViewportPoint(rect[0], rect[1]),
...viewport.convertToViewportPoint(rect[2], rect[3])
];
```
Center each glyph within its comb cell instead of left-aligning it,
both in the HTML annotation layer and in the printed/saved appearance,
to match Acrobat. Cell width is now the single source of truth via the
--comb-width CSS variable, and field text-alignment (center/right) is
applied as a whole-cell --comb-offset that stays in sync on input,
blur, resetform and updatefromsandbox. The field no longer grows on
focus; trailing letter-spacing is clipped and cell dividers are drawn
on focus.
The print service injected the per-PDF `@page { size }` rule as an inline
<style> element, which required 'unsafe-inline' on style-src-elem.
Inject it through a constructable CSSStyleSheet attached to
document.adoptedStyleSheets instead. Constructable stylesheets aren't
subject to style-src's inline restrictions in browsers.
Compared to regular `Object`s there's a number of advantages to using `Map`s:
- They support "proper" iteration.
- They have a simple way to check for the existence of data.
- They have a simple/efficient way to check the number of elements.
If this functionality was added today, I cannot imagine that we'd choose an `Object` for this sort of data.
Furthermore, in PR 21351 the data returned by `getAttachments` changed slightly and third-party users will need to update their code anyway (hence why `[api-minor]` should be fine here).
When a read-only field (which has its own canvas) is updated by the
sandbox while its page isn't rendered, showElementAndHideCanvas
isn't called, so once the page is finally rendered the field still
shows its outdated canvas instead of the new value.
Replace the imperative canvas/element toggling with a `sandboxModified`
class, set from the annotation storage both at render time and on
sandbox updates, and let the CSS show the element and hide the canvas.
This forces `color: transparent` on selections.
In latest Firefox Nightly, this is no longer inherited on
`::selection` from the normal element.
References: <753827d749>
Given that the `external/` folder contains various imported code/resources, all of which could affect functionality and/or rendering, it seems safest to simply run browser/font/integration/unit tests whenever any part of that folder is touched.
Looking at the coverage data there are cases where we attempt to insert inferred link-annotations *before* the annotationLayer has rendered, see [here](2348365874/blob/web/annotation_layer_builder.js (L246)), which shouldn't happen and why that's treated as an Error.
This is most likely caused by the asynchronicity of all the relevant code, since the `Autolinker` functionality can only be invoked after both the annotationLayer *and* the textLayer have finished rendering.
Given that those operations are asynchronous, by the time that they complete it's possible that the annotationLayer (and also the textLayer) has been replaced by a new instance. In that case we might thus attempt to inject inferred link-annotations before the "new" annotationLayer has rendered.
To avoid this intermittent issue, we now ensure that the annotationLayer and textLayer haven't changed between those layers rendering and the `Autolinker` functionality being invoked. (If they did change, then a future `render` call will trigger the inferred link-annotations handling).
By replacing the early return with optional chaining, a pattern that we already use in lots of places, the code becomes a tiny bit shorter and more importantly the code coverage for this file becomes 100 percent.
Given that the cursor tools are managed via the `PDFCursorTools` class, of which the `GrabToPan` instance is essentially a (semi) private implementation detail, the `GrabToPan.prototype.toggle` method is completely unused and can thus be removed.
This format is obviously not very efficient however it's been supported since "forever" and there's even examples using, hence it seems like a good idea to actually test this.
Given that any incoming data is already being ignored after loading has been aborted, it seems reasonable to reject the stream-capability to avoid it remaining in a pending state indefinitely.
*Note:* This is something that I noticed while looking at the coverage data, since the `ChunkedStreamManager.prototype.onError` method is not used and from a brief look at the history of the code it never appears to have been used either.
Obviously it's not yet possible to just migrate `gulp browsertest` to GitHub Actions, however it's already possible to at least run the browser tests there which allows collection of more code coverage data.
This should thus give us more realistic coverage numbers, since currently there's many `src/` files that have very low code coverage.
By taking advantage of the fact that the GitHub Actions runners provide multiple cores, these tests are also fairly fast:
- The ubuntu-latest/firefox job complete in ~9 minutes.
Normally entire PDFs are encrypted (or not).
But it is also possible to only encrypt attachments.
It is then also possible to *only* prompt for a password when the user opens
them.
In the existing flow, prompting for passwords happens because things are decrypted.
A specific error is thrown, caught, and the user is prompted.
To keep this flow working, this PR changes to decrypting attachments on demand,
instead of eagerly.
This sounds logical: to not read attachments on startup.
I’ve extensively tested this, not only with regular attachments, but also with outline items
and attachments in annotations.
This PR builds on GH-21234.
It’s an alternative to the naïve GH-20732.
Closes GH-20049.
Prior to PR 20321 the annotationLayer was hidden when there was no regular annotations on the page, which meant that if there were any inferred links (from the textLayer) the annotationLayer needed to be made visible but in such a way that it wouldn't override an explicit `hide`-call from the `PDFPageView` class.
With the changes in the aforementioned PR the annotationLayer is now always "visible", and this code can thus be simplified a little bit.
So that Ctrl+A, Ctrl+Z, etc. still fire on non-US keyboard layouts where
the physical "A" key produces a non-Latin character (Cyrillic, Greek,
some AZERTY combinations, ...). KeyboardManager now tries event.key first
and falls back to a US-layout translation of event.code (KeyA => a,
Digit1 => 1, Numpad1 => 1) when no shortcut is bound on event.key.
Also refactors KeyboardManager to store modifiers as a bitmask instead
of a serialized string, and treats a shortcut array without any
"mac+"-prefixed entry as applying on all platforms, letting us drop the
redundant "mac+X" duplicates of bare "X" entries across the editor code.
Identical embedded fonts and images across the merged documents are now
written once and shared, instead of being copied per source file.
And avoid to compress already compressed stream with Brotli.
- The `dict` field is optional, hence avoid an Error if trying to clone a non-existent dictionary.
- Use the `length` getter in the `Stream` class, to avoid duplication.
- Fix the `DecodeStream` implementation, since it has a couple of bugs:
- The `clone` method currently uses `start`/`end` fields, despite these only existing on `Stream` instances.
- Given the previous point, we ended up creating the cloned `Stream` instance using the *entire* underlying `buffer`. This is problematic since the length of a `DecodeStream` cannot be accurately estimated before decoding, and the `buffer`-length is simply a multiple of two.
Unless the size of the decoded-data just happens to also be a multiple of two, this causes the cloned `Stream` instance to be "padded" with zeros at the end.
There's currently a few EventBus listeners that aren't marked as "internal", however I'm assuming that they probably should be (e.g. to reduce the risk of intermittent failures in the integration-tests).
The problem is that we screenshot the page itself rather than the
canvas, even though we specifically care about the latter according to
the comment, which means that we manually have to take care of hiding and
showing the annotation editor. This is problematic because even though
we signal that the annotation editor should be hidden, we don't wait
until that is actually done, which leads to a situation where we can
take the screenshot before the annotation editor is actually invisible
in the view.
This commit fixes the issue by screenshotting the canvas instead, which
avoids the need for manually hiding/showing the annotation editor. This
makes the test less fragile, and matches other tests better.
There's currently some amount of `StringStream` usage where the `dict`-parameter is manually assigned, and by updating the signature of the constructor this can be avoided.
The GitHub Actions workflow for the integration tests on Windows logs
the following line for every test:
`JavaScript warning: http://127.0.0.1:62313/build/generic/build/pdf.mjs,
line 134934: WebGPU is disabled by blocklist.`
On Linux WebGPU is disabled by default because of missing support, but on
Windows it's enabled by default since bug 1972486, so we try to obtain a
GPU adapter which fails (and logs) if there is no actual GPU like on
GitHub Actions. Coverage data confirms that our own WebGPU code is
already uncovered because of the lack of a GPU, so having WebGPU enabled
or disabled doesn't change that, but if it causes log spam it seems
better to disable it, which this commit does.
Note that Chrome doesn't seem to have a matching flag, but Chrome already
doesn't log anything about this (which is the primary driver for this
change), so that's not a problem.
Currently the viewer uses semi-private `EventBus.prototype.{_on, _off}` methods, to try and ensure that all internal viewer state is updated *before* any "external" listeners are invoked.
For all use-cases outside of the viewer, e.g in the integration-tests, the `EventBus.prototype.{on, off}` methods are supposed to be used instead.
Unfortunately this isn't currently enforced in any way, except (hopefully) during review, and generally speaking it's not really possible to prevent the semi-private methods being used (e.g. by third-party users).
Hence this patch adds a new `INTERNAL_EVT` property which is *not* exposed anywhere (neither in the API nor globally), and whose value is generated at build-time, that the viewer uses to mark its `EventBus` listeners are internal.
This allows us to remove the semi-private `EventBus` methods, which helps to simplify that class a little bit.
This is a major version bump, but the changelog at
https://github.com/puppeteer/puppeteer/releases/tag/puppeteer-core-v25.0.0
doesn't indicate any breaking changes that should impact us.
Moreover, this release contains the fix for the memory leak from
puppeteer/puppeteer#14876, so we can remove
the workaround related to that now.
Finally, we rename `.puppeteerrc` to `.puppeteerrc.json` because of
https://github.com/puppeteer/puppeteer/issues/15076, but in general it's
a good idea to be explicit about the file format via its extension, so
even if this upstream bug is fixed we don't need to revert this.
The two affected code paths caught and logged errors, but that wasn't
reflected in the exit code of the process, and that is what GitHub
Actions (and other tools) to determine if process execution was
successful or not. This commit fixes the issue by making sure we
consistently exit with code 1 in case of errors so that GitHub Actions
pipelines correctly reflect the outcome of the test run.
Image files dropped on or selected via the thumbnail viewer's
"add file" picker are now accepted alongside PDFs and inserted
as synthetic pages sized to the document's modal page dimensions.
The image-encoding helper previously embedded in StampAnnotation has
moved to src/core/editor/pdf_images.js so it can be shared between
stamp annotations and page synthesis.
This PR is related to GH-20732, which is about `AuthEvent` (to delay
promting for a password), but instead adds the actual support for
encrypted attachments.
“Encrypted attachments” means that the main things are plain text.
Note that some PDF viewers, like Preview/QuickLook/Safari or Chrome,
do not support attachments at all.
Note that the file checked into the tests is the same as
`output-no-auth-event.pdf` referenced in
<https://github.com/mozilla/pdf.js/issues/20139#issuecomment-3952462166>.
Closes GH-20139.
I happened to notice that the way the `OptionalContentConfig`-data handled in the PR that implements worker-rendering leaves a lot to be desired:
- The way that the optional content state is handled is not correct, since that PR collects the "effective visibility" of the optional content groups rather than their *actual* internal state.
- The necessary `OptionalContentConfig`-data is collected piecemeal in the API, which leads to quite frankly very messy code that's hard to read and will be even harder to maintain.
The solution to all of these issues seem really simple though, just add a couple of `OptionalContentConfig` methods that serialize/de-serialize the necessary data.
In the API calling `optionalContentConfig.serializable` will get *all* of the needed data for transferring to the worker-renderer, and once received there calling `OptionalContentConfig.fromSerializable(/* transferred data here */)` will create an `OptionalContentConfig` instance with the correct internal state.
As part of this patch, to avoid increasing bundle-size unnecessarily, a couple of existing methods are stubbed out when the `OptionalContentConfig` class ends up in a worker-file (since they're unused there).
This part assumes that the new worker-renderer is built correctly, note how the existing `pdf.worker.mjs` is handled in 03eda70d7e/gulpfile.mjs (L539-L545)
(*Note:* Submitting a PR was a lot faster than trying to provide review comments, since writing this commit message took longer than writing the patch.)
Fonts that ship a BlueScale outside the range AFDKO considers valid
for their zone heights (0.5/maxZoneHeight <= BlueScale <= 1/maxZoneHeight)
cause Firefox's CFF rasterizer to misalign overshooting glyphs against
flat-topped ones at body sizes.
Clamp into that window, only apply the lower clamp when BlueScale is
also smaller than the default, so foundry fonts that pair the default
0.039625 with small zones are untouched.
Fixes#9437.
It fixes#15292.
PDFs can embed a CID-keyed Type 1 program (Adobe TechNote 5014,
CIDFontType 0) under /Subtype /CIDFontType0 + /FontFile. Its binary
CIDMap/SubrMap layout has no eexec block, so Type1Font's eexec-only
parser used to fall through and trigger the work-around added in
PR #15397.
Split the constructor and parse the binary CIDMap, SubrMap
and charstrings (encrypted with the standard Type 1 charstring cipher)
through the existing Type1CharString.convert + CFF wrap pipeline.
Only single-FDArray fonts are supported; the StartData length is
clamped to the stream's remaining bytes before allocating.
- functions were already removed
- variables can be removed when their initializer does not have side effects
- classes can be removed when they have no static blocks
Fixes#21337
Given that changes to either the Babel plugin or the "old" builder could accidentally cause issues in the built files, it seems like a good idea to run all test-suites when the `external/builder/` folder is changed.
When a reftest hangs and trips the per-browser timeout, the session was
closed, which left every remaining task in the per-browserType queue with
no consumer and effectively skipped the rest of the manifest. Instead,
mark the in-flight task(s) as failed, reload the page so the driver can
reconnect and request the next task, and only fall back to closing the
session if the reload itself fails.
Adds a `postMessageAfterPrintCallback` browser option (off by default).
When enabled by Firefox, the print service posts "ready" or "error" to
the window after each page's mozPrintCallback resolves, so embedders
(e.g. print-preview test harnesses) can observe when rendering is done.
Upstreams the viewer-side portion of Phabricator D297837.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Daniel Holbert <dholbert@cs.stanford.edu>
Enable the recommended preset and fix or per-line-disable the 78
findings it surfaces. Most are equivalent rewrites, intentional
patterns (control chars, the whatwg email regex, autolinker URL regex)
keep their behavior via targeted disables.
Fixes#21259.
Reset letter-spacing and word-spacing on the text layer and hidden measurement canvas so inherited page styles do not affect text layer alignment. Add an integration regression test for inherited spacing.
This patch is generated automatically using `npm audit fix`, and brings
the number of reported vulnerabilities back to zero by patching:
- GHSA-jxxr-4gwj-5jf2: "brace-expansion: Large numeric range defeats
documented `max` DoS protection"
- GHSA-58qx-3vcg-4xpx: "ws: Uninitialized memory disclosure"
It fixes#7625.
If the Top DICT's Private DICT extends past the end of the font data,
the Local Subrs INDEX is unreachable and every CharString that calls
a subr ends up as a blank glyph. Throw from parsePrivateDict so the
existing catch in translateFont triggers fallbackToSystemFont, then
run getFontSubstitution post-construction so we pick a close local
match instead of the generic fallbackName.
Some Type1 fonts (the embedded Optima variants in orw1972.pdf) ship
two /Subrs and /CharStrings blocks wrapped in save/restore frames
gated on an Adobe hires/lores runtime switch.
In such cases, we just use the first /Subrs and /CharStrings block,
which is the one that is actually used by the font renderer in Acrobat.
It fixes#18548.
In many/most PDF documents every glyph will require that the character BBox has scaling/offset applied, which can be made a tiny bit more efficient. In particular:
- Avoid creating one additional temporary Array for every glyph.
- Simplify the helper function, since there's no skew-components.
When comparing this code with the full `XRef` class it doesn't seem to be entirely correctly implemented, since the `fetch` method is basically doing what the `fetchIfRef` method is intended to do.
Prune the back-edge components from cyclic composite glyphs in
sanitizeGlyphLocations (leaving non-cyclic siblings intact), reject OS/2
tables whose length is too short for the declared version so a clean
table gets regenerated, and upgrade a version 0.5 maxp table to 1.0 for
TrueType fonts to silence OTS' "wrong maxp version for glyph data".
It fixes#21298.
Drop an external PDF anywhere in the views-manager thumbnail
sidebar to merge it at the cursor, rather than always inserting
after the current page via the "Add file" button.
The drop reuses the blue separator from page-move drag so the
user can see exactly where the inserted pages will land, and the
merge path is shared with the existing picker so post-merge
selection/current-page behavior stays consistent.
References <https://bugzilla.mozilla.org/show_bug.cgi?id=1879559>
(“In HCM, the text selection is barely visible”).
Continues work from @calixteman who had a partial patch.
This PR improves viewer text-selection highlighting by rendering
selection shapes in the draw layer.
* add selection overlay rendering in the draw layer
* significant code relates to selections spanning multiple text
layers/pages, and edges/end-of-content boundaries
* clear selection on rotate/scale/scroll/spread changes
My main question is: how should it appear?
I don’t have access to the Figma file linked on bugzilla.
In the CSS (`draw_layer-builder.css`) there are 3 blocks:
* default
* `@supports` for browsers supporting `backdrop-filter`
* `forced-colors` mode
So it’s possible to design for those (or more).
Personally, the `backdrop-filter: invert(1)` is the most contrast,
so perhaps it’s better to use something else as the default,
and to use `invert(1)` if high contrast mode is used (maybe with a
`prefers-contrast` media query instead)?
This commit fixes the rendering issue that makes the "must update an
existing annotation" ink editor integration test permafail locally in
Chrome. Note that we already do this for Firefox tests, so this also
improves consistency between the two browsers.
Moreover, improve how we define Chrome options to (similar to their
Firefox counterparts) provide them in a single array, and document the
reasoning for why these options are being set more explicitly.
Fixes#21272.
If the active page is corrupt that currently results in the entire dialog being "blank", thus providing no information, which seems unfortunate and it's easy enough to only skip `pageSizeField` in that rare case.
Improve test coverage for multi-page documents, to ensure that:
- Unnecessary re-parsing is avoided where possible.
- Rotation, in the viewer, is handled correctly.
- Different page sizes are handled correctly.
This method is completely unused in the worker-thread, and it only has a single call-site in the main-thread.
By moving this helper into the `src/display/canvas_dependency_tracker.js` file, the size of the `gulp mozcentral` bundle is reduced by `1220` bytes.
It looks like this test passes consistently again, most likely after a
combination of browser/Puppeteer/configuration updates and the completed
switch to the WebDriver BiDi protocol.
Fixes#20136.
Originally we introduced a small delay for Puppeteer operations in
Chrome to avoid intermittent failures where protocol calls were
happening too quickly in succession. However, since then a number of
improvements were made, both locally and upstream, that reduce the need
for this delay:
- the integration tests have been hardened to remove (potential) sources
of intermittent failures in many places;
- the browsers and Puppeteer have been updated to improve performance
and support for testing infrastructure;
- the conversion to WebDriver BiDi has been completed, which replaced
the Chrome-specific CDP protocol with a formalized protocol that could
provide more safety guarantees.
This commit therefore reduces the Chrome-specific delay from 5 to 3
milliseconds, which should nowadays be a better value to speed up the
Chrome tests and bring them closer to Firefox in terms of runtime.
The custom solution for obtaining the bounding box of a given element
that we have now was necessary during the original introduction of the
integration tests because at the time the `ElementHandle.boundingBox()`
API in Puppeteer didn't work correctly in Chrome.
However, `getRect`, where this is used, is a hot utility function
because most tests call it multiple times, either directly or indirectly
via other utility functions, and it turns out that the approach we use
is slower than the native `ElementHandle.boundingBox()` API.
Fortunately, most likely after a combination of Chrome/Puppeteer updates
and the conversion to the formalized WebDriver BiDi protocol the custom
solution is no longer necessary because all tests pass without it too,
so this commit converts `getRect` to use `ElementHandle.boundingBox()`
instead to speed up the tests.
These tests could obviously be improved/extended, but it's at least a start to ensure that `ColorConverters` is tested since it's used in both the annotation-layer and the scripting-implementation.
When a CIDFontType0 descendant has its program in a FontFile3 stream with
/Subtype /OpenType, the OTF wrapper sometimes lacks a usable cmap and the
CID→GID mapping only exists inside the embedded CFF itself. In that case
the OpenType-table path produces wrong glyphs, so route the font through
CFFFont and let it consume the inner CFF directly.
The file has been found in https://issues.chromium.org/issues/471404119.
This small helper function only exists to support printing of XFA documents, in the viewer, hence it seems like a good idea to (ever so slightly) reduce the official API surface a little bit.
This is necessary to prevent import cycles with the next patch.
It also shouldn't hurt to reduce the size of `src/display/display_utils.js` a little bit, since utility-files have a tendency to increase in size over time.
This method is unused in the worker-thread and has only *a single* call-site in the main-thread, which can be trivially replaced with the `getCurrentTransform` helper function.
Given that this function is only ever used during *parsing* of the PDF document, which happens in the worker-thread, this has always added (a little bit of) dead code in the built `pdf.mjs` file.
Given that the various utility-files naturally increase in size over time, it shouldn't hurt to shorten `src/core/core_utils.js` a little bit by moving a few of its string helper functions to their own file.
PR #21101 narrowed `compose()`'s `clearRect` from full canvas to the
caller-supplied dirty box. That leaves pixels outside the current
dirty box on the SMask scratch canvas between `compose()` calls;
subsequent draws into scratch are then source-over-blended on top of
those leftovers, so the output depends on the cumulative draw history
rather than just the current draw.
This is a new feature in PDF documents, hence it shouldn't hurt to complement the existing ref-test with a simple unit-test as well.
This should also improve test coverage for the `external/` folder, which can't hurt since the other external decoders are already fairly well covered.
Set layout-neutral styles at the creation sites for the hidden TextLayer
canvas and PDFViewer copy element rather than relying on the shared
web/pdf_viewer.css rule.
This keeps the helper elements invisible and out of layout when viewer CSS
selectors are scoped or omitted, and removes the obsolete hiddenCanvasElement
class and shared CSS rule.
Note that for the integration tests the coverage information ends up
being processed in the Node.js context where `window` is not available,
so we use `globalThis` instead for the function that merges individual
test's coverage information into the global object because that is
available in all contexts we support. For clarity we also rename said
function since we're not exclusively dealing with `window` nor worker
data anymore.
Currently the `isCanvasFilterSupported`/`isAlphaColorInputSupported` getters, on the `FeatureTest` class, contains code that cannot run in the worker-thread since it relies on the DOM being available.
To avoid that a new DEFINE is added, in `gulpfile.mjs`, to allow skipping this sort of dead code in the built `pdf.worker.mjs` file.
It fixes#18032.
Only use the special inner-backdrop compositing path for nested non-isolated groups that actually need isolation.
This preserves the parent/page backdrop for simple nested groups inside knockout groups, preventing later group
compositing from erasing existing backdrop content.
Test PDF files should never be executable because we only read their
contents, so this commit makes sure that all test PDFs have the same
permissions, namely 0644 (read-only for all groups, and writable for the
owner), to limit their permissions for a least-privilege approach.
In the `DrawingEditor.prototype.#createDrawOutlines` method the l10n-id was being "generated", which is bad for maintainability since searching for l10n-id becomes more difficult and the `gulp check_l10n` script actually warns about this.
Hence, similar to e.g. the resizer localization, let's define the `a11yAlert` l10n-ids *once* and provide shorthands for accessing them.
In a knockout (KO) group each painting operator ("element") composites against
the group's initial backdrop instead of accumulating onto prior elements
of the same group. The backend renders each element to a per-group pooled
temp canvas (keyed off `#groupStackMeta`), builds a binary alpha mask via
a new `feFuncA` filter (`addKnockoutFilter`), `destination-out`s the
group canvas through that mask, restores the initial backdrop into the
cleared footprint for non-isolated groups (cropped to the same mask so
sparse groups don't bleed the whole rectangle), and finally paints the
element on top with the parent's blend mode. Path / clip / transform ops
are mirrored back to the group canvas via `mirrorContextOperations` so
graphics state stays in sync between elements; only the raster pixels
land on the temp canvas.
The temp canvas is forced to source-over for the element raster (`multiply`
on a transparent backdrop would zero the color) and the original GCO is
restored before `copyCtxState` writes back, so the parent's blend mode
survives for the final composite.
Also handled:
- Nested KO groups (the level is incremented for KO, reset to 0 for
non-KO subgroups so an ancestor KO doesn't leak in).
- Non-isolated non-KO subgroups inside a KO parent (`hasInnerBackdrop`
path: blend the elements against the subgroup's running backdrop for
color, mask with the elements-only canvas).
- Soft masks installed inside a KO element (`applySMaskInPlace` in
`compose`, which runs the SMask destination-in directly on the temp
canvas; the existing blit-to-suspended step is gated by `if (!ctx)`).
- Type-3 text, shading fills, image-mask groups, inline images and the
solid-color mask path: each is wrapped in `#begin/#endKnockoutElement`.
- `endDrawing` cleanup so cancelled rendering doesn't leak pooled
canvases or stale knockout state.
This is a major version bump containing two breaking changes for us:
- the `baseUrl` option is removed;
- the `moduleResolution` option doesn't support `node10` (or the `node`
alias) anymore.
The migration guide at https://github.com/microsoft/TypeScript/issues/62508
indicates that we can remove `baseUrl` and change `moduleResolution` to
`bundler` (the latter is consistent with what other projects do that are
linked to the issue, and more details on that configuration option can
be found at https://www.typescriptlang.org/tsconfig/#moduleResolution).
Note that this is enough to get `npx gulp typestest` green and that is
all validation we can do on our side, so as usual if any follow-up fixes
for types are necessary we rely on the community to provide patches and
extend the types test where possible to improve validation.
Given that this function is only ever used in `src/core/` code, let's avoid a little bit of dead code in the *built* `pdf.mjs` file.
Also, place the `AnnotationPrefix` and `AnnotationEditorPrefix` constants together in `src/shared/util.js` since that should aid readability.
Note that `Ref`s and `Name`s are cached globally[1], since that helps reduce object creation (a lot) during parsing.
That cache will be cleared after a period of inactivity in the viewer[2], which is why those primitives cannot *safely* be compared with just `===`/`!==` and also (partially) why abstractions such as `RefSet`/`RefSetCache` are necessary.
Currently `deepCompare` doesn't handle `Ref`s and `Name`s correctly, which may lead to future *intermittent* bugs in any code using the `deepCompare` helper function.
---
[1] This applies to `Cmd` as well, however that doesn't matter in the context of this patch.
[2] Currently, and for more than a decade, set to 30 seconds.
Previously this was only supported in Firefox, however when merging PDFs the `PDFViewerApplication.onSaveAndLoad` method will provide a `filename` unconditionally.
This is a left-over from very old code, which pre-dates the introduction of the `PDFDocumentLoadingTask` and it's nothing more than an alias for its `destroy` method.
Given that `PDFDocumentProxy` already provides a way to access the underlying `PDFDocumentLoadingTask` instance, it shouldn't be necessary to have an alias for one of its methods.
*Please note:* For any existing code relying on the removed method, updating it should be as simple as replacing `pdfDocument.destroy()` with `pdfDocument.loadingTask.destroy()`.
---
[1] If the `PDFDocumentProxy` class was added today, there's no chance that it'd include a `destroy` method.
This is a left-over from very old code[1], before there were a lot of `getDocument` options and when most of the library configuration was done via the (since removed) `PDFJS` global.
Given all the functionality added through the years, which require configuration[2], in practice it's now unlikely that calling `getDocument` without additional options will work except for the most trivial PDFs.
---
[1] If the `getDocument` function was added today, there's no chance that it'd support anything other than a parameter object.
[2] Note things such as CMaps, standard fonts, wasm-based image decoders, and ICC-based colour spaces.
The delay between chunks when testing streaming is necessary to avoid the entire PDF document arriving all at once, since that would render those unit-tests somewhat pointless.
However, the delay is unnecessarily large which causes these unit-tests to be slower than necessary.
Also, update the range unit-tests to check the expected number of fetches *exactly* since those values are not supposed to vary.
For now OffscreenCanvas in worker threads doesn't support ctx.filter,
so we need to fall back to a more expensive pixel-buffer SMask filtering in that case.
As a side effect, this also allows to support correctly smask in Safari.
Prepare reusable soft-mask canvases for filtered and backdrop-dependent masks,
and use a faster destination-in composition path where possible.
Handle Alpha SMask /BC correctly, preserve OOB alpha behavior, and mirror canvas path
operations needed while rendering inside soft-mask mode (mirrored clip was buggy).
Add reftest PDFs covering Alpha masks, transfer functions, backdrop/OOB
alpha, and the optimized composition paths.
This fixes two things that I overlooked in PR 21225, more specifically:
- Use proper, rather than semi, private class fields in `WasmImage`.
- Make tracking of `WasmImage` instances optional, to avoid keeping data alive permanently in the `IMAGE_DECODERS` build.
Using Array.prototype.shift() to drain the traversal queue makes each
visited node move the remaining queued entries. For large name/number
trees this can make getAll() spend quadratic time in queue management.
Iterate over the queue with for...of instead. Children pushed while
iterating are still visited, and the queue no longer needs repeated
front removals.
Given that the (browser) `gulp unittest` coverage includes the `external/` folder, it seems reasonable to include that folder in the coverage report when running unit-tests in Node.js as well.
Given that these classes are, with the exception of their `decode` methods, virtually identical this helps reduce code duplication and simplifies maintenance.
These changes reduce the size of the `gulp mozcentral` build-target by `1292` bytes, which obviously isn't a lot but still cannot hurt.
This was just released, see https://nodejs.org/en/blog/release/v26.0.0, hence it seems like a good idea to start running unit-tests in that version.
Also, stop running the unit-tests in Node.js version 25 since it'll soon reach EOL anyway and testing in three separate Node.js versions ought to suffice.
- Shorten the `getURL` function slightly, by re-factoring the try-catch blocks.
- Change how the `decode` function looks for a decoded ".pdf" name, to skip the regular expression matching when it's not needed and to allow re-using the already defined `pdfRegex`.
This was necessary before charset compilation was implemented, however that's been supported for many years and this is just dead code now.
- PR 9340, back in 2018, stopped using the `raw` field.
- PR 10591, back in 2019, implemented proper charset compilation.
The `CFFFont.prototype.data` should contain a `Uint8Array`, however if compilation failed it was being set to a `Stream` instance which will thus fail elsewhere in the font-code.
*Please note:* This was found by code inspection, since I don't have a PDF document that's fixed by this change.
Return the data as-is from the `CFFCompiler.prototype.compile` method, rather than making a copy of it first.
The reason that it was implemented this way in PR 21053 was to avoid keeping a potentially large `ArrayBuffer` alive, see https://github.com/mozilla/pdf.js/pull/21053#discussion_r3045402988
Having traced all the call-sites in the font-code that directly or indirectly invoke that code, I've now managed to conclude that the compiled CFF-data is never stored on the `Font` instance and using the data as-is thus shouldn't increase permanent memory usage.
The find controller tests consistently show up in the list of slowest
tests reported by Jasmine. Profiling shows that most of the time is
spent waiting for the find results to arrive, even though the find
command itself is quite fast.
It turns out that the slowdown occurs between receiving the `find` event
and actually triggering the search. The find controller has a hardcoded
delay of 250 milliseconds built in, which was introduced for viewer
performance many years ago because otherwise every keystroke would
trigger a search even though the user's query was not complete yet.
For the unit tests we don't need this delay because, contrary to the
viewer use case, we don't have to account for user interaction and
instead dispatch complete `find` events on the event bus ourselves.
However, since the unit tests were introduced well over a year after
the delay was introduced, due to an oversight it was never made
configurable so we could skip it for the unit tests.
This commit fixes the issue, which locally results in the runtime of
`npx gulp unittest --noChrome` dropping from 39.991 seconds before this
patch to 29.116 seconds afterwards, which is a 27% speedup.
In a previous commit the time-based checks, which were based on
statistics provided by the `pdfBug: true` flag, got replaced by
test-only property checks that don't use said statistics anymore.
Fixes b01eeaf8.
This commit fixes a number of missing cleanup steps in the unit tests
that kept state alive for longer than necessary:
- the loading tasks were not all being destroyed;
- the find controllers were not being reset;
- the state set in `beforeAll`/`beforeEach` was not all being nulled in
the correspoding `afterAll`/`afterEach` blocks.
Combined this resulted in a steady increase in memory usage of the test
process as the tests ran, climbing up to ~1.5 GB. After this patch the
memory usage remains stable at ~800 MB.
Given that all of the relevant API methods are asynchronous it's possible, although quite unlikely, that the existing "is the document active" check won't catch all situations where the document was closed in the middle of searching.
This handles *all* errors correctly, if e.g. the document is closed in the middle of searching.
Also, replacing the "promise chains" with more `await` helps simplify the code a little bit.
disable_search == true, will avoid useless search especially because we already provide the path to the info file.
disable_telem == true, disable sending telemetry to codecov.
Originally we introduced a separate `coverage.yml` workflow to
test-drive coverage collection without immediately introducing it inline
in the `ci.yml` workflow. However, now that coverage collection works
nicely and is performant enough we can simplify the workflow definitions
by removing `coverage.yml` entirely in favor of simply inlining coverage
collection into the existing `unittestcli` run in `ci.yml`. Doing so
also avoids having to a full extra `unittestcli` run just to collect
coverage information.
Moreover, this commit aligns coverage collection flags with the ones in
the other workflows to be more explicit and consistent.
The thumbnails are only available if the first page is ready, and not
awaiting that causes the drag-and-drop action to be performed using
incorrect thumbnail viewer state (see the analysis in #21184 for more
details).
Fixes#21184.
Supersedes #20902.
Unblocks #21173.
This avoids having to "duplicate" dummy `readBlock` methods in a couple of image-stream classes.
Also, move a few `DecodeStream` field definitions to (ever so slightly) shorten the code.
This commit mirrors the approach from e656b833 to the other workflows
that run multiple OS/browser combinations. This approach has multiple
advantages:
- it improves performance because each job is run in its own environment
so we don't have two browsers competing for resources in the same
environment anymore;
- it improves monitoring because each job is shown separately, with its
own runtime, in e.g. the pull request checks and actions overviews,
which makes it easier to spot bottlenecks that are specific to a
certain OS/browser combination and enable follow-up optimizations.
The `textLayerImagePlaceholder` canvas added in #20626 covers scanned
pages and was not recognized as a valid pointerdown target by
`#textLayerPointerDown`, so free highlights couldn't start.
If `PDFDocumentLoadingTask.destroy` ran while `workerIdPromise` was
pending, the inner `.then` in `getDocument` threw "Loading aborted"
before `WorkerTransport` was constructed, so `_transport` was never set
and the "Terminate" message was never posted.
Remove the "Float32Array" mention in the comment, given that the
implementation usesa Float64Array.
Actually using a Float32Array passes all the tests we currently have and
reduces memory usage (by 16 bytes per op), however to be sure that it
does not introduce rounding bugs we'd need to `Math.fround` all
operations we do on the clipBox and pendingBBox. It reduces the
readibilty of the code, but we can revisit if this memory usage becomes
a problem.
When pages carry explicit pageIndices (e.g. after a delete),
resolve insertAfter against that layout instead of the empty
base sequence. Also reject partial pageIndices combined with
insertAfter, which would race against the extraction's auto-fill.
The job's environment must be `code-coverage` before it has access to
the `CODECOV_TOKEN` secret that's required for pushing coverage data to
Codecov. Without it we get an HTTP 400 response with a `Token required
because branch is protected` message if it's run from `master`.
Fixes f932a58d.
The font tests have run with coverage reporting enabled for quite some
time now and the coverage information has proven to work and be stable,
so we don't have to also run the font tests without coverage reporting
anymore, thereby reducing the total runtime of the workflow.
In practice this probably hasn't caused any bugs, however given that `DOMElement.style.zIndex` returns a string the existing code is subtly wrong.
This is also consistent with pre-existing `zIndex` code in the `PopupElement` class, see the `#show` and `#hide` methods.
Currently the hash-property is just a stringified Array, which means that the hash-property can become arbitrarily long. That's not a good idea since it's used to compute a cache-key, in the API, which is then sent to the worker-thread. Hence the hash-property should be reasonably short, and its length should *not* depend on the number of modified editors, which can be achieved by using `MurmurHash3_64` here as well.
Many `parseInt` call-sites already provide the `radix` argument, and this rule helps improve consistency in the code-base; see https://eslint.org/docs/latest/rules/radix
*Please note:* The rule is disabled in `src/scripting_api/util.js` for now, since it's not obvious at a glance (at least to me) what the correct `radix` argument should be there.
Screen readers like NVDA fire synthetic click events on the <img> child of the button rather than on the
thumbnailImageContainer element itself, so the strict classList.contains check silently failed.
The alpha feature is available in Firefox nightly (with the pref `dom.forms.html_color_picker.enabled` set to `true`).
It's available in Safari but not in Chrome.
Note that we no longer create an empty file if the download fails because
we don't want to leave traces of incorrect file contents on disk. This
has never been particularly useful because it'd require a manual user
action to remove the empty file to be able to retry in case of e.g. a
timed out connection, but especially in the context of GitHub Actions
where we cache the PDFs directory we don't want to cache invalid files
to make sure that a next run will automatically retry to fetch any
previously missed PDF files and update the cache.
*Note:* This is similar to PR 19525, which did the same thing for the OpenJPEG decoder.
The advantages of doing this are:
- The same JBig2 decoder is used regardless of WASM being supported or not, which means consistent rendering.
- The old `Jbig2Image` implementation has various bugs and missing features.
- Less code that needs to be maintained in the PDF.js project, since both the CCITT and the JBig2 decoder is replaced.
The disadvantage of doing this is:
- Slightly larger bundle size, however the effect is limited since a fair amount of PDF.js code can be removed. For the `gulp mozcentral` target the size increase is approximately 54 kilo-bytes (which is small compared to the 452 kilo-bytes for the JS version of the OpenJPEG decoder).
Rather than relying on the time it takes to parse/render the pages, which leads to intermittent failures, add a test-only property and use it to check if the "CopyLocalImage" code-path was exercised.
Instrument JS files on-the-fly via babel-plugin-istanbul when --coverage
or --coverage-per-test is passed, producing an aggregate lcov/HTML report
at the end of the run. A persistent PDFWorker accumulates worker-thread
coverage alongside the main-thread coverage, collected via a new
GetWorkerCoverage message handler.
With --coverage-per-test, an inverted index
(build/coverage/per-test-index.json) is also built as tests run, mapping
each hit source line and function name to the numeric IDs of the tests
that exercised it, keeping the output compact. The new
`gulp test_search --code=file::line_or_function` tool queries the index,
and passing --code to browsertest pre-filters the test run to only those
tests.
Coverage output formats are selectable via --coverage-formats (default:
info; also accepts html, json, text, cobertura, clover).
This unit test failed consistently in Firefox both locally and on GitHub
Actions (but not in Chrome or on the bots), which suggests a timing issue.
Since all other unit tests that rely on `commonObjs` actually render the
page, most likely to make sure that `commonObjs` is fully populated at
the time of the check, this commit mirrors that approach to this test,
which indeed fixes the issue.
Replace manifest slicing with a dynamic task queue: drivers request tasks
on demand via WebSocket so parallel sessions self-balance naturally.
Start reading reference PNGs from disk as soon as a task is dispatched
(prefetchRefPngs) to overlap I/O with rendering. Release snapshot
buffers and task entries immediately after comparison, and copy WS
frame slices via Buffer.from() so the original frame buffer can be
GC'd.
Bump --max-old-space-size to 8192 MB as a workaround for a
Puppeteer/BiDi memory leak where data: URL strings accumulate in
BrowsingContext.#requests indefinitely:
https://github.com/puppeteer/puppeteer/issues/14876
Moreover, we indicate the exact version that belongs to each commit
hash. This not only makes it easier to compare the hash against the
release tags in the actions repositories, but hopefully also makes it
easier for e.g. Dependabot to keep the comments up-to-date since not all
of them were correct and varying comment styles were in use. This commit
aligns all of them to a single `v{major}.{minor}.{patch}` style.
- Replace base64/JSON POST image submission with binary WebSocket frames,
avoiding base64 overhead and per-request HTTP costs; quit is also sent
over the same WS channel to guarantee ordering
- Prefetch the next task's PDF in the worker while the current task is
still rendering
- Use `getImageData` instead of `toBlob` for partial-test baseline
comparison (synchronous, no encoding); only encode to PNG in master mode
- Disable bounce tracking protection in Firefox to prevent EBUSY errors
from Puppeteer's profile cleanup on Windows
With the exception of `glyphsIds` the length of the other segments can be trivially determined upfront, which is obvious in hindsight. This way unnecessary allocations can be avoided when building the "cmap" table.
One browser per job is opened and will run a subset of the tests.
The goal is to make the tests significantly faster on machines with a good number of cpus.
After the previous patches the `string32` helper function is now only used in the `FontLoader.prototype._prepareFontLoadEvent` method, which is stubbed out in the Firefox PDF Viewer, hence move it there instead to avoid bundling dead code.
This helps reduce the amount of boilerplate code needed in multiple spots throughout the font code, and more importantly it'll help when building TrueType tables whose final size is non-trivial to compute upfront.
remove as much as possible some intermediate canvases and avoid to use SVG filter
at each composition.
Rendering all the pages of issue17784.pdf takes 2x less time now.
1. Record `fill` dependencies even if we early return due to `isPatternFill``
2. Isolate the `drawPattern` inner `executeOperationList` in a
`CanvasNestedDependencyTracker` so that it does not consume pending
dependencies from the outer list.
Compared to the other TrueType table building functions, see previous patches, these ones are not trivial to convert to use TypedArrays properly.
However, in order to simplify the `OpenTypeFileBuilder` implementation a little bit we can at least have these functions return TypedArray data.
This is a major version bump, but the changelog at
https://github.com/istanbuljs/babel-plugin-istanbul/releases/tag/v8.0.0
doesn't indicate any breaking changes that should impact us (the
dependency update meant a bump of the minimal required Node.js version
to 18, but our minimal supported version is already higher than that).
This helper function only had a single call-site, and it's easily replaced with a `DataView` method.
Additionally, to hopefully make future re-factoring easier, create a `DataView` for each TrueType table.
With the changes in PR 21072 the `string16` helper is no longer being used when building the "hmtx" table, which accidentally removed the development mode assert.
Currently the code only updates the position when the length is defined, and it seems that this has "always" been wrong. Originally I believe that the `ChunkedStream` class was essentially a copy of the `Stream` class, and that implementation had the same problem until PR 20593.
Hopefully there's no code that relies on the current incorrect behaviour[1], since testing every aspect of the `ChunkedStream` implementation can be tricky given that these things are timing dependant.
---
[1] If there are, fixing those call-sites may be as easy calling `ChunkedStream.prototype.reset`.
Nowadays there's a lot of places in the code-base where we need to initialize or reset bounding boxes. Rather than spelling this out repeatedly, this patch adds new `Array`/`Float32Array` constants that can be copied or used as-is where appropriate.
We try to detect in the worker if some patterns or groups need to be drawn or not in isolation.
When they don't, we just draw them on the main canvas instead of drawing on a new canvas.
A pattern or a group is considered as being in isolation if it has some compositing rules or some transparency.
It improves the rendering performance of the pdf in bug 1731514.
It's possible to compute the final index-data size upfront, thus avoiding a bunch of intermediate allocations during index compilation.
This also means that a TypedArray can be used, rather than a plain Array, making it more efficient to insert the `objects` data.
This helps PDFs with large and complex CFF fonts the most, for example the PDFs in https://bugs.ghostscript.com/show_bug.cgi?id=706451 render ~40 percent faster (based on quick measurements in the viewer with `#pdfBug=Stats`).
Currently the `CFFCompiler.prototype.compile` implementation seem a bit inefficient, since the data is stored in a plain Array that needs to grow (a lot) during compilation. Additionally, adding a lot of entries isn't very efficient either and requires special handling of the "too many elements" case.
Some of the "helper" methods that use TypedArrays internally currently need to convert their return data to plain Arrays, via the `compileTypedArray` method, which adds even more intermediate allocations.
Note also that the `OpenTypeFileBuilder` has a special-case for writing plain Array data, which is only needed because of how the CFF compilation is implemented.
To improve this situation the `CFFCompiler.prototype.compile` method is re-factored to store its data in a TypedArray, whose initial size is estimated from the "raw" file size.
This removes the need for most intermediate allocations, and it also handles adding of "many elements" more efficiently.
Steps to reproduce:
1. Open http://localhost:8888/web/viewer.html
2. Open the console, and run: `PDFViewerApplication.close();`
3. Note the warning: `[fluent] Missing translations in en-us: pdfjs-toggle-views-manager-button`
This avoids effectively re-implementing an existing helper function, and the code is also simplified a tiny bit by building the final TypedArray header directly.
One drawback of the current implementation is that the GPU device can be
unavailable at the time of the first pattern fill, which causes the
GPU-accelerated canvas to be move on the main thread because of putImageData.
Most of the shading patterns stuff will be moved to the GPU and in order
to avoid creating some useless data we've to know if the GPU is available or not.
So in this patch we create the GPU device during the worker initialization
and pass a flag to the evaluator to know if the GPU is available or not.
This looks like a leftover from much older code, since all colors are now parsed with the [`getRgbColor` helper](ca85d73335/src/core/annotation.js (L558-L583)) which returns `Uint8ClampedArray` data when the color is valid.
Also, use spread syntax when calling `Util.makeHexColor` in a few more spots.
Firefox 148 shipped with a single killswitch to disable current and
future AI/ML functionality. This can be controlled in the GUI via
`Settings -> AI Controls -> Block AI enhancements`, but can also be
controlled programmatically via the `browser.ai.control.default`
preference.
This commit uses this single preference to replace the multiple
preferences we had to use for this purpose before.
Extends 844681a.
In PR #20887 the issue got fixed partly, but two issues remained that
unintendedly left room for intermittent failres:
- in the "Ctrl+ArrowLeft/Right" loop the `waitForBrowserTrip` call was
placed _before_ the actual keypresses, which means that for the final
iteration of the loop the last key presses were not properly awaited;
- in the "ArrowLeft/ArrowRight" loop the `waitForBrowserTrip` call was
absent, which means that we didn't wait for key presses at all.
This commit fixes the issues by consistently waiting to a browser trip
_after_ each key press to make sure the sidebar got a chance to render.
There is generally a small amount of time between the find call being
reported as finished and the find count results text being updated with
the correct number in the DOM, so the integration tests will fail if we
check the find results count text too soon.
This commit fixes the issue by using the `waitForTextToBe` helper
function to wait until the find count results text is what we expect it
to be. Note that we already use this helper function for this exact
purpose in other integration tests (related to reorganizing pages), and
it's also a little bit shorter/easier to read which cannot hurt.
This improves readability by removing "magic" numbers, and matches what
we already have for e.g. annotation and shading types.
Note that function type 1 does not exist in the specification, but that
also applies to everything higher than 4, so we can also remove the
specific handling of function type 1 and instead just let it fall
through to throwing an exception for unknown function types, in which we
now also log the provided function type to aid debugging.
It seems just as easy to lookup the needed data in the original arrays, rather than having to first create (and allocate) nested arrays for that purpose.
This test-case is an especially "bad" one performance wise given its PostScript function use, when falling back to the (now removed) `PostScriptEvaluator` code.
These classes, and various related code, became unused after PR 21023 with only unit-tests actually running that code now.
Also removes the `isEvalSupported` API option, since the `PostScriptCompiler` was the only remaining code where `eval` was used.
- Use the same `PartialEvaluator` instance for all annotations on the page, to reduce unnecessary object creation.
- Use `Object.hasOwn` to check if the annotations were already parsed, to avoid having to keep a separate boolean variable in-sync with the actual code.
The `PDFDataTransportStream` constructor has always registered exactly one listener for each type of data that an `PDFDataRangeTransport` instance can receive.
Given that an end-user of the `PDFDataRangeTransport` class will supply data through its `onData...` methods, it's also somewhat difficult to understand why additional end-user registered listeners would be needed (since the data is already, by definition, available to the user).
Furthermore, since TypedArray data is being transferred nowadays it's not even clear that multiple listeners (of the same kind) would generally work.
All in all, let's simplify this old code a little bit by using *a single* (internal) listener in the `PDFDataRangeTransport` class.
The idea with this helper function is that once https://github.com/tc39/proposal-math-clamp/ becomes stable, all its call-sites should then be replaced by the native functionality.
This patch updates the minimum supported environments as follows:
- Node.js 22, which was initially released on 2024-04-24 and has now entered the "Maintenance"-phase; see https://github.com/nodejs/release#release-schedule
Furthermore, note also that Node.js 20 will reach end-of-life on 2026-04-30 which coincides (approximately) with the next PDF.js release.
It fixes#5046.
We just generate a mesh for the pattern rectangle where the color of each vertex is computed from the function.
Since the mesh is generated in the worker we don't really take into account the current transform when it's drawn.
That being said, there are maybe some possible improvements in using directly the gpu for the shading creation
which could then take into account the current transform, but it could only work with ps function we can convert
ino wgsl language and simple enough color spaces (gray and rgb).
While `Readable.toWeb` wasn't marked as stable until more recently, the functionality itself has existed since Node.js version `17.0.0`; note https://nodejs.org/api/stream.html#streamreadabletowebstreamreadable-options
Hence the polyfill shouldn't actually be necessary, which is confirmed by the unit-tests passing in Node.js version `20` in GitHub Actions.
The main goal is to remove the eval-based interpreter.
In order to have some good performances, the new parser performs some optimizations
on the AST (similar to the ones in the previous implementation),
and the Wasm compiler generates code for the optimized AST.
For now, in case of errors or unsupported features, the Wasm compiler returns null
and the old interpreter is used as a fallback.
Few things are still missing:
- a wasm-based interpreter using a stack (in case the ps code isn't stack-free);
- a better js implementation in case of disabled wasm.
but they will be added in follow-up patches.
This function only has a single call-site (if we ignore the unit-tests), where the colors are split into separate parameters.
Given that all the color components are modified in the exact same way, it seems easier (and shorter) to pass the colors as-is to `applyOpacity` and have it use `Array.prototype.map()` instead.
This code already has an integration-test, however also having a unit-test shouldn't hurt since those are often easier to run and debug (and it nicely complements the existing `outline` unit-tests).
The patch also makes the following smaller changes to the method itself:
- Avoid creating and parsing an empty Array, when doing the `pageRef` search.
- Use `XRef.prototype.fetch` directly, when walking the parent chain, since the check just above ensures that the value is a Reference.
- Use the `lookupRect` helper when parsing the /BBox entry.
This obviously won't matter in practice, however it seems more "correct" to only extract the necessary number of color components rather than slicing off excess ones at the end.
Since this code is quite old parts of it can also be simplified a little bit by using modern string methods, which removes the need for the `pad` helper function.
Looking at the very similar `CssFontInfo.prototype.#readString` and `SystemFontInfo.prototype.#readString` methods they decode using the data as-is, but the `FontInfo.prototype.#readString` method for some reason copies the data into a new `Uint8Array` first; fixes yet another bug/inefficiency in PR 20197.
* add support for directly writing masks into `rgbaBuf` if their size matches
* add support for writing into `rgbaBuf` to `resizeImageMask` to avoid extra
allocs/copies
* respect `actualHeight` to avoid unnecessary work on non-emitted rows
* mark more operations as `internal`
This changes the path for what I believe is the common case for masks:
a mask to add transparency to the accompanying opaque image, both being
equal in size.
The other paths are not meaninfully unchanged.
That increases my confidence as these new paths can be easily tested
with a PNG with transparency.
Some tests were failing and has been fixed:
- "Hello" + Alef + "(" + Bet: the "(" (neutral) was not considered as a part of the group Alef(Bet and the group wasn't reverted;
- some intermediate neutrals were considered as strong.
This is a little bit shorter, and it should be fine considering that in the API there are no `typeof` checks when invoking user-provided callbacks (see e.g. `onPassword` and `onProgress`).
It's somewhat common for multiple test-cases to use the same PDF, for example tests for both `eq` and `text` with one PDF.
Currently the logic in `downloadManifestFiles` doesn't handle duplicate test PDFs, which leads to wasted time/resources by downloading the same PDF more than once. Naturally the effect of this is especially bad when downloading all linked PDFs.
Total number of test PDFs downloaded:
- 507, with `master`.
- 447, with this patch.
The cache has been added in #3312 in 2013 and a lot of things changed since.
Having too many cached accelerated canvases can lead to have to move their data from the GPU to the RAM
which is costly.
So this patch:
- removes all the cached canvases;
- destroys the useless canvases in order to free their associated memory asap;
- slightly rewrite canvas.js::_scaleImage to avoid too much canvas creation.
Using the Fetch API simplifies and shortens the `downloadFile` function considerably, since among other things it handles redirects[1] by default.
Also, the regular expression in `downloadManifestFiles` can be replaced with a simple string function now.
---
[1] Implementations of the Fetch API should already prevent e.g. redirect loops and limit the total number of redirects allowed.
This code already isn't used (or even bundled) in the Firefox PDF Viewer, and it also slightly reduces the number of import maps that need to be maintained.
For the Python-based workflows we were already using `pip` caching [1],
but sadly this isn't fully functional at the moment because the caching
functionality uses `requirements.txt` to determine when to create or
invalidate the cache. However, we have two different `pip` install
commands but only a `requirements.txt` for one of them (the Fluent
linter), which means that the other job (the font tests) will not
populate the cache with its dependencies.
This can be seen by opening any font tests or Fluent linting build and
noticing that they report the exact same cache key even though their
dependencies are different. In the installation step the dependencies
are reported as "Downloading [package].whl" instead of the expected
"Using cached [package].whl".
This commit fixes the issue by explicitly defining a `requirements.txt`
file for both jobs and pointing the caching functionality to the
specific file paths to make sure that unique caches with the correct
package data are used. While we're here we also align the syntax and
step titles in the files for consistency.
[1] https://github.com/actions/setup-python?tab=readme-ov-file#caching-packages-dependencies
The `setup-node` action contains built-in support for caching [1], so
this commit makes sure we use it for all Node.js-based workflows to
reduce workflow execution time.
Note that, contrary what one might expect [2], the `node_modules`
directory is deliberately not cached because it can conflict with
differing Node.js versions and because it's not useful in combination
with `npm ci` usage which wipes the `node_modules` folder
unconditionally. Therefore, the action instead caches the global `npm`
cache directory instead which does not suffer from these problems and
still provides a speed-up at installation time.
[1] https://github.com/actions/setup-node?tab=readme-ov-file#caching-global-packages-data
[2] https://github.com/actions/setup-node/issues/416
[3] https://github.com/actions/cache/issues/67
Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1]
These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise.
Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will *only* be used when either:
- The PDF.js library runs in Node.js environments.
- The user manually sets `useWorkerFetch = false` when calling `getDocument`.
- The user provides custom `CMapReaderFactory`, `StandardFontDataFactory`, and/or `WasmFactory` instances when calling `getDocument`.
By replacing these factories with *a single* new `BinaryDataFactory` factory/option the number of `getDocument` options are thus reduced, which cannot hurt.
This also reduces the total bundle-size of the Firefox PDF Viewer a little bit, and it slightly reduces the number of import maps that need to be maintained.
*Please note:* For users that provide custom `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` instances when calling `getDocument` this will be a breaking change, however it's unlikely that (many) such users exist.
(The *internal* format data-format of `CMapReaderFactory` was changed in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)
---
[1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great.
[2] Note that the Firefox PDF Viewer no longer use these factories, since it "forcibly" sets `useWorkerFetch = true` during building.
This reverts commit d618a2bc7ebe550cfcef31df8ddd0c8a12cf6bf1.
Unfortunately it did not fix the hanging actions for the locale update
job; fixing the issue is tracked in #20813.
Trying to resolve the same `objId` more than once would be a bug elsewhere in the code-base, since that should never happen, hence update the `resolve` method to prevent that.
The `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes are all very similar, and the only difference is really in their respective `fetch` methods.
By have the worker-thread "compute" the complete `filename` it's possible to simplify the `BaseCMapReaderFactory.prototype.fetch` method, which will allow future improvements to all of these classes.
A couple of things to note:
- This code is unused, and it's not even bundled, in the Firefox PDF Viewer.
- In browsers it's unused by default, and worker-thread fetching will always be used when possible since that's more efficient.
*Please note:* For users that provide a custom `CMapReaderFactory` instance when calling `getDocument` this could be a breaking change, however it's unlikely that any such users exist.
(The *internal* format of this data was changed previously in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)
The WebGPU feature hasn't been released yet but it's interesting to see how
we can use it in order to speed up the rendering of some objects.
This patch allows to render mesh patterns using WebGPU.
I didn't see any significant performance improvement on my machine (mac M2)
but it may be different on other platforms.
Currently it's only possible to trigger page-render debugging through the page number, but when looking at the /Pages tree it's often not immediately obvious what the page number actually is.
However, the /Ref of the page is directly available and it's thus handy to be able to use that one instead to enable page-render debugging.
The goal of this patch is to remove the noice we've in the logs:
```
0:09.36 INFO Console message: [JavaScript Warning: "Script terminated by timeout at:
reset@resource://pdf.js/web/viewer.mjs:11773:7
EventListener.handleEvent*#enableGlobalSelectionListener@resource://pdf.js/web/viewer.mjs:11787:12
render@resource://pdf.js/web/viewer.mjs:11716:20
async*#renderTextLayer@resource://pdf.js/web/viewer.mjs:12108:28
draw/resultPromise<@resource://pdf.js/web/viewer.mjs:12575:53
promise callback*draw@resource://pdf.js/web/viewer.mjs:12570:8
renderView@resource://pdf.js/web/viewer.mjs:7872:14
forceRendering/<@resource://pdf.js/web/viewer.mjs:13963:29
" {file: "resource://pdf.js/web/viewer.mjs" line: 11773}]
```
Given that we "forcibly" set `useWorkerFetch = true` for the MOZCENTRAL build-target there's a small amount of dead code as a result, which we can thus remove during building.
Given that these classes are only used from the "FetchBinaryData" message handler, the `name`/`filename` parameters should never actually be missing and if they are that's a bug elsewhere in the code-base.
Furthermore a missing `name`/`filename` parameter would result in a "nonsense" URL and the actual data fetching would then fail instead, hence keeping this old validation code just doesn't seem necessary.
In PR 20367 the `CompiledFont.prototype.getPathJs` method was changed to return TypedArray data, however the `NOOP` fallback was (likely accidentally) left an empty string.
The compilation of font-paths in PR 20346 was then implemented such that an empty string just happened to be ignored silently, however the assert added in PR 20894 allowed me to spot this return value inconsistency.
*Please note:* Since this only applies to missing or broken glyphs, that wouldn't be rendered anyway, this doesn't show up in reference tests.
Providing one of these parameters is necessary when calling `getDocument`, since otherwise there's nothing to actually load. However, we currently don't enforce that properly and if there's more than one of these parameters provided the behaviour isn't well defined.[1]
The new behaviour is thus, in order:
1. Use the `data` parameter, since the PDF is already available and no additional loading is necessary.
2. Use the `range` parameter, for custom PDF loading (e.g. the Firefox PDF Viewer).
3. Use the `url` parameter, and have the PDF.js library load the PDF with a suitable `networkStream`.
4. Throw an error, since there's no way to load the PDF.
---
[1] E.g. if both `data` and `range` is provided, we'd load the document directly (since it's available) and also initialize a pointless `PDFDataTransportStream` instance.
Given that `CompiledFont.prototype.getPathJs` already returns data in the desired TypedArray format, we should be able to directly copy the font-path data which helps shorten the code a little bit (rather than the "manual" handling in PR 20346).
To ensure that this keeps working as expected, a non-production `assert` is added to prevent any future surprises.
After the changes in PR 20197 the code in the `TranslatedFont.prototype.send` method is not all that readable[1] given how it handles e.g. the `charProcOperatorList` data used with Type3 fonts.
Since this is the only spot where `Font.prototype.exportData` is used, it seems much simpler to move the `compileFontInfo` call there and *directly* return the intended data rather than messing with it after the fact.
Finally, while it doesn't really matter, the patch flips the order of the `charProcOperatorList` and `extra` properties throughout the code-base since the former is used with Type3 fonts while the latter (effectively) requires that debugging is enabled.
---
[1] I had to re-read it twice, also looking at all the involved methods, in order to convince myself that it's actually correct.
The commit message for the patch in PR 20427 is pretty non-descriptive, being only a single line, however there's a bit more context in https://github.com/mozilla/pdf.js/pull/20427#issue-3597370951 but unfortunately the details there don't really make sense.
Note that the PR only changed main-thread code, but all the links are to worker-thread code!?
The `FontFaceObject` class is only used on the main-thread, and when encountering a broken font we fallback to the built-in font renderer; see 820b70eb25/src/display/font_loader.js (L135-L143)
Hence the `FontFaceObject` class *only* needs a way to set the `disableFontFace` property, however nowhere on the main-thread do we ever update the `bbox` of a font.
Unfortunately, eslint-plugin-import depends on eslint 9. This plugin doesn't seem to be
actively maintained (lot of open issues and PRs).
Fortunately there's a fork of the plugin that doesn't support eslint 10 yet but is actively maintained.
So this PR changes the eslint version to 10 and replaces eslint-plugin-import with eslint-plugin-import-x.
When getting the extracted document, the url for the wasms wasn't passed
and consequently the icc-based color wasn't rendered as expected because of the wasm for qcms.
The bots currently detect if the tests failed or not by checking the
output text, but this is error-prone as it would break if the text gets
changed (which is rather unexpected) and it's non-standard as usually
processes report success/failure via their exit code.
This commit makes sure that the test process fails with a non-zero exit
code if the tests failed, and a zero exit code otherwise. Note that we
keep the same output text as before, so this change should not impact
the bots, but it does form another step to unlock usage in GitHub
Actions (which uses the exit code of commands to determine the status of
the workflow run).
Instead of having all the code for the new debugger in a single file,
split it into multiple files.
This makes it easier to navigate and maintain the codebase.
It'll be make hacking and fixing bugs in the debugger easier.
This is an old API-parameter that is now unused within the PDF.js project itself, and its description says that it's (partly) being used for "range requests operations".
Note that the `length` API-parameter is used to set the *initial* `contentLength` in various `BasePDFStreamReader` implementations, however it's always overridden by the "Content-Length" header (sent by the server) when that one exists *and* is a valid number. While we currently fallback to the keep the initial `contentLength` otherwise, note however how in that case range requests will always be *disabled* and thus the only spot in the code-base [where `fullReader.contentLength` is necessary](873378b718/src/core/worker.js (L230-L236)) cannot actually be reached.
Hence the only possible reason to use the `length` API-parameter would be for improved progress reporting[1] during streaming of PDF data in rare cases where the "Content-Length" header is missing/invalid, but the user *somehow* has information from another source about the correct `length` of the PDF document.
That situation feels very much like an edge-case, but it's obviously impossible to know if someone is depending on it. However, please note that there's a work-around available for users affected by this removal:
- Implement a `PDFDataRangeTransport` instance together with custom data-fetching[2], since in that case its `length`-parameter will always be used as-is.
Finally, updates various `BasePDFStreamReader` implementations to only set the `_isRangeSupported` field once the headers are available (since previously we'd just overwrite the "initial" value anyway).
---
[1] I.e. to avoid the "indeterminate" loadingBar being displayed in the viewer.
[2] This is what e.g. the Firefox PDF Viewer uses.
Currently the `bbox`, `fontMatrix`, and `defaultVMetrics` getters duplicate almost the same code, which we can avoid by adding a new helper method (similar to existing ones for reading numbers and strings).
The added `assert` in the new helper method also caught a bug in how the `defaultVMetrics` length was compiled.
On the worker-thread only the static `write` methods are actually used, and on the main-thread only class instances are being created.
Hence this, after PR 20197, leads to a bunch of dead code in both of the *built* `pdf.mjs` and `pdf.worker.js` files.
This patch reduces the size of the `gulp mozcentral` output by `21 419` bytes, i.e. `21` kilo-bytes, which I believe is way too large of a saving to not do this.
(I can't even remember the last time we managed to reduce build-size this much with a single patch.)
It has few cool features:
- all the canvas used during the rendering can be viewed;
- the different properties in the graphics state can be viewed;
- the different paths can be viewed.
The purpose of PR 11844 was to reduce memory usage once fonts have been attached to the DOM, since the font-data can be quite large in many cases.
Unfortunately the new `clearData` method added in PR 20197 doesn't actually remove *anything*, it just replaces the font-data with zeros which doesn't help when the underlying `ArrayBuffer` itself isn't modified.
The method does include a commented-out `resize` call[1], but uncommenting that just breaks rendering completely.
To address this regression, without having to make large or possibly complex changes, this patch simply changes the `clearData` method to replace the internal buffer/view with its contents *before* the font-data.
While this does lead to a data copy, the size of this data is usually orders of magnitude smaller than the font-data that we're removing.
---
[1] Slightly off-topic, but I don't think that patches should include commented-out code since there's a very real risk that those things never get found/fixed.
At the very least such cases should be clearly marked with `// TODO: ...` comments, and should possibly also have an issue filed about fixing the TODO.
In PR 20016 the actual uses of the `enableHWA` option was removed from the viewer, but for some reason it's still being provided when initializing `PDFViewer` and `PDFThumbnailViewer` despite the fact that it's now dead code.
The `PagesMapper` class currently makes up one third of the `src/display/display_utils.js` file size, and since its introduction it's grown (a fair bit) in size.
Note that the intention with files such as `src/display/display_utils.js` was to have somewhere to place functionality too small/simple to deserve its own file.
- Mention the internal viewer in the README, such that it's easier to find.
- Implement a new `INTERNAL_VIEWER` define, such that it's easier to limit code to only the "internal-viewer" gulp target.
- Only include the "GetRawData" message-handler when needed. Note that the `MessageHandler` [already throws](eb159abd6a/src/shared/message_handler.js (L121-L123)) for any missing handler.
- Move the various new helper functions from `src/core/document.js` and into their own file. The reasons for doing this are:
- That file is already quite large and complex as-is, and these helper functions are slightly orthogonal to its main functionality.
- Babel isn't able to remove all of the new code, and by moving this into a separate file we can guarantee that no extra code ends up in e.g. Firefox.
Similar to excluded tests pending tests should not count towards runs or
result in console logging because they are effectively not (fully) run.
This reduces visual noise and helps to get the tests running on GitHub
Actions where non-passed tests will count towards a non-zero exit code.
This makes the order of checks consistent with the one in
`test/reporter.js` and improves safety because now any status other
than passed will be treated as a failure (also if Jasmine adds more
statuses later on).
This is consistent with a bunch of other viewer code, since an `AbortSignal` can only be aborted once any "abort" listeners can thus be removed when invoked.
When recording bboxes for images, it's enough to record their
clip box / bounding box without needing to run the full bbox
tracking of the image's dependencies.
This patch adds right-click support for images in the PDF, allowing
users to download them. To minimize memory consumption, we:
- Do not store the images separately, and instead crop them out of the
PDF page canvas
- Only extract the images when needed (i.e. when the user right-clicks
on them), rather than eagery having all of them available.
To do so, we layer one empty 0x0 canvas per image, stretched to cover
the whole image, and only populate its contents on right click.
These images need to be inside the text layer: they cannot be _behind_
it, otherwise they would be covered by the text layer's container and
not be clickable, and they cannot be in front of it, otherwise they
would make the text spans unselectable.
This feature is managed by a new preference, `imagesRightClickMinSize`:
- when it's set to `-1`, right-click support is disabled
- when set to `0`, all images are available for right click
- when set to a positive integer, only images whose width and height are
greater than or equal to that value (in the PDF page frame of
reference) are available for right click.
This features is disabled by default outside of MOZCENTRAL, as it
significantly degrades the text selection experience in non-Firefox
browsers.
In PR 19548 these checks were added to ensure that the font-data sent from the worker-thread *always* include correct `disableFontFace` and `fontExtraProperties` data.
For some reason PR 20197 then changed the code such that these checks became effectively pointless, since these properties are now checked after the fact *and* the new getters provide fallback values.
This method is only used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through the underlying data to create a temporary `Array` that we finally iterate through at the call-site.
*Please note:* As port of these changes the chars/glyph caches, on the `Font` instances, are changed to use `Map`s rather than Objects.
This method is only used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through the underlying `Map` to create a temporary `Array` that we finally iterate through at the call-site.
The one from pdf.js.utils is a bit too old: a lot of bugs have been fixed
in the code that parses PDF files since then.
It's just an internal development tool, so it doesn't need to be perfect,
but it should be good enough to be useful.
Using `0.0.0.0` instead of `localhost` allows connecting from other
devices, significantly simplifying testing on mobile devices.
This is controlled by the `--host` CLI flag and not enabled by
default, since allowing external connections comes with security
implications (e.g. when on a public network without a properly
configured firewall).
There might be reasons to want to listen on custom hostnames, but as
the most common usage will probably be `--host 0.0.0.0`, there is a
shorter alias `--host 0` for it.
A number of these unit-tests didn't actually cover the intended code-paths, since many of them *accidentally* matched the "file size is smaller than two range requests"-check.
The patch also updates `validateRangeRequestCapabilities` to use return-value names that are consistent with the class fields used in the various stream implementations.
Falling back to use the `loaded` byteLength if the server `contentLength` is unknown doesn't make a lot of sense, since it'd lead to the `onProgress` callback reporting `percent === 100` repeatedly while the document is loading despite that being obviously wrong.
Instead we'll now report `percent === NaN` in that case, thus showing the indeterminate progressBar, which seems more correct if the `contentLength` is unknown.
Please note that this code-path is normally not even reached, since streaming is enabled by default (applies e.g. to the Firefox PDF Viewer).
With these changes `0`, `NaN`, `null`, and `undefined` in the `total`-property all result in `percent === NaN` being reported by the callback, since previously e.g. `0` would result in `percent === 100` being reported unconditionally which doesn't make a lot of sense.
Also, remove the "indeterminate" loadingBar (in the viewer) if the `PDFDocumentLoadingTask` fails since there won't be any more data arriving and displaying the animation thus seems wrong.
This adds a basic non-MOZCENTRAL polyfill for now, which we should be able to remove once the next QuickJS version is released; note the pending changelog at f1139494d1/Changelog (L7)
This adds a *very basic* non-MOZCENTRAL polyfill for now, which we should be able to remove once the next QuickJS version is released; note the pending changelog at f1139494d1/Changelog (L8)
This is not only shorter, but (in my opinion) it also simplifies the code.
*Note:* In order to keep the *five* different `BasePDFStreamReader` implementations consistent, we purposely don't re-factor the `PDFWorkerStreamReader` class to support `for await...of` iteration.
Looking at the `BinaryCMapStream` implementation, it's basically a "regular" `Stream` but with added functionality for reading compressed CMap data.
Hence, by letting `BinaryCMapStream` extend `Stream`, we can remove an effectively duplicate method and simplify/shorten the code a tiny bit.
Currently the `customNames` are read one byte at a time, in a loop, and at every iteration converted to a string.
This can be replaced with the `BaseStream.prototype.getString` method, which didn't exist back when this function was written.
The original issue can only be reproduce when the thumbnails are on two lines.
The fix is just a matter of computing the number of elements on the last line once we've
finished the first line.
This method is usually used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through ` Map`-values to create a temporary `Array` that we finally iterate through at the call-site.
Note that the `getRawValues` method is old code, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
This method is usually used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through ` Map`-keys to create a temporary `Array` that we finally iterate through at the call-site.
Note that the `getKeys` method is old code, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
This changes a number of loops currently using `Dict.prototype.{getKeys, getRaw}`, since it should be a tiny bit more efficient to use an iterator directly rather than first iterating through `Map`-keys to create a temporary `Array` that we finally iterate through at the call-site.
Note that the `getKeys` method is much older than `getRawEntries`, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
In practice we've not accepted PRs with "manual" changes of any non `en-US` locales for many years, however some (older) README/FAQ entries can perhaps be seen as actually inviting such PRs.
Now that l10n updates are running automatically via GitHub Actions, see PR 20749, we're definitely not going to accept any "manual" translation patches so it cannot hurt to de-emphasize localization in various README files.
(Also, since all relevant Fluent files have license headers it doesn't seem meaningful to mention that in the l10n README.)
The user has to select some pages and then click on the "Save As" menu item in the Manage menu.
If they modify the structure of the pdf (deleted, moved, copied pages), they have to use the usual
save button.
Working on PR 20784, I couldn't help noticing that this code can be improved a little bit.
- Only initialize `PdfTextExtractor` in development mode and MOZCENTRAL builds, since it's unused elsewhere.
- Re-factor how `PdfTextExtractor` waits for the viewer to be available/ready, by using existing (internal) events.
This simplifies the `PdfTextExtractor` class and removes its `setViewer` method, which improves general consistency since normally the viewer-components don't use such a method in that way (here it was effectively used as a stand-in for a `setDocument` method).
- Finally, while slightly unrelated, rename the `#getAllTextInProgress` field in the `PDFViewer` class to `#copyAllInProgress` to clearly indicate what it's for since the `getAllText` method is used more generally now.
Given that extracting all text can take a while to complete, especially in long PDF documents[1], it's technically possible for the user to "select all", copy, and finally abort copying (with the `Esc` key) while `PdfTextExtractor.prototype.extractTextContent` is still running.
If that happens the `PDFViewer.prototype.getAllText` code would be interrupted, and text-extraction would thus fail in an intermittent and (likely) hard to debug way. To avoid this we replace the current "global" interrupt handling with an optional `AbortSignal` instead.
---
[1] See e.g. `pdf.pdf` in the test-suite.
This solves [bug 1989406](https://bugzilla.mozilla.org/show_bug.cgi?id=1989406).
(“The user should be able to dismiss the in-content message displayed by clicking somewhere else in the PDF”)
There’s a good gif there that shows the problematic behavior.
In the thread, there are also mentions of 2 similar but slightly separate problems:
* clicking on another highlight should also dismiss
* the mention that hitting the escape key does not dismiss
I found the last point, the escape key, to work already (first test case here).
But this PR solves the main bug (second test case) and the adjacent one
(third test case).
It works by using the existing `unselectAll` handling.
Hopefully I'm not misunderstanding why it was written like that, however using an existing method should be a tiny bit more efficient than querying the DOM.
This will ignore filenames that become effectively empty, i.e. ones that are only ".pdf" and nothing more.
*Please note:* While this passes all existing unit-tests, I don't know if this is necessarily the "correct" solution here.
- Add a new test using only streaming, since that was missing and the lack of which most likely contributed to previous bugs in the `PDFDataRangeTransport` implementation (see PR 10675 and 20634).
- Improve the "ranges and streaming" test, to utilize both ranges *and* streaming properly, since the way it was written seemed somewhat unrealistic given how data will normally arrive when `PDFDataRangeTransport` is being used.
- Provide more `initialData`, in relevant tests, since a length smaller than `rangeChunkSize` seem pretty pointless.
- Test the `contentDispositionFilename`, and `contentLength`, handling in the `PDFDataRangeTransport` implementation.
It's not particularly common for locales to be removed from Firefox, but it has happened occasionally in the past.
Currently the `downloadL10n` function logs a message about any unmaintained locales, but after PR 20749 that code is running in GitHub Actions (rather than locally by a contributor) which makes it much less likely for the message to be seen.
This behaviour comes from the initial pdf.js commit but is wrong and
doesn't match other PDF readers like muPDF or pdfium.
From PDF Spec 7.3.3:
A PDF writer shall not use the PostScript language syntax for numbers with non-decimal radices (such
as 16#FFFE) or in exponential format (such as 6.02E23).
Clicking on the "Find Current Outline Item" button is (obviously) supposed to scroll that outline item into view, however that seems to have broken accidentally in PR 20495.
Currently the transfers aren't actually being used with the "GetOperatorList" message, since the placement of the parameter is wrong; note the method signature: 909a700afa/src/shared/message_handler.js (L219-L229)
This goes back to PR 16588, which added the transfers parameter, and unfortunately we all missed that :-(
Simply fixing the parameter isn't enough however, since that broke printing of Stamp-editors (and possibly others). The solution here is to *not* transfer data during printing, given that a single `PrintAnnotationStorage` instance is being used for all pages.
It fixes#20715.
`failedExpectations` was removed from `suiteStarted` and `specStarted` events.
HtmlReporter and HtmlSpecFilter have been deprecated and removed.
Change all these cases to use `Map.prototype.getOrInsertComputed()` instead, in combination with a helper function for creating the `Array`s (similar to the previous patch).
With the exception of the first invocation the callback function is unused, which means that a lot of pointless functions may be created.
To avoid this we introduce helper functions for simple cases, such as creating `Map`s and `Objects`s.
Currently we essentially "duplicate" the same code for parsing the `values` and `keys` of the `searchParams`, which seems a little unnecessary.
To be able to parse the `searchParams` from the end, we currently create an Array (from the Iterator) and then reverse it before finally looping through it. Here the latter two steps can be replaced with the `Array.prototype.findLast()` method instead.
*Please note:* I completely understand if this patch is rejected, on account of being less readable than the current code.
There's a number of classes where the constructors can be removed completely by instead using class fields, which help to slightly shorten the code.
It seems that `unicorn/prefer-class-fields` ESLint plugin, see PR 20657, unfortunately isn't able to detect all of these cases.
Rather than assigning it manually in every extending class, we can utilize the fact that the `AnnotationType`-entries are simply the upper-case version of the `/Subtype` (when it exists) in the Annotation dictionary.
When checking the code coverage report, it was noticed that the line numbers were off.
It was due to the fact that the files used for coverage were the transpiled ones,
when the ones used by Codecov were the original ones.
So this patches adds the source maps to the transpiled files, and also updates
the license header in the original files in using a babel plugin in order
to make sure the line numbers are correct.
As a side effect of this work, it's now possible to have the correct line
numbers in the stack traces when running tests with the transpiled files.
This is a major version bump, but the changelog at
https://github.com/stylelint/stylelint/releases/tag/17.0.0
doesn't indicate any breaking changes that should impact us.
We only have to list `stylelint` in `eslint.config.js` similar to e.g.
commit a2909f9b.
This is a major version bump, but the changelog at
https://github.com/sindresorhus/eslint-plugin-unicorn/releases/tag/v63.0.0
doesn't indicate any breaking changes that should impact us.
Note that this version supports ESLint 10, which is a prerequisite for
being able to upgrade ESLint itself in a follow-up later.
This is not only shorter, but (in my opinion) it also simplifies the code.
*Note:* In order to keep the *five* different `BasePDFStreamRangeReader` implementations consistent, we purposely don't re-factor the `PDFWorkerStreamRangeReader` class to support `for await...of` iteration.
It's a first step to add code coverage.
In order to get the code coverage report locally, you can run the following command:
```bash
npx gulp unittestcli --coverage
```
The code coverage report will be generated in the `./build/coverage` directory.
And the report can be consulted by opening:
http://localhost:8888/build/coverage/index.html
A GitHub workflow has also been added to run the unit tests with code coverage
on each push and pull request. The report will be uploaded to Codecov.
This leads to slightly shorter code[1] when initializing classes, and in some cases we can even remove the constructors, which shouldn't hurt; see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-class-fields.md
It's probably possible to also change a lot of these class fields to private ones[2], however it's often difficult to tell at a glance if that's safe hence this patch only does this for the `PDFRenderingQueue`.
---
[1] This reduces the size of the `gulp mozcentral` output by 999 bytes, for a mostly mechanical code change.
[2] That sort of re-factoring should generally be done separately, on a class-by-class basis, to reduce the risk of regressions.
While we don't dispatch the actual range request after PR 10694 we still parse the returned data, which ends up being an *empty* `ArrayBuffer` and thus cannot affect the `ChunkedStream.prototype._loadedChunks` property.
Given that no actual data arrived, it's thus pointless[1] to invoke the `ChunkedStreamManager.prototype.onReceiveData` method in this case (and it also avoids sending effectively duplicate "DocProgress" messages).
---
[1] With the *possible* exception of `disableAutoFetch === false` being set, see f24768d7b4/src/core/chunked_stream.js (L499-L517) however that never happens when streaming is being used; note f24768d7b4/src/core/worker.js (L237-L238)
In all cases where we currently use `Response.prototype.arrayBuffer()` the result is immediately wrapped in a `Uint8Array`, which can be avoided by instead using the newer `Response.prototype.bytes()` method; see https://developer.mozilla.org/en-US/docs/Web/API/Response/bytes
This patch updates the minimum supported browsers as follows:
- Google Chrome 118, which was released on 2023-10-10; see https://chromereleases.googleblog.com/2023/10/stable-channel-update-for-desktop_10.html
We haven't made any changes to the supported Google Chrome version for a year, and this change allows us to remove "hacks" needed to support `float: inline-start/inline-end` in old browsers; see https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/Properties/float#browser_compatibility.
Note that nowadays we usually try, where feasible and possible, to support browsers that are about two years old. By limiting support to only "recent" browsers we reduce the risk of holding back improvements of the *built-in* Firefox PDF Viewer, and also (significantly) reduce the maintenance/support burden for the PDF.js contributors.
*Please note:* As always, the minimum supported browser version assumes that a `legacy`-build of the PDF.js library is being used; see https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#faq-support
We tried to lookup the font metrics using the font name as-is, which didn't work since the PDF file in question has non-embedded fonts with names that include commas.
Hence the font names need to be normalized here as well, similar to elsewhere in the font code.
Given that the Node.js code uses standard `ReadableStream`s now, see PR 20594, it can use the same `getArrayBuffer` as the Fetch API implementation.
Also, change the `getArrayBuffer` fallback case to an Error (rather than a warning) since that should never actually happen.
Currently the same identical code is duplicated four times per file, which seems completely unnecessary.
Note that the function isn't placed in `src/display/network_utils.js`, since that file isn't included in MOZCENTRAL builds.
Doing skip-cache reloading of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#disableRange=true in the latest Firefox Nightly version I noticed an *intermittent* bug, where the loadingBar would fill up but without the PDF ever rendering.
Initially I was quite worried that the changes in PR 20602 had somehow caused this, however after a bunch of testing in a slightly older Nightly I was able to reproduce the problem there as well.[1]
Instead I believe that this problem goes all the back to PR 10675, so still my fault, since the `progressiveDone` handling there was incomplete.
Note how we only set the `this._done` property, but don't actually resolve any still pending requests. For reference, compare with the handling in the `PDFDataTransportStreamRangeReader.prototype._enqueue` method.
Depending on the exact order in which the `PDFDataTransportStreamReader.prototype.{_enqueue, read, progressiveDone}` methods are invoked, which depends on how/when the PDF data arrives, the bug might occur.
The reason that this bug hasn't been caught before now is likely that `disableRange=true` isn't used by default and Firefox users are unlikely to manually set that in e.g. `about:config`.
---
[1] Possibly some timings changed to make it slightly more common, but given the intermittent nature of this it's difficult to tell.
This method is only invoked via `ChunkedStreamManager.prototype.sendRequest`, which currently returns data in `Uint8Array` format (since it potentially combines multiple `ArrayBuffer`s).
Hence we end up doing a short-lived, but still completely unnecessary, data copy[1] in `ChunkedStream.prototype.onReceiveData` when handling range requests. In practice this is unlikely to be a big problem by default, given that streaming is used and the (low) value of the `rangeChunkSize` API-option. (However, in custom PDF.js deployments it might affect things more.)
Given that no data copy is better than a short lived one, let's fix this small oversight and add non-production `assert`s to keep it working as intended.
This way we also improve consistency, since all other streaming and range request methods (see e.g. `BasePDFStream` and related code) only return `ArrayBuffer` data.
---
[1] Remember that `new Uint8Array(arrayBuffer)` only creates a view of the underlying `arrayBuffer`, whereas `new Uint8Array(typedArray)` actually creates a copy of the `typedArray`.
Currently there's two small bugs, which have existed around a decade, in the `loaded` property that's sent via the "DocProgress" message from the `ChunkedStreamManager.prototype.onReceiveData` method.
- When the entire PDF has loaded the `loaded` property can become larger than the `total` property, which obviously doesn't make sense.
This happens whenever the size of the PDF is *not* a multiple of the `rangeChunkSize` API-option, which is a very common situation.
- When streaming is being used, the `loaded` property can become smaller than the actually loaded amount of data.
This happens whenever the size of a streamed chunk is *not* a multiple of the `rangeChunkSize` API-option, which is a common situation.
The decoder is a dependency of the jbig2 one and is already
included in pdf.js, so we just need to wire it up.
It improves the performance of documents using ccittfax images.
Report loading progress "automatically" when using the `PDFDataTransportStream` class, and remove the `PDFDataRangeTransport.prototype.onDataProgress` method
Currently this code expects a "url string", rather than a proper `URL` instance, which seems completely unnecessary now. The explanation for this is, as so often is the case, "historical reasons" since a lot of this code predates the general availability of `URL`.
This is consistent with the other `BasePDFStream` implementations, and simplifies the API surface of the `PDFDataRangeTransport` class (note the changes in the viewer).
Given that the `onDataProgress` method was changed to a no-op this won't affect third-party users, assuming there even are any since this code was written specifically for the Firefox PDF Viewer.
This should help reduce the maintenance burden of the code, since you no longer need to remember to update separate code when touching the different page/thumbnail classes.
This should help reduce the maintenance burden of the code, since you no longer need to remember to update separate code when touching the different `DownloadManager` classes.
This should help reduce the maintenance burden of the code, since you no longer need to remember to update separate code when touching the different `PDFPrintServiceFactory` classes.
Given that we either use the `L10n` class directly or extend it via `GenericL10n`, it should no longer be necessary to keep the interface-definition.
This should help reduce the maintenance burden of the code, since you no longer need to remember to update separate code when touching the `L10n` class.
Given that `SimpleLinkService` now extends the regular `PDFLinkService` class, see PR 18013, it should no longer be necessary to keep the interface-definition.
This should help reduce the maintenance burden of the code, since you no longer need to remember to update separate code when touching the `PDFLinkService` class.
The percentage calculation is currently "spread out" across various viewer functionality, which we can avoid by having the API handle that instead.
Also, remove the `this.#lastProgress` special-case[1] and just register a "normal" `fullReader.onProgress` callback unconditionally. Once `headersReady` is resolved the callback can simply be removed when not needed, since the "worst" thing that could theoretically happen is that the loadingBar (in the viewer) updates sooner this way. In practice though, since `fullReader.read` cannot return data until `headersReady` is resolved, this change is not actually observable in the API.
---
[1] This was added in PR 8617, close to a decade ago, but it's not obvious to me that it was ever necessary to implement it that way.
For now, `BrotliDecode` hasn't been specified but it should be in a
close future.
So when it's possible we use the native `DecompressionStream` API
with "brotli" as argument.
If that fails or if we've to decompress in a sync context, we fallback
to `BrotliStream` which a pure js implementation (see README in external/brotli).
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.
Also, remove the `rangeChunkSize` not defined checks in all the relevant stream-constructor implementations.
Note how the API, since some time, always validates *and* provides that parameter when creating a `BasePDFStreamReader`-instance.
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.
Also, spotted during rebasing, pass the `enableHWA` option "correctly" (i.e. as part of the existing `transportParams`) to the `WorkerTransport`-class to keep the constructor simpler.
Note how there's *nowhere* in the code-base where the `IPDFStreamRangeReader.prototype.onProgress` callback is actually being set and used, however the loadingBar (in the viewer) still works just fine since loading progress is already reported via:
- The `ChunkedStreamManager` instance respectively the `getPdfManager` function, through the use of a "DocProgress" message, on the worker-thread.
- A `IPDFStreamReader.prototype.onProgress` callback, on the main-thread.
Furthermore, it would definitely *not* be a good idea to add any `IPDFStreamRangeReader.prototype.onProgress` callbacks since they only include the `loaded`-property which would trigger the "indeterminate" loadingBar (in the viewer).
Looking briefly at the history of this code it's not clear, at least to me, when this became unused however it's probably close to a decade ago.
Given that nothing in the `PDFWorkerStreamRangeReader` class attempts to invoke the `onProgress` callback, this is effectively dead code now.
Looking briefly at the history of this code it's not clear, at least to me, when this became unused however it's probably close to a decade ago.
Finally, note also how progress is already being reported through the `ChunkedStreamManager.prototype.onReceiveData` method.
This getter was only invoked from `src/display/network.js` and `src/core/chunked_stream.js`, however in both cases it's hardcoded to `false` and thus isn't actually needed.
This originated in PR 6879, close to a decade ago, for a potential TODO which was never implemented and it ought to be OK to just simplify this now.
This really isn't necessary, and it's just a left-over from before the code was moved into the current file.
Also, spotted during rebasing, use the existing "locale" hash-parameter in integration-tests rather than adding a duplicate one for testing.
This avoids the hassle of having to manually update that file when adding/modifying preferences in the viewer.
Updating the preferences-metadata should now only be something that the Chromium addon maintainer has to do.
It's width was a bit wrong because of its box-sizing property and it was causing some issues when resizing it with the keyboard.
And for the thumbnails sidebar, the tabindex was missing and some aria properties too.
Modified the regex in web/autolinker.js to explicitly allow hyphens (-) in
the domain part of email addresses, while maintaining the exclusion of
other punctuation. This fixes mailto links like user@uni-city.tld being
truncated at the hyphen.
Fixes#20557
Usually, content stream or fonts are compressed using FlateDecode.
So use the DecompressionStream API to decompress those streams
in the async code path.
Given that only the `FileSpec.prototype.serializable` getter is ever invoked from "outside" of the class, and only once per `FileSpec`-instance, the caching/shadowing isn't actually necessary.
Furthermore the `_contentRef`-caching wasn't actually correct, since it ended up storing a `BaseStream`-instance and those should *generally* never be cached.
(Since calling `BaseStream.prototype.getBytes()` more than once, without resetting the stream in between, will return an empty TypedArray after the first time.)
The `NetworkManager` is very old code at this point, and it predates the introduction of the streaming functionality by many years.
To simplify things, especially with upcoming re-factoring patches, let's move this functionality into private (and semi-private) methods in the `PDFNetworkStream` class to avoid having to deal with too many different scopes.
Store pending requests in a `WeakMap`, which allows working directly with the `XMLHttpRequest`-data and removes the need for a couple of methods.
Simplify the `PDFNetworkStreamRangeRequestReader.prototype._onDone` method, since after removing `moz-chunked-arraybuffer` support (see PR 10678) it's never called without an argument.
Stop sending the `begin`-property to the `onDone`-callbacks since it's unused. Originally this code was shared with the Firefox PDF Viewer, many years ago, and then that property mattered.
Re-factor the `status` validation, to avoid checking for a "bad" range-request response unconditionally.
The bug was supposed to be fixed by #20471 but here there are some annotations in the pdf.
When those annotations are added to the DOM, the struct tree has to be rendered but without
the text layer (because of asynchronicity).
So this patch is making sure that the modifications in the text layer are done once the
layer is rendered.
I think it should have been removed with #2527 so it should be useless now.
Because of that stuff, some commands with a wrong number of arguments
weren't stripped out (see the pdf in #13850).
Some gradients are represented as patterns in PDF.js, because they
mustn't be affected by the current transform. But in most of the cases,
the gradient is attached to the origin and the current transform is very
basic (dilatation + orthogonal + translation). In those cases, we can
avoid creating the pattern because the gradient is transformed into
another gradient when the inverse transform is applied.
This commit extracts the logic to draw a line from one coordinate to
another to both remove code duplication (8% of the total number of lines
in the file are removed) and clarify the intent of the individual tests.
Update the styles and HTML to reflect the new views manager concept.
For now, nothing about split/merge functionality is implemented or visible.
The new styles for the outline, attachments, and layers will be added later.
The thumbnail view is now accessible with the keyboard.
Fixes https://github.com/mozilla/pdf.js/issues/11595, where cancelling loading with `loadingTask.destroy()` before it finishes throws a `Worker was terminated` error that CANNOT be caught.
When worker is terminated, an error is thrown here:
6c746260a9/src/core/worker.js (L374)
Then `onFailure` runs, in which we throw again via `ensureNotTerminated()`. However, this second error is never caught (and cannot be), resulting in console spam.
There is no need to throw any additional errors since the termination is already reported [here](6c746260a9/src/core/worker.js (L371-L373)), and `onFailure` is supposed to handle errors, not throw them.
By setting `display: contents` on `.markedContent` containers, they stop
affecting the layout of their children. This means that we can always
position text layer `<span>` elements using percentages relative to the
page dimensions, rather than having two separate code paths.
For some reason this breaks the workaround for text selection flickering
in Chrome/Safari, which can be fixed by setting `user-select: text` on
the `.endOfContent` div (only in Chrome/Safari, as it would break
selection in Firefox).
This commit moves all the logic to scale up&down `<span>`s in the text
layer, introduced in #18283, to CSS.
The motivation for this change is that #18283 is still not enough for
all cases. That PR fixed the problem in Chrome&Firefox desktop, which
allow users to set an actual minimum font size in the browser settings.
However, other browsers (e.g. the Chrome-based WebView on Android) have
more complex logic and they scale up small text rather than simply
applying a minimum.
A workaround for that behavior is probably out of scope for PDF.js
itself as it only affects not officially supported platforms. However,
having access to the actual expected font height (through
`--font-height`) allows embedders of PDF.js to implement a workaround by
themselves.
The goal is to be able to catch the errors before making a release.
And fix some css issues (especially the missing css code for the newly added menu.css)
This way, the screen readers can read the math content properly.
The elements in the text layer will also have aria-hidden="true"
to avoid duplication.
This commit updates the release pipeline to use OIDC trusted publishing
now that we have configured it between GitHub Actions and NPM. This
solution allows us to remove the token variable (because there is no
longer a fixed token) and provenance flag (because provenance
attestations are generated by default with this approach); refer to
https://docs.npmjs.com/trusted-publishers for more information.
For now its use is limited to the comment sidebar but it'll be used for the new one
containing the thumbnails.
And make the sidebar more accessible with the keyboard or a screen reader.
This dependency got introduced during the move to Fluent for
localization (bug 1858715), but it wasn't added to `package.json`
at the time. This worked because other dependencies already
installed it, but we shouldn't rely on that, so this commit
explicitly includes it in `package.json` instead.
Fixes 66982a2a.
The linter found some issues in viewer.html with </input> which isn't required
and a missing closing div in test/resources/reftest-analyzer.html.
The HTML can now be nicely formatted. In order to not break the build for
mozilla-central, the preprocessor has been fixed in order to take into account
the white spaces at the beginning of a comment line.
And finally, make .prettierrc (which is supposed to be either json or yaml)
itself lintable.
Doing so has a number of advantages:
- it removes code duplication, thereby improving readability;
- it removes hardcoded editor IDs, by using the `getNextEditorId` helper
function that was previously introduced for the highlight editor
integration tests, thereby improving readability and reusability;
- it removes potential for intermittent failures by not proceeding until
the freetext editor is fully created and all assertions pass, which
didn't happen consistently before because the code wasn't centralized.
For now it's just possible to create a single pdf in selecting some pages in different pdf sources.
The merge is for now pretty basic (it's why it's still a WIP) none of these data are merged for now:
- the struct trees
- the page labels
- the outlines
- named destinations
For there are 2 new ref tests where some new pdfs are created: one with some extracted pages and an other
one (encrypted) which is just rewritten.
The ref images are generated from the original pdfs in selecting the page we want and the new images are
taken from the generated pdfs.
Clear the pointer type of CurrentPointers outside of the DrawingEditor to always keep the same, until the user changes the mode.
One of the goal is to keep the same pointer type in case of a refresh of the page or if the user changes the pdf document.
Metric-compatible font with Times New Roman created by ParaType, based on
their serif font PT Serif, released under OFL-1.1 license.
https://www.paratype.com/fonts/pt/pt-astra-serif
Signed-off-by: Coelacanthus <uwu@coelacanthus.name>
It'll help to make math equations "visible" for screen readers.
MS Office has a specific way to add some MathML code to struc tree leaf
and this patch handles it.
Move current pointer field of DrawingEditor to CurrentPointer class in tools.js: The pointer types fields have been moved to a CurrentPointer object in tools.js. This object is used by eraser.js and ink.js.
Only reset pointer type when user select a new mode: Clear the pointer type when changing mode, instead of at the end of the session. It seems more stable, as the method is not called this way when the user changes pages. Also, clear the pointer type when the mode is changed by an event (the user changes the editor type), otherwise, the same pointer type is kept (the document is changed for example)
On GitHub Actions this test could fail with `Expected 1.3125 to be less
than 1` due to slight differences in the decimals of the `rect.y` value.
However, for this check we're only really interested in full pixels, so
this commit rounds the value up to the next integer to not be affected
by small decimal differences so the test passes on GitHub Actions too.
For the integration tests we prefer non-linked test cases because those
PDF files are directly checked into the Git repository and thus don't
need a separate download step that linked test cases do.
However, for the freetext and ink integration tests we currently use
`aboutstacks.pdf` which is a linked test case. Fortunately we don't need
to use it because for most tests we don't actually use any properties of
it: we only create editors on top of the canvas, but for that any PDF
file works, so we can simply use the non-linked `empty.pdf` file instead.
The only exception is the "aria-owns" test that needs a line of text from
the PDF file, so we move that particular test to a dedicated `describe`
block and adapt it to use the non-linked `attachment.pdf` file that just
contains a single line of text that can be used for this purpose.
The changes combined make 12 more integration tests run out-of-the-box
after a Git clone, which also simplifies running on GitHub Actions.
We used a SVG string which can be pass to the Path2D ctor but it's a bit slower than
building the path step by step.
Having numerical data instead of a string will help the font data serialization.
The position of the text rendered by `showText` is affected
incrementally by the preceding `showText` operations "on the same line".
For this reason, we keep track of all of them (with their dependencies)
in `sameLineText`.
`sameLineText` can be reset whenever we explicitly position the text
somewhere else. We were previously only doing it for `moveText`, and
this patch updates the code to also do it on `setTextMatrix` (which
resets `this.current.x/y` in `CanvasGraphics` to 0).
The complexity of subsequent `sameLineText` operations dependency
tracking grows quadratically with the number of operations on the same
line, so this patch fixes the performance problem when there are whole
pages of text that only use `setTextMatrix` and not `moveText`.
Python 3.14 is the current stable version, released on October 7th. The
dependencies we use also support Python 3.14 now, most importantly
`fonttools` for which the OS-specific builds have been published (see
the `cp314` wheels on https://pypi.org/project/fonttools/#files).
Follow up on https://github.com/mozilla/pdf.js/pull/20197,
This serializes pattern data into an ArrayBuffer which is
then transferred from the worker to the main thread.
It sets up the stage for us to eventually switch to a
SharedArrayBuffer in the future.
Instead of looking at every bbox, we use a grid (64x64) where each cell of the grid is associated with the bboxes
touching it.
In order to get the potential bboxes containing a point, we just have to compute the number of the cell containing
it and in using the associated described above, we can quickly know if the point is contained.
With the pdf in the mentioned bug, it's ~20 times faster.
The `width` and `height` arguments for `setDims` have been removed in
PR #20285, but for two calls in the highlight code they remained. This
commit removes them as they are no longer used in the method itself.
Fixes 0722faa9.
But keep a lower quality when enableOptimizedPartialRendering is true because we need to compensate the time
used to compute the bboxes and since subsequent rendering are faster it's more acceptable to see
a lower quality image for few tenths of seconds.
[Annotation] In reading mode with new commment stuff enabled, use the comment popup for annotations without a popup but with some contents (bug 1991029)
This PR serializes font data into an ArrayBuffer
that is then transfered from the worker to the
main thread. It's more efficient than the current
solution which clones the "export data" object
which includes the font data as a Uint8Array.
It prepares us to switch to a SharedArrayBuffer
in the future, which would allow us to share
the font data with multiple agents, which would be
crucial for the upcoming "renderer" worker.
Change info to use console.info and warn function
to use console.warn, this not only makes sense semantically
but also in practice server side runtimes like deno
write console.log to stdout, and console.warn
to stderr (info goes to stdout, unfortunately?)
this is important because logging to stdout
can break some cli apps.
Before this patch, when `enableOptimizedPartialRendering`
is enabled we would record the bounding boxes of the
various operations on the first render.
This patches change it to happen on the first render that we
know will also need a detail view, so that the performance
cost is not paid for the case when the detail view is not used.
The size of the canvas has significant impact on the rendering
performance. If we are going to render a high-res detail
view on top of the full-page canvas, we can further
reduce the full-page canvas resolution to improve
rendering time without affecting the resolution seen by
the user.
Users will se the lower resolution when quickly scrolling around the
page, but it will then be replaced with the high-res
detail view.
It can happen with a pdf having a large text layer.
Instead of waiting for the first rendered page to enable the buttons we wait for
a rendered annotation editor layer.
In order to see the issue this patch is fixing:
- open a pdf with some highlights and a comment on page 1, at page 7
- open the comment sidebar
- click on the comment on page 1
Opening at page 7 lets a not fully rendered page which means that when jumping to it
with the sidebar, we re-use what we've instead of redrawing it.
This PR changes the way we store bounding boxes so that they use less
memory and can be more easily shared across threads in the future.
Instead of storing the bounding box and list of dependencies for each
operation that renders _something_, we now only store the bounding box
of _every_ operation and no dependencies list. The bounding box of
each operation covers the bounding box of all the operations affected
by it that render something. For example, the bounding box of a
`setFont` operation will be the bounding box of all the `showText`
operations that use that font.
This affects the debugging experience in pdfBug, since now the bounding
box of an operation may be larger than what it renders itself. To help
with this, now when hovering on an operation we also highlight (in red)
all its dependents. We highlight with white stripes operations that do
not affect any part of the page (i.e. with an empty bbox).
To save memory, we now save bounding box x/y coordinates as uint8
rather than float64. This effectively gives us a 256x256 uniform grid
that covers the page, which is high enough resolution for the usecase.
Bug 1978027 has been fixed upstream 10 days ago, so this integration
test can be enabled for Firefox too now that it passed with recent
Nightly versions.
Before the introduction of the `renderRichText` helper function we
exclusively used `this.#html` for XFA rich text and exclusively used
`this.#contentsObj` for plain text. However, after the refactoring we
tried to access `this.#contentsObj.dir` in both cases, which fails for
XFA rich text because `this.#contentsObj` is `null` in that case.
This commit fixes the issue by using optional chaining to make sure we
don't try to access non-existent `this.#contentsObj` properties, which
makes the `must update an existing annotation and show the right popup`
freetext integration pass again.
Fixes#20237.
Fixes 35c90984.
Most places have a newline before/after `before{Each,All}`,
`after{Each,All}` and `it` to visually separate the blocks for clarity,
but in a handful of places this wasn't done. This commit removes the
inconsistencies so that the test code is formatted consistently.
The helper function was used in a number of places, but also a lot of
places contained the annotation selector string inline. This commit
makes sure that all places use `getAnnotationSelector` consistently to
make sure the annotation selector string is only defined in a single
place and to improve readability of the test code.
This test called `closeSinglePage` manually at the end of the test,
which is inconsistent with all other tests that call `closePages` in an
`afterEach` block. This commit fixes the difference for consistency.
If the document changes the comment state from the old document should
be replaced with that of the new document. To do this the comment
manager is destroyed, but the corresponding comment sidebar wasn't
destroyed yet, which resulted in the comment state from the old document
still being visible for the new document.
This commit fixes the issue by hiding the comment sidebar if the comment
manager is destroyed. Note that hiding the comment sidebar effectively
destroys all its state, and we already set the annotation mode to "none"
on document change so we don't want to keep showing the comment sidebar
anyway.
This function is required for the commenting feature: the user will be able to click
on a comment in the sidebar and the document will be scrolled accordingly.
Locally, on Arch Linux, this integration test permafails:
```
1) Text layer Text selection using selection carets doesn't jump when moving selection
Message:
second selection:
Expected '(frequently executed) bytecode sequences, records
them, and compiles them to fast native code. We call such a s' to roughly match /frequently .* We call such a se/s.
Stack:
at <Jasmine>
at UserContext.<anonymous> (file:///home/timvandermeij/Documenten/Ontwikkeling/pdf.js/Code/test/integration/text_layer_spec.mjs:521:12)
Message:
third selection:
Expected '(frequently executed) bytecode sequences, records
them, and compiles them to fast native code. We call such a s' to roughly match /frequently .* We call such a se/s.
Stack:
at <Jasmine>
at UserContext.<anonymous> (file:///home/timvandermeij/Documenten/Ontwikkeling/pdf.js/Code/test/integration/text_layer_spec.mjs:529:12
```
The exact selection can differ a bit per OS/browser. In this case the
last character was consistently not selected while on other platforms it
is, so this commit fixes the issue by relaxing the regex to not consider
the final character so that the test passes if the rest matches.
This removes the following error from the integration test logs:
```
JavaScript error: http://127.0.0.1:59283/build/generic/build/pdf.mjs, line 3879:
TypeError: EventTarget.addEventListener: 'signal' member of AddEventListenerOptions is not an object.
```
Fixes 636ff50.
Extends 63651885.
This commit is a first step towards #6419, and it can also help with
first compute which ops can affect what is visible in that part of
the page.
This commit adds logic to track operations with their respective
bounding boxes. Only operations that actually cause something to
be rendered have a bounding box and dependencies.
Consider the following example:
```
0. setFillRGBColor
1. beginText
2. showText "Hello"
3. endText
4. constructPath [...] -> eoFill
```
here we have three rendering operations: the showText op (2) and the
path (4). (2) depends on (0), (1) and (3), while (4) only depends on
(0). Both (2) and (4) have a bounding box.
This tracking happens when first rendering a PDF: we then use the
recorded information to optimize future partial renderings of a PDF, so
that we can skip operations that do not affected the PDF area on the
canvas.
All this logic only runs when the new `enableOptimizedPartialRendering`
preference, disabled by default, is enabled.
The bounding boxes and dependencies are also shown in the pdfBug
stepper. When hovering over a step now:
- it highlights the steps that they depend on
- it highlights on the PDF itself the bounding box
We don't need AI/ML features in the tests, so this should reduce CPU
usage by not having the inference process running. Moreover, it prevents
the following lines from being logged in the test output:
```
JavaScript error: resource://gre/actors/MLEngineParent.sys.mjs, line 509: Error: Unable to get the ML engine from Remote Settings.
JavaScript error: resource://gre/actors/MLEngineParent.sys.mjs, line 1279: TypeError: can't access property "postMessage", this[#port] is null
```
Implement a delay for Chrome protocol calls in the integration tests, and skip the "must check that an existing highlight is ignored on hovering" integration test on Windows
This is a temporary measure to reduce noise until #20136 is fixed. Note
that this shouldn't be an issue in terms of coverage because we still
run the test on Linux.
In Chrome protocol calls are faster than in Firefox and thus trigger in
quicker succession. This can cause intermittent failures because new
protocol calls can run before events triggered by the previous protocol
calls had a chance to be processed (essentially causing events to get
lost).
This commit fixes the issue by configuring Chrome with a protocol call
delay value that gives it a more similar execution speed as Firefox
(which also gives us more consistency between the two browser runs).
Note that this doesn't negatively impact the overall runtime of the
integration tests because Puppeteer already waits for a test to complete
in both browsers before continuing to the next one and Chrome
consistently was, and with this patch still slightly is, faster in
completing the tests.
It should fix the error:
```
JavaScript error: http://127.0.0.1:43303/build/generic/build/pdf.mjs, line 1445:
TypeError: EventTarget.addEventListener: 'signal' member of AddEventListenerOptions is not an object.
```
we've when running integration tests on the Linux bot.
In #20016, the `canvasContext` property of `RenderParameters` was deprecated in favor of the new `canvas` property.
The JSDoc was updated to include the new parameter along with the old one.
I think the old one should be enclosed in `[]` to mark it as optional (and to allow usage from TypeScript with just the `canvas` parameter provided).
I also reordered the properties so that all required properties come first, follow by optional ones.
The fixed -400px horizontal offset used by
scrollIntoView led to horizontal scroll only moving
part-way right on narrow screens. The highlights near
the right-edge remained party or completely off
screen.
This centres the highlighted match on any viewport width while
clamping the left margin to 20-400px. On very narrow screens
the scrollbar now moves all the way to the right instead of
stopping midway.
It's useful for users highlighting with NVDA.
They've to enable native selection and then selection some text.
In this case only a selectionchange is triggered once the selection is done.
Typing in the text field causes a sandbox event to trigger, which we
should await to avoid continuing to the next part of the test before
the sandbox event is fully processed.
Ensure that negative “LW” entries in an ExtGState dictionary are converted to their absolute values when the “gs” operator is processed.
See PR https://github.com/mozilla/pdf.js/pull/19639
It takes some time for the viewer alert to be updated after the editor
is committed, but the current tests don't await that and proceed too
fast to the viewer alert string assertion. This commit fixes the issue
by waiting for the expected viewer alert string to appear instead.
It fixes#20065.
The only to get a path (from the path generator) is when the font is embedded.
So when we need a path (disableFontFace: true or when we want to use a pattern for stroking/filling), it's impossible
to fulfil.
In order to screenshot the page and assert that it's monochrome,
providing a regression test for #18694, the viewer background is
configured to match the page background because screenshotting the page
always captures a small part of the viewer background as well, and this
way we can easily go over all pixels and check that they are all equal.
However, in addition to configuring the viewer background the test also
hides the toolbar and removes the page border. Especially the latter
makes `scrollIntoView` fail in both Chrome and Firefox with recent
Puppeteer versions, for reasons which remain a bit unclear.
Fortunately both hiding the toolbar and removing the page border is not
actually necessary (anymore) for the test to work, so we can simply
remove those actions to fix the issue and reduce the amount of code. To
make sure that the test still covers the original issue correctly we've
reverted the changes from #18698 and then test still fails as expected.
Fixes#19811.
Fixes 68332ec2.
This is a precursor to moving the call into a
worker thread to let us use `OffscreenCanvas`. The
current position wouldn't work since we make
transformations to the canvas object after the
getContext call, which isn't allowed for
OffscreenCanvas. Also it isn't allowed to clone or
`transferControlToOffscreen` the canvas after the
`getContext` call.
Necessary because when there is no Popup annotation created along
with a Text annotation, the Popup annotation created by pdf.js
does not receive the noRotate flag
The shadow was taken into account when computing the bounding box of the section
containing the link and it was making the clip path wrong.
Since the shadow is almost invisible because of the opacity, the yellow color and the clip
we can remove it without causing any visual regressions (and as a side effect it'll avoid
to use resources to compute it when displayed).
The fixed -400px horizontal offset used by
scrollIntoView led to horizontal scroll only moving
part-way right on narrow screens. The highlights near
the right-edge remained party or completely off
screen.
This centres the highlighted match on any viewport width while
clamping the left margin to 20-400px. On very narrow screens
the scrollbar now moves all the way to the right instead of
stopping midway.
After clicking the find button we need to wait for the input field to
appear before we try to type in it, but instead we were waiting for the
button to appear. However, the button is always present, so the current
`waitForSelector` call is basically a no-op, and this can cause the
integration tests to fail intermittently if we continue before the
input field is actually visible.
This appears to be due a typo first introduced in commit
d10da907dac86c103b787127110a59aed82c7aaa that has been copy/pasted.
This commit fixes the issue by waiting for the input field to be visible
before we continue typing in it.
On mac, the pdf backend used when printing is using the cid from the font,
so if a char has null cid then it's equivalent to .notdef and some viewers
don't display it.
The problem in the original code is that `getTextAt` is called too
early and therefore returns unexpected text content. This can happen
because we call `increaseScale` and then wait for the page's text layer
to be visible, but it can take some time before the zoom actually
occurs/completes in the viewer and in the meantime the old (pre-zoom)
text layer may still be visible, causing us to continue too soon because
we don't validate that we're dealing with the post-zoom text layer.
This commit fixes the issue by simply waiting for the expected text to
show up at the given origin coordinates, which makes the test work
independent of viewer actions/timing.
This isn't directly part of the official API, and having this class in its own file could help avoid future changes (e.g. issue 18148) affecting the size of the `src/display/api.js` file unnecessarily.
Given that this file represents the official API, it's difficult to avoid it becoming fairly large as we add new functionality. However, it also contains a couple of smaller (and internal) helpers that we can move into a new utils-file.
Also, we inline the `DEFAULT_RANGE_CHUNK_SIZE` constant since it's only used *once* and its value has never been changed in over a decade.
The `constructPath` op receives as arguments not only the
information to construct the path, but also the op to apply
the path to (such as "fill", or "stroke").
This commit updates the Stepper tool in the debugger to decode
that op, showing its name rather than just its numeric ID.
To make it easier to tell which PDF.js version/commit that the *built* files correspond to, they have (since many years) included `pdfjsVersion` and `pdfjsBuild` constants with that information.
As currently implemented this has a few shortcomings:
- It requires manually adding the code, with its preprocessor statements, in all relevant files.
- It requires ESLint disable statements, since it's obviously unused code.
- Being unused, this code is removed in the minified builds.
- This information would be more appropriate as comments, however Babel discards all comments during building.
- It would be helpful to have this information at the top of the *built* files, however it's being moved during building.
To address all of these issues, we'll instead utilize Webpack to insert the version/commit information as a comment placed just after the license header.
We have a fallback for the common case of Type3 fonts without a /FontDescriptor dictionary, however we also need to handle the case where it's present but lacking the required /FontName entry.
The `Page.evaluate()` and `Mouse.click()` APIs in Puppeteer both return
a promise; see https://pptr.dev/api/puppeteer.page.evaluate and
https://pptr.dev/api/puppeteer.mouse.click, and should therefore be
awaited before proceeding in tests to make sure that the test's behavior
is deterministic and avoid intermittent failures.
The following command was used to find potential places to fix:
`grep -nEr "[^await|return] page\." test/integration/*`
The clipboard, used via the `copyImage` helper function, is a shared
resource, so access to it cannot happen concurrently because it could
result in tests overwriting each other's contents. Most tests using
the clipboard are therefore run sequentially, but only the stamp
editor's undo-related tests weren't, so this commit fixes the issue
by running those tests sequentially too.
Given that Node.js has full support for the Fetch API since version 21, see the "History" data at https://nodejs.org/api/globals.html#fetch, it seems unnecessary for us to manually check for various globals before using it.
Since our primary development target is browsers in general, and Firefox in particular, being able to remove Node.js-specific compatibility code is always helpful.
Note that we still, for now, support Node.js version 20 and if the relevant globals are not available then Errors will instead be thrown from within the `PDFFetchStream` class.
This allows us to simply invoke `PDFWorker.create` unconditionally from the `getDocument` function, without having to manually check if a global `workerPort` is available first.
Without this "fake" workers may be ignored in the API, which isn't really what you want when manually providing the `disableWorker=true` hash parameter. (Note that this requires the `pdfBugEnabled` option/preference to be set as well.)
Also, after the changes in PR 19810 we can just load the "fake" worker directly in development mode and don't need to manually assign it to the global scope.
Emscripten generates code that allows the caller to provide the Wasm
module (thorugh Module.instantiateWasm), with a fallback in case
.instantiateWasm is not provided. We always define instantiateWasm, so
we can hard-code the check and let our dead code elimination logic
remove the unused fallback.
This commit also improved the dead code elimination logic so that if
a function declaration becomes unused as a result of removing dead
code, the function itself is removed.
They were removed in the minified build, but the code that made the comments necessary was still there (just minified). This commit updates the Terser config to preserve them.
The default value of Terser's `comments` option is [`/@preserve|@copyright|@lic|@cc_on|^\**!/i`](d528103b7c/lib/output.js (L178C12-L178C53)), however the only type of comment it was actually matching in our case is `@lic`, for the license header in minified files. Thus the new regexp is `/@lic|webpackIgnore|@vite-ignore/i`.
*This is something that occurred to me when reviewing the latest PDF.js update in mozilla-central.*
Currently we duplicate essentially the same code in both the `OutputScale.prototype.limitCanvas` and `PDFPageDetailView.prototype.update` methods, which seems unnecessary, and to avoid that we introduce a new `OutputScale.capPixels` method that is used to compute the maximum canvas pixels.
When the /OpenAction data is an Array we're currently using it as-is which could theoretically cause problems in corrupt PDF documents, hence we ensure that a "raw" destination is actually valid. (This change is covered by existing unit-tests.)
*Note:* In the Dictionary case we're using the `Catalog.parseDestDictionary` method, which already handles all of the necessary validation.
This way it helps to reduce the overall canvas dimensions and make the rendering faster.
The drawback is that when scrolling, the page can be blurry in waiting for the rendering.
The default value is 200% on desktop and will be 100% for GeckoView.
The effect is probably not even measurable, however this patch ever so slightly reduces the asynchronicity in the `fieldObjects` getter. These changes should be safe since:
- We're inside of the `PDFDocument`-class and the `annotationGlobals`-getter, which will always return a (shadowed) Promise and won't throw `MissingDataException`s, can be accessed directly without going through the `BasePdfManager`-instance.
- The `acroForm`-dictionary can be accessed through the `annotationGlobals`-data, removing the need to "manually" look it up and thus the need for using `Promise.all` here.
- We can also lookup the /Fields-data, in the `acroForm`-dictionary, synchronously since the initial `formInfo.hasFields` check guarantees that it's available.
Currently we repeat the same identical code five times in the `Page`-class when creating a `PartialEvaluator`-instance, which given the number of parameters it needs seems like unnecessary duplication.
Node.js version 24 was just released, see https://github.com/nodejs/release#release-schedule, hence we should run tests in that version in order to help catch any possible issues as soon as possible.
Also, since version 23 will reach EOL (end-of-life) in less than a month we stop running tests in that version.
The `ObjectLoader.prototype.load` method has a fast-path, which avoids any lookup/parsing if the entire PDF document is already loaded.
However, we still need to create an `ObjectLoader`-instance which seems unnecessary in that case.
Hence we introduce a *static* `ObjectLoader.load` method, which will help avoid creating `ObjectLoader`-instances needlessly and also (slightly) shortens the call-sites.
To ensure that the new method will be used, we extend the `no-restricted-syntax` ESLint rule to "forbid" direct usage of `new ObjectLoader()`.
Given that all the methods are already asynchronous we can just use `await` more throughout this code, rather than having to explicitly return function-calls and `undefined`.
Note also how none of the `ObjectLoader.prototype.load` call-sites use the return value.
Rather than "manually" invoking the methods from the `src/core/worker.js` file we introduce a single `PDFDocument`-method that handles this for us, and make the current methods private.
Since this code is only invoked at most *once* per document, and only for XFA documents, we can use `BasePdfManager.prototype.ensureDoc` directly rather than needing a stand-alone method.
Currently we repeat virtually the same code when calling the `PartialEvaluator.prototype.handleSetFont` method, which we can avoid by introducing an inline helper function.
Considering the name of the method, and how it's actually being used, you'd expect it to return a boolean value.
Given how it's currently being used this inconsistency doesn't cause any issues, however we should still fix this.
Rather than having a dedicated `BasePdfManager`-method for this one call-site we can instead change `PDFDocument.prototype.serializeXfaData` to a non-async method, that we invoke via `BasePdfManager.prototype.ensureDoc`.
Currently we create an intermediate `Dict` during parsing, however that seems unnecessary since (note especially the second point):
- The `NameOrNumberTree.prototype.getAll` method will already resolve any references, as needed, during parsing.
- The `Catalog.prototype.xfaImages` getter is invoked, via the `BasePdfManager`-instance, such that any `MissingDataException`s are already handled correctly.
Currently *some* of the links[1] on page three of the `issue19835.pdf` test-case aren't clickable, since the destination (of the LinkAnnotation) becomes empty.
The reason is that these destinations include the character `\x1b`, which is interpreted as the start of a Unicode escape sequence specifying the language of the string; please refer to section [7.9.2.2 Text String Type](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1957385) in the PDF specification.
Hence it seems that we need a way to optionally disable that behaviour, to avoid a "badly" formatted string from becoming empty (or truncated), at least for cases where we are:
- Parsing named destinations[2] and URLs.
- Handling "strings" that are actually /Name-instances.
- Building a lookup Object/Map based on some PDF data-structure.
*NOTE:* The issue that prompted this patch is obviously related to destinations, however I've gone through the `src/core/` folder and updated various other `stringToPDFString` call-sites that (directly or indirectly) fit the categories listed above.
---
[1] Try clicking on anything on the line containing "Item 7A. Quantitative and Qualitative Disclosures About Market Risk 27".
[2] Unfortunately just skipping `stringToPDFString` in this case would cause other issues, such as the named destination becoming "unusable" in the viewer; see e.g. issues 14847 and 14864.
Whenever we cannot find a destination we'll fallback to checking all destinations, to account for e.g. out-of-order NameTrees, and in those cases any subsequent destination-lookups can be made a tiny bit more efficient by immediately checking the already cached destinations.
Instead, we update the visible canvas every 500ms.
With large canvas, updating at 60fps lead to a lot gfx transactions and it can take a lot of time.
For example, with wuppertal_2012.pdf on Windows, displaying it at 150% takes around 14 min !!! without
this patch when it takes only around 14 sec with. Even at 30% it helps to improve the performance
by around 20%.
Modern Node.js versions now include a `navigator` implementation, with a few basic properties, that's actually enough for the PDF.js use-cases; please see https://nodejs.org/api/globals.html#navigator
Unfortunately we still support Node.js version `20`, hence we add a basic polyfill since that allows simplifying the code slightly.
@ -8,8 +8,6 @@ The issue tracking system is designed to record a single technical problem. A bu
If you are developing a custom solution, first check the examples at https://github.com/mozilla/pdf.js#learning and search existing issues. If this does not help, please prepare a short well-documented example that demonstrates the problem and make it accessible online on your website, JS Bin, GitHub, etc. before opening a new issue or contacting us in the Matrix room -- keep in mind that just code snippets won't help us troubleshoot the problem.
If you are developing a custom solution, first check the examples at https://github.com/mozilla/pdf.js#learning and search existing issues. If this does not help, please prepare a short well-documented example that demonstrates the problem and make it accessible online on your website, JS Bin, GitHub, etc. before opening a new issue or contacting us in the Matrix room -- keep in mind that just code snippets won't help us troubleshoot the problem.
Note that the translations for PDF.js in the `l10n` folder are imported from the Nightly channel of Mozilla Firefox, such that we don't have to maintain them ourselves. This means that we will not accept pull requests that add new languages and/or modify existing translations, unless the corresponding changes have been made in Mozilla Firefox first.
PDF.js is a Portable Document Format (PDF) viewer built with JavaScript, HTML5 Canvas, and CSS. It's a Mozilla project that provides a general-purpose, web standards-based platform for parsing and rendering PDFs without requiring native code or plugins.
## Common Commands
### Development Server
```bash
npx gulp server
```
Then open http://localhost:8888/web/viewer.html to view the PDF viewer. Test PDFs are available at http://localhost:8888/test/pdfs/?frame
### Building
Build for modern browsers:
```bash
npx gulp generic
```
This generates `pdf.js` and `pdf.worker.js` in `build/generic/build/`.
Build for distribution (creates pdfjs-dist package):
```bash
npx gulp dist
npx gulp dist-install # Build and install locally
```
### Testing
Run all tests:
```bash
npx gulp test
```
Run unit tests only:
```bash
npx gulp unittest
```
Run integration tests (browser-based tests using Puppeteer):
```bash
npx gulp integrationtest
```
Run font tests:
```bash
npx gulp fonttest
```
Run a single test file by modifying test/test_manifest.json or using test runner options.
### Linting and Formatting
Lint JavaScript:
```bash
npx gulp lint
```
Format code (uses Prettier and ESLint):
```bash
npx eslint --fix <file>
```
### Type Checking
Run TypeScript type checking:
```bash
npx gulp typestest
```
## Architecture
### High-Level Structure
PDF.js has a multi-layer architecture that separates concerns between PDF parsing, rendering, and UI:
#### 1. Core Layer (`src/core/`)
The core layer handles PDF parsing and interpretation. Key responsibilities:
- **PDF parsing**: Parsing PDF structure, cross-reference tables, streams
- **Font handling**: CFF, TrueType, Type1 font parsing and conversion (`font.js`, `fonts.js`, `cff_*.js`, `type1_*.js`)
@ -18,13 +18,13 @@ This tutorial shows how PDF.js can be used as a library in a web browser.
The object structure of PDF.js loosely follows the structure of an actual PDF. At the top level there is a document object. From the document, more information and individual pages can be fetched. To get the document:
The object structure of PDF.js loosely follows the structure of an actual PDF. At the top level there is a document object. From the document, more information and individual pages can be fetched. To get the document:
```js
```js
pdfjsLib.getDocument('helloworld.pdf')
pdfjsLib.getDocument({ url: "helloworld.pdf" })
```
```
Remember though that PDF.js uses promises, and the above will return a `PDFDocumentLoadingTask` instance that has a `promise` property which is resolved with the document object.
Remember though that PDF.js uses promises, and the above will return a `PDFDocumentLoadingTask` instance that has a `promise` property which is resolved with the document object.
```js
```js
var loadingTask = pdfjsLib.getDocument('helloworld.pdf');
var loadingTask = pdfjsLib.getDocument({ url: "helloworld.pdf" });
"description":"The theme to use.\n0 = Use system theme.\n1 = Light theme.\n2 = Dark theme.",
"type":"integer",
"enum":[0,1,2],
"default":2
},
"showPreviousViewOnLoad":{
"description":"DEPRECATED. Set viewOnLoad to 1 to disable showing the last page/position on load.",
"type":"boolean",
"default":true
},
"viewOnLoad":{
"title":"View position on load",
"description":"The position in the document upon load.\n -1 = Default (uses OpenAction if available, otherwise equal to `viewOnLoad = 0`).\n 0 = The last viewed page/position.\n 1 = The initial page/position.",
"type":"integer",
"enum":[-1,0,1],
"default":0
},
"defaultZoomDelay":{
"title":"Default zoom delay",
"description":"Delay (in ms) to wait before redrawing the canvas.",
"type":"integer",
"default":400
},
"defaultZoomValue":{
"title":"Default zoom level",
"description":"Default zoom level of the viewer. Accepted values: 'auto', 'page-actual', 'page-width', 'page-height', 'page-fit', or a zoom level in percents.",
"description":"Controls the state of the sidebar upon load.\n -1 = Default (uses PageMode if available, otherwise the last position if available/enabled).\n 0 = Do not show sidebar.\n 1 = Show thumbnails in sidebar.\n 2 = Show document outline in sidebar.\n 3 = Show attachments in sidebar.",
"type":"integer",
"enum":[-1,0,1,2,3],
"default":-1
},
"enableHandToolOnLoad":{
"description":"DEPRECATED. Set cursorToolOnLoad to 1 to enable the hand tool by default.",
"type":"boolean",
"default":false
},
"enableHWA":{
"title":"Enable hardware acceleration",
"description":"Whether to enable hardware acceleration.",
"type":"boolean",
"default":true
},
"enableAltText":{
"type":"boolean",
"default":false
},
"enableGuessAltText":{
"type":"boolean",
"default":true
},
"enableAltTextModelDownload":{
"type":"boolean",
"default":true
},
"enableNewAltTextWhenAddingImage":{
"type":"boolean",
"default":true
},
"altTextLearnMoreUrl":{
"type":"string",
"default":""
},
"enableSignatureEditor":{
"type":"boolean",
"default":false
},
"enableUpdatedAddImage":{
"type":"boolean",
"default":false
},
"cursorToolOnLoad":{
"title":"Cursor tool on load",
"description":"The cursor tool that is enabled upon load.\n 0 = Text selection tool.\n 1 = Hand tool.",
"type":"integer",
"enum":[0,1],
"default":0
},
"pdfBugEnabled":{
"title":"Enable debugging tools",
"description":"Whether to enable debugging tools.",
"type":"boolean",
"default":false
},
"enableScripting":{
"title":"Enable active content (JavaScript) in PDFs",
"type":"boolean",
"description":"Whether to allow execution of active content (JavaScript) by PDF files.",
"description":"Whether to disable range requests (not recommended).",
"type":"boolean",
"default":false
},
"disableStream":{
"title":"Disable streaming for requests",
"description":"Whether to disable streaming for requests (not recommended).",
"type":"boolean",
"default":false
},
"disableAutoFetch":{
"type":"boolean",
"default":false
},
"disableFontFace":{
"title":"Disable @font-face",
"description":"Whether to disable @font-face and fall back to canvas rendering (this is more resource-intensive).",
"type":"boolean",
"default":false
},
"disableTextLayer":{
"description":"DEPRECATED. Set textLayerMode to 0 to disable the text selection layer by default.",
"type":"boolean",
"default":false
},
"textLayerMode":{
"title":"Text layer mode",
"description":"Controls if the text layer is enabled, and the selection mode that is used.\n 0 = Disabled.\n 1 = Enabled.",
"type":"integer",
"enum":[0,1],
"default":1
},
"externalLinkTarget":{
"title":"External links target window",
"description":"Controls how external links will be opened.\n 0 = default.\n 1 = replaces current window.\n 2 = new window/tab.\n 3 = parent.\n 4 = in top window.",
"type":"integer",
"enum":[0,1,2,3,4],
"default":0
},
"disablePageLabels":{
"type":"boolean",
"default":false
},
"disablePageMode":{
"description":"DEPRECATED.",
"type":"boolean",
"default":false
},
"disableTelemetry":{
"title":"Disable telemetry",
"type":"boolean",
"description":"Whether to prevent the extension from reporting the extension and browser version to the extension developers.",
"default":false
},
"annotationMode":{
"type":"integer",
"enum":[0,1,2,3],
"default":2
},
"annotationEditorMode":{
"type":"integer",
"enum":[-1,0,3,15],
"default":0
},
"enablePermissions":{
"type":"boolean",
"default":false
},
"enableXfa":{
"type":"boolean",
"default":true
},
"historyUpdateUrl":{
"type":"boolean",
"default":false
},
"ignoreDestinationZoom":{
"title":"Ignore the zoom argument in destinations",
"description":"When enabled it will maintain the currently active zoom level, rather than letting the PDF document modify it, when navigating to internal destinations.",
"type":"boolean",
"default":false
},
"enablePrintAutoRotate":{
"title":"Automatically rotate printed pages",
"description":"When enabled, landscape pages are rotated when printed.",
"type":"boolean",
"default":true
},
"scrollModeOnLoad":{
"title":"Scroll mode on load",
"description":"Controls how the viewer scrolls upon load.\n -1 = Default (uses the last position if available/enabled).\n 3 = Page scrolling.\n 0 = Vertical scrolling.\n 1 = Horizontal scrolling.\n 2 = Wrapped scrolling.",
"type":"integer",
"enum":[-1,0,1,2,3],
"default":-1
},
"spreadModeOnLoad":{
"title":"Spread mode on load",
"description":"Whether the viewer should join pages into spreads upon load.\n -1 = Default (uses the last position if available/enabled).\n 0 = No spreads.\n 1 = Odd spreads.\n 2 = Even spreads.",
"type":"integer",
"enum":[-1,0,1,2],
"default":-1
},
"forcePageColors":{
"description":"When enabled, the pdf rendering will use the high contrast mode colors",
"type":"boolean",
"default":false
},
"pageColorsBackground":{
"description":"The color is a string as defined in CSS. Its goal is to help improve readability in high contrast mode",
"type":"string",
"default":"Canvas"
},
"pageColorsForeground":{
"description":"The color is a string as defined in CSS. Its goal is to help improve readability in high contrast mode",
"type":"string",
"default":"CanvasText"
},
"enableAutoLinking":{
"description":"Enable creation of hyperlinks from text that look like URLs.",
"Whether to disable @font-face and fall back to canvas rendering (this is more resource-intensive).",
},
disableRange:{
title:"Disable range requests",
description:"Whether to disable range requests (not recommended).",
},
disableStream:{
title:"Disable streaming for requests",
description:"Whether to disable streaming for requests (not recommended).",
},
disableTelemetry:{
title:"Disable telemetry",
description:
"Whether to prevent the extension from reporting the extension and browser version to the extension developers.",
},
enableAutoLinking:{
description:"Enable creation of hyperlinks from text that look like URLs.",
},
enableComment:{
description:"Enable creation of comment annotations.",
},
enableHWA:{
title:"Enable hardware acceleration",
description:"Whether to enable hardware acceleration.",
},
enableOptimizedPartialRendering:{
description:
"Enable tracking of PDF operations to optimize partial rendering.",
},
enablePrintAutoRotate:{
title:"Automatically rotate printed pages",
description:"When enabled, landscape pages are rotated when printed.",
},
enableScripting:{
title:"Enable active content (JavaScript) in PDFs",
description:
"Whether to allow execution of active content (JavaScript) by PDF files.",
},
externalLinkTarget:{
title:"External links target window",
description:
"Controls how external links will be opened.\n 0 = default.\n 1 = replaces current window.\n 2 = new window/tab.\n 3 = parent.\n 4 = in top window.",
enum:[0,1,2,3,4],
},
forcePageColors:{
description:
"When enabled, the pdf rendering will use the high contrast mode colors",
},
ignoreDestinationZoom:{
title:"Ignore the zoom argument in destinations",
description:
"When enabled it will maintain the currently active zoom level, rather than letting the PDF document modify it, when navigating to internal destinations.",
},
pageColorsBackground:{
description:
"The color is a string as defined in CSS. Its goal is to help improve readability in high contrast mode",
},
pageColorsForeground:{
description:
"The color is a string as defined in CSS. Its goal is to help improve readability in high contrast mode",
},
pdfBugEnabled:{
title:"Enable debugging tools",
description:"Whether to enable debugging tools.",
},
scrollModeOnLoad:{
title:"Scroll mode on load",
description:
"Controls how the viewer scrolls upon load.\n -1 = Default (uses the last position if available/enabled).\n 0 = Vertical scrolling.\n 1 = Horizontal scrolling.\n 2 = Wrapped scrolling.\n 3 = Page scrolling.",
enum:[-1,0,1,2,3],
},
sidebarViewOnLoad:{
title:"Sidebar state on load",
description:
"Controls the state of the sidebar upon load.\n -1 = Default (uses PageMode if available, otherwise the last position if available/enabled).\n 0 = Do not show sidebar.\n 1 = Show thumbnails in sidebar.\n 2 = Show document outline in sidebar.\n 3 = Show attachments in sidebar.",
enum:[-1,0,1,2,3],
},
spreadModeOnLoad:{
title:"Spread mode on load",
description:
"Whether the viewer should join pages into spreads upon load.\n -1 = Default (uses the last position if available/enabled).\n 0 = No spreads.\n 1 = Odd spreads.\n 2 = Even spreads.",
enum:[-1,0,1,2],
},
textLayerMode:{
title:"Text layer mode",
description:
"Controls if the text layer is enabled, and the selection mode that is used.\n 0 = Disabled.\n 1 = Enabled.",
enum:[0,1],
},
viewerCssTheme:{
title:"Theme",
description:
"The theme to use.\n0 = Use system theme.\n1 = Light theme.\n2 = Dark theme.",
enum:[0,1,2],
},
viewOnLoad:{
title:"View position on load",
description:
"The position in the document upon load.\n -1 = Default (uses OpenAction if available, otherwise equal to `viewOnLoad = 0`).\n 0 = The last viewed page/position.\n 1 = The initial page/position.",
enum:[-1,0,1],
},
};
// Deprecated keys are allowed in the managed preferences file.
// The code maintainer is responsible for adding migration logic to
// extensions/chromium/options/migration.js and web/chromecom.js .
constdeprecatedPrefs={
disablePageMode:{
description:"DEPRECATED.",
type:"boolean",
default:false,
},
disableTextLayer:{
description:
"DEPRECATED. Set textLayerMode to 0 to disable the text selection layer by default.",
type:"boolean",
default:false,
},
enableHandToolOnLoad:{
description:
"DEPRECATED. Set cursorToolOnLoad to 1 to enable the hand tool by default.",
type:"boolean",
default:false,
},
showPreviousViewOnLoad:{
description:
"DEPRECATED. Set viewOnLoad to 1 to disable showing the last page/position on load.",
type:"boolean",
default:true,
},
};
functionbuildPrefsSchema(prefs){
constproperties=Object.create(null);
for(constnameinprefs){
constpref=prefs[name];
lettype=typeofpref;
switch(type){
case"boolean":
case"string":
break;
case"number":
type="integer";
break;
default:
thrownewError(`Invalid type (${type}) for "${name}"-preference.`);
}
constmetadata=prefsMetadata[name];
if(metadata){
letnumMetadataKeys=0;
// Do some (very basic) validation of the metadata.
for(constkeyinmetadata){
constentry=metadata[key];
switch(key){
case"default":
case"type":
thrownewError(
`Invalid key (${key}) in metadata for "${name}"-preference.`
);
case"description":
if(entry.startsWith("DEPRECATED.")){
thrownewError(
`The \`description\` of the "${name}"-preference cannot begin with "DEPRECATED."`
);
}
break;
}
numMetadataKeys++;
}
if(numMetadataKeys===0){
thrownewError(
`No metadata for "${name}"-preference, remove the entry.`
);
}
}
properties[name]={
type,
default:pref,
...metadata,
};
}
for(constnameinprefsMetadata){
if(!properties[name]){
// Do *not* throw here, since keeping the metadata up-to-date should be
// the responsibility of the CHROMIUM-addon maintainer.
console.error(
`The "${name}"-preference was removed, add it to \`deprecatedPrefs\` instead.\n`
);
}
}
for(constnameindeprecatedPrefs){
constentry=deprecatedPrefs[name];
if(properties[name]){
thrownewError(
`The "${name}"-preference should not be listed as deprecated.`
* the build requires to have a [Docker](https://www.docker.com/) setup and then:
* `node build.js -C` to build the Docker image
* `node build.js -co /pdf.js/external/jbig2/` to compile the decoder
## Licensing
[PDFium](https://pdfium.googlesource.com/pdfium/) is under [Apache-2.0](https://pdfium.googlesource.com/pdfium/+/main/LICENSE)
and [pdf.js.jbig2](https://github.com/mozilla/pdf.js.jbig2/) is released under [Apache-2.0](https://github.com/mozilla/pdf.js.jbig2/blob/main/LICENSE) license so `jbig2.js` is released under [Apache-2.0](https://github.com/mozilla/pdf.js.jbig2/blob/main/LICENSE) license too.
constcachedTextDecoder=(typeofTextDecoder!=='undefined'?newTextDecoder('utf-8',{ignoreBOM:true,fatal:true}):{decode:()=>{throwError('TextDecoder not available')}});
console.warn("`WebAssembly.instantiateStreaming` failed because your server does not serve Wasm with `application/wasm` MIME type. Falling back to `WebAssembly.instantiate` which is slower. Original error:\n",e);
console.warn("`WebAssembly.instantiateStreaming` failed because your server does not serve Wasm with `application/wasm` MIME type. Falling back to `WebAssembly.instantiate` which is slower. Original error:\n",e);
Most of the files in this folder (except for the `en-US` folder) have been
+ Only the translations for the `en-US` locale are maintained here.
imported from the Firefox Nightly branch;
please see https://hg.mozilla.org/l10n-central. Some of the files are
+ The translations for all other locales are imported automatically from Firefox. Please see https://mozilla-l10n.github.io/introduction/ for information about contributing.
licensed under the MPL license. You can obtain a copy of the license at
@ -213,63 +198,3 @@ pdfjs-password-invalid = Mung me donyo pe atir. Tim ber i tem doki.
pdfjs-password-ok-button = OK
pdfjs-password-ok-button = OK
pdfjs-password-cancel-button = Juki
pdfjs-password-cancel-button = Juki
pdfjs-web-fonts-disabled = Kijuko dit pa coc me kakube woko: pe romo tic ki dit pa coc me PDF ma kiketo i kine.
pdfjs-web-fonts-disabled = Kijuko dit pa coc me kakube woko: pe romo tic ki dit pa coc me PDF ma kiketo i kine.
## Editing
## Default editor aria labels
## Remove button for the various kind of editor.
##
## Alt-text dialog
## Editor resizers
## This is used in an aria label to help to understand the role of the resizer.
## Color picker
## Show all highlights
## This is a toggle button to show/hide all the highlights.
## New alt-text dialog
## Group note for entire feature: Alternative text (alt text) helps when people can't see the image. This feature includes a tool to create alt text automatically using an AI model that works locally on the user's device to preserve privacy.
pdfjs-web-fonts-disabled = Webfonte is gedeaktiveer: kan nie PDF-fonte wat ingebed is, gebruik nie.
pdfjs-web-fonts-disabled = Webfonte is gedeaktiveer: kan nie PDF-fonte wat ingebed is, gebruik nie.
## Editing
## Default editor aria labels
## Remove button for the various kind of editor.
##
## Alt-text dialog
## Editor resizers
## This is used in an aria label to help to understand the role of the resizer.
## Color picker
## Show all highlights
## This is a toggle button to show/hide all the highlights.
## New alt-text dialog
## Group note for entire feature: Alternative text (alt text) helps when people can't see the image. This feature includes a tool to create alt text automatically using an AI model that works locally on the user's device to preserve privacy.
# $type (String) - an annotation type from a list defined in the PDF spec
# $type (String) - an annotation type from a list defined in the PDF spec
@ -245,63 +226,3 @@ pdfjs-password-invalid = Clau invalida. Torna a intentar-lo.
pdfjs-password-ok-button = Acceptar
pdfjs-password-ok-button = Acceptar
pdfjs-password-cancel-button = Cancelar
pdfjs-password-cancel-button = Cancelar
pdfjs-web-fonts-disabled = As fuents web son desactivadas: no se puet incrustar fichers PDF.
pdfjs-web-fonts-disabled = As fuents web son desactivadas: no se puet incrustar fichers PDF.
## Editing
## Default editor aria labels
## Remove button for the various kind of editor.
##
## Alt-text dialog
## Editor resizers
## This is used in an aria label to help to understand the role of the resizer.
## Color picker
## Show all highlights
## This is a toggle button to show/hide all the highlights.
## New alt-text dialog
## Group note for entire feature: Alternative text (alt text) helps when people can't see the image. This feature includes a tool to create alt text automatically using an AI model that works locally on the user's device to preserve privacy.
pdfjs-editor-add-signature-dialog-label = يتيح هذا النموذج للمستخدم إنشاء توقيع لإضافته إلى مستند PDF. ويمكن للمستخدم تحرير الاسم (الذي يعمل أيضًا كنص بديل)، وحفظ التوقيع بشكل اختياري للاستخدام المتكرر.
pdfjs-editor-add-signature-dialog-label = يتيح هذا النموذج للمستخدم إنشاء توقيع لإضافته إلى مستند PDF. ويمكن للمستخدم تحرير الاسم (الذي يعمل أيضًا كنص بديل)، وحفظ التوقيع بشكل اختياري للاستخدام المتكرر.
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.