3437 Commits

Author SHA1 Message Date
Jonas Jenwald
31b4612ac0 Truncate too long /Decode map entries (issue 20668) 2026-02-16 16:22:00 +01:00
Tim van der Meij
1d6307f5d4
Merge pull request #20657 from Snuffleupagus/unicorn-prefer-class-fields
Enable the `unicorn/prefer-class-fields` ESLint plugin rule
2026-02-14 15:26:28 +01:00
Tim van der Meij
f22fb6bbfb
Merge pull request #20652 from Snuffleupagus/ChunkedStream-sendRequest-skip-empty
Avoid parsing skipped range requests in `ChunkedStreamManager` (PR 10694 follow-up)
2026-02-14 13:55:31 +01:00
Jonas Jenwald
170599f1e7 Enable the unicorn/prefer-class-fields ESLint plugin rule
This leads to slightly shorter code[1] when initializing classes, and in some cases we can even remove the constructors, which shouldn't hurt; see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-class-fields.md

It's probably possible to also change a lot of these class fields to private ones[2], however it's often difficult to tell at a glance if that's safe hence this patch only does this for the `PDFRenderingQueue`.

---

[1] This reduces the size of the `gulp mozcentral` output by 999 bytes, for a mostly mechanical code change.

[2] That sort of re-factoring should generally be done separately, on a class-by-class basis, to reduce the risk of regressions.
2026-02-14 12:33:34 +01:00
Jonas Jenwald
520928719c Move and re-use the stripPath helper function more
There's a couple of spots that essentially re-implement that function.
2026-02-13 17:38:21 +01:00
Jonas Jenwald
b3c07f4b3d Avoid parsing skipped range requests in ChunkedStreamManager (PR 10694 follow-up)
While we don't dispatch the actual range request after PR 10694 we still parse the returned data, which ends up being an *empty* `ArrayBuffer` and thus cannot affect the `ChunkedStream.prototype._loadedChunks` property.
Given that no actual data arrived, it's thus pointless[1] to invoke the `ChunkedStreamManager.prototype.onReceiveData` method in this case (and it also avoids sending effectively duplicate "DocProgress" messages).

---
[1] With the *possible* exception of `disableAutoFetch === false` being set, see f24768d7b4/src/core/chunked_stream.js (L499-L517) however that never happens when streaming is being used; note f24768d7b4/src/core/worker.js (L237-L238)
2026-02-12 18:01:54 +01:00
Jonas Jenwald
8ba83e73fa Start using Response.prototype.bytes() in the code-base
In all cases where we currently use `Response.prototype.arrayBuffer()` the result is immediately wrapped in a `Uint8Array`, which can be avoided by instead using the newer `Response.prototype.bytes()` method; see https://developer.mozilla.org/en-US/docs/Web/API/Response/bytes
2026-02-12 11:20:05 +01:00
Jonas Jenwald
6a3d5fea6c Replace a few cases of "manual" font name normalization with the normalizeFontName helper function 2026-02-08 16:56:50 +01:00
Jonas Jenwald
e9c509aca9 Normalize the font name in getBaseFontMetrics (issue 20246)
We tried to lookup the font metrics using the font name as-is, which didn't work since the PDF file in question has non-embedded fonts with names that include commas.
Hence the font names need to be normalized here as well, similar to elsewhere in the font code.
2026-02-08 16:56:15 +01:00
calixteman
58ac273f1f
Merge pull request #20503 from andriivitiv/Fix-Worker-was-terminated-error
Fix `Worker was terminated` error when loading is cancelled
2026-02-06 09:59:05 +01:00
calixteman
b92bdf80a2
Merge pull request #20628 from calixteman/bug2014399
Cap the max canvas dimensions in order to avoid to downscale large images in the worker (bug 2014399)
2026-02-06 09:42:04 +01:00
Tim van der Meij
f302323c7e
Merge pull request #20627 from Snuffleupagus/ChunkedStream-onReceiveData-rm-copy
Improve progress reporting in `ChunkedStreamManager`, and prevent unnecessary data copy in `ChunkedStream.prototype.onReceiveData`
2026-02-05 21:27:42 +01:00
Calixte Denizet
ff42c0bd50
Cap the max canvas dimensions in order to avoid to downscale large images in the worker (bug 2014399) 2026-02-05 20:25:36 +01:00
Jonas Jenwald
b3cd042ded Prevent unnecessary data copy in ChunkedStream.prototype.onReceiveData
This method is only invoked via `ChunkedStreamManager.prototype.sendRequest`, which currently returns data in `Uint8Array` format (since it potentially combines multiple `ArrayBuffer`s).
Hence we end up doing a short-lived, but still completely unnecessary, data copy[1] in `ChunkedStream.prototype.onReceiveData` when handling range requests. In practice this is unlikely to be a big problem by default, given that streaming is used and the (low) value of the `rangeChunkSize` API-option. (However, in custom PDF.js deployments it might affect things more.)

Given that no data copy is better than a short lived one, let's fix this small oversight and add non-production `assert`s to keep it working as intended.
This way we also improve consistency, since all other streaming and range request methods (see e.g. `BasePDFStream` and related code) only return `ArrayBuffer` data.

---
[1] Remember that `new Uint8Array(arrayBuffer)` only creates a view of the underlying `arrayBuffer`, whereas `new Uint8Array(typedArray)` actually creates a copy of the `typedArray`.
2026-02-05 16:16:36 +01:00
Jonas Jenwald
01deb085f8 Improve progress reporting in the ChunkedStreamManager
Currently there's two small bugs, which have existed around a decade, in the `loaded` property that's sent via the "DocProgress" message from the `ChunkedStreamManager.prototype.onReceiveData` method.

 - When the entire PDF has loaded the `loaded` property can become larger than the `total` property, which obviously doesn't make sense.
   This happens whenever the size of the PDF is *not* a multiple of the `rangeChunkSize` API-option, which is a very common situation.

 - When streaming is being used, the `loaded` property can become smaller than the actually loaded amount of data.
   This happens whenever the size of a streamed chunk is *not* a multiple of the `rangeChunkSize` API-option, which is a common situation.
2026-02-05 16:04:45 +01:00
calixteman
22b97d1741
Flush the text content chunk only on real font changes (bug 2013793) 2026-02-03 23:11:31 +01:00
calixteman
1c12b07726
Merge pull request #20613 from calixteman/ccittfax_pdfium
Use the ccittfax decoder from pdfium
2026-02-02 15:07:53 +01:00
calixteman
88c2051698
Use the ccittfax decoder from pdfium
The decoder is a dependency of the jbig2 one and is already
included in pdf.js, so we just need to wire it up.
It improves the performance of documents using ccittfax images.
2026-02-02 11:10:32 +01:00
Tim van der Meij
f4326e17c4
Merge pull request #20610 from calixteman/brotli
Add support for Brotli decompression
2026-02-01 20:41:06 +01:00
calixteman
43273fde27
Add support for Brotli decompression
For now, `BrotliDecode` hasn't been specified but it should be in a
close future.
So when it's possible we use the native `DecompressionStream` API
with "brotli" as argument.
If that fails or if we've to decompress in a sync context, we fallback
to `BrotliStream` which a pure js implementation (see README in external/brotli).
2026-01-31 16:25:53 +01:00
Jonas Jenwald
4ca205bac3 Add an abstract BasePDFStreamRangeReader class, that all the old IPDFStreamRangeReader implementations inherit from
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.
2026-01-30 14:15:39 +01:00
Jonas Jenwald
54d8c5e7b4 Add an abstract BasePDFStreamReader class, that all the old IPDFStreamReader implementations inherit from
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.

Also, remove the `rangeChunkSize` not defined checks in all the relevant stream-constructor implementations.
Note how the API, since some time, always validates *and* provides that parameter when creating a `BasePDFStreamReader`-instance.
2026-01-30 14:15:39 +01:00
Jonas Jenwald
4a8fb4dde1 Add an abstract BasePDFStream class, that all the old IPDFStream implementations inherit from
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.

Also, spotted during rebasing, pass the `enableHWA` option "correctly" (i.e. as part of the existing `transportParams`) to the `WorkerTransport`-class to keep the constructor simpler.
2026-01-30 14:15:39 +01:00
Jonas Jenwald
a80f10ff1a Remove the onProgress callback from the IPDFStreamRangeReader interface
Note how there's *nowhere* in the code-base where the `IPDFStreamRangeReader.prototype.onProgress` callback is actually being set and used, however the loadingBar (in the viewer) still works just fine since loading progress is already reported via:
 - The `ChunkedStreamManager` instance respectively the `getPdfManager` function, through the use of a "DocProgress" message, on the worker-thread.
 - A `IPDFStreamReader.prototype.onProgress` callback, on the main-thread.

Furthermore, it would definitely *not* be a good idea to add any `IPDFStreamRangeReader.prototype.onProgress` callbacks since they only include the `loaded`-property which would trigger the "indeterminate" loadingBar (in the viewer).

Looking briefly at the history of this code it's not clear, at least to me, when this became unused however it's probably close to a decade ago.
2026-01-30 14:15:39 +01:00
Jonas Jenwald
05b78ce03c Stop registering an onProgress callback on the PDFWorkerStreamRangeReader-instance, in the ChunkedStreamManager class
Given that nothing in the `PDFWorkerStreamRangeReader` class attempts to invoke the `onProgress` callback, this is effectively dead code now.
Looking briefly at the history of this code it's not clear, at least to me, when this became unused however it's probably close to a decade ago.

Finally, note also how progress is already being reported through the `ChunkedStreamManager.prototype.onReceiveData` method.
2026-01-30 14:15:38 +01:00
Jonas Jenwald
987265720e Remove the unused IPDFStreamRangeReader.prototype.isStreamingSupported getter
This getter was only invoked from `src/display/network.js` and `src/core/chunked_stream.js`, however in both cases it's hardcoded to `false` and thus isn't actually needed.
This originated in PR 6879, close to a decade ago, for a potential TODO which was never implemented and it ought to be OK to just simplify this now.
2026-01-30 14:15:38 +01:00
Jonas Jenwald
62d5408cf0 Stop tracking progressiveDataLength in the ChunkedStreamManager class
Currently this property is essentially "duplicated", so let's instead use the identical one that's availble on the `ChunkedStream` instance.
2026-01-30 14:15:38 +01:00
Tim van der Meij
471adfd023
Merge pull request #20596 from Snuffleupagus/FileSpec-fixes
Simplify the `FileSpec` class, and remove no longer needed polyfills
2026-01-29 22:03:38 +01:00
Jonas Jenwald
5b368dd58a Remove the Uint8Array.prototype.toHex(), Uint8Array.prototype.toBase64(), and Uint8Array.fromBase64() polyfills
(During rebasing of the previous patches I happened to look at the polyfills and noticed that this one could be removed now.)

See:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array/toHex#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array/toBase64#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array/fromBase64#browser_compatibility

Note that technically this functionality can still be disabled via a preference in Firefox, however that's slated for removal in [bug 1985120](https://bugzilla.mozilla.org/show_bug.cgi?id=1985120).
Looking at the Firefox source-code, see https://searchfox.org/firefox-main/search?q=array.tobase64%28%29&path=&case=false&regexp=false, you can see that it's already being used *unconditionally* elsewhere in the browser hence removing the polyfills ought to be fine (since toggling the preference would break other parts of the browser).
2026-01-29 17:27:43 +01:00
Jonas Jenwald
b3f35b6007 Return the rawFilename as-is even if it's empty, from FileSpec.prototype.serializable
It's more correct to return the `rawFilename` as-is, and limit the fallback for empty filenames to only the `filename` property.
2026-01-29 17:25:44 +01:00
Calixte Denizet
806133379e
Refactor a bit page mapping stuff in order to be able to support delete/copy pages 2026-01-26 16:53:52 +01:00
calixteman
9f660be8a2
Use DecompressionStream in async code
Usually, content stream or fonts are compressed using FlateDecode.
So use the DecompressionStream API to decompress those streams
in the async code path.
2026-01-25 14:22:19 +01:00
Jonas Jenwald
640a3106d5 Remove caching/shadowing from the FileSpec getters, and simplify the code
Given that only the `FileSpec.prototype.serializable` getter is ever invoked from "outside" of the class, and only once per `FileSpec`-instance, the caching/shadowing isn't actually necessary.

Furthermore the `_contentRef`-caching wasn't actually correct, since it ended up storing a `BaseStream`-instance and those should *generally* never be cached.
(Since calling `BaseStream.prototype.getBytes()` more than once, without resetting the stream in between, will return an empty TypedArray after the first time.)
2026-01-25 13:16:29 +01:00
Jonas Jenwald
84b5866853 Reduce duplication in the pickPlatformItem helper function
Also, tweak code/comment used when handling "GoToR" destinations.
2026-01-25 13:16:06 +01:00
calixteman
ce296d8d42
Add the possibility to order the pages in an extracted pdf (bug 1997379)
or in a merged one.
2026-01-19 18:58:23 +01:00
Calixte Denizet
b5ed988267
Don't use contents stream which have an image format
The original bug has been filled in mupdf bug tracker:
https://bugs.ghostscript.com/show_bug.cgi?id=709033

The attached pdf can be open in Chrome but not in Acrobat.
2026-01-13 18:39:17 +01:00
calixteman
eab33828a9
Fix wasm url issue for the jbig2 decoder
and add a test for jbig2 decoding with the js decoder.
2026-01-04 00:08:59 +01:00
calixteman
98c1955bd4
Use the PDFium JBig2 decoder compiled into wasm
The decoder is ~4x faster than the JS decoder on large images.
2026-01-03 22:05:14 +01:00
calixteman
424c7989aa
Get glyph contours when stroking using a pattern
Fix issue #20513 (second part).
2025-12-28 22:55:59 +01:00
calixteman
5518c8a544
Use CIDToGIDMap when the font is a type 2 with an OpenType font
It fixes #18062.
2025-12-28 14:51:06 +01:00
Tim van der Meij
1990fa7cd0
Merge pull request #20538 from calixteman/issue13425
Fix the loca table length when there is enough space for it
2025-12-28 13:52:32 +01:00
calixteman
22932f7b68
Fix the loca table length when there is enough space for it
It fixes #13425.
2025-12-28 11:21:40 +01:00
calixteman
1dffcf7f25
Remove undefStack stuff in the cff parser
I think it should have been removed with #2527 so it should be useless now.
Because of that stuff, some commands with a wrong number of arguments
weren't stripped out (see the pdf in #13850).
2025-12-27 16:59:29 +01:00
calixteman
91033c2199
Fix the encoding for some missing chinese fonts
It fixes #20489.
2025-12-23 14:05:27 +01:00
Andrii Vitiv
9677798ba0
Fix Worker was terminated error when loading is cancelled
Fixes https://github.com/mozilla/pdf.js/issues/11595, where cancelling loading with `loadingTask.destroy()` before it finishes throws a `Worker was terminated` error that CANNOT be caught.

When worker is terminated, an error is thrown here:

6c746260a9/src/core/worker.js (L374)

Then `onFailure` runs, in which we throw again via `ensureNotTerminated()`. However, this second error is never caught (and cannot be), resulting in console spam.

There is no need to throw any additional errors since the termination is already reported [here](6c746260a9/src/core/worker.js (L371-L373)), and `onFailure` is supposed to handle errors, not throw them.
2025-12-14 18:15:10 +02:00
Tim van der Meij
d946f05841
Merge pull request #20440 from Gaurang-5/master
Fix infinite loop in JBIG2 decoder with >4 referred-to segments
2025-12-09 20:42:51 +01:00
calixteman
f75812b0af
Merge pull request #20346 from ryzokuken/binary-fontpath
Encode FontPath data into an ArrayBuffer
2025-12-08 13:59:23 +01:00
Tim van der Meij
de5709a7cd
Merge pull request #20454 from xiaobai2017666/russian-char
Extend getGlyphMapForStandardFonts with some Russian entries (issue 20453)
2025-12-07 18:28:41 +01:00
Gaurang Bhatia
ac8d80a8e4 Fix infinite loop in JBIG2 decoder with >4 referred-to segments and add regression test 2025-12-07 06:46:16 +05:30
Ujjwal Sharma
3a85770af1 Encode FontPath data into an ArrayBuffer
Serialize FontPath commands into a binary format
and store it in an ArrayBuffer so that it can
eventually be stored in a SharedArrayBuffer.
2025-12-06 03:00:48 +05:30