3464 Commits

Author SHA1 Message Date
Jonas Jenwald
40bd73551c
Merge pull request #20793 from Snuffleupagus/more-getRawEntries
Use the `Dict.prototype.getRawEntries` method more
2026-03-04 16:05:57 +01:00
Jonas Jenwald
50d66d7d34 Use the Dict.prototype.getRawEntries method more
This changes a number of loops currently using `Dict.prototype.{getKeys, getRaw}`, since it should be a tiny bit more efficient to use an iterator directly rather than first iterating through `Map`-keys to create a temporary `Array` that we finally iterate through at the call-site.

Note that the `getKeys` method is much older than `getRawEntries`, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
2026-03-04 12:46:25 +01:00
Nicolò Ribaudo
2f2d5c9e27
Add script to check license headers 2026-03-04 10:40:39 +01:00
Tim van der Meij
f32b9d2677
Merge pull request #20738 from Snuffleupagus/function-shorten
Slightly shorten some code in the `src/core/function.js` file
2026-03-01 20:06:29 +01:00
Jeff Muizelaar
8fa6ef36e4 Remove scientific notation parsing.
This behaviour comes from the initial pdf.js commit but is wrong and
doesn't match other PDF readers like muPDF or pdfium.

From PDF Spec 7.3.3:

A PDF writer shall not use the PostScript language syntax for numbers with non-decimal radices (such
as 16#FFFE) or in exponential format (such as 6.02E23).
2026-02-26 20:22:34 -05:00
Jonas Jenwald
8ced999803 Slightly shorten some code in the src/core/function.js file 2026-02-26 10:42:56 +01:00
Jonas Jenwald
80db3609f4 Remove unused lastCode property from the LZWStream class (PR 324 follow-up)
This appear to have been unused already in PR 324 all the way back in 2011.
2026-02-25 13:31:44 +01:00
Tim van der Meij
4ecbd0cbe2
Merge pull request #20726 from Snuffleupagus/getOrInsertComputed-fewer-functions
Reduce allocations and function creation when using `getOrInsert` and `getOrInsertComputed`
2026-02-24 23:32:36 +01:00
Tim van der Meij
b43c8eab73
Merge pull request #20725 from calixteman/bug2018162
After cut & paste, the thumbnail must be correctly rendered (bug 2018162)
2026-02-24 23:27:07 +01:00
Jonas Jenwald
0d4e587a5f Reduce allocations when using Map.prototype.getOrInsert() with Arrays
Change all these cases to use `Map.prototype.getOrInsertComputed()` instead, in combination with a helper function for creating the `Array`s (similar to the previous patch).
2026-02-24 09:03:32 +01:00
Jonas Jenwald
2e07715c9d Reduce function creation when using Map.prototype.getOrInsertComputed()
With the exception of the first invocation the callback function is unused, which means that a lot of pointless functions may be created.
To avoid this we introduce helper functions for simple cases, such as creating `Map`s and `Objects`s.
2026-02-24 08:58:28 +01:00
calixteman
15e7a551ab
Reset transfer functions when entering in a new group
It fixes #20722.
2026-02-23 22:37:20 +01:00
Calixte Denizet
97d973ce09
After cut & paste, the thumbnail must be correctly rendered (bug 2018162) 2026-02-23 18:38:33 +01:00
Jonas Jenwald
c2f5e19eb0 Use Map.prototype.getOrInsertComputed() in the src/core/xfa/ folder 2026-02-22 22:57:50 +01:00
Tim van der Meij
a5a27a5ca7
Merge pull request #20705 from Snuffleupagus/#collectParents-getOrInsert
Use `Map.prototype.getOrInsert()` in the `#collectParents` method
2026-02-22 12:55:41 +01:00
Tim van der Meij
8189ca358c
Merge pull request #20703 from Snuffleupagus/#collectFieldObjects-getOrInsert
Use `Map.prototype.getOrInsert()` in the `#collectFieldObjects` method
2026-02-22 12:39:54 +01:00
Jonas Jenwald
3e7ad8d6bf Use Map.prototype.getOrInsert() in the #collectParents method 2026-02-21 11:42:42 +01:00
Jonas Jenwald
210c969c4c Use Map.prototype.getOrInsert() in the #collectFieldObjects method 2026-02-21 11:23:32 +01:00
Jonas Jenwald
76a5aed05f Use Map.prototype.getOrInsert() in the getNewAnnotationsMap helper 2026-02-21 11:03:00 +01:00
Tim van der Meij
82de22428a
Merge pull request #20660 from Snuffleupagus/ChunkedStream-async-sendRequest
Convert `ChunkedStreamManager.prototype.sendRequest` to an asynchronous method
2026-02-20 21:39:26 +01:00
calixteman
a5c62b7489
Merge pull request #20691 from Snuffleupagus/rm-unnecessary-Map-entries
Remove unnecessary `Map.prototype.entries()` usage
2026-02-20 17:44:19 +01:00
Jonas Jenwald
374f524c29 Remove unnecessary Map.prototype.entries() usage
A `Map` instance can be iterated directly with a `for...of` loop, hence using its `entries` method is not actually necessary.
2026-02-20 13:44:00 +01:00
Jonas Jenwald
7fd939763e Remove unnecessary class constructors in the src folder
There's a number of classes where the constructors can be removed completely by instead using class fields, which help to slightly shorten the code.

It seems that `unicorn/prefer-class-fields` ESLint plugin, see PR 20657, unfortunately isn't able to detect all of these cases.
2026-02-19 00:08:57 +01:00
Jonas Jenwald
e1cc24c595 Set the annotationType automatically in the Annotation constructor
Rather than assigning it manually in every extending class, we can utilize the fact that the `AnnotationType`-entries are simply the upper-case version of the `/Subtype` (when it exists) in the Annotation dictionary.
2026-02-18 14:47:42 +01:00
Jonas Jenwald
62ac1b844a
Merge pull request #20669 from Snuffleupagus/decode-truncate
Truncate too long /Decode map entries (issue 20668)
2026-02-16 20:39:12 +01:00
Jonas Jenwald
31b4612ac0 Truncate too long /Decode map entries (issue 20668) 2026-02-16 16:22:00 +01:00
Jonas Jenwald
0a9176422e Remove Object.hasOwn usage from the src/core/xref.js file
This should not be necessary, given the following checks done early during the worker initialization: c5746949ac/src/core/worker.js (L124-L141)
2026-02-15 16:39:39 +01:00
Jonas Jenwald
59fbad617b Convert ChunkedStreamManager.prototype.sendRequest to an asynchronous method
This is not only shorter, but (in my opinion) it also simplifies the code.

*Note:* In order to keep the *five* different `BasePDFStreamRangeReader` implementations consistent, we purposely don't re-factor the `PDFWorkerStreamRangeReader` class to support `for await...of` iteration.
2026-02-14 15:49:31 +01:00
Tim van der Meij
1d6307f5d4
Merge pull request #20657 from Snuffleupagus/unicorn-prefer-class-fields
Enable the `unicorn/prefer-class-fields` ESLint plugin rule
2026-02-14 15:26:28 +01:00
Tim van der Meij
f22fb6bbfb
Merge pull request #20652 from Snuffleupagus/ChunkedStream-sendRequest-skip-empty
Avoid parsing skipped range requests in `ChunkedStreamManager` (PR 10694 follow-up)
2026-02-14 13:55:31 +01:00
Jonas Jenwald
170599f1e7 Enable the unicorn/prefer-class-fields ESLint plugin rule
This leads to slightly shorter code[1] when initializing classes, and in some cases we can even remove the constructors, which shouldn't hurt; see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-class-fields.md

It's probably possible to also change a lot of these class fields to private ones[2], however it's often difficult to tell at a glance if that's safe hence this patch only does this for the `PDFRenderingQueue`.

---

[1] This reduces the size of the `gulp mozcentral` output by 999 bytes, for a mostly mechanical code change.

[2] That sort of re-factoring should generally be done separately, on a class-by-class basis, to reduce the risk of regressions.
2026-02-14 12:33:34 +01:00
Jonas Jenwald
520928719c Move and re-use the stripPath helper function more
There's a couple of spots that essentially re-implement that function.
2026-02-13 17:38:21 +01:00
Jonas Jenwald
b3c07f4b3d Avoid parsing skipped range requests in ChunkedStreamManager (PR 10694 follow-up)
While we don't dispatch the actual range request after PR 10694 we still parse the returned data, which ends up being an *empty* `ArrayBuffer` and thus cannot affect the `ChunkedStream.prototype._loadedChunks` property.
Given that no actual data arrived, it's thus pointless[1] to invoke the `ChunkedStreamManager.prototype.onReceiveData` method in this case (and it also avoids sending effectively duplicate "DocProgress" messages).

---
[1] With the *possible* exception of `disableAutoFetch === false` being set, see f24768d7b4/src/core/chunked_stream.js (L499-L517) however that never happens when streaming is being used; note f24768d7b4/src/core/worker.js (L237-L238)
2026-02-12 18:01:54 +01:00
Jonas Jenwald
8ba83e73fa Start using Response.prototype.bytes() in the code-base
In all cases where we currently use `Response.prototype.arrayBuffer()` the result is immediately wrapped in a `Uint8Array`, which can be avoided by instead using the newer `Response.prototype.bytes()` method; see https://developer.mozilla.org/en-US/docs/Web/API/Response/bytes
2026-02-12 11:20:05 +01:00
Jonas Jenwald
6a3d5fea6c Replace a few cases of "manual" font name normalization with the normalizeFontName helper function 2026-02-08 16:56:50 +01:00
Jonas Jenwald
e9c509aca9 Normalize the font name in getBaseFontMetrics (issue 20246)
We tried to lookup the font metrics using the font name as-is, which didn't work since the PDF file in question has non-embedded fonts with names that include commas.
Hence the font names need to be normalized here as well, similar to elsewhere in the font code.
2026-02-08 16:56:15 +01:00
calixteman
58ac273f1f
Merge pull request #20503 from andriivitiv/Fix-Worker-was-terminated-error
Fix `Worker was terminated` error when loading is cancelled
2026-02-06 09:59:05 +01:00
calixteman
b92bdf80a2
Merge pull request #20628 from calixteman/bug2014399
Cap the max canvas dimensions in order to avoid to downscale large images in the worker (bug 2014399)
2026-02-06 09:42:04 +01:00
Tim van der Meij
f302323c7e
Merge pull request #20627 from Snuffleupagus/ChunkedStream-onReceiveData-rm-copy
Improve progress reporting in `ChunkedStreamManager`, and prevent unnecessary data copy in `ChunkedStream.prototype.onReceiveData`
2026-02-05 21:27:42 +01:00
Calixte Denizet
ff42c0bd50
Cap the max canvas dimensions in order to avoid to downscale large images in the worker (bug 2014399) 2026-02-05 20:25:36 +01:00
Jonas Jenwald
b3cd042ded Prevent unnecessary data copy in ChunkedStream.prototype.onReceiveData
This method is only invoked via `ChunkedStreamManager.prototype.sendRequest`, which currently returns data in `Uint8Array` format (since it potentially combines multiple `ArrayBuffer`s).
Hence we end up doing a short-lived, but still completely unnecessary, data copy[1] in `ChunkedStream.prototype.onReceiveData` when handling range requests. In practice this is unlikely to be a big problem by default, given that streaming is used and the (low) value of the `rangeChunkSize` API-option. (However, in custom PDF.js deployments it might affect things more.)

Given that no data copy is better than a short lived one, let's fix this small oversight and add non-production `assert`s to keep it working as intended.
This way we also improve consistency, since all other streaming and range request methods (see e.g. `BasePDFStream` and related code) only return `ArrayBuffer` data.

---
[1] Remember that `new Uint8Array(arrayBuffer)` only creates a view of the underlying `arrayBuffer`, whereas `new Uint8Array(typedArray)` actually creates a copy of the `typedArray`.
2026-02-05 16:16:36 +01:00
Jonas Jenwald
01deb085f8 Improve progress reporting in the ChunkedStreamManager
Currently there's two small bugs, which have existed around a decade, in the `loaded` property that's sent via the "DocProgress" message from the `ChunkedStreamManager.prototype.onReceiveData` method.

 - When the entire PDF has loaded the `loaded` property can become larger than the `total` property, which obviously doesn't make sense.
   This happens whenever the size of the PDF is *not* a multiple of the `rangeChunkSize` API-option, which is a very common situation.

 - When streaming is being used, the `loaded` property can become smaller than the actually loaded amount of data.
   This happens whenever the size of a streamed chunk is *not* a multiple of the `rangeChunkSize` API-option, which is a common situation.
2026-02-05 16:04:45 +01:00
calixteman
22b97d1741
Flush the text content chunk only on real font changes (bug 2013793) 2026-02-03 23:11:31 +01:00
calixteman
1c12b07726
Merge pull request #20613 from calixteman/ccittfax_pdfium
Use the ccittfax decoder from pdfium
2026-02-02 15:07:53 +01:00
calixteman
88c2051698
Use the ccittfax decoder from pdfium
The decoder is a dependency of the jbig2 one and is already
included in pdf.js, so we just need to wire it up.
It improves the performance of documents using ccittfax images.
2026-02-02 11:10:32 +01:00
Tim van der Meij
f4326e17c4
Merge pull request #20610 from calixteman/brotli
Add support for Brotli decompression
2026-02-01 20:41:06 +01:00
calixteman
43273fde27
Add support for Brotli decompression
For now, `BrotliDecode` hasn't been specified but it should be in a
close future.
So when it's possible we use the native `DecompressionStream` API
with "brotli" as argument.
If that fails or if we've to decompress in a sync context, we fallback
to `BrotliStream` which a pure js implementation (see README in external/brotli).
2026-01-31 16:25:53 +01:00
Jonas Jenwald
4ca205bac3 Add an abstract BasePDFStreamRangeReader class, that all the old IPDFStreamRangeReader implementations inherit from
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.
2026-01-30 14:15:39 +01:00
Jonas Jenwald
54d8c5e7b4 Add an abstract BasePDFStreamReader class, that all the old IPDFStreamReader implementations inherit from
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.

Also, remove the `rangeChunkSize` not defined checks in all the relevant stream-constructor implementations.
Note how the API, since some time, always validates *and* provides that parameter when creating a `BasePDFStreamReader`-instance.
2026-01-30 14:15:39 +01:00
Jonas Jenwald
4a8fb4dde1 Add an abstract BasePDFStream class, that all the old IPDFStream implementations inherit from
Given that there's no less than *five* different, but very similar, implementations this helps reduce code duplication and simplifies maintenance.

Also, spotted during rebasing, pass the `enableHWA` option "correctly" (i.e. as part of the existing `transportParams`) to the `WorkerTransport`-class to keep the constructor simpler.
2026-01-30 14:15:39 +01:00