3591 Commits

Author SHA1 Message Date
calixteman
2381ac6b16
Update the internal viewer to use a new debugger.
It has few cool features:
 - all the canvas used during the rendering can be viewed;
 - the different properties in the graphics state can be viewed;
 - the different paths can be viewed.
2026-03-12 22:38:08 +01:00
Jonas Jenwald
60d6abdf4f A couple of small improvements of the new internal viewer
- Mention the internal viewer in the README, such that it's easier to find.

 - Implement a new `INTERNAL_VIEWER` define, such that it's easier to limit code to only the "internal-viewer" gulp target.

 - Only include the "GetRawData" message-handler when needed. Note that the `MessageHandler` [already throws](eb159abd6a/src/shared/message_handler.js (L121-L123)) for any missing handler.

 - Move the various new helper functions from `src/core/document.js` and into their own file. The reasons for doing this are:
    - That file is already quite large and complex as-is, and these helper functions are slightly orthogonal to its main functionality.
    - Babel isn't able to remove all of the new code, and by moving this into a separate file we can guarantee that no extra code ends up in e.g. Firefox.
2026-03-10 23:41:35 +01:00
Tim van der Meij
44a63549b0
Merge pull request #20831 from calixteman/internal_viewer
Add a new internal viewer to explore the structure of PDF files.
2026-03-10 20:48:40 +01:00
Tim van der Meij
3f75c4e511
Merge pull request #20829 from Snuffleupagus/Blob-bytes
Start using `Blob.prototype.bytes()` in the code-base
2026-03-10 20:14:49 +01:00
Jonas Jenwald
dbb6ffb8d5 Change the Font.prototype.glyphCacheValues method to return an iterator
This method is only used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through the underlying data to create a temporary `Array` that we finally iterate through at the call-site.

*Please note:* As port of these changes the chars/glyph caches, on the `Font` instances, are changed to use `Map`s rather than Objects.
2026-03-09 16:18:48 +01:00
calixteman
9d81fafa8c
Add a new internal viewer to explore the structure of PDF files.
The one from pdf.js.utils is a bit too old: a lot of bugs have been fixed
in the code that parses PDF files since then.
It's just an internal development tool, so it doesn't need to be perfect,
but it should be good enough to be useful.
2026-03-09 14:16:12 +01:00
Jonas Jenwald
2598b0dcdd Start using Blob.prototype.bytes() in the code-base
Note that this isn't motivated by the miniscule reduction in code-size, but rather by wanting to unblock using this newer feature; see https://developer.mozilla.org/en-US/docs/Web/API/Blob/bytes
2026-03-08 14:06:03 +01:00
calixteman
253ce6e323
Handle outline with Structure Element (SE) destination 2026-03-08 12:28:24 +01:00
Jonas Jenwald
ddd69ce4e0 Remove the "DocProgress" loaded fallback from the getPdfManager function
Falling back to use the `loaded` byteLength if the server `contentLength` is unknown doesn't make a lot of sense, since it'd lead to the `onProgress` callback reporting `percent === 100` repeatedly while the document is loading despite that being obviously wrong.
Instead we'll now report `percent === NaN` in that case, thus showing the indeterminate progressBar, which seems more correct if the `contentLength` is unknown.

Please note that this code-path is normally not even reached, since streaming is enabled by default (applies e.g. to the Firefox PDF Viewer).
2026-03-08 10:22:01 +01:00
Tim van der Meij
98dc351cfa
Merge pull request #20824 from calixteman/bug2015853
Add the possibility to merge/update acroforms when merging/extracting (bug 2015853)
2026-03-07 20:12:02 +01:00
calixteman
baf8647b1f
Add the possibility to merge/update acroforms when merging/extracting (bug 2015853) 2026-03-07 19:03:02 +01:00
Jonas Jenwald
0c514b008b Use Response.prototype.bytes() more in the code-base (PR 20651 follow-up) 2026-03-07 15:50:36 +01:00
Tim van der Meij
d34a15e03f
Merge pull request #20662 from Snuffleupagus/getPdfManager-async-read
Convert the data reading in `getPdfManager` to be asynchronous
2026-03-07 13:16:22 +01:00
Jonas Jenwald
efa13c5e2a Don't duplicate the Jbig2Error exception
Let `src/core/jbig2_ccittFax_wasm.js` import the existing exception, rather than duplicate its code.
2026-03-06 12:04:08 +01:00
Jonas Jenwald
29362e6afb Remove the JBig2CCITTFaxWasmImage instance when running clean-up
This follows the same pattern as the existing handling for the `JpxImage` instance.
2026-03-06 12:04:03 +01:00
Jonas Jenwald
7f4e29ed22 Change the "Terminate" worker-thread handler to an asynchronous function
This is a tiny bit shorter, which cannot hurt.
2026-03-06 11:24:12 +01:00
Jonas Jenwald
e8ab3cb335 Convert the data reading in getPdfManager to be asynchronous
This is not only shorter, but (in my opinion) it also simplifies the code.

*Note:* In order to keep the *five* different `BasePDFStreamReader` implementations consistent, we purposely don't re-factor the `PDFWorkerStreamReader` class to support `for await...of` iteration.
2026-03-05 22:50:26 +01:00
Tim van der Meij
688ae9b3e5
Merge pull request #20811 from calixteman/fix_xref
Add fetch** functions in the XRefWrapper
2026-03-05 22:02:08 +01:00
Tim van der Meij
01bc76e681
Merge pull request #20806 from Snuffleupagus/BinaryCMapStream-extends-Stream
Let `BinaryCMapStream` extend the `Stream` class
2026-03-05 20:43:37 +01:00
Calixte Denizet
150c1e80c2
Add fetch** functions in the XRefWrapper
It could fail to not have them if they're used during writing.
2026-03-05 19:21:12 +01:00
Jonas Jenwald
fccee4bffd Let BinaryCMapStream extend the Stream class
Looking at the `BinaryCMapStream` implementation, it's basically a "regular" `Stream` but with added functionality for reading compressed CMap data.
Hence, by letting `BinaryCMapStream` extend `Stream`, we can remove an effectively duplicate method and simplify/shorten the code a tiny bit.
2026-03-05 11:45:29 +01:00
Jonas Jenwald
aa445877a9 Use BaseStream.prototype.getString in the readPostScriptTable function
Currently the `customNames` are read one byte at a time, in a loop, and at every iteration converted to a string.
This can be replaced with the `BaseStream.prototype.getString` method, which didn't exist back when this function was written.
2026-03-04 18:34:07 +01:00
Jonas Jenwald
4d0709c174
Merge pull request #20795 from Snuffleupagus/Dict-more-iterators
Change the `Dict.prototype.{getKeys, getRawValues}` methods to return iterators
2026-03-04 18:26:42 +01:00
calixteman
7384359a41
Merge pull request #20781 from pengkunbin/fix/chinese-font-names-gbk
Fix missing Chinese font name variants (SimFang and XiaoBiaoSong) in GBK encoding detection
2026-03-04 16:49:44 +01:00
Jonas Jenwald
229e3642be Change the Dict.prototype.getRawValues method to return an iterator
This method is usually used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through ` Map`-values to create a temporary `Array` that we finally iterate through at the call-site.

Note that the `getRawValues` method is old code, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
2026-03-04 16:07:49 +01:00
Jonas Jenwald
58996f21b2 Change the Dict.prototype.getKeys method to return an iterator
This method is usually used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through ` Map`-keys to create a temporary `Array` that we finally iterate through at the call-site.

Note that the `getKeys` method is old code, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
2026-03-04 16:07:49 +01:00
Jonas Jenwald
40bd73551c
Merge pull request #20793 from Snuffleupagus/more-getRawEntries
Use the `Dict.prototype.getRawEntries` method more
2026-03-04 16:05:57 +01:00
Jonas Jenwald
50d66d7d34 Use the Dict.prototype.getRawEntries method more
This changes a number of loops currently using `Dict.prototype.{getKeys, getRaw}`, since it should be a tiny bit more efficient to use an iterator directly rather than first iterating through `Map`-keys to create a temporary `Array` that we finally iterate through at the call-site.

Note that the `getKeys` method is much older than `getRawEntries`, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.
2026-03-04 12:46:25 +01:00
Nicolò Ribaudo
2f2d5c9e27
Add script to check license headers 2026-03-04 10:40:39 +01:00
jizou
0e1b5cd7bb Fix missing Chinese font name variants (SimFang and XiaoBiaoSong) in GBK encoding detection 2026-03-03 17:04:59 +08:00
Tim van der Meij
f32b9d2677
Merge pull request #20738 from Snuffleupagus/function-shorten
Slightly shorten some code in the `src/core/function.js` file
2026-03-01 20:06:29 +01:00
Jeff Muizelaar
8fa6ef36e4 Remove scientific notation parsing.
This behaviour comes from the initial pdf.js commit but is wrong and
doesn't match other PDF readers like muPDF or pdfium.

From PDF Spec 7.3.3:

A PDF writer shall not use the PostScript language syntax for numbers with non-decimal radices (such
as 16#FFFE) or in exponential format (such as 6.02E23).
2026-02-26 20:22:34 -05:00
Jonas Jenwald
8ced999803 Slightly shorten some code in the src/core/function.js file 2026-02-26 10:42:56 +01:00
Jonas Jenwald
80db3609f4 Remove unused lastCode property from the LZWStream class (PR 324 follow-up)
This appear to have been unused already in PR 324 all the way back in 2011.
2026-02-25 13:31:44 +01:00
Tim van der Meij
4ecbd0cbe2
Merge pull request #20726 from Snuffleupagus/getOrInsertComputed-fewer-functions
Reduce allocations and function creation when using `getOrInsert` and `getOrInsertComputed`
2026-02-24 23:32:36 +01:00
Tim van der Meij
b43c8eab73
Merge pull request #20725 from calixteman/bug2018162
After cut & paste, the thumbnail must be correctly rendered (bug 2018162)
2026-02-24 23:27:07 +01:00
Jonas Jenwald
0d4e587a5f Reduce allocations when using Map.prototype.getOrInsert() with Arrays
Change all these cases to use `Map.prototype.getOrInsertComputed()` instead, in combination with a helper function for creating the `Array`s (similar to the previous patch).
2026-02-24 09:03:32 +01:00
Jonas Jenwald
2e07715c9d Reduce function creation when using Map.prototype.getOrInsertComputed()
With the exception of the first invocation the callback function is unused, which means that a lot of pointless functions may be created.
To avoid this we introduce helper functions for simple cases, such as creating `Map`s and `Objects`s.
2026-02-24 08:58:28 +01:00
calixteman
15e7a551ab
Reset transfer functions when entering in a new group
It fixes #20722.
2026-02-23 22:37:20 +01:00
Calixte Denizet
97d973ce09
After cut & paste, the thumbnail must be correctly rendered (bug 2018162) 2026-02-23 18:38:33 +01:00
Jonas Jenwald
c2f5e19eb0 Use Map.prototype.getOrInsertComputed() in the src/core/xfa/ folder 2026-02-22 22:57:50 +01:00
Tim van der Meij
a5a27a5ca7
Merge pull request #20705 from Snuffleupagus/#collectParents-getOrInsert
Use `Map.prototype.getOrInsert()` in the `#collectParents` method
2026-02-22 12:55:41 +01:00
Tim van der Meij
8189ca358c
Merge pull request #20703 from Snuffleupagus/#collectFieldObjects-getOrInsert
Use `Map.prototype.getOrInsert()` in the `#collectFieldObjects` method
2026-02-22 12:39:54 +01:00
Jonas Jenwald
3e7ad8d6bf Use Map.prototype.getOrInsert() in the #collectParents method 2026-02-21 11:42:42 +01:00
Jonas Jenwald
210c969c4c Use Map.prototype.getOrInsert() in the #collectFieldObjects method 2026-02-21 11:23:32 +01:00
Jonas Jenwald
76a5aed05f Use Map.prototype.getOrInsert() in the getNewAnnotationsMap helper 2026-02-21 11:03:00 +01:00
Tim van der Meij
82de22428a
Merge pull request #20660 from Snuffleupagus/ChunkedStream-async-sendRequest
Convert `ChunkedStreamManager.prototype.sendRequest` to an asynchronous method
2026-02-20 21:39:26 +01:00
calixteman
a5c62b7489
Merge pull request #20691 from Snuffleupagus/rm-unnecessary-Map-entries
Remove unnecessary `Map.prototype.entries()` usage
2026-02-20 17:44:19 +01:00
Jonas Jenwald
374f524c29 Remove unnecessary Map.prototype.entries() usage
A `Map` instance can be iterated directly with a `for...of` loop, hence using its `entries` method is not actually necessary.
2026-02-20 13:44:00 +01:00
Jonas Jenwald
7fd939763e Remove unnecessary class constructors in the src folder
There's a number of classes where the constructors can be removed completely by instead using class fields, which help to slightly shorten the code.

It seems that `unicorn/prefer-class-fields` ESLint plugin, see PR 20657, unfortunately isn't able to detect all of these cases.
2026-02-19 00:08:57 +01:00