1485 Commits

Author SHA1 Message Date
calixteman
f61e00f2fa
Merge pull request #21054 from calixteman/fix_writing_numbers
Fix the way to write numbers when saving a pdf
2026-04-07 16:55:36 +02:00
Jonas Jenwald
6f0431456c Reduce allocations when compiling CFF fonts
Currently the `CFFCompiler.prototype.compile` implementation seem a bit inefficient, since the data is stored in a plain Array that needs to grow (a lot) during compilation. Additionally, adding a lot of entries isn't very efficient either and requires special handling of the "too many elements" case.
Some of the "helper" methods that use TypedArrays internally currently need to convert their return data to plain Arrays, via the `compileTypedArray` method, which adds even more intermediate allocations.
Note also that the `OpenTypeFileBuilder` has a special-case for writing plain Array data, which is only needed because of how the CFF compilation is implemented.

To improve this situation the `CFFCompiler.prototype.compile` method is re-factored to store its data in a TypedArray, whose initial size is estimated from the "raw" file size.
This removes the need for most intermediate allocations, and it also handles adding of "many elements" more efficiently.
2026-04-07 14:27:55 +02:00
Calixte Denizet
3d95aab8d7
Fix the way to write numbers when saving a pdf
It'll avoid to have numbers like 1e-23.
2026-04-07 10:52:06 +02:00
calixteman
a9f142c796
Unconditionally create a gpu device
One drawback of the current implementation is that the GPU device can be
unavailable at the time of the first pattern fill, which causes the
GPU-accelerated canvas to be move on the main thread because of putImageData.

Most of the shading patterns stuff will be moved to the GPU and in order
to avoid creating some useless data we've to know if the GPU is available or not.

So in this patch we create the GPU device during the worker initialization
and pass a flag to the evaluator to know if the GPU is available or not.
2026-04-06 13:23:29 +02:00
Jonas Jenwald
ccab310a39 Add an optional parameter in buildPostScriptJsFunction to force use of the PSStackBasedInterpreter code
This way the test-only function `buildPostScriptProgramFunction` can be removed.
2026-04-05 13:52:09 +02:00
Tim van der Meij
68da778329
Introduce a function type enumeration
This improves readability by removing "magic" numbers, and matches what
we already have for e.g. annotation and shading types.

Note that function type 1 does not exist in the specification, but that
also applies to everything higher than 4, so we can also remove the
specific handling of function type 1 and instead just let it fall
through to throwing an exception for unknown function types, in which we
now also log the provided function type to aid debugging.
2026-04-04 14:57:59 +02:00
Jonas Jenwald
f6bac014ea [api-minor] Remove PostScriptCompiler and PostScriptEvaluator, since it's now dead code (PR 21023 follow-up)
These classes, and various related code, became unused after PR 21023 with only unit-tests actually running that code now.

Also removes the `isEvalSupported` API option, since the `PostScriptCompiler` was the only remaining code where `eval` was used.
2026-04-03 22:14:14 +02:00
Tim van der Meij
d1a711bca3
Merge pull request #21023 from calixteman/wasm_stack_js
Add a js fallback for interpreting ps code
2026-04-03 20:09:29 +02:00
Jonas Jenwald
68366e31e4 Move the MathClamp helper function to its own file
This allows using it in the `src/scripting_api/` folder, without increasing the size of the scripting-bundle by also importing a bunch of unused code.
2026-04-02 11:22:28 +02:00
calixteman
8c7a5f3500
Add a js fallback for interpreting ps code
It's a basic stack based interpreter.
A wasm version will come soon.
2026-04-01 21:40:45 +02:00
Calixte Denizet
f373923170 Encrypt pdf data when merging the same pdf (bug 2028369) 2026-04-01 19:01:11 +02:00
calixteman
399fce6471
Merge pull request #21010 from calixteman/ps_js
Add an interpreter for optimized ps code
2026-03-31 22:21:00 +02:00
Calixte Denizet
9f3de1edf6
Add an interpreter for optimized ps code
It'll be used as a fallback when wasm is disabled.
And add in the debugger a view for the generated js code and one for the ps code.
2026-03-31 21:00:22 +02:00
Calixte Denizet
3727b7095a Add support for function-based shadings (bug 1254066)
It fixes #5046.
We just generate a mesh for the pattern rectangle where the color of each vertex is computed from the function.
Since the mesh is generated in the worker we don't really take into account the current transform when it's drawn.
That being said, there are maybe some possible improvements in using directly the gpu for the shading creation
which could then take into account the current transform, but it could only work with ps function we can convert
ino wgsl language and simple enough color spaces (gray and rgb).
2026-03-31 20:46:01 +02:00
Tim van der Meij
58b807d8e8
Merge pull request #21008 from calixteman/ast_cse
Avoid expressions duplication in the ps AST and use a local instead when compiling to WASM
2026-03-31 20:21:59 +02:00
Tim van der Meij
48228e2756
Merge pull request #21013 from calixteman/bug2026956
Add attachments when merging/reorganizing a pdf (bug 2026956)
2026-03-31 20:17:54 +02:00
Calixte Denizet
5b8c04f383 Add attachments when merging/reorganizing a pdf (bug 2026956) 2026-03-31 14:48:06 +02:00
Calixte Denizet
63cf35b47f Avoid expressions duplication in the ps AST and use a local instead when compiling to WASM 2026-03-30 16:30:33 +02:00
Jonas Jenwald
bfffb6c0f0 Import fs/promises directly in a few spots in the unit-tests
Also, use the existing PDF.js helper function to fetch text-data when running the "bidi" tests in browsers.
2026-03-30 14:34:53 +02:00
calixteman
952952c905
[api-minor] Rewrite the ps lexer & parser and add a small Wasm compiler
The main goal is to remove the eval-based interpreter.
In order to have some good performances, the new parser performs some optimizations
on the AST (similar to the ones in the previous implementation),
and the Wasm compiler generates code for the optimized AST.
For now, in case of errors or unsupported features, the Wasm compiler returns null
and the old interpreter is used as a fallback.
Few things are still missing:
 - a wasm-based interpreter using a stack (in case the ps code isn't stack-free);
 - a better js implementation in case of disabled wasm.

 but they will be added in follow-up patches.
2026-03-30 09:22:33 +02:00
Tim van der Meij
ada3438039
Merge pull request #21001 from Snuffleupagus/getDestFromStructElement-unit-test
Add a unit-test for the `Catalog.#getDestFromStructElement` method
2026-03-29 16:08:21 +02:00
Jonas Jenwald
498daadf3c Simplify the applyOpacity helper function
This function only has a single call-site (if we ignore the unit-tests), where the colors are split into separate parameters.
Given that all the color components are modified in the exact same way, it seems easier (and shorter) to pass the colors as-is to `applyOpacity` and have it use `Array.prototype.map()` instead.
2026-03-29 14:52:06 +02:00
Jonas Jenwald
d1f15fe352 Add a unit-test for the Catalog.#getDestFromStructElement method
This code already has an integration-test, however also having a unit-test shouldn't hurt since those are often easier to run and debug (and it nicely complements the existing `outline` unit-tests).

The patch also makes the following smaller changes to the method itself:
 - Avoid creating and parsing an empty Array, when doing the `pageRef` search.
 - Use `XRef.prototype.fetch` directly, when walking the parent chain, since the check just above ensures that the value is a Reference.
 - Use the `lookupRect` helper when parsing the /BBox entry.
2026-03-29 14:01:43 +02:00
Calixte Denizet
2e3d79e616
Break text chunks only if the base font is different
It fixes #20956.
2026-03-26 21:39:32 +01:00
Calixte Denizet
42c229c267
Add the bidi tests coming from BidiTest.txt and BidiCharacterTest.txt
Some tests were failing and has been fixed:
 - "Hello" + Alef + "(" + Bet: the "(" (neutral) was not considered as a part of the group Alef(Bet and the group wasn't reverted;
 - some intermediate neutrals were considered as strong.
2026-03-25 15:18:50 +01:00
Jonas Jenwald
a0102abe76 Move the NetworkStream choice from src/display/api.js and into a separate file
This code already isn't used (or even bundled) in the Firefox PDF Viewer, and it also slightly reduces the number of import maps that need to be maintained.
2026-03-24 17:08:04 +01:00
Jonas Jenwald
3a372fde94 [api-minor] Replace the CMapReaderFactory, StandardFontDataFactory, and WasmFactory API options with a single factory/option
Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1]
These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise.

Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will *only* be used when either:
 - The PDF.js library runs in Node.js environments.
 - The user manually sets `useWorkerFetch = false` when calling `getDocument`.
 - The user provides custom `CMapReaderFactory`, `StandardFontDataFactory`, and/or `WasmFactory` instances when calling `getDocument`.

By replacing these factories with *a single* new `BinaryDataFactory` factory/option the number of `getDocument` options are thus reduced, which cannot hurt.
This also reduces the total bundle-size of the Firefox PDF Viewer a little bit, and it slightly reduces the number of import maps that need to be maintained.

*Please note:* For users that provide custom `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` instances when calling `getDocument` this will be a breaking change, however it's unlikely that (many) such users exist.
(The *internal* format data-format of `CMapReaderFactory` was changed in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)

---

[1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great.

[2] Note that the Firefox PDF Viewer no longer use these factories, since it "forcibly" sets `useWorkerFetch = true` during building.
2026-03-22 15:49:06 +01:00
calixteman
ec24053ddf
Don't add an EOL after a superscript 2026-03-22 14:20:18 +01:00
Jonas Jenwald
262aeef3fa [api-minor] Simplify BaseCMapReaderFactory by having the worker-thread create the filename
The `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes are all very similar, and the only difference is really in their respective `fetch` methods.
By have the worker-thread "compute" the complete `filename` it's possible to simplify the `BaseCMapReaderFactory.prototype.fetch` method, which will allow future improvements to all of these classes.

A couple of things to note:
 - This code is unused, and it's not even bundled, in the Firefox PDF Viewer.
 - In browsers it's unused by default, and worker-thread fetching will always be used when possible since that's more efficient.

*Please note:* For users that provide a custom `CMapReaderFactory` instance when calling `getDocument` this could be a breaking change, however it's unlikely that any such users exist.
(The *internal* format of this data was changed previously in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)
2026-03-21 15:54:40 +01:00
Tim van der Meij
ab228da9ce
Merge pull request #20931 from Snuffleupagus/rm-factory-name-validation
Remove explicit `name`/`filename` validation in the `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes
2026-03-20 20:15:23 +01:00
calixteman
16aee06aac
Merge pull request #20925 from calixteman/reorganize_save_annotations
Add the possibility to save added annotations when reorganizing a pdf (bug 2023086)
2026-03-20 16:32:10 +01:00
Jonas Jenwald
5299eb2b83 Remove explicit name/filename validation in the BaseCMapReaderFactory, BaseStandardFontDataFactory, and BaseWasmFactory classes
Given that these classes are only used from the "FetchBinaryData" message handler, the `name`/`filename` parameters should never actually be missing and if they are that's a bug elsewhere in the code-base.
Furthermore a missing `name`/`filename` parameter would result in a "nonsense" URL and the actual data fetching would then fail instead, hence keeping this old validation code just doesn't seem necessary.
2026-03-20 15:50:26 +01:00
Calixte Denizet
04272de41d
Add the possibility to save added annotations when reorganizing a pdf (bug 2023086) 2026-03-20 10:55:47 +01:00
Calixte Denizet
c17801b77e Avoid getting null value in RefSet when cloning 2026-03-20 10:41:58 +01:00
Tim van der Meij
ff1af5a058
Merge pull request #20916 from calixteman/fix_co
When merging pdfs, fix the CO after the fields have been cloned
2026-03-19 21:22:43 +01:00
Tim van der Meij
6245bb201c
Merge pull request #20915 from calixteman/fix_pageindice
Avoid to use a used slot when looking for a new page position
2026-03-19 21:22:32 +01:00
Tim van der Meij
8cae5d17f2
Merge pull request #20917 from calixteman/fix_dup_name_dest
Fix the destination names when they're duplicated
2026-03-19 21:22:19 +01:00
Jonas Jenwald
7609a42209 Use toBeInstanceOf consistently in the unit-tests
There's currently a lot of unit-tests that manually check `instanceof`, let's replace that with the built-in Jasmine matcher function; see https://jasmine.github.io/api/edge/matchers.html#toBeInstanceOf
2026-03-19 17:18:25 +01:00
Calixte Denizet
cf67c1ef1e
Fix the destination names when they're duplicated 2026-03-19 10:52:39 +01:00
Calixte Denizet
b7da4b80a9
When merging pdfs, fix the CO after the fields have been cloned 2026-03-19 10:09:40 +01:00
Calixte Denizet
0bee641fed
Avoid to use a used slot when looking for a new page position 2026-03-19 09:40:16 +01:00
Jonas Jenwald
bdc16f8999
Merge pull request #20868 from Snuffleupagus/exportData-compileFontInfo
Move the `compileFontInfo` call into the `Font.prototype.exportData` method (PR 20197 follow-up)
2026-03-18 11:14:46 +01:00
Calixte Denizet
e67892d035
Add support for saving outlines after reorganize/merge (bug 2009574) 2026-03-17 22:22:13 +01:00
Jonas Jenwald
7d963ddc7c Move the compileFontInfo call into the Font.prototype.exportData method (PR 20197 follow-up)
After the changes in PR 20197 the code in the `TranslatedFont.prototype.send` method is not all that readable[1] given how it handles e.g. the `charProcOperatorList` data used with Type3 fonts.
Since this is the only spot where `Font.prototype.exportData` is used, it seems much simpler to move the `compileFontInfo` call there and *directly* return the intended data rather than messing with it after the fact.

Finally, while it doesn't really matter, the patch flips the order of the `charProcOperatorList` and `extra` properties throughout the code-base since the former is used with Type3 fonts while the latter (effectively) requires that debugging is enabled.

---

[1] I had to re-read it twice, also looking at all the involved methods, in order to convince myself that it's actually correct.
2026-03-16 09:29:17 +01:00
calixteman
3ff52e415f
Merge pull request #20862 from calixteman/bug2023106
Check for having Ref before adding them in a RefSet (bug 2023106)
2026-03-15 22:15:58 +01:00
Calixte Denizet
0fca64f01e
Check for having Ref before adding them in a RefSet (bug 2023106) 2026-03-15 22:03:39 +01:00
Tim van der Meij
315491dd32
Merge pull request #20840 from Snuffleupagus/getDocument-rm-length
[api-minor] Remove the `length` parameter from `getDocument`
2026-03-15 11:48:02 +01:00
Jonas Jenwald
09a9a7bd0b [api-minor] Remove the length parameter from getDocument
This is an old API-parameter that is now unused within the PDF.js project itself, and its description says that it's (partly) being used for "range requests operations".
Note that the `length` API-parameter is used to set the *initial* `contentLength` in various `BasePDFStreamReader` implementations, however it's always overridden by the "Content-Length" header (sent by the server) when that one exists *and* is a valid number. While we currently fallback to the keep the initial `contentLength` otherwise, note however how in that case range requests will always be *disabled* and thus the only spot in the code-base [where `fullReader.contentLength` is necessary](873378b718/src/core/worker.js (L230-L236)) cannot actually be reached.

Hence the only possible reason to use the `length` API-parameter would be for improved progress reporting[1] during streaming of PDF data in rare cases where the "Content-Length" header is missing/invalid, but the user *somehow* has information from another source about the correct `length` of the PDF document.
That situation feels very much like an edge-case, but it's obviously impossible to know if someone is depending on it. However, please note that there's a work-around available for users affected by this removal:
 - Implement a `PDFDataRangeTransport` instance together with custom data-fetching[2], since in that case its `length`-parameter will always be used as-is.

Finally, updates various `BasePDFStreamReader` implementations to only set the `_isRangeSupported` field once the headers are available (since previously we'd just overwrite the "initial" value anyway).

---

[1] I.e. to avoid the "indeterminate" loadingBar being displayed in the viewer.

[2] This is what e.g. the Firefox PDF Viewer uses.
2026-03-13 23:42:45 +01:00
Jonas Jenwald
3842936edf Split the src/shared/obj-bin-transform.js file into separate files for the main/worker threads (PR 20197 follow-up)
On the worker-thread only the static `write` methods are actually used, and on the main-thread only class instances are being created.
Hence this, after PR 20197, leads to a bunch of dead code in both of the *built* `pdf.mjs` and `pdf.worker.js` files.

This patch reduces the size of the `gulp mozcentral` output by `21 419` bytes, i.e. `21` kilo-bytes, which I believe is way too large of a saving to not do this.
(I can't even remember the last time we managed to reduce build-size this much with a single patch.)
2026-03-13 11:21:24 +01:00
calixteman
9d093d9607
Merge pull request #20626 from nicolo-ribaudo/images-right-click
Add support for right-clicking on images (bug 1012805)
2026-03-11 11:45:51 +01:00