688 Commits

Author SHA1 Message Date
Calixte Denizet
a33e06cafb Use a black backdrop for Luminosity SMasks when /BC is missing
It fixes #21346.
2026-05-26 22:36:50 +02:00
Tim van der Meij
d1c85f87f7
Merge pull request #21330 from calixteman/fix_regex
Enable 'eslint-plugin-regexp' and fix existing findings
2026-05-25 18:22:21 +02:00
Calixte Denizet
8f85e3f20b Load the predefined CMap for composite fonts that omit the FontDescriptor
and add font substitutions for the standard Acrobat CJK families.
2026-05-25 14:44:48 +02:00
Calixte Denizet
7bda0fc97c Enable 'eslint-plugin-regexp' and fix existing findings
Enable the recommended preset and fix or per-line-disable the 78
findings it surfaces. Most are equivalent rewrites, intentional
patterns (control chars, the whatwg email regex, autolinker URL regex)
keep their behavior via targeted disables.
2026-05-25 14:31:55 +02:00
calixteman
adcde1175e Substitute a system font when an embedded CFF is truncated
It fixes #7625.

If the Top DICT's Private DICT extends past the end of the font data,
the Local Subrs INDEX is unreachable and every CharString that calls
a subr ends up as a blank glyph. Throw from parsePrivateDict so the
existing catch in translateFont triggers fallbackToSystemFont, then
run getFontSubstitution post-construction so we pick a close local
match instead of the generic fallbackName.
2026-05-24 18:10:09 +02:00
Tim van der Meij
941e17296e
Merge pull request #21313 from Snuffleupagus/Annotation-OC
Add support for Optional Content in the AnnotationLayer (issue 20433)
2026-05-24 15:02:06 +02:00
Jonas Jenwald
31c6561b91 Shorten the fontFile lookup a tiny bit
Rather than effectively duplicating code, we can use a loop instead.
2026-05-24 10:19:34 +02:00
Jonas Jenwald
fb9758303b Add support for Optional Content in the AnnotationLayer (issue 20433) 2026-05-23 12:33:56 +02:00
calixteman
91f2facce3
Use the CFF program directly for CID fonts wrapped in OpenType FontFile3
When a CIDFontType0 descendant has its program in a FontFile3 stream with
/Subtype /OpenType, the OTF wrapper sometimes lacks a usable cmap and the
CID→GID mapping only exists inside the embedded CFF itself. In that case
the OpenType-table path produces wrong glyphs, so route the font through
CFFFont and let it consume the inner CFF directly.

The file has been found in https://issues.chromium.org/issues/471404119.
2026-05-16 16:30:55 +02:00
Jonas Jenwald
e5330f06fa Move the stringToPDFString helper function into the src/core/string_utils.js file
Given that this function is only ever used during *parsing* of the PDF document, which happens in the worker-thread, this has always added (a little bit of) dead code in the built `pdf.mjs` file.
2026-05-15 12:10:30 +02:00
Calixte Denizet
29fcf0aa76
Avoid cycles when getting operator list in patterns 2026-05-07 22:30:51 +02:00
calixteman
052e29cc56
Take into account CharProcs keys when computing the type3 hash
It fixes #19634.
2026-04-13 14:49:39 +02:00
Jonas Jenwald
654985c621 Add constants for defining the initial BBox and Float32 BBox
Nowadays there's a lot of places in the code-base where we need to initialize or reset bounding boxes. Rather than spelling this out repeatedly, this patch adds new `Array`/`Float32Array` constants that can be copied or used as-is where appropriate.
2026-04-08 20:30:20 +02:00
Calixte Denizet
00ea8db6bf
Avoid as much as possible to have intermediate canvases
We try to detect in the worker if some patterns or groups need to be drawn or not in isolation.
When they don't, we just draw them on the main canvas instead of drawing on a new canvas.
A pattern or a group is considered as being in isolation if it has some compositing rules or some transparency.

It improves the rendering performance of the pdf in bug 1731514.
2026-04-07 21:03:43 +02:00
calixteman
a9f142c796
Unconditionally create a gpu device
One drawback of the current implementation is that the GPU device can be
unavailable at the time of the first pattern fill, which causes the
GPU-accelerated canvas to be move on the main thread because of putImageData.

Most of the shading patterns stuff will be moved to the GPU and in order
to avoid creating some useless data we've to know if the GPU is available or not.

So in this patch we create the GPU device during the worker initialization
and pass a flag to the evaluator to know if the GPU is available or not.
2026-04-06 13:23:29 +02:00
Jonas Jenwald
f6bac014ea [api-minor] Remove PostScriptCompiler and PostScriptEvaluator, since it's now dead code (PR 21023 follow-up)
These classes, and various related code, became unused after PR 21023 with only unit-tests actually running that code now.

Also removes the `isEvalSupported` API option, since the `PostScriptCompiler` was the only remaining code where `eval` was used.
2026-04-03 22:14:14 +02:00
Calixte Denizet
2e3d79e616
Break text chunks only if the base font is different
It fixes #20956.
2026-03-26 21:39:32 +01:00
Jonas Jenwald
3a372fde94 [api-minor] Replace the CMapReaderFactory, StandardFontDataFactory, and WasmFactory API options with a single factory/option
Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1]
These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise.

Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will *only* be used when either:
 - The PDF.js library runs in Node.js environments.
 - The user manually sets `useWorkerFetch = false` when calling `getDocument`.
 - The user provides custom `CMapReaderFactory`, `StandardFontDataFactory`, and/or `WasmFactory` instances when calling `getDocument`.

By replacing these factories with *a single* new `BinaryDataFactory` factory/option the number of `getDocument` options are thus reduced, which cannot hurt.
This also reduces the total bundle-size of the Firefox PDF Viewer a little bit, and it slightly reduces the number of import maps that need to be maintained.

*Please note:* For users that provide custom `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` instances when calling `getDocument` this will be a breaking change, however it's unlikely that (many) such users exist.
(The *internal* format data-format of `CMapReaderFactory` was changed in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)

---

[1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great.

[2] Note that the Firefox PDF Viewer no longer use these factories, since it "forcibly" sets `useWorkerFetch = true` during building.
2026-03-22 15:49:06 +01:00
calixteman
ec24053ddf
Don't add an EOL after a superscript 2026-03-22 14:20:18 +01:00
Tim van der Meij
869f25a489
Merge pull request #20940 from calixteman/issue20872
Fix the group bbox when the numbers are too big
2026-03-22 12:27:43 +01:00
calixteman
5992d0f097
Fix the group bbox when the numbers are too big
It fixes #20872.
2026-03-21 19:37:42 +01:00
Jonas Jenwald
262aeef3fa [api-minor] Simplify BaseCMapReaderFactory by having the worker-thread create the filename
The `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes are all very similar, and the only difference is really in their respective `fetch` methods.
By have the worker-thread "compute" the complete `filename` it's possible to simplify the `BaseCMapReaderFactory.prototype.fetch` method, which will allow future improvements to all of these classes.

A couple of things to note:
 - This code is unused, and it's not even bundled, in the Firefox PDF Viewer.
 - In browsers it's unused by default, and worker-thread fetching will always be used when possible since that's more efficient.

*Please note:* For users that provide a custom `CMapReaderFactory` instance when calling `getDocument` this could be a breaking change, however it's unlikely that any such users exist.
(The *internal* format of this data was changed previously in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)
2026-03-21 15:54:40 +01:00
calixteman
918a319de6
Merge pull request #20885 from calixteman/gouraud_gpu
Implement Gouraud-based shading using WebGPU.
2026-03-21 15:18:56 +01:00
calixteman
86441e9eb8
Implement Gouraud-based shading using WebGPU.
The WebGPU feature hasn't been released yet but it's interesting to see how
we can use it in order to speed up the rendering of some objects.
This patch allows to render mesh patterns using WebGPU.

I didn't see any significant performance improvement on my machine (mac M2)
but it may be different on other platforms.
2026-03-21 14:34:32 +01:00
Calixte Denizet
eaa5eca73d Fix charSpacing in vertical mode
It fixes #20930.
And use the defaultVMetrics (coming from DW2 property) in the font.
2026-03-20 23:09:03 +01:00
Jonas Jenwald
652822bef0 [Firefox] Ensure that worker-thread fetching is used for built-in CMap, standard font, and wasm data
Given that we "forcibly" set `useWorkerFetch = true` for the MOZCENTRAL build-target there's a small amount of dead code as a result, which we can thus remove during building.
2026-03-20 16:58:57 +01:00
Jonas Jenwald
bdc16f8999
Merge pull request #20868 from Snuffleupagus/exportData-compileFontInfo
Move the `compileFontInfo` call into the `Font.prototype.exportData` method (PR 20197 follow-up)
2026-03-18 11:14:46 +01:00
Calixte Denizet
fd1fea5f6a
Remove some useless operations when getting the text content
The removed code has been added in #20624 and it's useless since these
operations (i.e. save/restore) are already handled in preprocessor.read.
2026-03-17 16:00:29 +01:00
Jonas Jenwald
7d963ddc7c Move the compileFontInfo call into the Font.prototype.exportData method (PR 20197 follow-up)
After the changes in PR 20197 the code in the `TranslatedFont.prototype.send` method is not all that readable[1] given how it handles e.g. the `charProcOperatorList` data used with Type3 fonts.
Since this is the only spot where `Font.prototype.exportData` is used, it seems much simpler to move the `compileFontInfo` call there and *directly* return the intended data rather than messing with it after the fact.

Finally, while it doesn't really matter, the patch flips the order of the `charProcOperatorList` and `extra` properties throughout the code-base since the former is used with Type3 fonts while the latter (effectively) requires that debugging is enabled.

---

[1] I had to re-read it twice, also looking at all the involved methods, in order to convince myself that it's actually correct.
2026-03-16 09:29:17 +01:00
Jonas Jenwald
3842936edf Split the src/shared/obj-bin-transform.js file into separate files for the main/worker threads (PR 20197 follow-up)
On the worker-thread only the static `write` methods are actually used, and on the main-thread only class instances are being created.
Hence this, after PR 20197, leads to a bunch of dead code in both of the *built* `pdf.mjs` and `pdf.worker.js` files.

This patch reduces the size of the `gulp mozcentral` output by `21 419` bytes, i.e. `21` kilo-bytes, which I believe is way too large of a saving to not do this.
(I can't even remember the last time we managed to reduce build-size this much with a single patch.)
2026-03-13 11:21:24 +01:00
jizou
0e1b5cd7bb Fix missing Chinese font name variants (SimFang and XiaoBiaoSong) in GBK encoding detection 2026-03-03 17:04:59 +08:00
calixteman
15e7a551ab
Reset transfer functions when entering in a new group
It fixes #20722.
2026-02-23 22:37:20 +01:00
Jonas Jenwald
7fd939763e Remove unnecessary class constructors in the src folder
There's a number of classes where the constructors can be removed completely by instead using class fields, which help to slightly shorten the code.

It seems that `unicorn/prefer-class-fields` ESLint plugin, see PR 20657, unfortunately isn't able to detect all of these cases.
2026-02-19 00:08:57 +01:00
Jonas Jenwald
6a3d5fea6c Replace a few cases of "manual" font name normalization with the normalizeFontName helper function 2026-02-08 16:56:50 +01:00
Jonas Jenwald
e9c509aca9 Normalize the font name in getBaseFontMetrics (issue 20246)
We tried to lookup the font metrics using the font name as-is, which didn't work since the PDF file in question has non-embedded fonts with names that include commas.
Hence the font names need to be normalized here as well, similar to elsewhere in the font code.
2026-02-08 16:56:15 +01:00
calixteman
22b97d1741
Flush the text content chunk only on real font changes (bug 2013793) 2026-02-03 23:11:31 +01:00
calixteman
9f660be8a2
Use DecompressionStream in async code
Usually, content stream or fonts are compressed using FlateDecode.
So use the DecompressionStream API to decompress those streams
in the async code path.
2026-01-25 14:22:19 +01:00
calixteman
424c7989aa
Get glyph contours when stroking using a pattern
Fix issue #20513 (second part).
2025-12-28 22:55:59 +01:00
calixteman
91033c2199
Fix the encoding for some missing chinese fonts
It fixes #20489.
2025-12-23 14:05:27 +01:00
Ujjwal Sharma
3a85770af1 Encode FontPath data into an ArrayBuffer
Serialize FontPath commands into a binary format
and store it in an ArrayBuffer so that it can
eventually be stored in a SharedArrayBuffer.
2025-12-06 03:00:48 +05:30
calixteman
1a8689b9be
Merge pull request #20340 from Aditi-1400/serialize-pattern-ab
Serialize pattern data into ArrayBuffer
2025-10-22 11:05:22 +02:00
Calixte Denizet
199b3d04df Fix stream use when getting the text (follow-up of #20373) 2025-10-18 22:58:27 +02:00
Aditi
fa631806bf Serialize pattern data into ArrayBuffer
Follow up on https://github.com/mozilla/pdf.js/pull/20197,
This serializes pattern data into an ArrayBuffer which is
then transferred from the worker to the main thread.

It sets up the stage for us to eventually switch to a
SharedArrayBuffer in the future.
2025-10-11 01:58:07 +05:30
Calixte Denizet
4d15bfec0d Only apply word spacing when there is a 0x20 in the text chunk
Fixes #20319.
2025-10-03 22:18:02 +02:00
Ujjwal Sharma
4bed7370f4 [WIP] Serialize font data into an ArrayBuffer
This PR serializes font data into an ArrayBuffer
that is then transfered from the worker to the
main thread. It's more efficient than the current
solution which clones the "export data" object
which includes the font data as a Uint8Array.

It prepares us to switch to a SharedArrayBuffer
in the future, which would allow us to share
the font data with multiple agents, which would be
crucial for the upcoming "renderer" worker.
2025-09-19 12:02:40 +05:30
Calixte Denizet
b6d772d71d Consider a ttf font with both Symbolic and Nonsymbolic flags set with a Differences array in the encoding dict as non-symbolic
It fixes #20232.
2025-09-14 18:52:16 +02:00
Calixte Denizet
1d4ae786f4 Check the setDash arguments
It fixes #20155.
2025-08-09 22:34:44 +02:00
Noritaka Kobayashi
fa568e826d
Fix typos across the codebase 2025-07-07 09:59:36 +09:00
Calixte Denizet
3bdc5d54fe Get the text under highlight/squiggly/underline/strikethrough annotations (bug 1885505)
and add an invisible element containing the text in the annotation layer to make
it readable by a screen reader.
2025-06-22 21:47:29 +02:00
Jonas Jenwald
36b40d959b
Merge pull request #19955 from Snuffleupagus/issue-19954
Support Type3 fonts with an incomplete /FontDescriptor dictionary (issue 19954)
2025-05-19 17:26:46 +02:00