3695 Commits

Author SHA1 Message Date
Titus Wormer
45cdb5d3e8
Add support for encrypted attachments
This PR is related to GH-20732, which is about `AuthEvent` (to delay
promting for a password), but instead adds the actual support for
encrypted attachments.
“Encrypted attachments” means that the main things are plain text.
Note that some PDF viewers, like Preview/QuickLook/Safari or Chrome,
do not support attachments at all.
Note that the file checked into the tests is the same as
`output-no-auth-event.pdf` referenced in
<https://github.com/mozilla/pdf.js/issues/20139#issuecomment-3952462166>.

Closes GH-20139.
2026-05-28 10:30:37 +02:00
Tim van der Meij
d1c85f87f7
Merge pull request #21330 from calixteman/fix_regex
Enable 'eslint-plugin-regexp' and fix existing findings
2026-05-25 18:22:21 +02:00
calixteman
f82382e010
Merge pull request #21331 from calixteman/fix_cjk_file
Load the predefined CMap for composite fonts that omit the FontDescriptor
2026-05-25 16:40:11 +02:00
Jonas Jenwald
48a12ac225 Avoid a temporary variable and return results directly in a couple of functions 2026-05-25 15:33:39 +02:00
Calixte Denizet
8f85e3f20b Load the predefined CMap for composite fonts that omit the FontDescriptor
and add font substitutions for the standard Acrobat CJK families.
2026-05-25 14:44:48 +02:00
Calixte Denizet
7bda0fc97c Enable 'eslint-plugin-regexp' and fix existing findings
Enable the recommended preset and fix or per-line-disable the 78
findings it surfaces. Most are equivalent rewrites, intentional
patterns (control chars, the whatwg email regex, autolinker URL regex)
keep their behavior via targeted disables.
2026-05-25 14:31:55 +02:00
Calixte Denizet
9391296036 Recover CFF FontBBox with negative coordinates encoded as unsigned 16-bit
It fixes #21312.
2026-05-25 08:36:18 +02:00
calixteman
adcde1175e Substitute a system font when an embedded CFF is truncated
It fixes #7625.

If the Top DICT's Private DICT extends past the end of the font data,
the Local Subrs INDEX is unreachable and every CharString that calls
a subr ends up as a blank glyph. Throw from parsePrivateDict so the
existing catch in translateFont triggers fallbackToSystemFont, then
run getFontSubstitution post-construction so we pick a close local
match instead of the generic fallbackName.
2026-05-24 18:10:09 +02:00
calixteman
143a7244a3
Merge pull request #21315 from calixteman/issue18548
Keep the first /Subrs and /CharStrings block
2026-05-24 18:07:20 +02:00
Tim van der Meij
13a61b1f72
Merge pull request #21319 from Snuffleupagus/XRefWrapper-fix
Fix the `XRefWrapper` implementation, in the `src/core/editor/pdf_editor.js` file
2026-05-24 15:06:22 +02:00
Tim van der Meij
941e17296e
Merge pull request #21313 from Snuffleupagus/Annotation-OC
Add support for Optional Content in the AnnotationLayer (issue 20433)
2026-05-24 15:02:06 +02:00
calixteman
1f8eed020f Keep the first /Subrs and /CharStrings block
Some Type1 fonts (the embedded Optima variants in orw1972.pdf) ship
two /Subrs and /CharStrings blocks wrapped in save/restore frames
gated on an Adobe hires/lores runtime switch.
In such cases, we just use the first /Subrs and /CharStrings block,
which is the one that is actually used by the font renderer in Acrobat.

It fixes #18548.
2026-05-24 15:01:22 +02:00
Jonas Jenwald
31c6561b91 Shorten the fontFile lookup a tiny bit
Rather than effectively duplicating code, we can use a loop instead.
2026-05-24 10:19:34 +02:00
Jonas Jenwald
05de3c8a88 Fix the XRefWrapper implementation, in the src/core/editor/pdf_editor.js file
When comparing this code with the full `XRef` class it doesn't seem to be entirely correctly implemented, since the `fetch` method is basically doing what the `fetchIfRef` method is intended to do.
2026-05-23 22:40:14 +02:00
calixteman
ea18e73de2
Merge pull request #20542 from calixteman/fontfile3
Use the CFF program directly for CID fonts wrapped in OpenType FontFile3
2026-05-23 21:39:13 +02:00
Jonas Jenwald
fb9758303b Add support for Optional Content in the AnnotationLayer (issue 20433) 2026-05-23 12:33:56 +02:00
Calixte Denizet
d6a2b91243 Sanitize glyf composite cycles, OS/2 length and maxp version mismatches
Prune the back-edge components from cyclic composite glyphs in
sanitizeGlyphLocations (leaving non-cyclic siblings intact), reject OS/2
tables whose length is too short for the declared version so a clean
table gets regenerated, and upgrade a version 0.5 maxp table to 1.0 for
TrueType fonts to silence OTS' "wrong maxp version for glyph data".

It fixes #21298.
2026-05-21 21:24:00 +02:00
calixteman
cd8a78c4e2
Recover CFF private dict defaults zeroed by Ghostscript
It fixes the issue #20633.
2026-05-17 20:51:35 +02:00
calixteman
91f2facce3
Use the CFF program directly for CID fonts wrapped in OpenType FontFile3
When a CIDFontType0 descendant has its program in a FontFile3 stream with
/Subtype /OpenType, the OTF wrapper sometimes lacks a usable cmap and the
CID→GID mapping only exists inside the embedded CFF itself. In that case
the OpenType-table path produces wrong glyphs, so route the font through
CFFFont and let it consume the inner CFF directly.

The file has been found in https://issues.chromium.org/issues/471404119.
2026-05-16 16:30:55 +02:00
Jonas Jenwald
7c5087cc16 Move the SVG_NS definition into src/shared/util.js
This constant is already defined in both the `src/core/` and `src/display/` folders, and in a few spots the same string was also inlined.
2026-05-16 15:17:04 +02:00
Jonas Jenwald
e5330f06fa Move the stringToPDFString helper function into the src/core/string_utils.js file
Given that this function is only ever used during *parsing* of the PDF document, which happens in the worker-thread, this has always added (a little bit of) dead code in the built `pdf.mjs` file.
2026-05-15 12:10:30 +02:00
Jonas Jenwald
7a7e7049c1 Shorten the isAscii helper function a tiny bit 2026-05-15 11:56:33 +02:00
Jonas Jenwald
153cef615e Move a couple of src/core/ string helper functions into their own file
Given that the various utility-files naturally increase in size over time, it shouldn't hurt to shorten `src/core/core_utils.js` a little bit by moving a few of its string helper functions to their own file.
2026-05-15 11:49:54 +02:00
Tim van der Meij
26dc195a65
Collect coverage information for the integration tests
Note that for the integration tests the coverage information ends up
being processed in the Node.js context where `window` is not available,
so we use `globalThis` instead for the function that merges individual
test's coverage information into the global object because that is
available in all contexts we support. For clarity we also rename said
function since we're not exclusively dealing with `window` nor worker
data anymore.
2026-05-14 12:34:12 +02:00
Jonas Jenwald
5bc5791a86
Merge pull request #21257 from Snuffleupagus/deepCompare-Refs
Update the `deepCompare` helper function to handle `Ref`s and `Name`s correctly
2026-05-12 11:53:02 +02:00
Jonas Jenwald
aecb571ea6 Move the getModificationDate helper function into src/core/core_utils.js
Given that this function is only ever used in `src/core/` code, let's avoid a little bit of dead code in the *built* `pdf.mjs` file.

Also, place the `AnnotationPrefix` and `AnnotationEditorPrefix` constants together in `src/shared/util.js` since that should aid readability.
2026-05-11 14:13:23 +02:00
Jonas Jenwald
326df1f711 Update the deepCompare helper function to handle Refs and Names correctly
Note that `Ref`s and `Name`s are cached globally[1], since that helps reduce object creation (a lot) during parsing.
That cache will be cleared after a period of inactivity in the viewer[2], which is why those primitives cannot *safely* be compared with just `===`/`!==` and also (partially) why abstractions such as `RefSet`/`RefSetCache` are necessary.

Currently `deepCompare` doesn't handle `Ref`s and `Name`s correctly, which may lead to future *intermittent* bugs in any code using the `deepCompare` helper function.

---

[1] This applies to `Cmd` as well, however that doesn't matter in the context of this patch.

[2] Currently, and for more than a decade, set to 30 seconds.
2026-05-11 13:18:54 +02:00
Tim van der Meij
702d60aa18
Merge pull request #21230 from calixteman/avoid_cycles
Avoid cycles when getting operator list in patterns
2026-05-10 18:15:01 +02:00
Tim van der Meij
3b58a339c8
Merge pull request #21213 from saripovdenis/perf-name-tree-getall-queue-index
perf: Avoid multi-second getDestinations stalls for PDFs with many named destinations
2026-05-10 18:13:12 +02:00
Calixte Denizet
29fcf0aa76
Avoid cycles when getting operator list in patterns 2026-05-07 22:30:51 +02:00
Calixte Denizet
b39440b6e0
Simplify '#getFilteredPageIndices' and '#resolveInsertAfterIndices' 2026-05-07 21:41:37 +02:00
Tim van der Meij
e81507c167
Merge pull request #21228 from calixteman/bug2027682
Place new annotations on the correct page when extracting pages (bug 2027682)
2026-05-07 21:12:15 +02:00
Calixte Denizet
4c62a49483
Place new annotations on the correct page when extracting pages (bug 2027682) 2026-05-06 18:44:02 +02:00
Jonas Jenwald
3f6a2feef6 Tweak the WasmImage implementation a little bit (PR 21225 follow-up)
This fixes two things that I overlooked in PR 21225, more specifically:

 - Use proper, rather than semi, private class fields in `WasmImage`.

 - Make tracking of `WasmImage` instances optional, to avoid keeping data alive permanently in the `IMAGE_DECODERS` build.
2026-05-06 17:52:35 +02:00
saripovdenis
473f9b4592 Avoid quadratic traversal in NameOrNumberTree.getAll
Using Array.prototype.shift() to drain the traversal queue makes each
visited node move the remaining queued entries. For large name/number
trees this can make getAll() spend quadratic time in queue management.

Iterate over the queue with for...of instead. Children pushed while
iterating are still visited, and the queue no longer needs repeated
front removals.
2026-05-06 09:51:57 +08:00
Jonas Jenwald
6ff0f8690f Add an abstract WasmImage class, that JBig2CCITTFaxImage and JpxImage inherit from
Given that these classes are, with the exception of their `decode` methods, virtually identical this helps reduce code duplication and simplifies maintenance.

These changes reduce the size of the `gulp mozcentral` build-target by `1292` bytes, which obviously isn't a lot but still cannot hurt.
2026-05-05 17:25:18 +02:00
Jonas Jenwald
ac6a9230d1 Replace TrueTypeTableBuilder and CompilerOutput with a single class
Given that both of these classes are so similar, let's replace them with a single `DataBuilder` class instead to reduce unnecessary code-duplication.
2026-05-04 15:01:53 +02:00
Jonas Jenwald
53fd89682c Remove the unused raw field from the CFFCharset class
This was necessary before charset compilation was implemented, however that's been supported for many years and this is just dead code now.
 - PR 9340, back in 2018, stopped using the `raw` field.
 - PR 10591, back in 2019, implemented proper charset compilation.
2026-05-03 18:51:24 +02:00
Jonas Jenwald
027671e6dc Replace a loop with TypedArray.prototype.set() in the compileFDSelect method
Given that the `fdSelect.fdSelect` data is a regular Array, this code can simplified a tiny bit.
2026-05-03 16:32:48 +02:00
Jonas Jenwald
e5e82b9617 Don't create a DataView for the "CFF " TrueType table in readTableEntry
Given that the "CFF " table may be replaced completely, during font-parsing, it shouldn't make sense to read and/or modify it piecewise.
2026-05-03 13:17:23 +02:00
Jonas Jenwald
b65eedc636 Set the correct data if compilation fails in the CFFFont constructor
The `CFFFont.prototype.data` should contain a `Uint8Array`, however if compilation failed it was being set to a `Stream` instance which will thus fail elsewhere in the font-code.

*Please note:* This was found by code inspection, since I don't have a PDF document that's fixed by this change.
2026-05-03 13:17:18 +02:00
Jonas Jenwald
521f4dc554 Remove the CompilerOutput.prototype.finalData getter (PR 21053 follow-up)
Return the data as-is from the `CFFCompiler.prototype.compile` method, rather than making a copy of it first.
The reason that it was implemented this way in PR 21053 was to avoid keeping a potentially large `ArrayBuffer` alive, see https://github.com/mozilla/pdf.js/pull/21053#discussion_r3045402988

Having traced all the call-sites in the font-code that directly or indirectly invoke that code, I've now managed to conclude that the compiled CFF-data is never stored on the `Font` instance and using the data as-is thus shouldn't increase permanent memory usage.
2026-05-03 13:13:50 +02:00
Jonas Jenwald
a8715f6f96 Don't provide unused /DecodeParms when initializing JpxStream 2026-05-02 12:20:28 +02:00
Jonas Jenwald
adf07ea51c
Merge pull request #21200 from Snuffleupagus/Intersector-grid-push
Shorten how intersectors are added to the grid in the `Intersector` constructor
2026-04-30 12:56:38 +02:00
Jonas Jenwald
4a5c455c0b Shorten how intersectors are added to the grid in the Intersector constructor
Thanks to modern JavaScript features this code can be simplified a tiny bit.
2026-04-30 12:06:08 +02:00
Jonas Jenwald
f26b98c7c4 Simplify the nextChunk handling in the DecryptStream class
This is old code, that can be simplified a tiny bit with modern JavaScript features.
2026-04-30 11:40:34 +02:00
Jonas Jenwald
1f6bfa0890 Add an abstract readBlock method in the DecodeStream class
This avoids having to "duplicate" dummy `readBlock` methods in a couple of image-stream classes.
Also, move a few `DecodeStream` field definitions to (ever so slightly) shorten the code.
2026-04-29 13:02:15 +02:00
Jonas Jenwald
3475806311 Convert Catalog.prototype.getPageIndex to an asynchronous method
This simplifies/shortens a piece of old code, which shouldn't hurt.
2026-04-28 11:34:41 +02:00
Jonas Jenwald
339f755a52 Add more validation in the Catalog.prototype.getPageIndex method
- Ensure that the /Kids-entries are Arrays, before trying to iterate through them.
 - Ensure that the /Count-entries are (positive) integers.
2026-04-28 11:33:50 +02:00
Calixte Denizet
c9a7ff0506 Fix merging PDFs with conflicting AcroForm /DR (bug 2035197) 2026-04-27 18:54:52 +02:00