3722 Commits

Author SHA1 Message Date
Jonas Jenwald
131d6b7d38 Handle corrupt PDFs that lack /Kids array and just inline the /Page dictionary (issue 21436)
This basically extends PR 9549 to the fallback `getAllPageDicts` method, which didn't exist at the time, in order to support more cases of corrupt PDF documents.
2026-06-12 12:04:58 +02:00
Jonas Jenwald
587abf0ef4 Re-use the getPassword helper function more in the src/core/worker.js file
Currently the same code, for requesting the password from the main-thread, is now duplicated three times.
Let's avoid that by moving the new `getPassword` helper function, added in the previous commit, and re-use that everywhere instead.
2026-06-11 10:07:39 +02:00
Jonas Jenwald
f3f5acc418 Fix the unit-tests for on-demand password handling of encrypted attachments (issue 21425)
These unit-tests used a PDF that prompted for password on document load, which meant that the on-demand password handling wasn't actually being tested as intended.

Updating the unit-tests also caused the "re-prompts for encrypted attachments after incorrect passwords" test to fail, since the `INCORRECT_PASSWORD` password reason was being accidentally "swallowed" in the worker-thread.
2026-06-10 23:08:34 +02:00
calixteman
a13f2aa793
Merge pull request #21413 from calixteman/improve_comb
Improve rendering of comb text fields
2026-06-09 23:10:49 +02:00
Calixte Denizet
fe5eb0f779
Improve rendering of comb text fields
Center each glyph within its comb cell instead of left-aligning it,
both in the HTML annotation layer and in the printed/saved appearance,
to match Acrobat. Cell width is now the single source of truth via the
--comb-width CSS variable, and field text-alignment (center/right) is
applied as a whole-cell --comb-offset that stays in sync on input,
blur, resetform and updatefromsandbox. The field no longer grows on
focus; trailing letter-spacing is clipped and cell dividers are drawn
on focus.
2026-06-09 22:15:40 +02:00
Tim van der Meij
8a80f1b8b7
Merge pull request #21418 from Snuffleupagus/getAttachments-Map
[api-minor] Convert `getAttachments` to return data in a `Map`
2026-06-09 19:42:23 +02:00
Jonas Jenwald
ea139e7df1 [api-minor] Convert getAttachments to return data in a Map
Compared to regular `Object`s there's a number of advantages to using `Map`s:
 - They support "proper" iteration.
 - They have a simple way to check for the existence of data.
 - They have a simple/efficient way to check the number of elements.

If this functionality was added today, I cannot imagine that we'd choose an `Object` for this sort of data.
Furthermore, in PR 21351 the data returned by `getAttachments` changed slightly and third-party users will need to update their code anyway (hence why `[api-minor]` should be fine here).
2026-06-09 10:17:23 +02:00
Calixte Denizet
473b53fe3c Handle TR2 with /Default entry
It fixes #21406.
2026-06-08 19:09:45 +02:00
Tim van der Meij
9c437e6ab4
Merge pull request #21388 from calixteman/strip_jbig2_header
Strip the JBIG2 file header from JBIG2Decode streams
2026-06-05 20:06:02 +02:00
Calixte Denizet
88c52a1523 Strip the JBIG2 file header from JBIG2Decode streams
It's rendering correctly in Acrobat and PdfBox.
2026-06-05 16:31:44 +02:00
Jonas Jenwald
959ce38f5b Reject the stream-capability when aborting the ChunkedStreamManager
Given that any incoming data is already being ignored after loading has been aborted, it seems reasonable to reject the stream-capability to avoid it remaining in a pending state indefinitely.

*Note:* This is something that I noticed while looking at the coverage data, since the `ChunkedStreamManager.prototype.onError` method is not used and from a brief look at the history of the code it never appears to have been used either.
2026-06-05 12:25:53 +02:00
Tim van der Meij
23ea0810d9
Merge pull request #21379 from calixteman/dedup_stream_merging
Deduplicate shared font/image streams when merging PDFs
2026-06-04 20:58:22 +02:00
Jonas Jenwald
d36d3ab893 Shorten the getBytes method in the Stream/ChunkedStream classes
This is very old code and there's currently a bit of unneeded duplication in these methods, especially in the `ChunkedStream` class.
2026-06-04 13:10:24 +02:00
Titus Wormer
4db9e45b8c
Add support for /AuthEvent, on-demand decryption
Normally entire PDFs are encrypted (or not).
But it is also possible to only encrypt attachments.
It is then also possible to *only* prompt for a password when the user opens
them.

In the existing flow, prompting for passwords happens because things are decrypted.
A specific error is thrown, caught, and the user is prompted.
To keep this flow working, this PR changes to decrypting attachments on demand,
instead of eagerly.
This sounds logical: to not read attachments on startup.

I’ve extensively tested this, not only with regular attachments, but also with outline items
and attachments in annotations.

This PR builds on GH-21234.
It’s an alternative to the naïve GH-20732.

Closes GH-20049.
2026-06-03 16:44:57 +02:00
Calixte Denizet
1a7821ab13 Deduplicate shared font/image streams when merging PDFs
Identical embedded fonts and images across the merged documents are now
written once and shared, instead of being copied per source file.
And avoid to compress already compressed stream with Brotli.
2026-06-02 22:08:21 +02:00
Tim van der Meij
744c1e6d7a
Merge pull request #21372 from calixteman/issue7998
Render gray transparency groups in grayscale
2026-06-02 20:10:18 +02:00
calixteman
69e8d6900f Render gray transparency groups in grayscale
It fixes #7998.
2026-05-31 21:16:29 +02:00
Jonas Jenwald
a6321e7201 Improve the BaseStream.prototype.clone implementations
- The `dict` field is optional, hence avoid an Error if trying to clone a non-existent dictionary.

 - Use the `length` getter in the `Stream` class, to avoid duplication.

 - Fix the `DecodeStream` implementation, since it has a couple of bugs:
    - The `clone` method currently uses `start`/`end` fields, despite these only existing on `Stream` instances.
    - Given the previous point, we ended up creating the cloned `Stream` instance using the *entire* underlying `buffer`. This is problematic since the length of a `DecodeStream` cannot be accurately estimated before decoding, and the `buffer`-length is simply a multiple of two.
       Unless the size of the decoded-data just happens to also be a multiple of two, this causes the cloned `Stream` instance to be "padded" with zeros at the end.
2026-05-31 20:24:39 +02:00
Jonas Jenwald
06439a95c3 Update the StringStream constructor to accept an optional dictionary argument
There's currently some amount of `StringStream` usage where the `dict`-parameter is manually assigned, and by updating the signature of the constructor this can be avoided.
2026-05-31 11:36:32 +02:00
calixteman
5d28cf5e88 Skip the format 4 cmap sub-table when it doesn't fit its 16-bits fields
It's a follow-up of bug 199861 (see https://bugzilla.mozilla.org/show_bug.cgi?id=1998618#c2).
2026-05-30 21:53:29 +02:00
calixteman
c7a32c3db6
Merge pull request #21343 from calixteman/issue9437
Clamp out-of-range BlueScale to Adobe's valid window
2026-05-29 08:58:05 +02:00
Calixte Denizet
600986b51d Allow inserting an image as a new page when editing a PDF
Image files dropped on or selected via the thumbnail viewer's
"add file" picker are now accepted alongside PDFs and inserted
as synthetic pages sized to the document's modal page dimensions.

The image-encoding helper previously embedded in StampAnnotation has
moved to src/core/editor/pdf_images.js so it can be shared between
stamp annotations and page synthesis.
2026-05-28 22:11:13 +02:00
calixteman
389853d473
Merge pull request #21336 from calixteman/issue15292
Parse CID-keyed Type 1 fonts instead of falling back to a system font
2026-05-28 21:45:30 +02:00
Tim van der Meij
be8a8c4309
Merge pull request #21348 from calixteman/issue21346
Use a black backdrop for Luminosity SMasks when /BC is missing
2026-05-28 20:46:47 +02:00
Titus Wormer
45cdb5d3e8
Add support for encrypted attachments
This PR is related to GH-20732, which is about `AuthEvent` (to delay
promting for a password), but instead adds the actual support for
encrypted attachments.
“Encrypted attachments” means that the main things are plain text.
Note that some PDF viewers, like Preview/QuickLook/Safari or Chrome,
do not support attachments at all.
Note that the file checked into the tests is the same as
`output-no-auth-event.pdf` referenced in
<https://github.com/mozilla/pdf.js/issues/20139#issuecomment-3952462166>.

Closes GH-20139.
2026-05-28 10:30:37 +02:00
Calixte Denizet
a33e06cafb Use a black backdrop for Luminosity SMasks when /BC is missing
It fixes #21346.
2026-05-26 22:36:50 +02:00
calixteman
385b1ca412 Clamp out-of-range BlueScale to Adobe's valid window
Fonts that ship a BlueScale outside the range AFDKO considers valid
for their zone heights (0.5/maxZoneHeight <= BlueScale <= 1/maxZoneHeight)
cause Firefox's CFF rasterizer to misalign overshooting glyphs against
flat-topped ones at body sizes.
Clamp into that window, only apply the lower clamp when BlueScale is
also smaller than the default, so foundry fonts that pair the default
0.039625 with small zones are untouched.

Fixes #9437.
2026-05-26 21:24:51 +02:00
calixteman
e1de5c30b5 Parse CID-keyed Type 1 fonts instead of falling back to a system font
It fixes #15292.

PDFs can embed a CID-keyed Type 1 program (Adobe TechNote 5014,
CIDFontType 0) under /Subtype /CIDFontType0 + /FontFile. Its binary
CIDMap/SubrMap layout has no eexec block, so Type1Font's eexec-only
parser used to fall through and trigger the work-around added in
PR #15397.
Split the constructor and parse the binary CIDMap, SubrMap
and charstrings (encrypted with the standard Type 1 charstring cipher)
through the existing Type1CharString.convert + CFF wrap pipeline.

Only single-FDArray fonts are supported; the StartData length is
clamped to the stream's remaining bytes before allocating.
2026-05-26 17:49:56 +02:00
Tim van der Meij
d1c85f87f7
Merge pull request #21330 from calixteman/fix_regex
Enable 'eslint-plugin-regexp' and fix existing findings
2026-05-25 18:22:21 +02:00
calixteman
f82382e010
Merge pull request #21331 from calixteman/fix_cjk_file
Load the predefined CMap for composite fonts that omit the FontDescriptor
2026-05-25 16:40:11 +02:00
Jonas Jenwald
48a12ac225 Avoid a temporary variable and return results directly in a couple of functions 2026-05-25 15:33:39 +02:00
Calixte Denizet
8f85e3f20b Load the predefined CMap for composite fonts that omit the FontDescriptor
and add font substitutions for the standard Acrobat CJK families.
2026-05-25 14:44:48 +02:00
Calixte Denizet
7bda0fc97c Enable 'eslint-plugin-regexp' and fix existing findings
Enable the recommended preset and fix or per-line-disable the 78
findings it surfaces. Most are equivalent rewrites, intentional
patterns (control chars, the whatwg email regex, autolinker URL regex)
keep their behavior via targeted disables.
2026-05-25 14:31:55 +02:00
Calixte Denizet
9391296036 Recover CFF FontBBox with negative coordinates encoded as unsigned 16-bit
It fixes #21312.
2026-05-25 08:36:18 +02:00
calixteman
adcde1175e Substitute a system font when an embedded CFF is truncated
It fixes #7625.

If the Top DICT's Private DICT extends past the end of the font data,
the Local Subrs INDEX is unreachable and every CharString that calls
a subr ends up as a blank glyph. Throw from parsePrivateDict so the
existing catch in translateFont triggers fallbackToSystemFont, then
run getFontSubstitution post-construction so we pick a close local
match instead of the generic fallbackName.
2026-05-24 18:10:09 +02:00
calixteman
143a7244a3
Merge pull request #21315 from calixteman/issue18548
Keep the first /Subrs and /CharStrings block
2026-05-24 18:07:20 +02:00
Tim van der Meij
13a61b1f72
Merge pull request #21319 from Snuffleupagus/XRefWrapper-fix
Fix the `XRefWrapper` implementation, in the `src/core/editor/pdf_editor.js` file
2026-05-24 15:06:22 +02:00
Tim van der Meij
941e17296e
Merge pull request #21313 from Snuffleupagus/Annotation-OC
Add support for Optional Content in the AnnotationLayer (issue 20433)
2026-05-24 15:02:06 +02:00
calixteman
1f8eed020f Keep the first /Subrs and /CharStrings block
Some Type1 fonts (the embedded Optima variants in orw1972.pdf) ship
two /Subrs and /CharStrings blocks wrapped in save/restore frames
gated on an Adobe hires/lores runtime switch.
In such cases, we just use the first /Subrs and /CharStrings block,
which is the one that is actually used by the font renderer in Acrobat.

It fixes #18548.
2026-05-24 15:01:22 +02:00
Jonas Jenwald
31c6561b91 Shorten the fontFile lookup a tiny bit
Rather than effectively duplicating code, we can use a loop instead.
2026-05-24 10:19:34 +02:00
Jonas Jenwald
05de3c8a88 Fix the XRefWrapper implementation, in the src/core/editor/pdf_editor.js file
When comparing this code with the full `XRef` class it doesn't seem to be entirely correctly implemented, since the `fetch` method is basically doing what the `fetchIfRef` method is intended to do.
2026-05-23 22:40:14 +02:00
calixteman
ea18e73de2
Merge pull request #20542 from calixteman/fontfile3
Use the CFF program directly for CID fonts wrapped in OpenType FontFile3
2026-05-23 21:39:13 +02:00
Jonas Jenwald
fb9758303b Add support for Optional Content in the AnnotationLayer (issue 20433) 2026-05-23 12:33:56 +02:00
Calixte Denizet
d6a2b91243 Sanitize glyf composite cycles, OS/2 length and maxp version mismatches
Prune the back-edge components from cyclic composite glyphs in
sanitizeGlyphLocations (leaving non-cyclic siblings intact), reject OS/2
tables whose length is too short for the declared version so a clean
table gets regenerated, and upgrade a version 0.5 maxp table to 1.0 for
TrueType fonts to silence OTS' "wrong maxp version for glyph data".

It fixes #21298.
2026-05-21 21:24:00 +02:00
calixteman
cd8a78c4e2
Recover CFF private dict defaults zeroed by Ghostscript
It fixes the issue #20633.
2026-05-17 20:51:35 +02:00
calixteman
91f2facce3
Use the CFF program directly for CID fonts wrapped in OpenType FontFile3
When a CIDFontType0 descendant has its program in a FontFile3 stream with
/Subtype /OpenType, the OTF wrapper sometimes lacks a usable cmap and the
CID→GID mapping only exists inside the embedded CFF itself. In that case
the OpenType-table path produces wrong glyphs, so route the font through
CFFFont and let it consume the inner CFF directly.

The file has been found in https://issues.chromium.org/issues/471404119.
2026-05-16 16:30:55 +02:00
Jonas Jenwald
7c5087cc16 Move the SVG_NS definition into src/shared/util.js
This constant is already defined in both the `src/core/` and `src/display/` folders, and in a few spots the same string was also inlined.
2026-05-16 15:17:04 +02:00
Jonas Jenwald
e5330f06fa Move the stringToPDFString helper function into the src/core/string_utils.js file
Given that this function is only ever used during *parsing* of the PDF document, which happens in the worker-thread, this has always added (a little bit of) dead code in the built `pdf.mjs` file.
2026-05-15 12:10:30 +02:00
Jonas Jenwald
7a7e7049c1 Shorten the isAscii helper function a tiny bit 2026-05-15 11:56:33 +02:00
Jonas Jenwald
153cef615e Move a couple of src/core/ string helper functions into their own file
Given that the various utility-files naturally increase in size over time, it shouldn't hurt to shorten `src/core/core_utils.js` a little bit by moving a few of its string helper functions to their own file.
2026-05-15 11:49:54 +02:00