13 Commits

Author SHA1 Message Date
Titus Wormer
4db9e45b8c
Add support for /AuthEvent, on-demand decryption
Normally entire PDFs are encrypted (or not).
But it is also possible to only encrypt attachments.
It is then also possible to *only* prompt for a password when the user opens
them.

In the existing flow, prompting for passwords happens because things are decrypted.
A specific error is thrown, caught, and the user is prompted.
To keep this flow working, this PR changes to decrypting attachments on demand,
instead of eagerly.
This sounds logical: to not read attachments on startup.

I’ve extensively tested this, not only with regular attachments, but also with outline items
and attachments in annotations.

This PR builds on GH-21234.
It’s an alternative to the naïve GH-20732.

Closes GH-20049.
2026-06-03 16:44:57 +02:00
Jonas Jenwald
a6321e7201 Improve the BaseStream.prototype.clone implementations
- The `dict` field is optional, hence avoid an Error if trying to clone a non-existent dictionary.

 - Use the `length` getter in the `Stream` class, to avoid duplication.

 - Fix the `DecodeStream` implementation, since it has a couple of bugs:
    - The `clone` method currently uses `start`/`end` fields, despite these only existing on `Stream` instances.
    - Given the previous point, we ended up creating the cloned `Stream` instance using the *entire* underlying `buffer`. This is problematic since the length of a `DecodeStream` cannot be accurately estimated before decoding, and the `buffer`-length is simply a multiple of two.
       Unless the size of the decoded-data just happens to also be a multiple of two, this causes the cloned `Stream` instance to be "padded" with zeros at the end.
2026-05-31 20:24:39 +02:00
Calixte Denizet
b5ed988267
Don't use contents stream which have an image format
The original bug has been filled in mupdf bug tracker:
https://bugs.ghostscript.com/show_bug.cgi?id=709033

The attached pdf can be open in Chrome but not in Acrobat.
2026-01-13 18:39:17 +01:00
Calixte Denizet
05f368056d Use stream for whatever substrem in stream classes
and add a method in order to get the original stream.
When writing an existing stream it'll help to have the original one instead of the filtered one.
2025-10-17 22:26:05 +02:00
Calixte Denizet
94b4b54ef6 [api-major] Add openjpeg.wasm to pdf.js (bug 1935076)
In order to fix bug 1935076, we'll have to add a pure js fallback in case wasm is disabled
or simd isn't supported. Unfortunately, this fallback will take some space.

So, the main goal of this patch is to reduce the overall size (by ~93k).
As a side effect, it should make easier to use an other wasm file (which must export
_jp2_decode, _malloc and _free).
2025-01-16 21:09:50 +01:00
Calixte Denizet
b6c4f0b69e Use ImageDecoder in order to decode jpeg images (bug 1901223) 2024-10-23 10:42:01 +02:00
Jonas Jenwald
aebb8534f3 Limit base-class initialization checks to development and TESTING modes
We have a number of base-classes that are only intended to be extended, but never to be used directly. To help enforce this during development these base-class constructors will check for direct usage, however that code is obviously not needed in the actual builds.

*Note:* This patch reduces the size of the `gulp mozcentral` output by `~2.7` kilo-bytes, which isn't a lot but still cannot hurt.
2024-08-12 12:26:35 +02:00
Calixte Denizet
196affd8e0 Fix decoding of JPX images having an alpha channel
When an image has a non-zero SMaskInData it means that the image
has an alpha channel.
With JPX images, the colorspace isn't required (by spec) so when we
don't have it, the JPX decoder will handle the conversion in RGBA
format.
2024-06-03 20:08:11 +02:00
Calixte Denizet
9654ad570a Decompress when it's possible images in using DecompressionStream
Getting images is already asynchronous, so we can use this opportunity
to use DecompressStream (which is async too) to decompress images.
2024-06-02 14:00:05 +02:00
Jonas Jenwald
fbf6dee8ee [api-minor] Remove the forceClamped-functionality in the Streams (issue 14849)
As it turns out, most of the code-paths in the `PDFImage`-class won't actually pass the TypedArray (containing the image-data) to the `ColorSpace`-code. Hence we *generally* don't need to force the image-data to be a `Uint8ClampedArray`, and can just as well directly use a `Uint8Array` instead.

In the following cases we're returning the data without any `ColorSpace`-parsing, and the exact TypedArray used shouldn't matter:
 - b72a448327/src/core/image.js (L714)
 - b72a448327/src/core/image.js (L751)

In the following cases the image-data is only used *internally*, and again the exact TypedArray used shouldn't matter:
 - b72a448327/src/core/image.js (L762) with the actual image-data being defined (as `Uint8ClampedArray`) further below
 - b72a448327/src/core/image.js (L837)

*Please note:* This is tagged `api-minor` because it's API-observable, given that *some* image/mask-data will now be returned as `Uint8Array` rather than using `Uint8ClampedArray` unconditionally. However, that seems like a small price to pay to (slightly) reduce memory usage during image-conversion.
2022-04-29 14:46:30 +02:00
Jonas Jenwald
3624f9eac7 Add a new BaseStream.getString(...) method to replace manual bytesToString(BaseStream.getBytes(...)) calls
Given that the `bytesToString(BaseStream.getBytes(...))` pattern is somewhat common throughout the `src/core/` code, it cannot hurt to add a new `BaseStream`-method which handles that case internally.
2021-05-01 19:20:36 +02:00
Jonas Jenwald
67a1cfc1b1 Improve the handling getBaseStreams, on the various Stream implementations
The way that `getBaseStreams` is currently handled has bothered me from time to time, especially how we're checking if the method exists before calling it.
By adding a dummy `BaseStream.getBaseStreams` method, and having the call-sites simply check the return value, we can improve some of the relevant code.

Note in particular how the `ObjectLoader._walk` method didn't actually check that the data in question is a Stream instance, and instead only checked the `currentNode` (which could be anything) for the existence of a `getBaseStreams` property.
2021-04-28 13:44:47 +02:00
Jonas Jenwald
67415bfabe Add an abstract base-class, which all the various Stream implementations inherit from
By having an abstract base-class, it becomes a lot clearer exactly which methods/getters are expected to exist on all Stream instances.
Furthermore, since a number of the methods are *identical* for all Stream implementations, this reduces unnecessary code duplication in the `Stream`, `DecodeStream`, and `ChunkedStream` classes.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 619 329` to `1 616 115` bytes with this patch-series.
2021-04-28 13:44:45 +02:00