This PR is related to GH-20732, which is about `AuthEvent` (to delay
promting for a password), but instead adds the actual support for
encrypted attachments.
“Encrypted attachments” means that the main things are plain text.
Note that some PDF viewers, like Preview/QuickLook/Safari or Chrome,
do not support attachments at all.
Note that the file checked into the tests is the same as
`output-no-auth-event.pdf` referenced in
<https://github.com/mozilla/pdf.js/issues/20139#issuecomment-3952462166>.
Closes GH-20139.
Enable the recommended preset and fix or per-line-disable the 78
findings it surfaces. Most are equivalent rewrites, intentional
patterns (control chars, the whatwg email regex, autolinker URL regex)
keep their behavior via targeted disables.
It fixes#7625.
If the Top DICT's Private DICT extends past the end of the font data,
the Local Subrs INDEX is unreachable and every CharString that calls
a subr ends up as a blank glyph. Throw from parsePrivateDict so the
existing catch in translateFont triggers fallbackToSystemFont, then
run getFontSubstitution post-construction so we pick a close local
match instead of the generic fallbackName.
Some Type1 fonts (the embedded Optima variants in orw1972.pdf) ship
two /Subrs and /CharStrings blocks wrapped in save/restore frames
gated on an Adobe hires/lores runtime switch.
In such cases, we just use the first /Subrs and /CharStrings block,
which is the one that is actually used by the font renderer in Acrobat.
It fixes#18548.
When comparing this code with the full `XRef` class it doesn't seem to be entirely correctly implemented, since the `fetch` method is basically doing what the `fetchIfRef` method is intended to do.
Prune the back-edge components from cyclic composite glyphs in
sanitizeGlyphLocations (leaving non-cyclic siblings intact), reject OS/2
tables whose length is too short for the declared version so a clean
table gets regenerated, and upgrade a version 0.5 maxp table to 1.0 for
TrueType fonts to silence OTS' "wrong maxp version for glyph data".
It fixes#21298.
When a CIDFontType0 descendant has its program in a FontFile3 stream with
/Subtype /OpenType, the OTF wrapper sometimes lacks a usable cmap and the
CID→GID mapping only exists inside the embedded CFF itself. In that case
the OpenType-table path produces wrong glyphs, so route the font through
CFFFont and let it consume the inner CFF directly.
The file has been found in https://issues.chromium.org/issues/471404119.
Given that this function is only ever used during *parsing* of the PDF document, which happens in the worker-thread, this has always added (a little bit of) dead code in the built `pdf.mjs` file.
Given that the various utility-files naturally increase in size over time, it shouldn't hurt to shorten `src/core/core_utils.js` a little bit by moving a few of its string helper functions to their own file.
Note that for the integration tests the coverage information ends up
being processed in the Node.js context where `window` is not available,
so we use `globalThis` instead for the function that merges individual
test's coverage information into the global object because that is
available in all contexts we support. For clarity we also rename said
function since we're not exclusively dealing with `window` nor worker
data anymore.
Given that this function is only ever used in `src/core/` code, let's avoid a little bit of dead code in the *built* `pdf.mjs` file.
Also, place the `AnnotationPrefix` and `AnnotationEditorPrefix` constants together in `src/shared/util.js` since that should aid readability.
Note that `Ref`s and `Name`s are cached globally[1], since that helps reduce object creation (a lot) during parsing.
That cache will be cleared after a period of inactivity in the viewer[2], which is why those primitives cannot *safely* be compared with just `===`/`!==` and also (partially) why abstractions such as `RefSet`/`RefSetCache` are necessary.
Currently `deepCompare` doesn't handle `Ref`s and `Name`s correctly, which may lead to future *intermittent* bugs in any code using the `deepCompare` helper function.
---
[1] This applies to `Cmd` as well, however that doesn't matter in the context of this patch.
[2] Currently, and for more than a decade, set to 30 seconds.
This fixes two things that I overlooked in PR 21225, more specifically:
- Use proper, rather than semi, private class fields in `WasmImage`.
- Make tracking of `WasmImage` instances optional, to avoid keeping data alive permanently in the `IMAGE_DECODERS` build.
Using Array.prototype.shift() to drain the traversal queue makes each
visited node move the remaining queued entries. For large name/number
trees this can make getAll() spend quadratic time in queue management.
Iterate over the queue with for...of instead. Children pushed while
iterating are still visited, and the queue no longer needs repeated
front removals.
Given that these classes are, with the exception of their `decode` methods, virtually identical this helps reduce code duplication and simplifies maintenance.
These changes reduce the size of the `gulp mozcentral` build-target by `1292` bytes, which obviously isn't a lot but still cannot hurt.
This was necessary before charset compilation was implemented, however that's been supported for many years and this is just dead code now.
- PR 9340, back in 2018, stopped using the `raw` field.
- PR 10591, back in 2019, implemented proper charset compilation.
The `CFFFont.prototype.data` should contain a `Uint8Array`, however if compilation failed it was being set to a `Stream` instance which will thus fail elsewhere in the font-code.
*Please note:* This was found by code inspection, since I don't have a PDF document that's fixed by this change.
Return the data as-is from the `CFFCompiler.prototype.compile` method, rather than making a copy of it first.
The reason that it was implemented this way in PR 21053 was to avoid keeping a potentially large `ArrayBuffer` alive, see https://github.com/mozilla/pdf.js/pull/21053#discussion_r3045402988
Having traced all the call-sites in the font-code that directly or indirectly invoke that code, I've now managed to conclude that the compiled CFF-data is never stored on the `Font` instance and using the data as-is thus shouldn't increase permanent memory usage.
This avoids having to "duplicate" dummy `readBlock` methods in a couple of image-stream classes.
Also, move a few `DecodeStream` field definitions to (ever so slightly) shorten the code.