The main goal is to remove the eval-based interpreter.
In order to have some good performances, the new parser performs some optimizations
on the AST (similar to the ones in the previous implementation),
and the Wasm compiler generates code for the optimized AST.
For now, in case of errors or unsupported features, the Wasm compiler returns null
and the old interpreter is used as a fallback.
Few things are still missing:
- a wasm-based interpreter using a stack (in case the ps code isn't stack-free);
- a better js implementation in case of disabled wasm.
but they will be added in follow-up patches.
This code already has an integration-test, however also having a unit-test shouldn't hurt since those are often easier to run and debug (and it nicely complements the existing `outline` unit-tests).
The patch also makes the following smaller changes to the method itself:
- Avoid creating and parsing an empty Array, when doing the `pageRef` search.
- Use `XRef.prototype.fetch` directly, when walking the parent chain, since the check just above ensures that the value is a Reference.
- Use the `lookupRect` helper when parsing the /BBox entry.
* add support for directly writing masks into `rgbaBuf` if their size matches
* add support for writing into `rgbaBuf` to `resizeImageMask` to avoid extra
allocs/copies
* respect `actualHeight` to avoid unnecessary work on non-emitted rows
* mark more operations as `internal`
This changes the path for what I believe is the common case for masks:
a mask to add transparency to the accompanying opaque image, both being
equal in size.
The other paths are not meaninfully unchanged.
That increases my confidence as these new paths can be easily tested
with a PNG with transparency.
Some tests were failing and has been fixed:
- "Hello" + Alef + "(" + Bet: the "(" (neutral) was not considered as a part of the group Alef(Bet and the group wasn't reverted;
- some intermediate neutrals were considered as strong.
Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1]
These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise.
Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will *only* be used when either:
- The PDF.js library runs in Node.js environments.
- The user manually sets `useWorkerFetch = false` when calling `getDocument`.
- The user provides custom `CMapReaderFactory`, `StandardFontDataFactory`, and/or `WasmFactory` instances when calling `getDocument`.
By replacing these factories with *a single* new `BinaryDataFactory` factory/option the number of `getDocument` options are thus reduced, which cannot hurt.
This also reduces the total bundle-size of the Firefox PDF Viewer a little bit, and it slightly reduces the number of import maps that need to be maintained.
*Please note:* For users that provide custom `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` instances when calling `getDocument` this will be a breaking change, however it's unlikely that (many) such users exist.
(The *internal* format data-format of `CMapReaderFactory` was changed in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)
---
[1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great.
[2] Note that the Firefox PDF Viewer no longer use these factories, since it "forcibly" sets `useWorkerFetch = true` during building.
The `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes are all very similar, and the only difference is really in their respective `fetch` methods.
By have the worker-thread "compute" the complete `filename` it's possible to simplify the `BaseCMapReaderFactory.prototype.fetch` method, which will allow future improvements to all of these classes.
A couple of things to note:
- This code is unused, and it's not even bundled, in the Firefox PDF Viewer.
- In browsers it's unused by default, and worker-thread fetching will always be used when possible since that's more efficient.
*Please note:* For users that provide a custom `CMapReaderFactory` instance when calling `getDocument` this could be a breaking change, however it's unlikely that any such users exist.
(The *internal* format of this data was changed previously in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)
The WebGPU feature hasn't been released yet but it's interesting to see how
we can use it in order to speed up the rendering of some objects.
This patch allows to render mesh patterns using WebGPU.
I didn't see any significant performance improvement on my machine (mac M2)
but it may be different on other platforms.
Given that we "forcibly" set `useWorkerFetch = true` for the MOZCENTRAL build-target there's a small amount of dead code as a result, which we can thus remove during building.
In PR 20367 the `CompiledFont.prototype.getPathJs` method was changed to return TypedArray data, however the `NOOP` fallback was (likely accidentally) left an empty string.
The compilation of font-paths in PR 20346 was then implemented such that an empty string just happened to be ignored silently, however the assert added in PR 20894 allowed me to spot this return value inconsistency.
*Please note:* Since this only applies to missing or broken glyphs, that wouldn't be rendered anyway, this doesn't show up in reference tests.
Given that `CompiledFont.prototype.getPathJs` already returns data in the desired TypedArray format, we should be able to directly copy the font-path data which helps shorten the code a little bit (rather than the "manual" handling in PR 20346).
To ensure that this keeps working as expected, a non-production `assert` is added to prevent any future surprises.
After the changes in PR 20197 the code in the `TranslatedFont.prototype.send` method is not all that readable[1] given how it handles e.g. the `charProcOperatorList` data used with Type3 fonts.
Since this is the only spot where `Font.prototype.exportData` is used, it seems much simpler to move the `compileFontInfo` call there and *directly* return the intended data rather than messing with it after the fact.
Finally, while it doesn't really matter, the patch flips the order of the `charProcOperatorList` and `extra` properties throughout the code-base since the former is used with Type3 fonts while the latter (effectively) requires that debugging is enabled.
---
[1] I had to re-read it twice, also looking at all the involved methods, in order to convince myself that it's actually correct.