Given that `CompiledFont.prototype.getPathJs` already returns data in the desired TypedArray format, we should be able to directly copy the font-path data which helps shorten the code a little bit (rather than the "manual" handling in PR 20346).
To ensure that this keeps working as expected, a non-production `assert` is added to prevent any future surprises.
The commit message for the patch in PR 20427 is pretty non-descriptive, being only a single line, however there's a bit more context in https://github.com/mozilla/pdf.js/pull/20427#issue-3597370951 but unfortunately the details there don't really make sense.
Note that the PR only changed main-thread code, but all the links are to worker-thread code!?
The `FontFaceObject` class is only used on the main-thread, and when encountering a broken font we fallback to the built-in font renderer; see 820b70eb25/src/display/font_loader.js (L135-L143)
Hence the `FontFaceObject` class *only* needs a way to set the `disableFontFace` property, however nowhere on the main-thread do we ever update the `bbox` of a font.
This is an old API-parameter that is now unused within the PDF.js project itself, and its description says that it's (partly) being used for "range requests operations".
Note that the `length` API-parameter is used to set the *initial* `contentLength` in various `BasePDFStreamReader` implementations, however it's always overridden by the "Content-Length" header (sent by the server) when that one exists *and* is a valid number. While we currently fallback to the keep the initial `contentLength` otherwise, note however how in that case range requests will always be *disabled* and thus the only spot in the code-base [where `fullReader.contentLength` is necessary](873378b718/src/core/worker.js (L230-L236)) cannot actually be reached.
Hence the only possible reason to use the `length` API-parameter would be for improved progress reporting[1] during streaming of PDF data in rare cases where the "Content-Length" header is missing/invalid, but the user *somehow* has information from another source about the correct `length` of the PDF document.
That situation feels very much like an edge-case, but it's obviously impossible to know if someone is depending on it. However, please note that there's a work-around available for users affected by this removal:
- Implement a `PDFDataRangeTransport` instance together with custom data-fetching[2], since in that case its `length`-parameter will always be used as-is.
Finally, updates various `BasePDFStreamReader` implementations to only set the `_isRangeSupported` field once the headers are available (since previously we'd just overwrite the "initial" value anyway).
---
[1] I.e. to avoid the "indeterminate" loadingBar being displayed in the viewer.
[2] This is what e.g. the Firefox PDF Viewer uses.
Currently the `bbox`, `fontMatrix`, and `defaultVMetrics` getters duplicate almost the same code, which we can avoid by adding a new helper method (similar to existing ones for reading numbers and strings).
The added `assert` in the new helper method also caught a bug in how the `defaultVMetrics` length was compiled.
On the worker-thread only the static `write` methods are actually used, and on the main-thread only class instances are being created.
Hence this, after PR 20197, leads to a bunch of dead code in both of the *built* `pdf.mjs` and `pdf.worker.js` files.
This patch reduces the size of the `gulp mozcentral` output by `21 419` bytes, i.e. `21` kilo-bytes, which I believe is way too large of a saving to not do this.
(I can't even remember the last time we managed to reduce build-size this much with a single patch.)
It has few cool features:
- all the canvas used during the rendering can be viewed;
- the different properties in the graphics state can be viewed;
- the different paths can be viewed.
The purpose of PR 11844 was to reduce memory usage once fonts have been attached to the DOM, since the font-data can be quite large in many cases.
Unfortunately the new `clearData` method added in PR 20197 doesn't actually remove *anything*, it just replaces the font-data with zeros which doesn't help when the underlying `ArrayBuffer` itself isn't modified.
The method does include a commented-out `resize` call[1], but uncommenting that just breaks rendering completely.
To address this regression, without having to make large or possibly complex changes, this patch simply changes the `clearData` method to replace the internal buffer/view with its contents *before* the font-data.
While this does lead to a data copy, the size of this data is usually orders of magnitude smaller than the font-data that we're removing.
---
[1] Slightly off-topic, but I don't think that patches should include commented-out code since there's a very real risk that those things never get found/fixed.
At the very least such cases should be clearly marked with `// TODO: ...` comments, and should possibly also have an issue filed about fixing the TODO.
The `PagesMapper` class currently makes up one third of the `src/display/display_utils.js` file size, and since its introduction it's grown (a fair bit) in size.
Note that the intention with files such as `src/display/display_utils.js` was to have somewhere to place functionality too small/simple to deserve its own file.
- Mention the internal viewer in the README, such that it's easier to find.
- Implement a new `INTERNAL_VIEWER` define, such that it's easier to limit code to only the "internal-viewer" gulp target.
- Only include the "GetRawData" message-handler when needed. Note that the `MessageHandler` [already throws](eb159abd6a/src/shared/message_handler.js (L121-L123)) for any missing handler.
- Move the various new helper functions from `src/core/document.js` and into their own file. The reasons for doing this are:
- That file is already quite large and complex as-is, and these helper functions are slightly orthogonal to its main functionality.
- Babel isn't able to remove all of the new code, and by moving this into a separate file we can guarantee that no extra code ends up in e.g. Firefox.
When recording bboxes for images, it's enough to record their
clip box / bounding box without needing to run the full bbox
tracking of the image's dependencies.
This patch adds right-click support for images in the PDF, allowing
users to download them. To minimize memory consumption, we:
- Do not store the images separately, and instead crop them out of the
PDF page canvas
- Only extract the images when needed (i.e. when the user right-clicks
on them), rather than eagery having all of them available.
To do so, we layer one empty 0x0 canvas per image, stretched to cover
the whole image, and only populate its contents on right click.
These images need to be inside the text layer: they cannot be _behind_
it, otherwise they would be covered by the text layer's container and
not be clickable, and they cannot be in front of it, otherwise they
would make the text spans unselectable.
This feature is managed by a new preference, `imagesRightClickMinSize`:
- when it's set to `-1`, right-click support is disabled
- when set to `0`, all images are available for right click
- when set to a positive integer, only images whose width and height are
greater than or equal to that value (in the PDF page frame of
reference) are available for right click.
This features is disabled by default outside of MOZCENTRAL, as it
significantly degrades the text selection experience in non-Firefox
browsers.
In PR 19548 these checks were added to ensure that the font-data sent from the worker-thread *always* include correct `disableFontFace` and `fontExtraProperties` data.
For some reason PR 20197 then changed the code such that these checks became effectively pointless, since these properties are now checked after the fact *and* the new getters provide fallback values.
This method is only used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through the underlying data to create a temporary `Array` that we finally iterate through at the call-site.
*Please note:* As port of these changes the chars/glyph caches, on the `Font` instances, are changed to use `Map`s rather than Objects.
This method is only used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through the underlying `Map` to create a temporary `Array` that we finally iterate through at the call-site.
The one from pdf.js.utils is a bit too old: a lot of bugs have been fixed
in the code that parses PDF files since then.
It's just an internal development tool, so it doesn't need to be perfect,
but it should be good enough to be useful.
A number of these unit-tests didn't actually cover the intended code-paths, since many of them *accidentally* matched the "file size is smaller than two range requests"-check.
The patch also updates `validateRangeRequestCapabilities` to use return-value names that are consistent with the class fields used in the various stream implementations.
Falling back to use the `loaded` byteLength if the server `contentLength` is unknown doesn't make a lot of sense, since it'd lead to the `onProgress` callback reporting `percent === 100` repeatedly while the document is loading despite that being obviously wrong.
Instead we'll now report `percent === NaN` in that case, thus showing the indeterminate progressBar, which seems more correct if the `contentLength` is unknown.
Please note that this code-path is normally not even reached, since streaming is enabled by default (applies e.g. to the Firefox PDF Viewer).
With these changes `0`, `NaN`, `null`, and `undefined` in the `total`-property all result in `percent === NaN` being reported by the callback, since previously e.g. `0` would result in `percent === 100` being reported unconditionally which doesn't make a lot of sense.
Also, remove the "indeterminate" loadingBar (in the viewer) if the `PDFDocumentLoadingTask` fails since there won't be any more data arriving and displaying the animation thus seems wrong.
This adds a basic non-MOZCENTRAL polyfill for now, which we should be able to remove once the next QuickJS version is released; note the pending changelog at f1139494d1/Changelog (L7)
This adds a *very basic* non-MOZCENTRAL polyfill for now, which we should be able to remove once the next QuickJS version is released; note the pending changelog at f1139494d1/Changelog (L8)