pdf.js.mirror

Marmelator/pdf.js.mirror

mirror of https://github.com/mozilla/pdf.js.git synced 2026-05-31 15:21:00 +02:00

Author	SHA1	Message	Date
Jonas Jenwald	f6bac014ea	[api-minor] Remove `PostScriptCompiler` and `PostScriptEvaluator`, since it's now dead code (PR 21023 follow-up) These classes, and various related code, became unused after PR 21023 with only unit-tests actually running that code now. Also removes the `isEvalSupported` API option, since the `PostScriptCompiler` was the only remaining code where `eval` was used.	2026-04-03 22:14:14 +02:00
Tim van der Meij	d1a711bca3	Merge pull request #21023 from calixteman/wasm_stack_js Add a js fallback for interpreting ps code	2026-04-03 20:09:29 +02:00
Jonas Jenwald	68366e31e4	Move the `MathClamp` helper function to its own file This allows using it in the `src/scripting_api/` folder, without increasing the size of the scripting-bundle by also importing a bunch of unused code.	2026-04-02 11:22:28 +02:00
calixteman	8c7a5f3500	Add a js fallback for interpreting ps code It's a basic stack based interpreter. A wasm version will come soon.	2026-04-01 21:40:45 +02:00
Calixte Denizet	f373923170	Encrypt pdf data when merging the same pdf (bug 2028369)	2026-04-01 19:01:11 +02:00
calixteman	399fce6471	Merge pull request #21010 from calixteman/ps_js Add an interpreter for optimized ps code	2026-03-31 22:21:00 +02:00
Calixte Denizet	9f3de1edf6	Add an interpreter for optimized ps code It'll be used as a fallback when wasm is disabled. And add in the debugger a view for the generated js code and one for the ps code.	2026-03-31 21:00:22 +02:00
Calixte Denizet	3727b7095a	Add support for function-based shadings (bug 1254066) It fixes #5046. We just generate a mesh for the pattern rectangle where the color of each vertex is computed from the function. Since the mesh is generated in the worker we don't really take into account the current transform when it's drawn. That being said, there are maybe some possible improvements in using directly the gpu for the shading creation which could then take into account the current transform, but it could only work with ps function we can convert ino wgsl language and simple enough color spaces (gray and rgb).	2026-03-31 20:46:01 +02:00
Tim van der Meij	58b807d8e8	Merge pull request #21008 from calixteman/ast_cse Avoid expressions duplication in the ps AST and use a local instead when compiling to WASM	2026-03-31 20:21:59 +02:00
Tim van der Meij	48228e2756	Merge pull request #21013 from calixteman/bug2026956 Add attachments when merging/reorganizing a pdf (bug 2026956)	2026-03-31 20:17:54 +02:00
Calixte Denizet	5b8c04f383	Add attachments when merging/reorganizing a pdf (bug 2026956)	2026-03-31 14:48:06 +02:00
Calixte Denizet	63cf35b47f	Avoid expressions duplication in the ps AST and use a local instead when compiling to WASM	2026-03-30 16:30:33 +02:00
Jonas Jenwald	bfffb6c0f0	Import `fs/promises` directly in a few spots in the unit-tests Also, use the existing PDF.js helper function to fetch text-data when running the "bidi" tests in browsers.	2026-03-30 14:34:53 +02:00
calixteman	952952c905	[api-minor] Rewrite the ps lexer & parser and add a small Wasm compiler The main goal is to remove the eval-based interpreter. In order to have some good performances, the new parser performs some optimizations on the AST (similar to the ones in the previous implementation), and the Wasm compiler generates code for the optimized AST. For now, in case of errors or unsupported features, the Wasm compiler returns null and the old interpreter is used as a fallback. Few things are still missing: - a wasm-based interpreter using a stack (in case the ps code isn't stack-free); - a better js implementation in case of disabled wasm. but they will be added in follow-up patches.	2026-03-30 09:22:33 +02:00
Tim van der Meij	ada3438039	Merge pull request #21001 from Snuffleupagus/getDestFromStructElement-unit-test Add a unit-test for the `Catalog.#getDestFromStructElement` method	2026-03-29 16:08:21 +02:00
Jonas Jenwald	498daadf3c	Simplify the `applyOpacity` helper function This function only has a single call-site (if we ignore the unit-tests), where the colors are split into separate parameters. Given that all the color components are modified in the exact same way, it seems easier (and shorter) to pass the colors as-is to `applyOpacity` and have it use `Array.prototype.map()` instead.	2026-03-29 14:52:06 +02:00
Jonas Jenwald	d1f15fe352	Add a unit-test for the `Catalog.#getDestFromStructElement` method This code already has an integration-test, however also having a unit-test shouldn't hurt since those are often easier to run and debug (and it nicely complements the existing `outline` unit-tests). The patch also makes the following smaller changes to the method itself: - Avoid creating and parsing an empty Array, when doing the `pageRef` search. - Use `XRef.prototype.fetch` directly, when walking the parent chain, since the check just above ensures that the value is a Reference. - Use the `lookupRect` helper when parsing the /BBox entry.	2026-03-29 14:01:43 +02:00
Calixte Denizet	2e3d79e616	Break text chunks only if the base font is different It fixes #20956.	2026-03-26 21:39:32 +01:00
Calixte Denizet	42c229c267	Add the bidi tests coming from BidiTest.txt and BidiCharacterTest.txt Some tests were failing and has been fixed: - "Hello" + Alef + "(" + Bet: the "(" (neutral) was not considered as a part of the group Alef(Bet and the group wasn't reverted; - some intermediate neutrals were considered as strong.	2026-03-25 15:18:50 +01:00
Jonas Jenwald	a0102abe76	Move the `NetworkStream` choice from `src/display/api.js` and into a separate file This code already isn't used (or even bundled) in the Firefox PDF Viewer, and it also slightly reduces the number of import maps that need to be maintained.	2026-03-24 17:08:04 +01:00
Jonas Jenwald	3a372fde94	[api-minor] Replace the `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` API options with a single factory/option Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1] These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise. Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will only be used when either: - The PDF.js library runs in Node.js environments. - The user manually sets `useWorkerFetch = false` when calling `getDocument`. - The user provides custom `CMapReaderFactory`, `StandardFontDataFactory`, and/or `WasmFactory` instances when calling `getDocument`. By replacing these factories with a single new `BinaryDataFactory` factory/option the number of `getDocument` options are thus reduced, which cannot hurt. This also reduces the total bundle-size of the Firefox PDF Viewer a little bit, and it slightly reduces the number of import maps that need to be maintained. Please note: For users that provide custom `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` instances when calling `getDocument` this will be a breaking change, however it's unlikely that (many) such users exist. (The internal format data-format of `CMapReaderFactory` was changed in PR 18951, and there hasn't been a single question/complaint about it in well over a year.) --- [1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great. [2] Note that the Firefox PDF Viewer no longer use these factories, since it "forcibly" sets `useWorkerFetch = true` during building.	2026-03-22 15:49:06 +01:00
calixteman	ec24053ddf	Don't add an EOL after a superscript	2026-03-22 14:20:18 +01:00
Jonas Jenwald	262aeef3fa	[api-minor] Simplify `BaseCMapReaderFactory` by having the worker-thread create the `filename` The `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes are all very similar, and the only difference is really in their respective `fetch` methods. By have the worker-thread "compute" the complete `filename` it's possible to simplify the `BaseCMapReaderFactory.prototype.fetch` method, which will allow future improvements to all of these classes. A couple of things to note: - This code is unused, and it's not even bundled, in the Firefox PDF Viewer. - In browsers it's unused by default, and worker-thread fetching will always be used when possible since that's more efficient. Please note: For users that provide a custom `CMapReaderFactory` instance when calling `getDocument` this could be a breaking change, however it's unlikely that any such users exist. (The internal format of this data was changed previously in PR 18951, and there hasn't been a single question/complaint about it in well over a year.)	2026-03-21 15:54:40 +01:00
Tim van der Meij	ab228da9ce	Merge pull request #20931 from Snuffleupagus/rm-factory-name-validation Remove explicit `name`/`filename` validation in the `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes	2026-03-20 20:15:23 +01:00
calixteman	16aee06aac	Merge pull request #20925 from calixteman/reorganize_save_annotations Add the possibility to save added annotations when reorganizing a pdf (bug 2023086)	2026-03-20 16:32:10 +01:00
Jonas Jenwald	5299eb2b83	Remove explicit `name`/`filename` validation in the `BaseCMapReaderFactory`, `BaseStandardFontDataFactory`, and `BaseWasmFactory` classes Given that these classes are only used from the "FetchBinaryData" message handler, the `name`/`filename` parameters should never actually be missing and if they are that's a bug elsewhere in the code-base. Furthermore a missing `name`/`filename` parameter would result in a "nonsense" URL and the actual data fetching would then fail instead, hence keeping this old validation code just doesn't seem necessary.	2026-03-20 15:50:26 +01:00
Calixte Denizet	04272de41d	Add the possibility to save added annotations when reorganizing a pdf (bug 2023086)	2026-03-20 10:55:47 +01:00
Calixte Denizet	c17801b77e	Avoid getting null value in RefSet when cloning	2026-03-20 10:41:58 +01:00
Tim van der Meij	ff1af5a058	Merge pull request #20916 from calixteman/fix_co When merging pdfs, fix the CO after the fields have been cloned	2026-03-19 21:22:43 +01:00
Tim van der Meij	6245bb201c	Merge pull request #20915 from calixteman/fix_pageindice Avoid to use a used slot when looking for a new page position	2026-03-19 21:22:32 +01:00
Tim van der Meij	8cae5d17f2	Merge pull request #20917 from calixteman/fix_dup_name_dest Fix the destination names when they're duplicated	2026-03-19 21:22:19 +01:00
Jonas Jenwald	7609a42209	Use `toBeInstanceOf` consistently in the unit-tests There's currently a lot of unit-tests that manually check `instanceof`, let's replace that with the built-in Jasmine matcher function; see https://jasmine.github.io/api/edge/matchers.html#toBeInstanceOf	2026-03-19 17:18:25 +01:00
Calixte Denizet	cf67c1ef1e	Fix the destination names when they're duplicated	2026-03-19 10:52:39 +01:00
Calixte Denizet	b7da4b80a9	When merging pdfs, fix the CO after the fields have been cloned	2026-03-19 10:09:40 +01:00
Calixte Denizet	0bee641fed	Avoid to use a used slot when looking for a new page position	2026-03-19 09:40:16 +01:00
Jonas Jenwald	bdc16f8999	Merge pull request #20868 from Snuffleupagus/exportData-compileFontInfo Move the `compileFontInfo` call into the `Font.prototype.exportData` method (PR 20197 follow-up)	2026-03-18 11:14:46 +01:00
Calixte Denizet	e67892d035	Add support for saving outlines after reorganize/merge (bug 2009574)	2026-03-17 22:22:13 +01:00
Jonas Jenwald	7d963ddc7c	Move the `compileFontInfo` call into the `Font.prototype.exportData` method (PR 20197 follow-up) After the changes in PR 20197 the code in the `TranslatedFont.prototype.send` method is not all that readable[1] given how it handles e.g. the `charProcOperatorList` data used with Type3 fonts. Since this is the only spot where `Font.prototype.exportData` is used, it seems much simpler to move the `compileFontInfo` call there and directly return the intended data rather than messing with it after the fact. Finally, while it doesn't really matter, the patch flips the order of the `charProcOperatorList` and `extra` properties throughout the code-base since the former is used with Type3 fonts while the latter (effectively) requires that debugging is enabled. --- [1] I had to re-read it twice, also looking at all the involved methods, in order to convince myself that it's actually correct.	2026-03-16 09:29:17 +01:00
calixteman	3ff52e415f	Merge pull request #20862 from calixteman/bug2023106 Check for having Ref before adding them in a RefSet (bug 2023106)	2026-03-15 22:15:58 +01:00
Calixte Denizet	0fca64f01e	Check for having Ref before adding them in a RefSet (bug 2023106)	2026-03-15 22:03:39 +01:00
Tim van der Meij	315491dd32	Merge pull request #20840 from Snuffleupagus/getDocument-rm-length [api-minor] Remove the `length` parameter from `getDocument`	2026-03-15 11:48:02 +01:00
Jonas Jenwald	09a9a7bd0b	[api-minor] Remove the `length` parameter from `getDocument` This is an old API-parameter that is now unused within the PDF.js project itself, and its description says that it's (partly) being used for "range requests operations". Note that the `length` API-parameter is used to set the initial `contentLength` in various `BasePDFStreamReader` implementations, however it's always overridden by the "Content-Length" header (sent by the server) when that one exists and is a valid number. While we currently fallback to the keep the initial `contentLength` otherwise, note however how in that case range requests will always be disabled and thus the only spot in the code-base [where `fullReader.contentLength` is necessary](`873378b718/src/core/worker.js (L230-L236)`) cannot actually be reached. Hence the only possible reason to use the `length` API-parameter would be for improved progress reporting[1] during streaming of PDF data in rare cases where the "Content-Length" header is missing/invalid, but the user somehow has information from another source about the correct `length` of the PDF document. That situation feels very much like an edge-case, but it's obviously impossible to know if someone is depending on it. However, please note that there's a work-around available for users affected by this removal: - Implement a `PDFDataRangeTransport` instance together with custom data-fetching[2], since in that case its `length`-parameter will always be used as-is. Finally, updates various `BasePDFStreamReader` implementations to only set the `_isRangeSupported` field once the headers are available (since previously we'd just overwrite the "initial" value anyway). --- [1] I.e. to avoid the "indeterminate" loadingBar being displayed in the viewer. [2] This is what e.g. the Firefox PDF Viewer uses.	2026-03-13 23:42:45 +01:00
Jonas Jenwald	3842936edf	Split the `src/shared/obj-bin-transform.js` file into separate files for the main/worker threads (PR 20197 follow-up) On the worker-thread only the static `write` methods are actually used, and on the main-thread only class instances are being created. Hence this, after PR 20197, leads to a bunch of dead code in both of the built `pdf.mjs` and `pdf.worker.js` files. This patch reduces the size of the `gulp mozcentral` output by `21 419` bytes, i.e. `21` kilo-bytes, which I believe is way too large of a saving to not do this. (I can't even remember the last time we managed to reduce build-size this much with a single patch.)	2026-03-13 11:21:24 +01:00
calixteman	9d093d9607	Merge pull request #20626 from nicolo-ribaudo/images-right-click Add support for right-clicking on images (bug 1012805)	2026-03-11 11:45:51 +01:00
Nicolò Ribaudo	886c90d1a5	Add support for right-clicking on images This patch adds right-click support for images in the PDF, allowing users to download them. To minimize memory consumption, we: - Do not store the images separately, and instead crop them out of the PDF page canvas - Only extract the images when needed (i.e. when the user right-clicks on them), rather than eagery having all of them available. To do so, we layer one empty 0x0 canvas per image, stretched to cover the whole image, and only populate its contents on right click. These images need to be inside the text layer: they cannot be _behind_ it, otherwise they would be covered by the text layer's container and not be clickable, and they cannot be in front of it, otherwise they would make the text spans unselectable. This feature is managed by a new preference, `imagesRightClickMinSize`: - when it's set to `-1`, right-click support is disabled - when set to `0`, all images are available for right click - when set to a positive integer, only images whose width and height are greater than or equal to that value (in the PDF page frame of reference) are available for right click. This features is disabled by default outside of MOZCENTRAL, as it significantly degrades the text selection experience in non-Firefox browsers.	2026-03-10 14:51:03 +01:00
Jonas Jenwald	a1b769caea	Improve the `validateRangeRequestCapabilities` unit-tests A number of these unit-tests didn't actually cover the intended code-paths, since many of them accidentally matched the "file size is smaller than two range requests"-check. The patch also updates `validateRangeRequestCapabilities` to use return-value names that are consistent with the class fields used in the various stream implementations.	2026-03-08 18:28:50 +01:00
calixteman	baf8647b1f	Add the possibility to merge/update acroforms when merging/extracting (bug 2015853)	2026-03-07 19:03:02 +01:00
Jonas Jenwald	229e3642be	Change the `Dict.prototype.getRawValues` method to return an iterator This method is usually used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through ` Map`-values to create a temporary `Array` that we finally iterate through at the call-site. Note that the `getRawValues` method is old code, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.	2026-03-04 16:07:49 +01:00
Jonas Jenwald	58996f21b2	Change the `Dict.prototype.getKeys` method to return an iterator This method is usually used with loops, and it should be a tiny bit more efficient to use an iterator directly rather than first iterating through ` Map`-keys to create a temporary `Array` that we finally iterate through at the call-site. Note that the `getKeys` method is old code, and originally the `Dict` class stored its data in a regular `Object`, hence why the old code was written that way.	2026-03-04 16:07:49 +01:00
calixteman	ed390c06a1	Fix intermittent issue with a unit test Avoid to rely on timing in the test, which can cause intermittent failures. Instead, we check that the image is cached at the document/page level.	2026-03-01 22:59:04 +01:00

1 2 3 4 5 ...

1479 Commits