108 Commits

Author SHA1 Message Date
Dominique Eifländer
8924e905ad hotfix: Extend Tesseract instead of Tesseract1 2024-01-15 16:13:43 +01:00
Kilian Schuettler
912f00aa84 RED-8155: bold-detection in ocr-service
* fix application.yml
2024-01-08 13:49:58 +01:00
Kilian Schüttler
be4656189b RED-8155: integrate bold-detection into ocr-service 2024-01-05 16:05:53 +01:00
Dominique Eifländer
8944b57344 Merge branch 'RED-7669' into 'master'
RED-7669: optimize OCR-module performance

Closes RED-7669

See merge request redactmanager/ocr-service!30
2023-12-22 15:14:42 +01:00
Kilian Schuettler
67540950b8 RED-7669: optimize OCR-module performance
* fix thread handling for PDFs without any images
2023-12-22 15:11:29 +01:00
Kilian Schuettler
6f29270e66 RED-7669: optimize OCR-module performance
* fix thread handling for PDFs without any images
2023-12-22 15:04:52 +01:00
Dominique Eifländer
4b6411161e RED-1137: Do not observe actuator endpoints 2023-12-20 14:17:09 +01:00
Dominique Eifländer
99fc16130b RED-5223: Use tracing-commons from fforesight 2023-12-13 16:10:10 +01:00
Kilian Schüttler
c06974ce69 RED-7669: optimize OCR-module performance 2023-12-12 15:27:00 +01:00
Dominique Eifländer
0300a087d4 RED-5223: Enabled tracing, upgrade spring, use logstash-logback-encoder for json logs 2023-12-12 11:55:01 +01:00
Andrei Isvoran
ae09a59a7c RED-7715 - Add log4j config to enable switching between json/line logs 2023-12-06 11:52:01 +02:00
Kilian Schuettler
6fe95c6940 RED-7669: optimize OCR-module performance
* dont interrupt threads, use boolean flag instead
2023-11-28 10:04:56 +01:00
Kilian Schuettler
0264e28cc2 RED-7669: optimize OCR-module performance
* enable caches
2023-11-24 10:21:55 +01:00
Kilian Schuettler
1926707ae1 RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:53 +01:00
Kilian Schuettler
d3190844a3 RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
c7ccbae6ff RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
880bebcafc RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
955ff6281d RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
efd3a1d952 RED-7669: optimize OCR-module performance
* move all non thread safe stuff to separate thread in the middle
2023-11-23 16:00:29 +01:00
Kilian Schuettler
bb5b4a2fd8 RED-7669: optimize OCR-module performance
* binarize images after reading
2023-11-23 16:00:22 +01:00
Kilian Schuettler
6f99664906 RED-7669: optimize OCR-module performance
* try and synchronize all malloc calls
2023-11-23 16:00:19 +01:00
Kilian Schuettler
574f7ac25e RED-7669: optimize OCR-module performance
* moar sigsegv
2023-11-23 16:00:01 +01:00
Kilian Schuettler
12217f2459 RED-7669: optimize OCR-module performance
* moar sigsegv
2023-11-23 16:00:01 +01:00
Kilian Schuettler
19747cbca5 RED-7669: optimize OCR-module performance
* moar sigsegv
2023-11-23 15:59:59 +01:00
Kilian Schuettler
2632d2023d RED-7669: optimize OCR-module performance
* reset test and settings
2023-11-23 15:59:16 +01:00
Kilian Schuettler
4c225c2219 RED-7669: optimize OCR-module performance
* cleanup Code
2023-11-23 15:59:16 +01:00
Kilian Schuettler
3d09f46844 RED-7669: optimize OCR-module performance
* don't despeckle small images
2023-11-23 15:59:16 +01:00
Kilian Schuettler
77355b5367 RED-7669: optimize OCR-module performance
* second attempt at thread safety
2023-11-23 15:59:16 +01:00
Kilian Schuettler
57e194fcd0 RED-7669: optimize OCR-module performance
* attempt at thread safety
2023-11-23 15:59:14 +01:00
Kilian Schüttler
759bae6499 RED-7669: optimize OCR-module performance 2023-11-20 09:55:48 +01:00
Kilian Schüttler
a82676c36b CYB-001: Improve OCR-Module performance 2023-11-14 09:17:46 +01:00
Corina Olariu
6d3ec8a9db RED-7686 - Specific hidden text in specific file is not removed
- upgraded storage-commons, tenant-commons to the newest windows compatible versions
2023-10-13 14:50:31 +03:00
Corina Olariu
6533501ffc RED-7686 - Specific hidden text in specific file is not removed
- update pdftron-logic-commons dependency
2023-10-13 13:45:44 +03:00
RaphaelArnold
acba4cb103 RED-7075: Watermark recognition improval 2023-09-01 12:13:22 +02:00
Andrei Isvoran
ede443a47a RED-6864 - Fix storage update 2023-08-18 10:19:33 +03:00
Andrei Isvoran
1c62c5ddf4 RED-6864 - Update ocr-service to new storage 2023-08-16 11:32:49 +03:00
deiflaender
506b888424 RED-7080: Fixed NPE in OCGWatermark removal 2023-08-14 16:35:31 +02:00
deiflaender
37ff2b982a RED-7080: Remove all watermarks that are named as watermarks in OCG 2023-08-14 16:11:48 +02:00
deiflaender
06c49cc412 RED-7080: Remove watermarks that are named as watermarks in OCG 2023-08-14 13:26:03 +02:00
deiflaender
262204bcca hotfix: Fixed npe for inline image where getXObject() returns null 2023-08-11 15:16:28 +02:00
RaphaelArnold
57ef7da5b3 RED-7075: WIP 2023-08-07 13:29:17 +02:00
Kilian Schuettler
7e20541d73 hotfix: fix OCRServiceIntegrationTest 2023-08-07 12:39:49 +02:00
Andrei Isvoran
33412589c0 RED-7290 - Update platform-common-dependency version 2023-08-03 18:14:46 +03:00
Andrei Isvoran
0ad4682571 RED-7080 - Add removeWatermark flag for dossier template 2023-07-31 15:51:22 +03:00
Timo Bejan
74f9f123f4 Update pom.xml 2023-07-27 08:49:44 +02:00
Ali Oezyetimoglu
7209d47862 RED-7012: upgraded pdftron-logic-commons version 2023-07-19 15:06:22 +02:00
Kilian Schuettler
856d52951c DM-326: extend removeInvisibleElements 2023-07-14 12:46:24 +02:00
Kilian Schuettler
95af0faecd added watermark removal test 2023-07-13 11:25:16 +02:00
Kilian Schuettler
098bfcfce3 DM-305: port rules to new schema
* implement remove invisible text, when color equal to background
2023-07-11 17:23:03 +02:00
deiflaender
4df80612ab DM-307: Added none production ready code remove watermarks from SCM Flora prototype files 2023-07-03 12:37:42 +02:00