36 Commits

Author SHA1 Message Date
Kilian Schuettler
7b5a175440 RED-7669: improve ocr
* fix pmd
2024-05-13 11:35:57 +02:00
Kilian Schuettler
18ba1daaef RED-7669: improve ocr
* decrease otsu-scorefract slightly for thin lines
* don't write text that is overlapped with existing text
2024-05-08 10:55:38 +02:00
Kilian Schuettler
c61f71871e RED-7669: improve ocr
* decrease otsu-scorefract slightly for thin lines
* don't write text that is overlapped with existing text
2024-05-08 10:54:25 +02:00
Kilian Schüttler
6be5dcf305 Merge branch 'RED-8800' into 'master'
RED-8800: fix text location for weird mediaboxes

See merge request fforesight/ocr-service!46
2024-04-04 18:10:02 +02:00
Kilian Schuettler
7f0fb149a9 RED-8800: fix text location for weird mediaboxes 2024-04-04 17:03:37 +02:00
Timo Bejan
d8011bdba5 Clari-30 ocr service compatibility 2024-03-08 14:44:48 +02:00
Timo Bejan
6d69b783f1 wrong conditional 2024-03-06 18:09:16 +02:00
Timo Bejan
2e37b8eec9 CLARI-30 - reworked ocr service to use queues for request/response, moved DLQ listener to consumer of this service. Removed rest API calls 2024-03-04 11:42:30 +02:00
Kilian Schuettler
d2f2def1c2 RED-8156: add ocr debug layers to viewer document
* fix pmd
* disable tests again
2024-02-07 11:36:42 +01:00
Kilian Schuettler
2bbc3775c5 RED-8156: add ocr debug layers to viewer document 2024-02-07 11:31:40 +01:00
Kilian Schuettler
2aaa53f441 RED-8156: add debug layers to viewer document
* wip, fonts need to be created in the original document
2024-02-05 18:28:19 +01:00
Timo Bejan
b48db538fd PMD fix for ocr service RED-8085 2024-01-30 07:17:37 +01:00
Kilian Schüttler
9010ee8691 RED-8212: Pageborders from scanned documents are used for tables 2024-01-24 13:40:17 +01:00
Kilian Schüttler
74d5f8d8e0 RED-8155: bold-detection in ocr-service 2024-01-17 13:54:00 +01:00
Kilian Schüttler
be4656189b RED-8155: integrate bold-detection into ocr-service 2024-01-05 16:05:53 +01:00
Kilian Schuettler
67540950b8 RED-7669: optimize OCR-module performance
* fix thread handling for PDFs without any images
2023-12-22 15:11:29 +01:00
Kilian Schuettler
6f29270e66 RED-7669: optimize OCR-module performance
* fix thread handling for PDFs without any images
2023-12-22 15:04:52 +01:00
Kilian Schüttler
c06974ce69 RED-7669: optimize OCR-module performance 2023-12-12 15:27:00 +01:00
Kilian Schuettler
6fe95c6940 RED-7669: optimize OCR-module performance
* dont interrupt threads, use boolean flag instead
2023-11-28 10:04:56 +01:00
Kilian Schuettler
0264e28cc2 RED-7669: optimize OCR-module performance
* enable caches
2023-11-24 10:21:55 +01:00
Kilian Schuettler
c7ccbae6ff RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
880bebcafc RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
955ff6281d RED-7669: optimize OCR-module performance
* move all critical stuff to its own singleton thread
* make gs process queue any image once the file has been written
2023-11-23 16:00:31 +01:00
Kilian Schuettler
efd3a1d952 RED-7669: optimize OCR-module performance
* move all non thread safe stuff to separate thread in the middle
2023-11-23 16:00:29 +01:00
Kilian Schuettler
bb5b4a2fd8 RED-7669: optimize OCR-module performance
* binarize images after reading
2023-11-23 16:00:22 +01:00
Kilian Schuettler
6f99664906 RED-7669: optimize OCR-module performance
* try and synchronize all malloc calls
2023-11-23 16:00:19 +01:00
Kilian Schuettler
574f7ac25e RED-7669: optimize OCR-module performance
* moar sigsegv
2023-11-23 16:00:01 +01:00
Kilian Schuettler
12217f2459 RED-7669: optimize OCR-module performance
* moar sigsegv
2023-11-23 16:00:01 +01:00
Kilian Schuettler
19747cbca5 RED-7669: optimize OCR-module performance
* moar sigsegv
2023-11-23 15:59:59 +01:00
Kilian Schuettler
2632d2023d RED-7669: optimize OCR-module performance
* reset test and settings
2023-11-23 15:59:16 +01:00
Kilian Schuettler
4c225c2219 RED-7669: optimize OCR-module performance
* cleanup Code
2023-11-23 15:59:16 +01:00
Kilian Schuettler
3d09f46844 RED-7669: optimize OCR-module performance
* don't despeckle small images
2023-11-23 15:59:16 +01:00
Kilian Schuettler
77355b5367 RED-7669: optimize OCR-module performance
* second attempt at thread safety
2023-11-23 15:59:16 +01:00
Kilian Schuettler
57e194fcd0 RED-7669: optimize OCR-module performance
* attempt at thread safety
2023-11-23 15:59:14 +01:00
Kilian Schüttler
759bae6499 RED-7669: optimize OCR-module performance 2023-11-20 09:55:48 +01:00
Kilian Schüttler
a82676c36b CYB-001: Improve OCR-Module performance 2023-11-14 09:17:46 +01:00