60 Commits

Author SHA1 Message Date
Maverick Studer
74f55a5cbf RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-28 16:13:56 +01:00
Maverick Studer
1d64028158 RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-21 13:54:30 +01:00
yhampe
cc77d19500 RED-8481: Use visual layout parsing to detect signatures
addressed review comments
2024-02-15 13:01:30 +01:00
yhampe
bdf1161c91 RED-8481: Use visual layout parsing to detect signatures
addressed review comments
2024-02-15 12:12:23 +01:00
yhampe
b4a225144d RED-8481: Use visual layout parsing to detect signatures
working on failing tests
2024-02-15 10:16:07 +01:00
yhampe
fbd0196719 RED-8481: Use visual layout parsing to detect signatures
implemented visuallayoutparsingresult
2024-02-14 12:16:37 +01:00
Kilian Schuettler
015984891f RED-8156: refactor ViewerDocumentService as a dependency for ocr-service
* fix pmd
2024-02-06 17:17:26 +01:00
Kilian Schuettler
66fcb62833 RED-8156: refactor ViewerDocumentService as a dependency for ocr-service
* fix pmd
2024-02-06 17:09:21 +01:00
Kilian Schuettler
23eb0c40a3 RED-8156: refactor ViewerDocumentService as a dependency for ocr-service
* various improvements to experimental parsing steps
* added embed fonts functionality to viewer doc
2024-02-06 16:59:51 +01:00
Timo Bejan
88855de2da Red 8085 2024-01-29 10:31:36 +01:00
Kilian Schüttler
ba1c7c07ab RED-7384: fixes for migration 2023-12-20 12:40:00 +01:00
Dominique Eifländer
dacc2f7f43 DM-589: Filter wrong detected cells that borders from rotation at scanning 2023-11-20 15:54:02 +01:00
yhampe
b25d46291a * checkstyle 2023-11-16 08:12:47 +01:00
yhampe
84148d3b6e * fixed tests 2023-11-16 07:51:08 +01:00
Dominique Eifländer
a6ba66b1aa TAAS-103: Fixed values in wrong cells 2023-11-15 13:36:46 +01:00
yhampe
c3e69b2cdf * fixed bug with incorrect empty cell count by adding threshhold to cell.contains 2023-11-15 10:44:47 +01:00
Corina Olariu
3bab61c446 RED-7434 - Remove Section Grid entirely
- remove sectionGrid relation (including SectionGridCreatorService)
- update junit tests
2023-10-20 09:09:22 +03:00
Dominique Eifländer
567cbc178b hotfix: Fixed parsing for specific taas document 2023-10-17 15:52:19 +02:00
Dominique Eifländer
8647cf5a18 RED-7759: Upgraded storage-commons to newest windwos compatible version 2023-10-13 12:15:22 +02:00
Corina Olariu
daba0bf8a6 RED-7607 - Rotating pages leads to lost annotations (RM & DM)
- remove finally clause
2023-10-04 17:46:46 +03:00
Corina Olariu
f2c0991987 RED-7607 - Rotating pages leads to lost annotations (RM & DM)
- fix PMD findings
2023-10-04 14:09:46 +03:00
Kilian Schuettler
621c3f269d TAAS-104: merge visually intersecting Paragraphs 2023-09-05 16:09:05 +02:00
deiflaender
306a53ea79 RED-7461: Fixed wrong textblock classifation if footer is marked as header 2023-09-01 12:07:47 +02:00
Kilian Schuettler
28ec4c9ccb TAAS-89: added log entry and an end2end test 2023-08-31 14:28:18 +02:00
Kilian Schuettler
3a18923ef5 upgrade PDFBox to 3.0.0
* disable experimental ruling header stuff
2023-08-21 17:54:20 +02:00
Kilian Schuettler
2b15fd1d3c RED-7461: improve header/footer recognition 2023-08-21 17:49:13 +02:00
deiflaender
0cb8029f0a RED-7461: Fixed pr findings 2023-08-21 16:57:37 +02:00
deiflaender
b270b9c942 RED-7461: Use marked content to classify headers and footers if available 2023-08-21 16:02:24 +02:00
deiflaender
60615ec5d8 RED-7461: First working iteration of header and footer improvement 2023-08-21 15:31:11 +02:00
Timo Bejan
83d39ba3a5 Fixed issue with weird colors 2023-08-18 16:21:45 +03:00
Kilian Schuettler
ea0af08c31 RED-7851: add layoutgrid to new viewer document as optional content 2023-08-14 16:06:23 +02:00
Andrei Isvoran
cfca5376a0 RED-6864 - Switch to new storage-commons download 2023-08-08 17:16:40 +02:00
Kilian Schuettler
4a5464d6aa Refactoring to make downstream refactoring easier 2023-08-04 15:16:36 +02:00
Kilian Schuettler
ded00df11e fix build 2023-08-01 09:57:58 +02:00
Kilian Schuettler
d6a74dc9f9 add field id to image data 2023-07-31 16:32:11 +02:00
Kilian Schuettler
72d1e6271a more refactoring, added a comment 2023-07-27 14:35:40 +02:00
Kilian Schuettler
299b5be385 package refactoring in processor 2023-07-27 14:28:09 +02:00
Kilian Schuettler
41267a0f98 ported to gradle 2023-07-27 12:27:30 +02:00
Kilian Schuettler
65ab5eca22 update to redaction-service state 2023-07-25 16:10:57 +02:00
Kilian Schuettler
143ebee25e move and fix layout tests from redaction-service 2023-07-24 19:43:25 +02:00
Kilian Schuettler
47fd8e05d1 rename Data classes 2023-07-24 18:36:27 +02:00
Kilian Schuettler
daa68f3fa6 TAAS-41: disable experimental tests 2023-07-24 16:07:27 +02:00
Kilian Schuettler
ed66043856 TAAS-41: add test files 2023-07-24 16:04:51 +02:00
Kilian Schuettler
526b1c5ad3 TAAS-41: add (inactive) experimental services 2023-07-24 15:58:06 +02:00
Kilian Schuettler
241a32cb4f TAAS-41/ RED-6725: integrate layoutparser into redactmanager 2023-07-24 15:55:31 +02:00
Timo Bejan
3bc88bc9b7 store new document type 2023-07-13 13:01:01 +03:00
Kilian Schuettler
15a6d46f5c RED-7081: getBBox() Performance Improvement 2023-07-13 13:01:01 +03:00
Kilian Schuettler
788613c92e TAAS-41: TAAS Document Structure
* added linebreaks to ParagraphData
* moved List<String> cellText to List<ParagraphData> cellTexts
2023-07-13 13:01:01 +03:00
Kilian Schuettler
7f0aa32d1b TAAS-41: TAAS Document Structure
* added more testFiles
* hacked a workaround for CMMException
2023-07-13 13:01:01 +03:00
Kilian Schuettler
f08c4ced43 TAAS-41: TAAS Document Structure
* changed TextPageBlock splitting
* changed Header and Footer Classification
* added TAAS Document Structure Prototype
2023-07-13 13:01:01 +03:00