39 Commits

Author SHA1 Message Date
Maverick Studer
48b7a22e2b RED-7074: Design Subsection section tree structure algorithm 2024-05-24 13:30:25 +02:00
Kilian Schuettler
b6f0a21886 RED-8825: general layoutparsing improvements
* refactor all coordinates
2024-05-02 21:01:25 +02:00
Kilian Schuettler
15ea385f4d RED-8825: general improvements
* some more refactoring
 * fixed text ruling classification for vertical text
 * shrunk min graphics size
2024-04-30 10:44:32 +02:00
Kilian Schuettler
3dd215288a RED-8825: improve layoutparsing
* added improved debugging capabilities to viewer-doc
* refactored coordinates (wip)
* refactored line intersection algorithm
* removed cropbox correction from pdfbox text positions
2024-04-29 15:54:53 +02:00
Yannik Hampe
84bdb4d1ed Merge branch 'RED-8701' into 'main'
RED-8701 - Move files to customer data repositories

See merge request fforesight/layout-parser!137
2024-04-25 09:06:35 +02:00
Dominique Eifländer
8442e60055 RED-8932 Fixed not merged headline with identifier 2024-04-24 11:45:38 +02:00
Corina Olariu
0ef67fc07b RED-8701 - Move files to customer data repositories
- update junit tests and syngenta submodule
2024-04-23 14:54:56 +03:00
Corina Olariu
6a86036a78 Merge branch 'main' into RED-8701 2024-04-23 11:46:59 +03:00
Corina Olariu
069a6c0b49 RED-8701 - Move files to customer data repositories
- update syngenta submodule
2024-04-23 10:44:23 +03:00
Corina Olariu
7eab3a4088 RED-8701 - Move files to customer data repositories
- remove customer files from project
2024-04-22 14:57:51 +03:00
Corina Olariu
48c54f63a0 RED-8701 - Move files to customer data repositories
- update submodules
2024-04-22 13:57:39 +03:00
Corina Olariu
20e4e5ddff RED-8701 - Move files to customer data repositories
- update unit tests with the new path to submodules for customer files
2024-04-22 13:37:27 +03:00
Corina Olariu
cc9816c8cb RED-8701 - Move files to customer data repositories
- use git lfs to store customer files
2024-04-18 20:31:35 +03:00
yhampe
8099a00bb6 RED-8402: Header and footer are not indexed / searched
added unit test and file
2024-04-18 14:39:01 +02:00
Corina Olariu
014eba9fc3 RED-8747 - Entities not merged properly - fp
- fix typo
- add validate table test
2024-04-09 12:14:57 +03:00
Corina Olariu
f185b13f2b RED-8747 - Entities not merged properly - fp
- use the rullings from the found tables instead of all rullings as splitting rullings in the blockification service
2024-04-08 09:42:32 +03:00
Dominique Eifländer
8e7e588d26 RED-8627: Fixed scrambled text after sorting 2024-03-19 10:58:36 +01:00
Dominique Eifländer
27aa418029 RED-7141: Fixed overlapping blocks 2024-03-13 16:14:55 +01:00
Dominique Eifländer
d6e3d6fe22 Clarifynd 2024-03-11 11:24:58 +02:00
Dominique Eifländer
79239b751d RED-7141: Implemented docstrum layout parsing 2024-03-06 11:18:40 +01:00
Maverick Studer
74f55a5cbf RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-28 16:13:56 +01:00
Maverick Studer
1d64028158 RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-21 13:54:30 +01:00
Timo Bejan
88855de2da Red 8085 2024-01-29 10:31:36 +01:00
Dominique Eifländer
dacc2f7f43 DM-589: Filter wrong detected cells that borders from rotation at scanning 2023-11-20 15:54:02 +01:00
Dominique Eifländer
a6ba66b1aa TAAS-103: Fixed values in wrong cells 2023-11-15 13:36:46 +01:00
Dominique Eifländer
567cbc178b hotfix: Fixed parsing for specific taas document 2023-10-17 15:52:19 +02:00
Kilian Schuettler
621c3f269d TAAS-104: merge visually intersecting Paragraphs 2023-09-05 16:09:05 +02:00
deiflaender
306a53ea79 RED-7461: Fixed wrong textblock classifation if footer is marked as header 2023-09-01 12:07:47 +02:00
Kilian Schuettler
28ec4c9ccb TAAS-89: added log entry and an end2end test 2023-08-31 14:28:18 +02:00
deiflaender
60615ec5d8 RED-7461: First working iteration of header and footer improvement 2023-08-21 15:31:11 +02:00
Timo Bejan
83d39ba3a5 Fixed issue with weird colors 2023-08-18 16:21:45 +03:00
Kilian Schuettler
4a5464d6aa Refactoring to make downstream refactoring easier 2023-08-04 15:16:36 +02:00
Kilian Schuettler
41267a0f98 ported to gradle 2023-07-27 12:27:30 +02:00
Kilian Schuettler
65ab5eca22 update to redaction-service state 2023-07-25 16:10:57 +02:00
Kilian Schuettler
143ebee25e move and fix layout tests from redaction-service 2023-07-24 19:43:25 +02:00
Kilian Schuettler
ed66043856 TAAS-41: add test files 2023-07-24 16:04:51 +02:00
Kilian Schuettler
788613c92e TAAS-41: TAAS Document Structure
* added linebreaks to ParagraphData
* moved List<String> cellText to List<ParagraphData> cellTexts
2023-07-13 13:01:01 +03:00
Kilian Schuettler
7f0aa32d1b TAAS-41: TAAS Document Structure
* added more testFiles
* hacked a workaround for CMMException
2023-07-13 13:01:01 +03:00
Kilian Schuettler
aac0259caf RED-6009: Document Tree Structure
*moved all layoutparsing code to separate project
*wip (some dependency issues)
2023-04-12 11:06:28 +02:00