216 Commits

Author SHA1 Message Date
Corina Olariu
bdcb9aeda4 RED-8701 - Move files to customer data repositories
- update junit tests
2024-04-23 11:49:29 +03:00
Corina Olariu
6a86036a78 Merge branch 'main' into RED-8701 2024-04-23 11:46:59 +03:00
Corina Olariu
a358d7565e RED-8701 - Move files to customer data repositories
- update junit tests
2024-04-23 11:12:57 +03:00
Corina Olariu
069a6c0b49 RED-8701 - Move files to customer data repositories
- update syngenta submodule
2024-04-23 10:44:23 +03:00
Corina Olariu
7eab3a4088 RED-8701 - Move files to customer data repositories
- remove customer files from project
2024-04-22 14:57:51 +03:00
Corina Olariu
970fc99ed1 RED-8701 - Move files to customer data repositories
- update junit test
2024-04-22 14:14:47 +03:00
Corina Olariu
48c54f63a0 RED-8701 - Move files to customer data repositories
- update submodules
2024-04-22 13:57:39 +03:00
Corina Olariu
20e4e5ddff RED-8701 - Move files to customer data repositories
- update unit tests with the new path to submodules for customer files
2024-04-22 13:37:27 +03:00
Dominique Eifländer
c947d552d2 Merge branch 'RED-8995-fp' into 'main'
RED-8995: unclassified text might be missing from document data

See merge request fforesight/layout-parser!135
2024-04-19 09:21:50 +02:00
Corina Olariu
cc9816c8cb RED-8701 - Move files to customer data repositories
- use git lfs to store customer files
2024-04-18 20:31:35 +03:00
Kilian Schuettler
f256f9b30f RED-8995: unclassified text might be missing from document data
* treat TablePageBlock.OTHER like PARAGRAPH (no special treatment)
2024-04-18 17:42:34 +02:00
yhampe
8099a00bb6 RED-8402: Header and footer are not indexed / searched
added unit test and file
2024-04-18 14:39:01 +02:00
Kilian Schüttler
c4d9c5df02 Merge branch 'RED-8747-fp' into 'main'
RED-8747 - Entities not merged properly - fp

See merge request fforesight/layout-parser!131
2024-04-09 16:30:02 +02:00
Corina Olariu
976f408237 RED-8747 - Entities not merged properly - fp
- rework the extraction of rulings from the table cells
2024-04-09 14:38:48 +03:00
Corina Olariu
319268c53d RED-8747 - Entities not merged properly - fp
- update test
2024-04-09 12:24:19 +03:00
Corina Olariu
014eba9fc3 RED-8747 - Entities not merged properly - fp
- fix typo
- add validate table test
2024-04-09 12:14:57 +03:00
yhampe
c13ff7fbf6 RED-8402: Header and footer are not indexed / searched
checkstyle
added review comments
2024-04-08 12:17:49 +02:00
yhampe
0c3194276a RED-8402: Header and footer are not indexed / searched
added headers and footers to simplifiedtext
2024-04-08 12:02:36 +02:00
Corina Olariu
f185b13f2b RED-8747 - Entities not merged properly - fp
- use the rullings from the found tables instead of all rullings as splitting rullings in the blockification service
2024-04-08 09:42:32 +03:00
Dominique Eifländer
990c376ce6 Merge branch 'RED-8873' into 'main'
RED-8773 - Fix images not appearing on specific file

See merge request fforesight/layout-parser!123
2024-04-05 10:11:23 +02:00
Kilian Schuettler
f18bda1d4e RED-8799: LayoutGrid is wrong draw for some tables 2024-04-04 13:33:22 +02:00
Andrei Isvoran
456b8fe4a1 RED-8773 - Fix images not appearing on specific file 2024-04-03 10:20:46 +03:00
maverickstuder
9778ece992 RED-8702: Explore document databases to store entityLog
* fix for duplicate images in document structure that are linked to multiple sections
2024-04-02 14:19:14 +02:00
Timo Bejan
5c1708f97f Issue with merging text blocks multiple times 2024-03-22 12:47:05 +02:00
Dominique Eifländer
8e7e588d26 RED-8627: Fixed scrambled text after sorting 2024-03-19 10:58:36 +01:00
Dominique Eifländer
1d765a6baa RED-7141: Fixed more overlap problems 2024-03-14 16:30:52 +01:00
Dominique Eifländer
27aa418029 RED-7141: Fixed overlapping blocks 2024-03-13 16:14:55 +01:00
Dominique Eifländer
92fd1a72de RED-7141: Readded lost mergeLinesInZones 2024-03-12 13:42:40 +01:00
Dominique Eifländer
0d3d25e7d7 Merge branch 'RED-7141-hotfix' into 'main'
RED-7141: Align backend text sorting with Webviewer sorting

See merge request fforesight/layout-parser!115
2024-03-12 11:15:41 +01:00
maverickstuder
956fbff872 RED-7141: Align backend text sorting with Webviewer sorting
* hotfix for tables not being detected due to wrong x-y-sorting
2024-03-12 11:06:53 +01:00
maverickstuder
16be2467fd RED-8715: Improve NearestNeighbor Algorithm in LayoutParser
* replaced the old algorithm with an algorithm based on a kd-tree
2024-03-11 14:42:28 +01:00
Timo Bejan
dfc23955d7 Linespacing claryfind 2024-03-11 11:30:51 +02:00
Dominique Eifländer
d6e3d6fe22 Clarifynd 2024-03-11 11:24:58 +02:00
Timo Bejan
65ab7a1912 CLARI-30 - forward analysis headers 2024-03-08 16:47:27 +02:00
Timo Bejan
56c07a4491 CLARI-30 - identifier fix for clarifynd 2024-03-08 16:23:27 +02:00
Dominique Eifländer
0ad0cd45d6 RED-7141: Moved docstrum to root level of processor package 2024-03-08 14:20:28 +01:00
Dominique Eifländer
d659fe7234 RED-7141: Performance improvments 2024-03-08 10:00:52 +01:00
Dominique Eifländer
cb9127b4f3 RED-7141: Fixed pr finding and improved speed 2024-03-07 16:51:48 +01:00
Timo Bejan
05523585c0 orchestrator/persistence service should control queues 2024-03-06 16:55:44 +02:00
Timo Bejan
4ced572949 orchestrator/persistence service should control queues 2024-03-06 16:53:10 +02:00
Dominique Eifländer
79239b751d RED-7141: Implemented docstrum layout parsing 2024-03-06 11:18:40 +01:00
yhampe
a6ba501fa8 RED-8481: Use visual layout parsing to detect signatures
fixed some nullpointer errors
2024-02-29 09:22:27 +01:00
Maverick Studer
74f55a5cbf RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-28 16:13:56 +01:00
Kilian Schuettler
f4d789311c hotfix: double viewerdoc writes in rare cases lead to some contentstreams not being written 2024-02-26 12:24:15 +01:00
yhampe
477f6af886 RED-8481: Use visual layout parsing to detect signatures
added a new layer for visual parsing results

checkstyle
2024-02-23 14:02:53 +01:00
yhampe
2c171b6a9e RED-8481: Use visual layout parsing to detect signatures
added a new layer for visual parsing results

codestyle
2024-02-23 13:55:11 +01:00
yhampe
71477dabde RED-8481: Use visual layout parsing to detect signatures
added a new layer for visual parsing results

codestyle
2024-02-23 12:46:51 +01:00
yhampe
a927cbd9dc RED-8481: Use visual layout parsing to detect signatures
added a new layer for visual parsing results

fixed tests
2024-02-23 12:38:05 +01:00
yhampe
a1521877d7 RED-8481: Use visual layout parsing to detect signatures
added a new layer for visual parsing results

added a source label to image properties to enable rules
2024-02-23 12:20:11 +01:00
Maverick Studer
1d64028158 RED-8550: Faulty table recognition and text duplication leads to huge sections 2024-02-21 13:54:30 +01:00