Timo Bejan
5b6a706c28
CLAR-139 - fixed outline error for unparsable object
2024-08-08 16:20:14 +03:00
Timo Bejan
0c1583c1be
Fixed Index out of bounds exception in blockificationpostprocessingservice - this could should be documented btw, there are also probably other use-cases where the code doesnt work
2024-07-30 17:45:05 +03:00
Andrei Isvoran
cc4f09711e
RED-9607 - Correctly determine text position sequence based on file rotation
2024-07-24 16:35:11 +03:00
Maverick Studer
8c052c38d7
CLARI: document-data-markdown
2024-07-18 17:19:43 +02:00
Kilian Schüttler
2726fc3fe1
RED-8800: adjust coordinates in BE to ignore cropbox
2024-07-15 17:45:13 +02:00
Kilian Schüttler
ec0dd032c9
RED-9353: refactor PDFTronViewerDocumentService
2024-07-15 12:54:17 +02:00
Kilian Schuettler
e920eb5a78
CLARI-003: add treeId to StructureObject
2024-07-01 13:56:16 +02:00
Kilian Schüttler
66d3433e04
RED-9353: use azure ocr service
2024-07-01 11:13:26 +02:00
Yannik Hampe
39f527a57c
Merge branch 'main' into 'RED-3813'
...
# Conflicts:
# layoutparser-service/layoutparser-service-processor/src/main/java/com/knecon/fforesight/service/layoutparser/processor/LayoutParsingPipeline.java
2024-06-26 09:10:59 +02:00
Kilian Schuettler
2e2f30ba35
RED-9194: roll back single digit headline change
2024-06-21 14:42:30 +02:00
Kilian Schuettler
9f7ed974ec
RED-9194: roll back single digit headline change
2024-06-21 14:41:30 +02:00
Kilian Schuettler
570a348a77
RED-9194: roll back single digit headline change
2024-06-21 14:39:47 +02:00
Maverick Studer
1c5d755111
hotfix for table/paragraph section creation on document start before first headline
2024-06-18 17:36:04 +02:00
Maverick Studer
da91fcff97
RED-9374: Ner Entities are at wrong locations
2024-06-18 16:31:24 +02:00
Kilian Schuettler
b719db86ab
RED-9194: allow single digit headline identifiers
2024-06-06 16:32:05 +02:00
maverickstuder
3d2f66cf10
fixed issue with thread-safety of local fields in the HeadlineClassificationService:
...
* HeadlineClassificationService is no singleton anymore
* instead initialize it in the ClassificationService and pass it to the classifyMethods as required
2024-06-06 14:39:23 +02:00
Maverick Studer
c05f67cf44
RED-7074: Design Subsection section tree structure algorithm
2024-06-06 13:22:14 +02:00
yhampe
9ecf9ca19f
RED-3813: Recategorize same image as experimental feature
...
now writing hash into structure
2024-06-05 14:20:33 +02:00
Corina Olariu
072a8aa3da
RED-9206 - Sections are no longer correctly separated from each other in the test file
...
- add REDACT_MANAGER_WITHOUT_DUPLICATE_PARAGRAPH case
2024-06-05 14:26:54 +03:00
Corina Olariu
5f5a6258c5
Merge branch 'main' into RED-9206
2024-06-05 13:34:14 +03:00
Maverick Studer
5d33ad570e
RED-7074: Design Subsection section tree structure algorithm
2024-06-05 12:28:00 +02:00
Corina Olariu
fd698a78fc
RED-9206 - Sections are no longer correctly separated from each other in the test file
...
- introduce new layout parsing type: REDACT_MANAGER_WITHOUT_DUPLICATE_PARAGRAPH to include changes from REDACT_MANAGER apart from duplicate paragraph.
- updated junit tests
-
2024-06-04 20:55:37 +03:00
Maverick Studer
fc06dba2ce
RED-7074: Design Subsection section tree structure algorithm
2024-06-04 15:07:40 +02:00
Maverick Studer
efb1a748af
RED-7074: Design Subsection section tree structure algorithm
2024-05-28 14:48:21 +02:00
yhampe
9be672c728
RED-3813: Recategorize same image as experimental feature
...
working on pushing properties to persistence service
2024-05-28 13:51:45 +02:00
Maverick Studer
48b7a22e2b
RED-7074: Design Subsection section tree structure algorithm
2024-05-24 13:30:25 +02:00
Corina Olariu
0ed1481517
RED-9177 - Layout parser fails to process file
...
- use originFile as viewerDocumentFile
- return layoutGridOCGName in case the name is found and not check further properties
2024-05-22 13:02:42 +03:00
Andrei Isvoran
3835d03036
RED-9149 - Remove header detection
2024-05-20 14:59:34 +03:00
yhampe
a5fcebce30
RED-3813: Recategorize same image as experimental feature
...
added representation to image and DocumentStructure
2024-05-17 07:34:05 +02:00
Kilian Schuettler
8648ed0952
hotifx for clarifynd
2024-05-15 14:02:02 +02:00
Andrei Isvoran
40465e8778
RED-9149 - Improvements
2024-05-13 15:13:37 +03:00
Andrei Isvoran
a76b2ace3f
RED-9149 - Address comments
2024-05-13 13:18:33 +03:00
Andrei Isvoran
aeaca2f278
RED-9149 - Header and footer extraction by page-association
2024-05-10 16:04:06 +03:00
Andrei Isvoran
f1dbcc24a2
RED-9149 - Header and footer extraction by page-association
2024-05-10 15:49:08 +03:00
Andrei Isvoran
fda25852d1
RED-9149 - Header and footer extraction by page-association
2024-05-10 15:17:41 +03:00
Dominique Eifländer
87001090d5
RED-8933: Fixed bugs in DocumineClassificationService
2024-05-08 13:01:23 +02:00
Kilian Schuettler
6a65d7f9fc
RED-8825: minor fixes
...
* also added overrides via env variables
2024-05-07 17:37:42 +02:00
Kilian Schuettler
e935cc7b14
RED-8825: some fixes, and experimental column detector
2024-05-06 14:24:39 +02:00
Kilian Schuettler
abb249e966
RED-8825: general layoutparsing improvements
...
* fix checkstyle
2024-05-03 00:15:31 +02:00
Kilian Schuettler
60acbac53f
RED-8825: general layoutparsing improvements
...
* fixing a bunch of coordinates
2024-05-03 00:06:29 +02:00
Kilian Schuettler
a3decd292d
RED-8825: general layoutparsing improvements
...
* fix RulingCleaningService
2024-05-02 23:00:22 +02:00
Kilian Schuettler
b6f0a21886
RED-8825: general layoutparsing improvements
...
* refactor all coordinates
2024-05-02 21:01:25 +02:00
Kilian Schuettler
d61cac8b4f
RED-8825: general layoutparsing improvements
...
* fix tests
2024-04-30 16:06:22 +02:00
Kilian Schuettler
ae46c5f1ca
RED-8825: general layoutparsing improvements
...
* fix tests
2024-04-30 11:55:18 +02:00
Kilian Schuettler
15ea385f4d
RED-8825: general improvements
...
* some more refactoring
* fixed text ruling classification for vertical text
* shrunk min graphics size
2024-04-30 10:44:32 +02:00
Kilian Schuettler
08be18db2d
RED-8825: general improvements
...
* some more refactoring
2024-04-29 20:09:53 +02:00
Kilian Schuettler
64209255cb
RED-8825: general improvements
...
* classify rulings as underline/striketrough
* improve performance of CleanRulings.lineBetween
* use lineBetween where possible
* wip, still todo:
- Header/Footer by Ruling for all rotations
- actually the ticket, optimizing layoutparsing for documine
2024-04-29 17:24:15 +02:00
Kilian Schuettler
4761d2e1a2
RED-8825: general improvements
...
* classify rulings as underline/striketrough
* improve performance of CleanRulings.lineBetween
* use lineBetween where possible
* wip, still todo:
- Header/Footer by Ruling for all rotations
- actually the ticket, optimizing layoutparsing for documine
2024-04-29 17:22:33 +02:00
Kilian Schuettler
1916e626df
RED-8825: general improvements
...
* classify rulings as underline/striketrough
* improve performance of CleanRulings.lineBetween
* use lineBetween where possible
* wip, still todo:
- Header/Footer by Ruling for all rotations
- actually the ticket, optimizing layoutparsing for documine
2024-04-29 17:15:19 +02:00
Kilian Schuettler
e4663ac8db
RED-8825: added split by ruling into every step of docstrum
2024-04-29 15:54:56 +02:00