Kilian Schuettler
8ca41cf340
RED-10127: add more units
2024-10-15 09:47:07 +02:00
Kilian Schuettler
d8394d9a78
RED-10127: improve list classification
...
* add one more format to list identification
* add 'ppb' to known units
* special case for headlines continuing with 14C after the identifier (quite often in some specific files)
2024-10-14 17:22:19 +02:00
Kilian Schüttler
d614aed96a
RED-10127: add list classification
2024-10-10 10:50:17 +02:00
Kilian Schüttler
8c28a46817
RED-10127: improve headline detection
2024-10-09 09:56:04 +02:00
Maverick Studer
8a11d838b9
RM-187: Footers are recognized in the middle of the page
2024-10-08 14:27:55 +02:00
Dominique Eifländer
dda5a2c719
RED-9975: Fixed missing section numbers in layout grid
2024-09-18 11:20:15 +02:00
Dominique Eifländer
b08c102f76
RED-9974: Disabled failing test because of different header/footers
2024-09-16 13:32:44 +02:00
Dominique Eifländer
6acc85266c
RED-9974: Ignore enoughChars when section identifierer regex matches for documine old
2024-09-16 12:16:11 +02:00
Dominique Eifländer
a4d6d2326e
RED-9974: Do not rewrite outline as pdftron crashes in some cases
2024-09-16 10:50:24 +02:00
Dominique Eifländer
a337fdf684
RED-9974: Ignore pmd errors that only occur on build server
2024-09-16 10:18:27 +02:00
Kilian Schuettler
95e6fdecd7
RED-9974: wip
2024-09-16 09:46:41 +02:00
Kilian Schuettler
1337c56591
RED-9974: wip
2024-09-16 09:46:31 +02:00
Kilian Schuettler
31bf4ba8c8
hotfix: viewerDocService doesn't remove existing marked content
2024-09-16 09:46:16 +02:00
Kilian Schüttler
41ba531734
RED-9975: improve SuperSection handling
2024-09-11 13:38:04 +02:00
Dominique Eifländer
4a624f9642
RED-9976: Removed sorting that scrambles text in PDFTextStripper
2024-09-10 12:48:28 +02:00
Kilian Schuettler
90a1187921
hotfix: unmerge super large tables
2024-09-05 14:50:35 +02:00
Kilian Schuettler
09c18c110a
hotfix: unmerge super large tables
2024-09-05 14:26:45 +02:00
Kilian Schuettler
49604cd96e
hotfix: add Java advanced imaging
2024-09-04 15:19:43 +02:00
Kilian Schuettler
302d8b884f
RED-9964: fix errors with images
2024-09-03 16:38:17 +02:00
Dominique Eifländer
8de9d8309f
RED-9988: Fixed NPE when image representation is not present
2024-09-02 09:18:38 +02:00
Kilian Schüttler
e8605f4956
Red 9975: fix outline detection
2024-08-30 17:48:02 +02:00
Kilian Schüttler
8496b48cde
Red 9975: add outline debug layer
2024-08-30 14:18:09 +02:00
Kilian Schüttler
10e525f0de
Red 9964: don't merge tables on non-consecutive pages or with tables in between
2024-08-30 14:00:50 +02:00
Dominique Eifländer
e1d8d1ea3b
RED-9974: Improved headline detection for documine old
2024-08-30 10:35:24 +02:00
Kilian Schuettler
7c88c30ca7
RED-9975: activate outline detection
2024-08-29 14:17:20 +02:00
Kilian Schuettler
338c6c5dd0
RED-9975: activate outline detection
2024-08-29 12:27:20 +02:00
Dominique Eifländer
81469413b0
RED-9760: Fixed nullpointer in TextPageBlock
2024-08-13 13:18:50 +02:00
Kilian Schüttler
8e115dcd8a
RED-9760: change compareDouble to something sensible
2024-08-12 16:02:50 +02:00
Kilian Schuettler
b0ae00aa02
hotfix: threshold adjustements
2024-08-12 14:52:18 +02:00
Kilian Schuettler
d16377a24a
hotfix: line comparison with center coordinates
2024-08-09 15:45:23 +02:00
Dominique Eifländer
1953b5924f
RED-9760: Changed lineSeparation threshold for documine old
2024-08-09 14:42:14 +02:00
Kilian Schüttler
69bcd4f68d
hotfix reading order
2024-08-09 11:49:12 +02:00
Timo Bejan
cdc2081785
CLARI-140 - case issue
2024-08-08 22:40:11 +03:00
Timo Bejan
5b6a706c28
CLAR-139 - fixed outline error for unparsable object
2024-08-08 16:20:14 +03:00
Timo Bejan
0c1583c1be
Fixed Index out of bounds exception in blockificationpostprocessingservice - this could should be documented btw, there are also probably other use-cases where the code doesnt work
2024-07-30 17:45:05 +03:00
Andrei Isvoran
cc4f09711e
RED-9607 - Correctly determine text position sequence based on file rotation
2024-07-24 16:35:11 +03:00
Maverick Studer
8c052c38d7
CLARI: document-data-markdown
2024-07-18 17:19:43 +02:00
Kilian Schüttler
2726fc3fe1
RED-8800: adjust coordinates in BE to ignore cropbox
2024-07-15 17:45:13 +02:00
Kilian Schüttler
ec0dd032c9
RED-9353: refactor PDFTronViewerDocumentService
2024-07-15 12:54:17 +02:00
Andrei Isvoran
65b1f7d179
RED-9496 - Implement graceful shutdown
2024-07-04 14:21:20 +03:00
Kilian Schuettler
e920eb5a78
CLARI-003: add treeId to StructureObject
2024-07-01 13:56:16 +02:00
Kilian Schüttler
66d3433e04
RED-9353: use azure ocr service
2024-07-01 11:13:26 +02:00
Yannik Hampe
39f527a57c
Merge branch 'main' into 'RED-3813'
...
# Conflicts:
# layoutparser-service/layoutparser-service-processor/src/main/java/com/knecon/fforesight/service/layoutparser/processor/LayoutParsingPipeline.java
2024-06-26 09:10:59 +02:00
yhampe
5c2844fe31
RED-3813: Recategorize same image as experimental feature
...
fixed failing test
2024-06-26 09:08:37 +02:00
Kilian Schuettler
2e2f30ba35
RED-9194: roll back single digit headline change
2024-06-21 14:42:30 +02:00
Kilian Schuettler
9f7ed974ec
RED-9194: roll back single digit headline change
2024-06-21 14:41:30 +02:00
Kilian Schuettler
570a348a77
RED-9194: roll back single digit headline change
2024-06-21 14:39:47 +02:00
Maverick Studer
1c5d755111
hotfix for table/paragraph section creation on document start before first headline
2024-06-18 17:36:04 +02:00
Maverick Studer
da91fcff97
RED-9374: Ner Entities are at wrong locations
2024-06-18 16:31:24 +02:00
Kilian Schuettler
b719db86ab
RED-9194: allow single digit headline identifiers
2024-06-06 16:32:05 +02:00