230 Commits

Author SHA1 Message Date
Timo
e48e4e1797 updated redrect 2021-04-20 10:32:54 +03:00
Timo
ba28a3e0d3 code format, dependecy and test update, logging for reanalysis 2021-04-20 10:26:27 +03:00
Timo
1d4708ad13 reworked reanalysis and text storage 2021-04-20 09:51:50 +03:00
Timo
5c2596e268 Serialization of text 2021-04-19 13:08:32 +03:00
Timo
42fcea85d3 set image type on error 2021-04-18 11:31:33 +03:00
Timo
a34d2fb675 proper error handling for image clasification 2021-04-18 11:30:53 +03:00
Timo
ed5686dc51 Re-worked cache issues 2021-04-17 09:55:46 +03:00
Timo
8112f2035a close final PDDocument 2021-04-17 09:18:01 +03:00
Timo
169ab20351 fixed in-memory storage issues 2021-04-17 09:10:40 +03:00
Timo
8060e3a29f "fixed" memory issues by calling GC manually, removing soft reference cache and disposing images properly 2021-04-16 23:13:12 +03:00
Timo
4749858e80 attempted fix for image clasification 2021-04-16 20:53:09 +03:00
Timo
93d75e2f1c added versions to analyze result 2021-04-16 15:52:17 +03:00
Timo
5cb4ea287c Reworked re-analysis and analysis to use memory model / directly read/store files, and dumped pd doc wherever possible 2021-04-16 14:50:04 +03:00
Dominique Eifländer
c157a80630 Fixed endless loop on documents that contains no text 2021-04-16 09:42:29 +02:00
Dominique Eifländer
55ba351362 RED-1260: Enabled to add rules and manuel redaction actions for images 2021-04-15 12:51:29 +02:00
Dominique Eifländer
ae28555bf4 RED-1260: First steps for image classification 2021-04-09 13:37:00 +02:00
Dominique Eifländer
2558b3cab8 Integrate image classification 2021-04-09 11:44:12 +02:00
Dominique Eifländer
fc70f972da Do not remove images on reanalysis 2021-04-08 15:47:11 +02:00
Dominique Eifländer
e7c24487c7 Do not remove images at reanalysis 2021-04-08 15:22:25 +02:00
Dominique Eifländer
8375c04829 RED-1276: Fixed annotation position problem for 270° rotated pages 2021-04-08 10:30:51 +02:00
Dominique Eifländer
0638877d0a RED-1061: Upgraded to newest spring boot/cloud and reenabled actuator metrics 2021-03-15 13:22:14 +01:00
Dominique Eifländer
aac4d0437e RED-1170: Do not use non dictionay hints vor local analysis 2021-03-15 11:35:40 +01:00
Dominique Eifländer
bd56364cc3 Fixed importing caseinsensitive dictionaties 2021-03-12 15:08:34 +01:00
Dominique Eifländer
f4ea236fc5 Fixed renalysis for caseinsensitive dictionary entries 2021-03-11 13:57:15 +01:00
Dominique Eifländer
511092b9e7 First steps for incremental analysis 2021-03-08 16:31:37 +01:00
Dominique Eifländer
0dd47111c5 RED-1090: Return text with areas and sectionNumber 2021-02-25 14:55:53 +01:00
Dominique Eifländer
b6a13d1ff8 Adjusted rules to new requirements, add possibility to addRedaction for rules 2021-02-18 10:33:38 +01:00
Dominique Eifländer
de1dea7ac3 RED-1070: Fixed not finding annotation on not classified textblocks 2021-02-18 08:58:51 +01:00
Dominique Eifländer
e0fba8d38c RED-1055: Enabled to force redact ignore annotations 2021-02-12 10:41:32 +01:00
lmaldacker
0f263c69b8 Add hint annotation by reg-ex and expand to hint annotation by reg-ex 2021-02-11 16:53:07 +01:00
Dominique Eifländer
00b0cb1603 RED-1039: Fixed finding textpositions, RED-1042: Fixed get rectangles per line 2021-02-08 14:09:11 +01:00
Dominique Eifländer
8965e76548 RED-1046: Ignore dictionary rank for words that are explicitly set in the rules 2021-02-05 14:55:16 +01:00
Dominique Eifländer
577db37b11 RED-1045: Enabled to redact in headers and footers 2021-02-05 13:05:12 +01:00
Dominique Eifländer
a101b98a40 Fixed several table extraction problems 2021-02-04 15:08:48 +01:00
Dominique Eifländer
fc2ac03691 Fixed table extraction problems 2021-02-03 14:34:29 +01:00
Dominique Eifländer
35f3582d08 Expand CBI Authors with firstname initials 2021-02-02 12:20:40 +01:00
Dominique Eifländer
d8e444280b Fixed position problem for rotated images 2021-02-01 16:04:45 +01:00
Dominique Eifländer
39ca191b9c Ignore too small images 2021-02-01 15:14:46 +01:00
Dominique Eifländer
acddfafa5b Annotate images 2021-02-01 13:52:18 +01:00
Dominique Eifländer
1ed9941259 RED-1019: Fixed returning numberOfPages 2021-01-29 13:17:25 +01:00
Timo
1091d6a886 fixed issue affecting service on restarts 2021-01-28 11:05:01 +02:00
Dominique Eifländer
d739a4f2f5 RED-1010: Splittet redaction endpoint to analyis and annotation endpoints 2021-01-27 15:19:38 +01:00
Dominique Eifländer
62b960f2ea RED-1004: Fixed position problems when mediabox is bigger than cropbox 2021-01-27 11:34:17 +01:00
Dominique Eifländer
a76095c5d6 Always check dictionary rank when overriding annotations 2021-01-26 12:04:26 +01:00
Dominique Eifländer
43a3d76f1c Added possibility to redact all Author tables 2021-01-25 15:59:08 +01:00
Dominique Eifländer
7dfaf604e7 Fixed recognize rules and exceptions for redactAndRecommend, fixed wrong order of rules(published information) 2021-01-25 11:20:35 +01:00
Dominique Eifländer
531eeebae1 Recoginze rules and exceptions for redactAndRecommend, Do not redact publish information authors in new rulesset 2021-01-25 09:46:09 +01:00
Dominique Eifländer
33795527fd RED-934: Added rule to redact purity and et al Authors will be redacted where they are found 2021-01-20 13:18:37 +01:00
Dominique Eifländer
89f642ba90 RED-937: Handle rules version per RuleSetId 2021-01-13 12:52:59 +01:00
Dominique Eifländer
9104db6fa4 Clean recommendation values starting with : 2021-01-12 13:41:19 +01:00